Functions for managing release branches
release_branch_management.Rmd
The following functions provide tools for managing branches for specific PIP releases. These are mainly intended to be used for auxiliary data, which are stored in dedicated repositories, such as aux_gdp for GDP data or aux_ppp for PPP data. These functions are a starting point to enable vintage control of auxiliary data, so we can track which version of aux data was used for specific releases. They also ensure that release branches are consistently updated with the latest changes from the development branch.
The following workflow is recommended:
Step 1 - Check if repo has release branch
As a first step, check whether any auxiliary data repo has or not a release branch. Release branches are named after the corresponding day of the release (e.g., 20241105 for November 5th 2024 release).
To do so, call get_repo_branches()
. It returns a list
with (1.) names of all branches, (2.) name of release branch if present
and (3.) logical “has_release_branch” specifying whether or not the repo
has a release branch.
repo_branches <- get_repo_branches(repo = "aux_ppp",
owner = getOption("pipfun.ghowner"))
names(repo_branches)
#> [1] "all_branches" "release_branches" "has_release_branch"
# printing all elements of the list
repo_branches$all_branches
#> [1] "DEV_v2" "DEV" "PROD" "main"
repo_branches$release_branch
#> character(0)
repo_branches$has_release_branch
#> [1] FALSE
Let’s use a fake auxiliary data repo, aux_test, to illustrate how the functions work.
Step 2 - If absent, create release branch
To create a release branch, simply call
create_new_branch()
, specifying the name of the new branch
as well as the name of the reference branch it should be created
from
create_new_branch(repo = "aux_test",
new_branch = format(Sys.Date(), "%Y%m%d"),
ref_branch = "main"
)
#> Git credentials are missing or invalid in non-interactive mode.
#> → branch 20250414 already exists in repo PIP-Technical-Team/aux_test
Step 3 - Check if release branch is updated with development branch
In order to check whether the release branch is updated with the most
recent version of development branch, call
compare_branch_content()
. This function compares the
content of the two branches based on their corresponding latest commit
tree SHAs. This is a string representing the SHA hash of the tree object
associated with the branch. In Git, a tree represents the entire
directory structure and content of a commit. The SHA is a unique
identifier based on the files and folders’ state in that branch.
Therefore, comparing the SHA of the latest commits is not enough to determine if branches have the same content because:
- Tree SHA: Represents the state of the repository’s content (files and directories) at a given point in time.
- Commit SHA: Includes metadata such as the author, timestamp, and commit message, in addition to referencing the tree SHA. Even if the content is identical, differences in metadata can cause the commit SHAs to differ.
As a result, compare_branch_content()
returns a list
with: (1) tree SHA of branch 1, (2) tree SHA of branch 2 and (3) a
TRUE/FALSE value indicating whether the two branches have identical
content. If the tree SHAs for both branches are the same,
same_content
will be TRUE
; otherwise, it will
be FALSE
.
br_compare <- compare_branch_content(repo = "aux_test",
branch1 = format(Sys.Date(), "%Y%m%d"),
branch2 = "DEV")
#> ! The branches 20250414 and DEV have different content at their latest commits.
# Output list
names(br_compare)
#> [1] "tree_sha_1" "tree_sha_2" "same_content"
br_compare$tree_sha_1
#> [1] "b02c728e9ad5c6a52950b6200c923208ccb7db38"
br_compare$tree_sha_2
#> [1] "df6cdf58ecba3397d668b4c6d17995600895dd98"
br_compare$same_content
#> [1] FALSE
# Example with branches of the same content
compare_branch_content(repo = "aux_test",
branch1 = "main",
branch2 = "test_main")
#> ✔ The branches main and test_main have the same content at their latest commits.
#> $tree_sha_1
#> [1] "b1430b7286e86c933e61902de5677c42064081d0"
#>
#> $tree_sha_2
#> [1] "b1430b7286e86c933e61902de5677c42064081d0"
#>
#> $same_content
#> [1] TRUE
Step 4 - Update branches
Note (RT): as of now, I developed two functions that allow updating branches with two different approaches:
- Update the target branch to point to the latest commit of the source branch (this overwrites the history of target branch).
- Update the target branch by merging the source branch into it.
# Approach 1:
update_branches(repo = "aux_test",
branch1 = "DEV_v2",
branch2 = format(Sys.Date(), "%Y%m%d"))
#> Git credentials are missing or invalid in non-interactive mode.
#> Git credentials are missing or invalid in non-interactive mode.
#> ✔ The SHAs of the latest commits on both branches are the same.
#> ✔ The branches DEV_v2 and 20250414 have the same content at their latest commits.
#> ! Branches are already up-to-date.
#> [1] TRUE
# Approach 2:
merge_branch_into(repo = "aux_test",
source_branch = "DEV",
target_branch = format(Sys.Date(), "%Y%m%d"))
#> ! The branches DEV and 20250414 have different content at their latest commits.
#> ! Error merging branches: GitHub API error (403): Resource not accessible by integration
#> [1] FALSE
In both cases, you can set the force
option to
FALSE
to make the function ask confirmation before
proceeding with the operation.
Additional: Delete branches
To delete a branch, you can use the delete_branch()
function. Here’s an example that demonstrates creating and subsequently
deleting a branch:
# Create a new branch named "to_delete"
create_new_branch(repo = "aux_test",
new_branch = "to_delete")
#> Git credentials are missing or invalid in non-interactive mode.
#>
#> "GitHub API error (403): Resource not accessible by integration"
#> Error in `create_new_branch()`:
#> ! Failed creating branch to_delete. Check connection and try again
# Delete the branch named "to_delete" (asks for confirmation in interactive mode)
delete_branch(repo = "aux_test",
branch_to_delete = "to_delete")
#> Error in `delete_branch()`:
#> ! branch to_delete does not exists. Nothing to delete