Chapter 21 Git Version Control | Bioconductor Packages: Development, Maintenance, and Peer Review

21 Git Version Control

The Bioconductor project is maintained in a Git source control system. Package maintainers update their packages by pushing changes to their git repositories.

This chapter contains several sections that will cover typical scenarios encountered when adding and maintaining a Bioconductor package.

21.1 Essential work flow

A minimal workflow is to checkout, update, commit, and push changes to your repository. Using BiocGenerics as an example:

git clone git@git.bioconductor.org:packages/BiocGenerics
cd BiocGenerics
## add a file, e.g., `touch README`
## edit file, e.g., `vi DESCRIPTION`
BiocGenerics$ git commit README DESCRIPTION
BiocGenerics$ git push

This requires that Bioconductor knows the SSH keys you use to establish your identity.

Two useful commands are

BiocGenerics$ git diff     # review changes prior to commit
BiocGenerics$ git log      # review recent commits

If the repository is already cloned, the work flow is to make sure that you are on the ‘devel’ branch, pull any changes, then introduce your edits.

BiocGenerics$ git checkout devel
BiocGenerics$ git pull
## add, edit, commit, and push as above

21.2 Where to Commit Changes

New features and bug fixes are introduced on the devel branch of the GIT repository.

BiocGenerics$ git checkout devel
BiocGenerics$ git pull
## edit 'R/foo.R' and commit on devel
BiocGenerics$ git commit R/foo.R  #
[devel c955179] your commit message
1 file changed, 10 insertions(+), 3 deletions(-)
BiocGenerics$ git push

To make more extensive changes see Fix bugs in devel and release.

Bug fixes can be ported to the current release branch. Use cherry-pick to identify the commmit(s) you would like to port. E.g., for release 3.6, porting the most recent commit to devel

BiocGenerics$ git checkout RELEASE_3_6
BiocGenerics$ git cherry-pick devel
BiocGenerics$ git push

21.3 Checks and version bumps

Each commit pushed to the Bioconductor repository should build and check without errors or warnings

BiocGenerics$ cd ..
R CMD build BiocGenerics
R CMD check BiocGenerics_1.22.3.tar.gz

Each commit, in either release or devel, should include a bump in the z portion of the x.y.z package versioning scheme.

Builds occur once per day, and take approximately 24 hours. See the build report for git commits captured in the most recent build (upper left corner)

21.4 Annotation packages

Traditional Annotation packages are not stored in GIT due to the size of annotation files. To update an existing Annotation package please send an email to . A member of the Bioconductor team will be in contact to receive the updated package.

Newer annotation packages can be stored in GIT as it is a requirement to use the AnnotationHub server hosted data. The larger sized files are not included directly in the package. To contribute a new Annotation package please contact for guidance and read the documentation on Create A Hub Package.

Currently direct updates to annotation packages, even those stored on git, are not supported. If you wish to updated an annotation package, make required changes and push to git.bioconductor.org. Then send an email to or requesting the package be propagated.

21.5 Subversion to Git Transition

The essential steps for transitioning from SVN to git are summarized in

21.5.1 New package workflow

Goal: You developed a package in GitHub, following the Bioconductor new package Contributions README guidelines, submitted it to Bioconductor, and your package has been moderated. As part of moderation process, the package to be reviewed has been added as a repository on the Bioconductor git server.

During and after the review process, package authors must push changes that include a version number ‘bump’ to the Bioconductor git repository. This causes the package to be built and checked on Linux, macOS, and Windows operating systems, and forms the basis for the review process.

In this document, package authors will learn best practices for pushing to the Bioconductor git repository.

  1. SSH keys. As part of the initial moderation step, Bioconductor will use SSH ‘public key’ keys available in https://github.com/<your-github-id>.keys.

    After the review process is over, additional SSH keys can be added and contact information edited using the BiocCredentials application.

  2. Configure the “remotes” of your local git repository. You will need to push any future changes to your package to the Bioconductor git repository to issue a new build of your package.

    Add a remote named upstream to your package’s local git repository using:

     git remote add upstream git@git.bioconductor.org:packages/<YOUR-REPOSITORY-NAME>.git

    Check that you have updated the remotes in your repository; you’ll see an origin remote pointing to github.com, and an upstream remote pointing to bioconductor.org

     $ git remote -v
    
     origin  <link to your github> (fetch)
     origin  <link to your github> (push)
     upstream git@git.bioconductor.org:packages/<YOUR-REPOSITORY-NAME>.git (fetch)
     upstream git@git.bioconductor.org:packages/<YOUR-REPOSITORY-NAME>.git (push)

    NOTE: As a package developer, you must use the SSH protocol (as in the above command) to gain read/write access to your package in the Bioconductor git repository.

  3. Add and commit changes to your local repository. During the review process you will likely need to update your package. Do this in your local repository by first making sure your repository is up-to-date with your github.com and git.bioconductor.org repositories.

     git fetch --all
    
     ## merge changes from Bioc (upstream remote at git@git.bioconductor.org)
     git merge upstream/devel
    
     ## merge changes from GitHub (origin remote)
     git merge origin/devel

    Note. If your GitHub default branch is main, replace devel with main:

     ## merge changes from GitHub (origin remote)
     git merge origin/main

    Make changes to your package devel branch and commit them to your local repository

     git add <files changed>
     git commit -m "<informative commit message>"
  4. ‘Bump’ the package version. Your package version number is in the format ‘major.minor.patch’. Initial package submissions should have a version number of 0.99.0, as indicated by the Version field in the DESCRIPTION file. Increment the patch version number by 1, e.g., to 0.99.1, 0.99.2, …, 0.99.9, 0.99.10, …

    Bumping the version number before pushing is essential. It ensures that the package is built across platforms.

    Remember to add and commit these changes to your local repository.

  5. Push changes to the Bioconductor and GitHub repositories. Push the changes in your local repository to the Bioconductor and GitHub repositories.

     ## push to BioC (upstream remote at git@git.bioconductor.org)
     git push upstream devel
    
     ## push to GitHub (origin remote)
     git push origin devel

    Note. If your GitHub default branch is main, replace devel with main:

     ## push to Bioc (upstream remote at git@git.bioconductor.org)
     git push upstream main:devel
    
     ## push to GitHub (origin remote)
     git push origin main
  6. Check the updated build report. If your push to git.bioconductor.org included a version bump, you’ll receive an email directing you to visit your issue on github.com, https://github.com/Bioconductor/Contributions/issues/; a comment is also posted on the issue indicating that a build has started.

    After several minutes a second email and comment will indicate that the build has completed, and that the build report is available. The comment includes a link to the build report. Follow the link to see whether further changes are necessary.

  7. See other scenarios for working with Bioconductor and GitHub repositories, in particular:

21.5.2 Create a new GitHub repository for an existing Bioconductor package

Goal: As a maintainer, you’d like to create a new GitHub repository for your existing Bioconductor repository, so that your user community can engage in the development of your package.

  1. Create a new GitHub account if you don’t have one.

  2. Set up remote access to GitHub via SSH or Https. Please check which-remote-url-should-i-use and add your public key to your GitHub account.

  3. Once you have submitted your keys, you can login to the BiocCredentials application to check if the correct keys are on file with Bioconductor.

  4. Create a new GitHub repo on your account, with the name of the existing Bioconductor package.

    We use “BiocGenerics” as an example for this scenario.

    After pressing the ‘Create repository’ button, ignore the instructions that GitHub provides, and follow the rest of this document.

  5. On your local machine, clone the empty repository from GitHub.

    Use https URL (replace <developer> with your GitHub username)

    git clone https://github.com/<developer>/BiocGenerics.git

    or SSH URL

    git clone git@github.com:<developer>/BiocGenerics.git
  6. Add a remote to your cloned repository.

    Change the current working directory to your local repository cloned in the previous step.

    cd BiocGenerics
    git remote add upstream git@git.bioconductor.org:packages/BiocGenerics.git
  7. Fetch content from remote upstream,

    git fetch upstream
  8. Merge upstream with origin’s devel branch,

    git merge upstream/devel

    NOTE: If you have the error fatal: refusing to merge unrelated histories, then the repository cloned in step 4 was not empty. Either clone an empty repository, or see Sync existing repositories.

  9. Push changes to your origin devel,

     git push origin devel

    NOTE: Run the command git config --global push.default matching to always push local branches to the remote branch of the same name, allowing use of git push origin rather than git push origin devel.

  10. (Optional) Add a branch to GitHub,

    ## Fetch all updates
    git fetch upstream
    
    ## Checkout new branch RELEASE_3_6, from upstream/RELEASE_3_6
    git checkout -b RELEASE_3_6 upstream/RELEASE_3_6
    
    ## Push updates to remote origin's new branch RELEASE_3_6
    git push -u origin RELEASE_3_6
  11. Check your GitHub repository to confirm that the devel (and optionally RELEASE_3_6) branches are present.

  12. Once the GitHub repository is established follow Push to GitHub and Bioconductor to maintain your repository on both GitHub and Bioconductor.

21.5.3 Maintain a Bioconductor-only repository for an existing package

Goal: Developer wishes to maintain their Bioconductor repository without using GitHub.

21.5.3.1 Clone and setup the package on your local machine.

  1. Make sure that you have SSH access to the Bioconductor repository; be sure to submit your SSH public key or github id to Bioconductor.

  2. Clone the package to your local machine,

     git clone git@git.bioconductor.org:packages/<ExamplePackage>

    NOTE: If you clone with https you will NOT get read+write access.

  3. View existing remotes

     git remote -v

    which will display

     origin    git@git.bioconductor.org:packages/<ExamplePackage>.git (fetch)
     origin    git@git.bioconductor.org:packages/<ExamplePackage>.git (push)

    This indicates that your git repository has only one remote origin, which is the Bioconductor repository.

  4. In other work flows, the origin remote has been renamed to upstream. It may be convenient to make this change to your own repository

     git remote rename origin upstream

    and confirm that git remote -v now associates the upstream repository name with git@git.bioconductor.org.

21.5.3.2 Commit changes to your local repository

  1. Before making changes to your repository, make sure to pull changes or updates from the Bioconductor repository. This is needed to avoid conflicts.

     git pull
  2. Make the required changes, then add and commit your changes to your devel branch.

     git add <files changed>
     git commit -m "My informative commit message"
  3. (Alternative) If the changes are non-trivial, create a new branch where you can easily abandon any false starts. Merge the final version onto devel

     git checkout -b feature-my-feature
     ## add and commit to this branch. When the change is complete...
     git checkout devel
     git merge feature-my-feature

21.5.3.3 Push your local commits to the Bioconductor repository

  1. Push your commits to the Bioconductor repository to make them available to the user community.

    Push changes to the devel branch using:

     git checkout devel
     git push upstream devel

Make sure there is a valid version bump for the changes to propagate to users.

21.5.3.4 (Optional) Merge changes to the current release branch

Merging devel into release branch should be avoided. Select bug fixes should be cherry-picked from devel to release. See section Bug Fixes in Release and Devel

21.6 More scenarios for repository creation

21.6.1 Sync an existing GitHub repository with Bioconductor

Goal: Ensure that your local, Bioconductor, and GitHub repositories are all in sync.

  1. Clone the GitHub repository to a local machine. Change into the directory containing the repository.

  2. Configure the “remotes” of the GitHub clone.

     git remote add upstream git@git.bioconductor.org:packages/<YOUR-REPOSITORY>.git
  3. Fetch updates from all (Bioconductor and GitHub) remotes. You may see “warning: no common commits”; this will be addressed after resolving conflicts, below.

     git fetch --all
  4. Make sure you are on the devel branch.

     git checkout devel
  5. Merge updates from the GitHub (origin) remote

     git merge origin/devel
  6. Merge updates from the Bioconductor (upstream) remote

     git merge upstream/devel

    Users of git version >= 2.9 will see an error message (“fatal: refusing to merge unrelated histories”) and need to use

     git merge --allow-unrelated-histories upstream/devel
  7. Resolve merge conflicts if necessary.

  8. After resolving conflicts and committing changes, look for duplicate commits (e.g., git log --oneline | wc returns twice as many commits as in SVN) and consider following the steps to force Bioconductor devel to GitHub devel.

  9. Push to both Bioconductor and GitHub repositories.

     git push upstream devel
     git push origin devel
  10. Repeat for the release branch, replacing devel with the name of the release branch, e.g., RELEASE_3_6. It may be necessary to create the release branch in the local repository.

     git checkout RELEASE_3_6
     git merge upstream/RELEASE_3_6
     git merge origin/RELEASE_3_6
     git push upstream RELEASE_3_6
     git push origin RELEASE_3_6

    NOTE: If you are syncing your release branch for the first time, you have to make a local copy of the RELEASE_X_Y branch, by

     git checkout -b <RELEASE_X_Y> upstream/<RELEASE_X_Y>

    Following this one time local checkout, you may switch between RELEASE_X_Y and devel with git checkout <RELEASE_X_Y>. If you do not use the command to get a local copy of the release branch, you will get the message,

    (HEAD detached from origin/RELEASE_X_Y)

    Remember that only devel and the current release branch of Bioconductor repositories can be updated.

21.6.2 Create a local repository for private use

Goal: A user (not the package developer) would like to modify functions in a package to meet their needs. There is no GitHub repository for the package.

  1. Clone the package from the Bioconductor repository. As an end user, you do not have write access to the repository, so use the https protocol

     git clone https://git@git.bioconductor.org/packages/<ExamplePackage>
  2. Make changes to your local repository. Commit the changes to your local repository. A best practice might modify the changes in a new branch

     git checkout -b feature-my-feature
     ## modify
     git commit -a -m "feature: a new feature"

    and then merge the feature onto the branch corresponding to the release in use, e.g.,

     git checkout <RELEASE_X_Y>
     git merge feature-my-feature
  3. Rebuild (to create the vignette and help pages) and reinstall the package in your local machine by running in the parent directory of ExamplePackage

     R CMD build ExamplePackage
     R CMD INSTALL ExamplePackage_<version.number>.tar.gz
  4. The package with the changes should be available in your local R installation.

21.7 Scenarios for code update

21.7.1 Pull upstream changes

Goal: Your Bioconductor repository has been updated by the core team. You want to fetch these commits from Bioconductor, merge them into your local repository, and push them to GitHub.

NOTE: It is always a good idea to fetch updates from Bioconductor before making more changes. This will help prevent merge conflicts.

These steps update the devel branch.

  1. Make sure you are on the appropriate branch.

     git checkout devel
  2. Fetch content from Bioconductor

     git fetch upstream
  3. Merge upstream with the appropriate local branch

     git merge upstream/devel

    Get help on Resolve merge conflicts if these occur.

  4. If you also maintain a GitHub repository, push changes to GitHub’s (origin) devel branch

     git push origin devel

To pull updates to the current RELEASE_X_Y branch, replace devel with RELEASE_X_Y in the lines above.

See instructions to Sync existing repositories with changes to both the Bioconductor and GitHub repositories.

21.7.2 Push to GitHub and Bioconductor repositories

Goal: During everyday development, you commit changes to your local repository devel branch, and wish to push these commits to both GitHub and Bioconductor repositories.

NOTE: See Pull upstream changes for best practices before committing local changes.

  1. We assume you already have a GitHub repository with the right setup to push to Bioconductor’s git server (). If not please see FAQ’s on how to get access and follow instructions to Maintain GitHub and Bioconductor repositories. We use a clone of the BiocGenerics package in the following example.

  2. To check that remotes are set up properly, run the command inside your local machine’s clone.

    git remote -v

    which should produce the result (where <developer> is your GitHub username):

    origin  git@github.com:<developer>/BiocGenerics.git (fetch)
    origin  git@github.com:<developer>/BiocGenerics.git (push)
    upstream    git@git.bioconductor.org:packages/BiocGenerics.git (fetch)
    upstream    git@git.bioconductor.org:packages/BiocGenerics.git (push)
  3. Make and commit changes to the devel branch

     git checkout devel
     ## edit files, etc.
     git add <name of file changed>
     git commit -m "My informative commit message describing the change"
  4. (Alternative) When changes are more elaborate, best practice is to use a local branch for development.

     git checkout devel
     git checkout -b feature-my-feature
     ## multiple rounds of edit, add, commit

    Merge the local branch to devel when the feature is ‘complete’.

     git checkout devel
    
     # Pull upstream changes before merging
     # http://bioconductor.org/developers/how-to/git/pull-upstream-changes/
    
     git merge feature-my-feature
  5. Push updates to GitHub’s (origin) devel branch

     git push origin devel
  6. Next, push updates to Bioconductor’s (upstream) devel branch

    git push upstream devel
  7. Confirm changes, e.g., by visiting the GitHub web page for the repository.

21.7.3 Resolve merge conflicts

Goal: Resolve merge conflicts in branch and push to GitHub and Bioconductor repositories.

  1. You will know you have a merge conflict when you see something like this:

     git merge upstream/devel
     Auto-merging DESCRIPTION
     CONFLICT (content): Merge conflict in DESCRIPTION
     Automatic merge failed; fix conflicts and then commit the result

    This merge conflict occurs when the package developer makes a change, and also a collaborator or a Bioconductor core team member makes a change to the same file (in this case the DESCRIPTION file).

    How can you avoid this? Pull upstream changes before committing any changes. In other words, fetch and merge remote branches before a push.

  2. If in spite of this you have conflicts, you need to fix them. See which file has the conflict,

     git status

    This will show you something like this:

     On branch devel
     Your branch is ahead of 'origin/devel' by 1 commit.
       (use "git push" to publish your local commits)
     You have un-merged paths.
       (fix conflicts and run "git commit")
       (use "git merge --abort" to abort the merge)
    
     Un-merged paths:
       (use "git add <file>..." to mark resolution)
    
       both modified:   DESCRIPTION
    
     no changes added to commit (use "git add" and/or "git commit -a")
  3. Open the file in your favorite editor. Conflicts look like:

     <<<<<<< HEAD
     Version: 0.23.2
     =======
     Version: 0.23.3
     >>>>>>> upstream/devel

    Everything between <<<< and ===== refers to HEAD, i.e your current change. And everything between ===== and >>>>> refers to the remote/branch shown there, i.e upstream/devel.

    You want to keep the most accurate change, by deleting what is necessary. In this case, keep the latest version:

     Version: 0.23.3
  4. Add and commit the file as you would any other change.

     git add DESCRIPTION
     git commit -m "Fixed conflicts in version change"
  5. Push to both your GitHub and Bioconductor repositories,

    git push origin devel
    git push upstream devel

21.7.4 Abandon changes

Goal: You want to start fresh after failing to resolve conflicts or some other issue. If you intend to go nuclear, please contact the mailing list.

21.7.4.1 Force Bioconductor devel to GitHub devel

One way you can ignore your work and make a new branch is by replacing your local and GitHub repository devel branch with the Bioconductor devel branch.

Note: This works only if you haven’t pushed the change causing the issue to the Bioconductor repository.

Note: Any references to commits on current devel (e.g., in GitHub issues) will be invalidated.

  1. Checkout a new branch, e.g., devel_backup, with tracking set to track the Bioconductor devel branch upstream/devel.

     git checkout -b devel_backup upstream/devel
  2. Rename the branches you currently have on your local machine. First, rename devel to devel_deprecated. Second, rename devel_backup to devel. This process is called the classic Switcheroo.

     git branch -m devel devel_deprecated
     git branch -m devel_backup devel
  3. You will now have to “force push” the changes to your GitHub (origin) devel branch.

     git push -f origin devel
  4. (Optional) If you have commits on your devel_deprecated branch that you would like ported on to your new devel branch. Git has a special feature called cherry-pick

    Take a look at which commit you want to cherry-pick on to the new devel branch, using git log devel_deprecated, copy the correct commit id, and use:

     git cherry-pick <commit id>

    Push these cherry-picked changes to GitHub and Bioconductor repositories.

21.7.4.2 Reset to a previous commit

If you find yourself in a place where you want to abandon changes already committed to Bioconductor or GitHub, use reset to undo the commits on your local repository and push -f to force the changes to the remotes. Remember that the HEAD commit id is the most recent parent commit of the current state of your local repository.

git reset --hard <commit id>

Example:

git reset --hard e02e4d86812457fd9fdd43adae5761f5946fdfb3
HEAD is now at e02e4d8 version bump by bioc core

To make the changes permanent, you will then need to push the changes to GitHub, and then email the Bioconductor core team to force push to the repository on Bioconductor.

## You

git push -f origin

Bioconductor core team will do the rest after you email.

21.7.4.3 Delete your local copy and GitHub repo, because nothing is working

CAUTION: These instructions come with many disadvantages. You have been warned.

  1. Delete your local repository, e.g., rm -rf BiocGenerics

  2. Delete (or rename) your GitHub repository.

  3. Maintain GitHub and Bioconductor repositories for an existing Bioconductor repository, then Pull upstream changes.

21.7.4.3.1 Disadvantages of going “nuclear”
  1. You will lose all your GitHub issues

  2. You will lose your custom collaborator settings in GitHub.

  3. You will lose any GitHub-specific changes.

21.7.5 Fix bugs in devel and release

Goal: When a bug is present in both the release and devel branches of Bioconductor, a maintainer will have to introduce a patch in the default git branch and in the current release branch (e.g., RELEASE_3_14).

  1. First Sync existing repositories.

     git fetch --all
     git checkout devel
     git merge upstream/devel
     git merge origin/devel
     git checkout <RELEASE_X_Y>
     git merge upstream/<RELEASE_X_Y>
     git merge origin/<RELEASE_X_Y>
  2. On your local machine, be sure that you are on the devel branch.

     git checkout devel

    Make the changes needed to fix the bug and add the modified files to the commit. Remember to bump the version number in the DESCRIPTION file in a separate commit. Only bug-fix changes should be introduced in this commit.

     git add <files changed>

    Commit the modified files. It is helpful to tag the commit message with “bug fix”.

     git commit -m "bug fix: my bug fix"

    Bump the version of the package by editing the Version field in the DESCRIPTION and commit the change in a separate commit. This allows to only cherry-pick the bug correction and avoid version number conflicts with the Bioconductor branches when/if the bug fixes are ported to release.

     ## after version bump
     git add DESCRIPTION
     git commit -m "version bump in devel"
  3. (Alternative) If the changes are non-trivial i.e., with multiple commits, create a new branch where you can easily abandon any false starts.

     git checkout devel
     git checkout -b bugfix-my-bug
     ## add and commit to this branch to fix the bug

    Merge the final version of the branch into the default branch.

     git checkout devel
     git merge bugfix-my-bug
  4. Switch to the release branch and cherry-pick the commit hash or range of hashes from the default branch that correspond to the bug fix (more examples in git cherry-pick --help). Remember to edit the DESCRIPTION file to update the release version of the package according to Bioconductor’s version numbering scheme.

     git checkout <RELEASE_X_Y>
     ## example hash from git log: 2644710
     git cherry-pick <hash>
     ## Bump the version and commit the change
     git add DESCRIPTION
     git commit -m "version bump in release"

    NOTE: If you are patching your release for the first time, you have to make a local copy of the RELEASE_X_Y branch with

     git checkout -b <RELEASE_X_Y> upstream/<RELEASE_X_Y>

    Following this one time local checkout, you may switch between RELEASE_X_Y and devel with git checkout <RELEASE_X_Y>. If you do not use the command to get a local checkout of the release branch, you will get the message:

     (HEAD detached from origin/RELEASE_X_Y)
  5. Push your changes to both the GitHub and Bioconductor devel and <RELEASE_X_Y> branches. Make sure you are on the correct branch on your local machine.

    For the devel branch,

     git checkout devel
     git push upstream devel
     git push origin devel

    For the release branch,

     git checkout <RELEASE_X_Y>
     git push upstream <RELEASE_X_Y>
     git push origin <RELEASE_X_Y>
  • See the video tutorial here:

21.7.6 Bug fixes due to API changes

Packages that make use of web APIs will need to be updated when there are significant API changes. This is a common occurrence with Bioconductor packages that use REST APIs, e.g., UniProt.ws, AnVIL, etc. To provide a smooth experience for users, it is important to update the package as soon as possible.

21.7.6.1 Minor API changes

The steps to apply bug fixes due to API changes are similar to the steps for fixing bugs in the devel and release branches. The main difference is that the bug may be due to minor API issues, certificate issues, API version issues, etc. The maintainer can readily update the R package to adapt to the new API. Note that the functionality of the package largely remains the same.

21.7.6.2 Major API changes

It may be possible that an organization’s API changes are not backwards compatible and that they do not provide the same functionality for the existing Bioconductor package to remain functional. In such case, it is recommended to contact the organization that maintains the API to get more information about the changes and to request a backward compatible API, if possible. When API changes do not provide a similar functionality to that of the R package, the package maintainer may need to submit a new package that works with the new API and to deprecate the old package.

21.7.7 Resolve Duplicate Commits

Goal: You want to get rid of the duplicate (or triplicate) commits in your git commit history.

Before you begin Update your package to the existing version on Bioconductor.

21.7.7.1 Steps:

  1. IMPORTANT Make a backup of the branch with duplicate commits, call this devel_backup (or RELEASE_3_7_backup). The name of the back up branch should be identifiable and specific to the branch you are trying to fix (i.e if you want to fix the devel branch or some <RELEASE_X_Y> branch).

    On devel, (make sure you are on devel by git checkout devel)

     git branch devel_backup
  2. Identify the commit from which the duplicates have originated. These commits are more often than not, merge commits.

  3. Reset your branch to the commit before the merge commit.

     git reset —hard <commit_id>
  4. Now cherry pick your commits from the devel_backup branch.

     git cherry-pick <commit_id>
    1. The commits you cherry-pick should be only 1 version of the duplicate commit, i.e don’t cherry-pick the same commit twice.

    2. Include the branching and version bump commits in your cherry-pick. Make the package history look as normal as possible.

  5. (Optional) In some cases, there are conflicts you need to fix for the cherry-pick to succeed. Please read the documentation on how to resolve conflicts

  6. Finally, you would need to contact the bioc-devel mailing list to reach the Bioconductor core team to sync your repository with the version on Bioconductor. This is not possible as --force pushes which alter the git timeline are not possible for maintainers.

21.7.7.2 How to check if your package has duplicate commits

One way is to check the log. You should see the same commit message with the same changes, but with a different commit ID, if you try

git log

or

git log --oneline

21.7.7.3 Script to detect duplicate commits

Run this script written in python to detect duplicate commits which are specific to Bioconductor repositories.

Usage example:

python detect_duplicate_commits.py /BiocGenerics 1000

python detect_duplicate_commits.py <path_to_package> <number of commits to check>'

21.8 Github scenarios

21.8.1 Add collaborators and leverage GitHub features

Goal: You would like to take advantage of the social coding features provided by GitHub, while continuing to update your Bioconductor repository.

21.8.1.1 Maintaining Collaborators on GitHub

  1. Adding a new collaborator

  2. Removing collaborator

21.8.1.2 Pull requests on GitHub

  1. Merging a pull request

21.8.1.3 Push GitHub changes to the Bioconductor repository

Once you have accepted pull requests from your package community on GitHub, you can push these changes to Bioconductor.

  1. Make sure that you are on the branch to which the changes were applied, for example devel.

     git checkout devel
  2. Fetch and merge the GitHub changes to your local repository.

     git fetch origin
     git merge

    Resolve merge conflicts if necessary.

  3. Push your local repository to the upstream Bioconductor repository.

     git push upstream devel

    To push GitHub release branch updates to the Bioconductor release branch, replace devel with name of the release branch, e.g.: RELEASE_3_6.

21.8.2 Add or Transfer Maintainership of a Package

Goal: Perhaps there is a point in time where you can no longer maintain your package in accordance with the Bioconductor package guidelines. It may be necessary to add or transfer maintainer-ship of a package in order to properly maintain the package and avoid deprecation and removal.

  1. Find a new maintainer

    You may have a collaborator or colleague volunteer to take over. If not, ask on the bioc-devel mailing list.

  2. Email or

    The original maintainer should email and request that the maintainer of the package be updated. Include the package name and the contact information for the new maintainer.

  3. Update Package DESCRIPTION file

    The DESCRIPTION file of the package should be updated to the new maintainer information and pushed to the Bioconductor git.bioconductor.org repository.

21.8.3 Remove Large Data Files and Clean Git Tree

Goal: Git remembers. Sometimes large data files are added to git repository (intentionally or unintentionally) causing the size of the repository to become large. When this happens, it’s not enough to just remove the files. You also must remove them from the git tree (the repository history), or else your repository will remain large.

There are a few ways to remove large files from a git history. Here, we’ll outline two options: 1) git filter-repo, and 2) the BFG repo cleaner. These steps should be run on your local copy and (if necessary) pushed to your own github repository.

21.8.3.1 Removing Large Files from History with filter-repo

As of 2023, the recommended way is to first locate any large files, and then remove them with git filter-repo.

  1. Identify large files using this git rev-list script:
git rev-list --objects --all \
| git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' \
| sed -n 's/^blob //p' \
| sort --numeric-sort --key=2 \
| cut -c 1-12,41- \
| $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest

This will list all the objects in your repository, including objects in your history, in order of file size. Use this to identify the names of files to remove.

  1. Remove them with filter-repo.

This is a separate tool that you’ll have to install (with e.g. pip3 install git-filter-repo). Then, you can rewrite your repository history to remove files like this, where <file-glob> identifies your files:

git filter-repo --path-glob '<file-glob>' --invert-paths

For example, to remove all .RData files, you could use:

git filter-repo --path-glob '*.RData' --invert-paths

This command may reset your remotes. Check with git remote and if needed, you can add remotes back in using something like this:

git remote add origin git@github.com:<username>/<repo_name>
git push --set-upstream origin devel --force

Finally, we have to push with --mirror to reset the remote.

git push --force --mirror

Now, just notify everyone that they’ll have to re-clone the new repository, since history has been rewritten, so existing clones will no longer be compatible with this repo.

21.8.3.2 Removing Large Files from History with BFG repo cleaner

Another option is to use BFG. The steps below assume origin is a user-maintained github repository.

NOTE: Anyone that is maintaining the package repository (with a local copy) should run steps 1-3.

  1. Download BFG Repo-Cleaner

  2. Run BFG Repo-Cleaner on your package directory

    In the location of your package, run the following command

    java -jar <path to download>/bfg-1.13.0.jar --strip-blobs-bigger-than 100M <your
    package>

    Note: The above command would remove any file that is 100Mb or larger. Adjust this argument based on the size of the files you are cleaning up after. It should be lower than the offending file size.

  3. Run clean up

    cd <your package>
    git reflog expire --expire=now --all && git gc --prune=now --aggressive
  4. Push Changes

    git push -f origin
  5. Request updates on the git.bioconductor.org repository location.

    The Bioconductor git server does not allow -f or to force push to the git.bioconductor.org location. Please email explaining the package has been cleaned for large data files and needs to be reset.

21.9 Frequently Asked Question (FAQs)

  1. I can’t access my package.

    You will need to log in to the BiocCredentials application. If you have not logged in before, you must first activate your account.

    There are two steps,

    1. If there is no SSH key registered, you must add one.

    2. If there is already an SSH key registered, check the packages you have access to in the ‘Profile’ interface.

    You can alternatively check if you have access to your package using the command line

       ssh -T git@git.bioconductor.org

    If you have access to your package, but cannot git pull or push, please check FAQ #13, #14, and #15.

  2. I’m a developer for Bioconductor, my package ExamplePackage is on the new server https://git.bioconductor.org. What do I do next?

    Take a look at Maintain GitHub and Bioconductor repositories. This will give you the information needed.

    NOTE: This situation is for packages which were previously maintained on SVN and have never been accessed through GIT. It is not for newly accepted packages through Github.

  3. I have a GitHub repository already set up for my Bioconductor package at www.github.com/<developer>/<ExamplePackage> , how do I link my repository in GitHub and https://git.bioconductor.org ?

    Take a look at New package workflow. Step 2 gives you information on how to add the remote and link both GitHub and Bioconductor repositories.

  4. I’m unable to push or merge my updates from my GitHub repository to my Bioconductor package on git@git.bioconductor.org, how do I go about this?

    If you are unable to push or merge to either your GitHub account or Bioconductor repository, it means you do not have the correct access rights. If you are a developer for Bioconductor, you will need to submit your SSH public key to the BiocCredentials app.

    You should also make sure to check that your public key is set up correctly on GitHub. Follow Adding a new SSH key to your GitHub account.

  5. I’m not sure how to fetch the updates from git.bioconductor.org with regards to my package, how do I do this?

    Take a look at Sync existing repositories. This will give you the information needed.

  6. I’m just a package user, do I need to do any of this?

    As a package user, you do not need any of these developer related documentation. Although, it is a good primer if you want to be a contributor to Bioconductor.

    You can also open Pull requests and issues on the Bioconductor packages you use, if they have a GitHub repository.

  7. I’m new to git and GitHub, where should I learn?

    There are many resources where you can learn about git and GitHub.

  8. I’m a Bioconductor package maintainer, but I don’t have access to the Bioconductor server where my packages are being maintained. How do I gain access?

    Please submit your SSH public key using the BiocCredentials app. Your key will be added to your our server and you will get read+ write access to your package.

    All developers of Bioconductor packages are required to do this, if they don’t already have access. Please identify which packages you need read/write access to in the email.

  9. What is the relationship between the origin and upstream remote?

    In git lingo origin is just the default name for a remote from which a repository was originally cloned. It might equally have been called by another name. We recommend that origin be set to the developers GitHub repository.

    Similarly, upstream is the name for a remote which is hosted on the Bioconductor server.

    It is important that all the changes/updates you have on your origin are equal to upstream, in other words, you want these two remotes to be in sync.

    Follow Sync existing repositories for details on how to achieve this.

    Image explaining GitHub and Bioconductor relationship for a developer

  10. Can I have more than one upstream remote, if yes, is this recommended?

    You can have as many remotes as you please. But you can have only one remote with the name upstream. We recommend having the remote origin set to GitHub, and upstream set to the Bioconductor git server to avoid confusion.

  11. Common names used in the scenario’s

    developer: This should be your GitHub username, e.g., mine is nturaga.

    BiocGenerics: This is being used as an example to demonstrate git commands.

    ExamplePackage: This is being used a place holder for a package name.

    SVN trunk and git devel branch are now the development branches.

  12. I’m a Bioconductor developer only on the Bioconductor server. I do not have/want a GitHub account. What should I do?

    You do not have to get a Github account if you do not want one. But it is a really good idea, to maintain your package publicly and interact with the community via the social coding features available in Github.

    We highlight this in Maintain a Bioconductor-only repository

  13. I cannot push to my package. I get the error,

     $ git push origin devel
     fatal: remote error: FATAL: W any packages/myPackage nobody DENIED by fallthru
     (or you mis-spelled the reponame)

    (you might have renamed the origin remote as upstream; substitute upstream for origin. Check your remote,

     $ git remote -v
     origin  https://git.bioconductor.org/packages/myPackage.git (fetch)
     origin  https://git.bioconductor.org/packages/myPackage.git (push)

    As a developer you should be using the SSH protocol, but the origin remote is HTTPS. Use

     git remote add origin git@git.bioconductor.org:packages/myPackage

    to change the remote to the SSH protocol. Note the : after the host name in the SSH protocol, rather than the / in the HTTPS protocol. Confirm that the remote has been updated correctly with git remote -v.

    If your remote is correct and you still see the message, then your SSH key is invalid. See the next FAQ.

  14. Before sending a question to the Bioc-devel mailing list about git, please check the output of the following commands for correctness so that we can help you better.

    • As a developer check to make sure, you are using SSH as your access protocol. Check the output of git remote -v for consistency. Include this in your email to bioc-devel, if you are unsure. The remote should look like,

        origin  git@git.bioconductor.org:packages/myPackage.git (fetch)
        origin  git@git.bioconductor.org:packages/myPackage.git (push)

      or

        origin  git@github.com:<github username>/myPackage.git (fetch)
        origin  git@github.com:<github username>/myPackage.git (push)
        upstream  git@git.bioconductor.org:packages/myPackage.git (fetch)
        upstream  git@git.bioconductor.org:packages/myPackage.git (push)
    • Check if you have access to the bioc-git server (), by using ssh -T git@git.bioconductor.org. This will show you a list of packages with READ(R) and WRITE(W) permissions. As a developer you should have R W next to your package. This is based on the SSH public key you are using, the default for ssh authentication is id_rsa.

        R       admin/..*
        R       packages/..*
        R   admin/manifest
        R   packages/ABAData
        R   packages/ABAEnrichment
        R   packages/ABSSeq
        R W     packages/ABarray
        R   packages/ACME
        R   packages/ADaCGH2
        R   packages/AGDEX
  15. SSH key not being recognized because of different name?

    If you have named your SSH public key differently from id_rsa as suggested by ssh-keygen, you may find it useful to set up a ~/.ssh/config file on your machine. Simply make a ~/.ssh/config file if it does not exist, and add,

     host git.bioconductor.org
         HostName git.bioconductor.org
         IdentityFile ~/.ssh/id_rsa_bioconductor
         User git

    In this example, my private key is called id_rsa_bioconductor instead of id_rsa.

    You may find it useful to check the BiocCredentials app to see what SSH key you have registered.

  16. SSH key asking for a password and I don’t know it? How do I retrieve it?

    There are a few possibilities here,

    • You have set a password. The bioc-devel mailing list cannot help you with this. You have to submit a new key on the BiocCredentials app.

    • The permissions on your SSH key are wrong. Verify that the permissions on SSH IdentityFile are 400. SSH will reject, in a not clearly explicit manner, SSH keys that are too readable. It will just look like a credential rejection. The solution, in this case, is (if your SSH key for bioconductor is called id_rsa):

        chmod 400 ~/.ssh/id_rsa
    • You have the wrong remote set up, please check git remote -v to make sure the SSH access protocol is being used. Your bioc-git server remote, should be git@git.bioconductor.org:packages/myPackage.

  17. Can I create and push new branches to my repository on git.bioconductor.org?

    No. Maintainers only have access to devel and the current RELEASE_X_Y. New branches cannot be created and pushed to the bioconductor server. We recommend maintainers have additional branches on their Github repository if they are maintaining one.

  18. How can I fix my duplicate commits issue and find the required documentation?

    The detailed documentation to resolve duplicate commits can be found at the link.

21.9.0.1 More questions?

If you have additional questions which are not answered here already, please send an email to .