“Vendor branch” in a git monorepo
By David Röthlisberger. Tweet your comments @drothlis.
Published 4 Oct 2022.
If your software depends on a third-party library, “vendoring” the dependency means to commit the source code of that library into a “vendor” folder alongside your own source code. A “vendor branch” is a technique for integrating new releases of that third-party library when you are maintaining your own custom patches to it. Git makes this technique particularly easy, but I haven’t seen it documented well, so here’s my attempt.
The “vendor branch” terminology comes from subversion.
Directory structure
As an example, let’s imagine a C codebase that vendors curl, using the following directory structure:
my_file.c
third_party/
curl/
In the olden days you would extract the tarball of the latest curl release into third_party/curl/, and commit all the files. Later in the article we’ll see how to do this with git subtree instead.
The vendor branch
The “vendor branch” contains only the third-party dependency, but with the same directory structure as above. This branch doesn’t share any history with our main branch.
To create such a branch:
$ git checkout --orphan vendor/curl
Switched to a new branch 'vendor/curl'
$ git rm -rf .
$ git status
On branch vendor/curl
No commits yet
nothing to commit (create/copy files and use "git add" to track)
Now we create the subdirectory, put the curl source code there, and commit it:
$ mkdir -p third_party/curl
$ tar --directory=third_party/curl --extract --file=/path/to/curl-7.80.0.tar.gz --strip-components=1
$ git add third_party/curl
$ git commit --message="curl 7.80.0"
Next, we merge the vendor branch into our normal branch:
$ git checkout main
$ git merge vendor/curl
At this point our git history looks like this:
Integrating a new vendor release
We continue development on our main branch. We might even make some custom changes to the third-party library — we do this by editing the files in third_party/curl/ directly and committing them like any other change.
Our git history is looking like this. The highlighted commit is changing one of the vendored files (third_party/curl/Makefile) but it’s on our main branch, not the vendor branch:
When we want to upgrade the third-party library, we check out our “vendor” branch (which doesn’t have any of our custom changes) and we replace the files there with the new version:
$ git checkout vendor/curl
$ git rm -rf third_party/curl
$ mkdir -p third_party/curl
$ tar --directory=third_party/curl --extract --file=/path/to/curl-7.81.0.tar.gz --strip-components=1
$ git add third_party/curl
$ git commit --message="curl 7.81.0"
Next, we merge the vendor branch into our main branch:
$ git checkout main
$ git merge vendor/curl
If there are any conflicts caused by the custom curl patches on our main branch, this is the time to resolve them.
Now our git history looks like this:
It’s easy to see what has changed between the two curl releases. Our custom changes to curl are still there on our main branch. The git history clearly shows which changes are ours vs. upstream. This is “vendor branch”.
git subtree
git subtree does exactly what I have described above, except that the code on the “vendor” branch comes from another git repo instead of a tarball, and the code on the “vendor” branch isn’t under the third_party/curl/ directory. Instead, git subtree moves the code to third_party/curl/ when merging it to our main branch.
Instead of creating the vendor branch ourselves, we would use git subtree add like this:
$ git status
On branch main
nothing to commit, working tree clean
$ git subtree add --squash --prefix=third_party/curl git@github.com:curl/curl.git curl-7_80_0
git fetch git@github.com:curl/curl.git curl-7_80_0
From github.com:curl/curl
* tag curl-7_80_0 -> FETCH_HEAD
Added dir 'third_party/curl'
This gives us the following git history:
The top two commits were created by git subtree. The second-from-the-top commit contains all the code from the git repo we specified (git@github.com:curl/curl.git) at the revision we specified (tag curl-7_80_0) — all squashed into a single commit, discarding its own git history. Then the merge commit moves that code to the specified prefix (third_party/curl/) as well as merging it to our main branch.
After we have continued development on our main branch, we can upgrade the third-party library with git subtree pull:
$ git subtree pull --squash --prefix=third_party/curl git@github.com:curl/curl.git curl-7_81_0
From github.com:curl/curl
* tag curl-7_81_0 -> FETCH_HEAD
Merge made by the 'recursive' strategy.
In the git history you can see that git subtree has created a graph similar to our explicit vendor branch: