mergeneovimvimdiff

merge two files with nvim -d or vimdiff


Yesterday the arch community announced the successful migration to git. Congrats at this point.

Reading the news article and the steps to use the new repositories I stumbled over the lines "merge the pacman pacnew /etc/pacman.conf.pacnew file".

I compared the two files (my existing and the new one) with nvim -d (or vimdiff) and thought is it possible to merge the files like I am used to within a git repository?


Solution

  • My experience is from Fedora/Centos where rpm might produce *.rpmsave or *.rpmnew files, but I am assuming *.pacnew files behave in the exact same way.


    and thought is it possible to merge the files like I am used to within a git repository?

    Yes it is. It requires a little bit of setup but it is absolutely doable.

    Step 1, Install etckeeper

    Etckeeper is a program that

    hooks into package managers like apt, yum, dnf or pacman to automatically commit changes made to /etc during package upgrades.

    THIS IS PROGRAM THAT EVERYONE SHOULD INSTALL, independently of plans of using git merge.

    Etckeeper has had full support for Arch since 2021.

    Step 2, create an additional worktree for /etc

    Worktree in general

    When you clone or initializes a repository with git you end up having the repo object storage where git stores commits in the .git directory, the index/staging/cache which also is stored in .git and a working directory which contains the files corresponding to the current commit. And there is a one to one relation between them.

    However git supports connecting multiple working directories to the same repo object storage through the worktree command. The additional worktrees shares the original .git directory but gets their own indexes, and the files are checked out in different directories. Only one single worktree can check out a given branch (although you can of course check out any specific commit, or use a mirroring branch).

    Typical use cases for creating additional worktrees are for instance

    Worktree for storing the contents of /etc

    If you use /etc as your worktree when doing interactive rebase, then notice that git checks out older commits, and thus /etc suddenly contains old versions of various config files. This not unlikely to represent a problem for some services. And when there is a conflicts where git inserts the "<<<<<"/">>>>>" markers, you will have config files that definitely have invalid syntax/content which is unlikely to not represent a problem.

    So for these reasons, you should consider /etc as the production system that you normally do not touch and rather create an additional worktree as the staging system where you are free to break and make modifications to.

    cd /etc
    git worktree add /root/etc-worktree worktree-main main
    cd /root/etc-worktree
    

    Step 3, taking advantage of git for merging

    In order to make use of git's merge power you need to keep your own changes separate from the upstream changes on separate branches (this sounds more complicated than it actually is). By having a branch which contains the upstream changes, you can then merge this branch and git is able to use its merge algorithm properly.

    You only need to do this for files that end up producing .pacnew or .rpmnew files and you can do this retroactively, so it is fine to just start with one main branch and one worktree-main branch, and add upstream branches as needed over time.

    You need one branch per file with upstream changes you want to merge (e.g. this only applies to files that you have made local changes to which then prevents the package manager to update).

    Example

    One of my machines started out being installed with Fedora 26 in 2017. Why do I know that? Because etckeeper is the second program to install on my list of what to do on a new machine (only installing a full vim version has higher priority), and the first commit which /etc/hosts is present in has commit message "Fresh install of Fedora 26". Did I mention how awesome etckeeper is?

    Over time I have added a few entries to /etc/hosts (and thanks to etckeeper I know exactly when!). The initial hosts from Fedora 26 only contained two lines, but then in September 2022 (how do I know when?...) an update to the setup package came with a new hosts file which contained 5 additional comment lines. Since I had modified hosts this new version ended up as /etc/hosts.rpmnew.

    So this is how I handled that update:

    I ran git log -p hosts to figure out what the newest commit before I started modifying it was. It was actually commit 1d49be6b Fresh install of Fedora 26. So that is the starting point for the upstream update branch.

    My branch naming policy is to prefix all such upstream branches with rpmnew since *.rpmnew files will be the source for the content of those. You can decide to use the full path in the branch or just the file name. For some cases like /etc/mock/default.cfg using rpmnew/mock/default.cfg is obviously a much better choice than rpmnew/default.cfg while using rpmnew/sshd_config for /etc/ssh/sshd_config might be ok, just make a choice (where a hybrid approach is a possibility).

    So with that in mind, I created a rpmnew/hosts branch and updated it with the new hosts.rpmnew content.

    cd /root/etc-worktree
    git branch rpmnew/hosts 1d49be6b
    git switch rpmnew/hosts
    cp /etc/hosts.rpmnew hosts
    git add hosts
    git commit -m /etc/hosts.rpmnew
    rm /etc/hosts.rpmnew
    

    Notice that this resets the worktree's content back to 2017, but that's fine because is not used for anything else. If I had worked directly in /etc that would have been a huge problem.

    So the upstream changes branch is updated and I can then merge it into the main branch (via worktree-main first).

    cd /root/etc-worktree
    git switch worktree-main
    git merge --ff main  # `worktree-main` should strictly follow `main` and only
                         # occasionally be ahead when doing a merge.
    git merge rpmnew/hosts
    

    The last merge command might result in conflicts, and if so you need to resolve those. I highly recommend using KDiff3 for conflict resolution.

    After the merge is done then worktree-main is one merge commit ahead of main. This one we want to update /etc to use.

    cd /etc
    git switch main
    git merge --ff worktree-main
    

    and that's it. /etc was moved forward from whatever today's content was to today's content plus an update to hosts and git made use of its merge functionality along the way.