I want to allow manual changes to automatically generated files (from templates), while still allowing for updates to the templates or to data. I envision to use git to track the human-made and code-generated portions of the generated files and merge them intelligently.
Somewhat along the lines of
What I want is something like often done when handling embedded code where the IDE autogenerates code and for example fills register addresses, but still allows manual changes to said code. However, they usually restrict changes only to small segments - which is a restriction that I cannot allow.
I think that it can be done with separate branches and a suitable merge/rebase strategy, but when thinking about it I identify many difficult corner cases. Therefore I am looking for references where something similar has been done already or a more detailed strategy that avoids at least the following issues:
I can separate the templates from the autogenerated files into different directories, but that does not fix the checkout problem.
Ideally I'd want to write some python code that handles the complete workflow: git checkouts, template generation, merges and user feedback with a single command.
I see two ways of handling this.
Either you rebase your manual changes on top of "upstream" changes from time to time when applicable in a similar way as making temporary commits, only that in your case the distinction is not temporary but manual (say with a "Manual: " prefix).
The other way to do it if you want to keep the manual changes kept in the history over time as normal commits is do the same1 as when you have your /etc
directory under version control (e.g. using Etckeeper) and some upstream package update either creates a *.rpmnew
or *.rpmsave
file.
So to pull in a rpmnew update of say /etc/hosts
for instance, I need to figure out what version the update is relative to, e.g. distinguishing between the automatic upstream changes and my local changes. Running git log -p hosts
might show some custom "10.0.0.50 www.example.com" entries I have added until the newest upstream change shows up:
commit 769700d4ea13c89959333eeda66574507d5a3237
Author: Mr Root <root@localhost>
Date: Fri May 27 17:34:04 2022 +0200
committing changes in /etc made by "-bash"
Package changes:
...
-0:setup-2.13.9.1-2.fc35.noarch
+0:setup-2.13.10-1.fc35.noarch
...
diff --git a/hosts b/hosts
index 849c10d..740a59a 100644
--- a/hosts
+++ b/hosts
@@ -1,2 +1,7 @@
+# Loopback entries; do not change.
+# For historical reasons, localhost precedes localhost.localdomain:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
+# See hosts(5) for proper format and other examples:
+# 192.168.1.10 foo.mydomain.org foo
+# 192.168.1.13 bar.mydomain.org bar
Thus I know that the new hosts file update should be relative to commit 769700d4. Therefore I
cd /root/etc.worktree
git checkout main.worktree
git merge --ff main
git branch rpmnew/hosts 769700d4
git checkout rpmnew/hosts
cp /etc/hosts.rpmnew hosts
git add hosts
git commit -m /etc/hosts.rpmnew
git checkout main.worktree
git merge rpmnew/hosts # Resolve any conflicts using KDiff3, https://github.com/hlovdal/git-resolve-conflict-using-kdiff3
cd /etc
git checkout main
git merge main.worktree
rm hosts.rpmnew
The next time there is a /etc/hosts.rpmnew
file created you of course skip the branch creation and just re-use the existing branch.
The above example is a bit more complicated being /ect where you do not want to disturb the main worktree while updating, but I include the full details since it should not be too hard to grasp.
1 Specifically for /etc
you want to avoid checking out older versions because that could make programs misbehave when the content of /etc changes. So for that scenario you definitely want to do the rpmnew/rpmsave recovery in a separate worktree and then only merge in the result into the main /etc directory at the end. This should not be an issue for your average source code repository.