gitmacosmacos-catalinaicloud-driveicloud-documents

ICloud Drive Desktop Sync vs. Git - deleted files reappear and duplicates with number suffixes


Some words about my setup:

I have noticed that the following happens every now and then for some time now. Yesterday, I backed up my MacBook Pro to MacOs Catalina 10.15.2 and this seems to have exacerbated the following pecularity I noticed this in my git initialized project folders:

A lot of times when I deleted files from my local worktree they randomly start reappearing in the worktree (sometimes even a day or more later) as untracked files.

Secondly, quite regularly, my existing files are suddenly seemingly duplicated - there are copies of them with number suffixes like for file foo there suddenly is foo 2 and for file bar there is bar 6. They then also show up in git status as untracked files. example of duplicates appearing in worktree index

I also observed this behavior inside the .git folder. example of duplicates appearing in .git folder

* Edit: It is noteworthy that the <filename> 2 duplicates seem to stem from a previous time, sometimes even a month back (see "config 2" in .git folder screenshot above). I also noted (but this is not shown on the pictures I provided) that sometimes the number suffix is a random say "6" for example with no sequence of number suffixes (e.g. 1-5) leading up to that duplicated filename with number 6.

I have observed this happening every now and then but today it was all over the place. It could be that this problem especially appeared when I did some git ops like git commit git reset etc.

My assumption is that this must have something to do with .git not working well with ICloud Drive Desktop file sync.

So for now I will disable the ICloud Drive Desktop file sync option, and see if that will solve it.

In the meantime, is anyone here familiar with the issue I have described and can anyone point me in the right direction, please?

These posts seem to be related:

Can Git and iCloud Drive be effectively used together?

https://apple.stackexchange.com/questions/255172/icloud-drive-and-git-repository/353123

Github repo cloned to synced iCloud drive on multiple computers


Solution

  • Short Answer: Keep your repository folders outside of your iCloud Drive-synced folders, and you should be fine. To be safe, do not combine VCS and file-syncing services together for the same directories/files. Use Github/GitLab/Bitbucket/etc. for synced access and centralised safekeeping.

    Long Answer: iCloud Drive is a "consumer" product, meant for home users. If you are a developer working with version control software, you are considered a "professional" - and you will find that iCloud Drive (as well as other file-syncing solutions) is not a robust solution that will work well with your version-controlled folders. iCloud Drive (and other file-syncing services) is not aware of your VCS setup, and gets confused when you perform operations that make sweeping changes to directories - like switching branches or pulling changes. If you'd like to access your repositories on various computers/devices simultaneously, and have a 'central backup' of your repository files, just use one of the many repository hosting services - like Github, GitLab, Bitbucket, etc.

    Even Longer Answer: The key problem that all 'auto-syncing' software has, is: how do we determine when a file has been changed, and should be synced? Do we check the actual file contents, assuming that a file with the same name should be the same file? How about tracking name changes? How about when we transfer files from one computer to another, and permissions (or dates) might change?

    Often, file-syncing software will watch the directories as you work, for any changes. Once it detects that you've changed something in there, it will go through its routine to determine which files have changed, and to re-sync the needed ones.

    There are many VCS operations - like pulling the latest changes in the repository, switching branches, or rolling back to a previous commit - which are likely to cause the file-syncing software to trigger its syncing routine. Depending on the actual syncing algorithm (how it determines what has changed, what steps are taken to sync, and how fast it is at actually performing the sync), it is likely that it will detect 'false positives', which will cause you to end up with duplicates.

    In the particular case of the 'iCloud Drive + git' pair, we have a deadly combination: git is very fast at make sweeping changes to entire directory structures, and iCloud Drive is notoriously bad at detecting what has actually changed correctly - and is also very slow at syncing. This means that as git goes about switching branches and updating your working tree, iCloud Drive is likely to wrongly detect that files have changed, when they haven't. It will then tag these files for syncing. But because it is extremely slow at syncing, by the time it's halfway making its first duplicate copies, you might have already made another git change to your repository - which will cause you to now have 'file 3', and then 'file 4' and so on.

    Hopefully this might change in the future, but in the meantime, the safest solution is to simply NOT keep your version-controlled repositories in any folder that is automatically synced. In this particular case, if you keep your repositories on any folder that is not 'Documents' or 'Desktop' - and not watched by iCloud Drive - then you should not have an issue with git.

    Note that this is not an issue only with iCloud Drive and git. If you use any file syncing service (Dropbox, Google Drive, OwnCloud, Box, etc.) and any VCS (git, svn, fossil, etc.), you are likely to run into some kind of duplication, corruption, or security issue. :(

    Lastly, it is worthwhile mentioning that the benefits provided by iCloud Drive - and other file-syncing services - are 'availability' (being able to access the repository from multiple computers and devices, keeping them synced) and 'security' (having a central location with a safe copy of all your files). You get these benefits already if you use any of the repository hosting services, such as Github, GitLab, Bitbucket, etc. So, in general, file-syncing your repositories is something that you don't actually need to do - just use the repository hosting services you are probably already using! ;-)