I really have hard time to understand how .gitignore file works...
This is how my file looks like:
custom/history
cache
*.log
custom/modules/*/Ext
upload
sugar-cron*
custom/application/Ext
custom/Extenstion/modules/*/Ext/Language
!custom/modules/*/Language/cs_CZ.*
!custom/modules/*/Language/en_us.*
custom/Extenstion/application/Ext/Language
!custom/Extenstion/application/Ext/Language/cs_CZ.*
!custom/Extenstion/application/Ext/Language/en_US.*
.htaccess
config.php
config_override.php
files.md5
This is how my git status looked like:
apache@cb772759c68a sugarcrm$ git status
# On branch master
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# LOG.txt
# deploy_backup/
nothing added to commit but untracked files present (use "git add" to track)
So now I wanted to get rid of the two untracked files, but to my surprise a whole bunch of other files was removed too.
apache@cb772759c68a sugarcrm$ git clean -fd
Removing Disabled/upload:/
Removing LOG.txt
Removing custom/Extension/modules/Bugs/Ext/Language/
Removing custom/Extension/modules/Cases/Ext/Language/
Removing custom/Extension/modules/EmailAddresses/
Removing custom/Extension/modules/EmailParticipants/
Removing custom/Extension/modules/ForecastManagerWorksheets/
Removing custom/Extension/modules/ForecastWorksheets/
Removing custom/Extension/modules/Forecasts/
Removing custom/Extension/modules/Meetings/Ext/Layoutdefs/
Removing custom/Extension/modules/Meetings/Ext/WirelessLayoutdefs/
Removing custom/Extension/modules/Meetings/Ext/clients/
Removing custom/Extension/modules/ModuleBuilder/
Removing custom/Extension/modules/OutboundEmail/
Removing custom/Extension/modules/PdfManager/
Removing custom/Extension/modules/ProjectTask/Ext/Language/
Removing custom/Extension/modules/Quotas/
Removing custom/Extension/modules/Quotes/Ext/Dependencies/
Removing custom/Extension/modules/Targets/
Removing custom/Extension/modules/Tasks/Ext/Language/
Removing custom/Extension/modules/TimePeriods/
Removing custom/application/
Removing custom/install/
Removing custom/modules/Administration/
Removing custom/modules/Bugs/
Removing custom/modules/Cases/
Removing custom/modules/Contracts/
Removing custom/modules/Emails/
Removing custom/modules/HHP_Products/
Removing custom/modules/KBContents/
Removing custom/modules/Project/
Removing custom/modules/ProjectTask/
Removing custom/modules/ProspectLists/
Removing custom/modules/Prospects/
Removing custom/modules/Quotas/
Removing custom/modules/Reports/
Removing custom/modules/RevenueLineItems/
Removing custom/modules/Schedulers/
Removing custom/modules/Tags/
Removing custom/modules/Teams/
Removing custom/modules/hhp_assignment_zip/
Removing custom/modules/hhp_zipcode/
Removing custom/working/modules/Calls/
Removing custom/working/modules/Leads/clients/
Removing deploy_backup/
Removing deploy_log/
Removing dist/identity-provider/tests/docker/saml-test/config/simplesamlphp/config/
Removing vendor/sugarcrm/identity-provider/tests/docker/saml-test/config/simplesamlphp/config/
First point - The removed files were not shown after git status
so obviously they were part of gitignore "mask"... Can anyone explain, how does any of these files match any of the patterns in gitignore? Like vendor/sugarcrm/identity-provider/tests/docker/saml-test/config/simplesamlphp/config/
... Can anyone help me with building a propper gitignore?
Second point - I thought that .gitignore "protects" these unversioned files from git clean
, that git literally does not take any action up on them. So obviously it does delete them... how can I not delete unversioned files while using git clean
?
EDIT: I confused git clean with git rm, I was talking about git clean the whole time
EDIT 2: it turned out, that the deleted directories which didn't match the .gitignore were "empty" after all. (they had subdirectories, but the directory tree was without any files...)
You've mis-interpreted what git clean
removes by default and with -d
. (Note: I'm not a big fan of git clean
myself; it's way too easy to have it remove precious files.)
As phd notes, listing a file in .gitignore
specifically disables, by default, having git clean
clean it away. However, git clean
is (significantly) more complicated than that. We'll get into this in a bit.
First, though, let's address one peculiarity of .gitignore
entries. If you already know all this (but nobody seems to :-) ) you can skip down to the git clean
-specific sections below.
A file that is tracked (is in the index right now) is never ignored, so that matching a .gitignore
or equivalent (e.g., .git/info/exclude
) pattern is irrelevant.
The phrase is in the index right now means just that. When you use git add
or git rm --cached
to add or remove a file, that changes its tracked-ness. You can also use git ls-files --stage
to dump out a complete list of every file in the index along with its staging data—mode, hash, and stage-slot-number—or without --stage
to get just the names.
A file (not a directory) that has been found by Git, that is not in the index right now, is untracked. Git does not store directories so directories never appear in the index.1 Tracked or untracked is purely a property of files.
An untracked file can also be an ignored file. If so, git add
won't add it, even if you name it explicitly on the command line (though you can both name it explicitly and use --force
to add it).
This means files (but not directories) fall into one of three categories: tracked, untracked (only), or untracked-and-ignored. This matters for git status
, which only complains about untracked files (not untracked-and-ignored), but also in a moment for git clean
as well.
Last, when Git is doing a full directory-tree search / scan—as in git add .
for instance—and encounters a directory that it might be able to skip (has no tracked files within it), Git will check whether the directory itself matches a .gitignore
pattern, and if so, not look inside it. This speeds up git status
and git add -A
/ git add .
on such directories (sometimes enormously, if you can ignore an entire vendor tree or SDK for instance).
Rule 4 is why, if you want to not ignore particular file paths that live underneath some directory path, you must instruct Git to specifically not-ignore the directory. If you ignore the directory, Git may never look inside the directory. This affects these three lines in particular:
custom/Extenstion/application/Ext/Language
!custom/Extenstion/application/Ext/Language/cs_CZ.*
!custom/Extenstion/application/Ext/Language/en_US.*
If you have ignored the entire directory custom/Extenstion/application/Ext/Language
, Git won't look inside it and will never find any file matching custom/Extenstion/application/Ext/Language/cs_CZ.*
to un-ignore it. It's therefore necessary to except the directory itself from ignored status: you should change the first line to read custom/Extenstion/application/Ext/Language/*
, so that Git must look inside the directory. The subsequent lines ending with cs_CZ.*
and en_US.*
will override the ignored status for Czech and US-English files.
1In fact, they can appear in the index, but only so as to be treated as special cases. git ls-files
, which can show you the index contents, skips right over them.
git clean -d
clearly modifies Rule 4Git can only remove a directory if it's empty. This is a general OS-enforced rule: if a directory d
contains some files d/f1
, d/f2
, and so on, and you were to remove d
without removing the files first, you'd have a problem with the files. The system forces you to first remove the files within the directory. This applies to sub-directories as well: you can't remove d
if d/sub
exists even if d/sub
is itself an empty directory. Only empty directories can be removed.
Running git clean
without -d
not only leaves Rule 4 installed, but actually extends it. For instance, in the example we started with, Git notices that (1) custom/Extenstion/application/Ext/Language
is a directory; (2) the directory matches an ignore pattern; so (3) provided there are no files in custom/Extenstion/application/Ext/Language
that are already tracked, Git can and will skip the entire directory (and of course not remove it, since git clean
is running without -d
).
Suppose that there's another directory named xyzzy/
that has no files listed in the index. This directory might be completely empty. In that case, there are no untracked files within it, by definition; so git clean
without -d
should do nothing to it. Or it might have files; these files are by definition untracked (and hence may be untracked-and-ignored), but you said not to remove directories, so git clean
still doesn't even bother to look inside. This is the slightly odd case: Git often doesn't bother to look inside unknown directories.2 (You see this with git status
as well: you have to use git status -uall
to find the files inside a mystery directory. But git add -A
or git add .
has to look inside, unless the directory is ignored, which is why Rule 4 is a bit complicated in the general case.)
Running with -d
, though, apparently throws Rule 4 out completely. Again, in order to remove a directory, Git must first remove all the files within the directory. To do that, Git has to enumerate the contents as well. So if you tell git clean
to use -d
, it seems appropriate to disable Rule 4 entirely. The directory-ness of a path name will force Git to scan the directory's contents. Either we already needed to look inside because there are tracked files, or we need to look inside to remove files in order to remove the directory.
2Note that "unknown" is not the same as "untracked". It's not even a Git term; I've made it up here. However, as we'll see, it might be nice if Git did define the phrase "untracked directory".
git clean
removesRunning git clean -n
will show you what it would remove. This showing uses some shorthand: removing a directory implies removing all the files within that directory, including (recursively) sub-directories with sub-files. This is safer than running with -f
instead of -n
, since -f
shows you what it did remove, the same way -n
shows you what it would remove.
By default, git clean
removes files that are untracked, but not files that are untracked-and-ignored. That is, go back to point 3 above and look at the three classifications of files: git clean
removes the middle classification (only). Adding -X
(uppercase X) tells Git: don't remove untracked-only files; instead, remove untracked-and-ignored files.
Adding -x
tells Git: don't read the usual ignore-directives files such as .gitignore
. At this point, no files will be ignored, so that (regardless of which files are tracked) no files can be untracked-and-ignored. Combining this with -X
would make no sense,3 so git clean
forbids you to use both -x
and -X
.
Running git clean
with -d
adds empty-directory removal. Here, things get particularly squirrely, though. It seems as though Git's tracked, untracked, and untracked-and-ignored classification breaks down a bit. The documentation says that -d
will:
Remove untracked directories in addition to untracked files.
But Git has no definition of untracked directories. "Tracked-ness" is exclusively a property of files. We did see, in a footnote, that directories sneak into the index as invisible entities (for purposes of speeding up various Git operations), but that doesn't really mean that directories are tracked.
We can make one up: an "untracked directory" might be a directory that contains no tracked files. I think (but have not proven to my own satisfaction) that this definition works and explains git clean
's behavior. It would help a lot if the Git documentation actually defined this properly, though.
3Combining -x
and -X
with -e
could have some practical uses, but Git still forbids this, at least as of today.