version-controlmercuriallarge-filesmercurial-extension

How do I safely disable/remove the largefiles directory from a mercurial repository?


In the past, I have been working with the largefiles extension in mercurial to save data together with the code I have been working on. I think this was a mistake and I would like to remove the "largefiles" directory (8GB). Our network user directories are limited to 10 GB, and I need space. I have not used any large files for a long time now. I will not miss them when they are gone forever.

So my questions are

  1. Can I remove the largefiles directory under .hg without damaging the repo?
  2. If I do, will I be able to check out old code, even if some large datafiles are missing?
  3. Should I remove those files from all clones of that repo to avoid polluting all repos again with largefiles from another clone?

Solution

  • For your first question I did an experiment:

    1. Created a repo with a large file.
    2. hg update null
    3. Deleted .hg\largefiles
    4. hg update

    The large files came back! It turns out, at least on Windows, the large files are also cached in %UserProfile%\AppData\Local\largefiles. Since this was my only largefile database, It only contained my one large file, so I deleted that, too. This cache contains large files from multiple local largefile-enabled databases, so you'd have to be careful with this one. If it seems wasteful to have two copies, it turns out if the local databases are on the same drive as the %UserProfile%, then they are hardlinked. I have two drives in my system, and it turns out if a database is on a different drive it is still copied to the AppData location, but is not hardlinked and doubles your disk usage.

    Once all copies of the large file were deleted, an hg update gave:

    1 files updated, 0 files merged, 0 files removed, 0 files unresolved
    getting changed largefiles
    largefile.dat: can't get file locally
    (no default or default-push path set in hgrc)
    0 largefiles updated, 0 removed
    

    I then removed [extensions], largefiles= from .hg\hgrc to disable the extension. At this point the repository worked fine, but still had the .hglf directory with hashes in changesets that used to have large files. so the answer to your second question is yes, you can check out old code.

    For your third question, to eliminate all traces of largefiles and hashes, create a file with:

    exclude .hglf
    

    and run:

    hg convert --filemap <file> <srcrepo> <destrepo>
    

    Your users will then have to clone this new, modified repository because convert modifies the changesets and the new database will be unrelated to the old one.