In this F8 conference video(starting 8:40) from 2015 they speak about the advantages of using Mercurial and a single repository across facebook.
How does this work in practice? Using Mercurial, can i checkout a subdirectory (live in SVN)? If so, how? Do i need a facebook-mercurial-extension for this
P.S.: I only found answers like this or this from 2010 on SO where i am not sure if the answers still apply with all the efforts FB put into it.
From your question it is not clear if you are looking for a workflow (the monorepo vs multiple repos debate) or for performance and scaling for a huge code base.
For the workflow, I suggest googling for monorepo
. It has its pros and cons, you need to understand your situation and current workflow to decide. For the performance and scaling, keep reading.
The idea of remotefilelog
is not to checkout a subdirectory (as you mention), the idea is to checkout everything. In order to do that in an efficient way, you need two extensions actively developed by Facebook:
hg clone
and hg pull
time.hgwatchman
, it is now part of mercurial core). This dramatically reduces time of local operations such as hg status
. Note that fsmonitor
is independent from remotefilelog
. You can start experimenting with this, since it doesn't require any setup on the server side.With a recent mercurial (which I strongly suggest) you can shave off the additional startup time of the Python interpreter using CommandServer + CHg.
Some additional notes:
fsmonitor
. It works very well, on huge repos the time of hg status
is reduced from 10 secs to less than 1 sec (and the majority of this 1 sec is Python startup time, see above for CHg
). If your repository is really huge, you might need to fine tune some inotify kernel parameters (or the equivalent on MacOSX). The fsmonitor
documentation has all the information you need.remotefilelog
, although I read everything I found about it and I am sure it works. Depending on how development is done (everybody has always Internet connectivity or not, the organization has its own master repo or not) there can be a caveat: it partially transforms the decentralized hg
into a centralized VCS like svn
: some operations that normally can be done offline (for example: hg log
and the first hg update
to a changeset in the past) will now require connectivity to the master repository.remotefilelog
, I used extensively the largefiles
extension on a huge repo. It has the same drawbacks than remotefilelog
and some confusing corner cases for users that want to use hg
just to get things done without taking the time to understand how it works. If I were to manage another huge repo, I would use remotefilelog
instead than largefiles
, although their use case is not really the same.subrepositories
(doc1, doc2). The problem is that it changes the behavior of hg depending on where you are in the source tree. Again, if the developers don't care about really understanding how hg works, it will be just too confusing.Additional information:
mercurial facebook
.