Re: [boost] [svn/git/hg] Support for modularization of Boost?

11 Apr 2012

      Beman Dawes <bdawes@acm.org> writes:
...
Modularization seems to have been missed in the discussions of
Subversion, Git, and Mercurial. Do distributed version control systems
in general and Git in particular have any important
advantages/disadvantages over svn for highly modularized projects?
Please, let's not waste everyone's time with a rehash of general DCVS
vs CCVS pros and cons. We have beat that to death. Let's focus this
thread on modularization support, particularly as it applies to Boost.
Modularization in Mercurial is done with something we call
subrepositories:

  http://mercurial.selenic.com/wiki/Subrepository

This lets you recursively checkout and operate on nested repositories.
The repositories can be of different kinds: Hg, Git, or SVN.

The system is quite simple: there are two version controlled files:

* .hgsub tells Mercurial where each nested repository comes from and
  where it should be put in the working copy. You manage this file.

* .hgsubstate tells Mercurial which exact revision to checkout for each
  subrepository. Mercurial manages this file.

Mercurial always checks out the version mentioned in .hgsubstate. This
means that each commit in the main repository will reference exactly one
version of each subrepo -- this ensures that you get a consistent
snapshot of the entire tree that way. This is very similar to using
pegged svn:externals in Subversion.

My company has written a subrepository guide that might be helpful:

  http://mercurial.aragost.com/kick-start/en/subrepositories/

I know that subrepositories are used in various big companies -- we've
helped some of them by improving the subrepo support in Mercurial. There
is now support for recursive 'hg status' and 'hg diff' and the goal is
to make subrepos more and more transparent for users.

That's all the good stuff about subrepos. The bad part is that working
with them isn't as smooth as working with a single repo -- it's simply a
more complex workflow.

If you begin referencing the subrepos by several main repos (and if you
don't, why are you using subrepos in the first place?) then you'll want
to configure the server-side to link the subrepos. That's not obvious at
first, but I've documented it here:

  http://stackoverflow.com/a/7318280/110204

I would also like to warn against a too tight coupling. When you use
subrepos you introduce a very tight coupling: you say "73f4e05287b4 in
foo needs exactly 0f1e621d3d3b from bar". You often don't need quite
that strong dependencies:

  http://stackoverflow.com/a/8637556/110204

I'm happy to see the talk about 0install, which I understand is a tool
that will manage the dependencies better.

Mercurial doesn't have anything like 'git sub-tree', though 'hg convert'
has the ability to re-write history so that it looks like a subtree was
in a different place (with a file map). I think that could be used to
make a tool very similar to 'git sub-tree'.

Please let me know if you have more questions about subrepos -- since I
was responsible for adding many of the --subrepos flags in Mercurial, I
know the capabilities quite well.

-- 
Martin Geisler

Mercurial links: http://mercurial.ch/