Re: [boost] [git] Mercurial?

22 Mar 2012

      Julian Gonggrijp <j.gonggrijp@gmail.com> writes:
...
What is the meaning of a commit?
One possible interpretation is that a commit is a snapshot of your
project. A snapshot is something that you store for future reference.
Because in a sense it's a form of documentation, one will take care to
submit well-crafted commits that include enough useful changes to
license a new snapshot. In principle, every commit is assumed to
introduce some form of progress compared to the previous. Making
changes to such a history of snapshots is almost necessarily a form of
fraud.
This is the kind of mental model of a commit that is stimulated by
svn. You can see it from the terminology: making a commit causes the
repository to move to the next revision number.
Another possible interpretation is that a commit represents a unit of
work. This tends to favour many small commits over few big commits.
Since anything you do before you're sure that it's the right thing is
also work, shabby commits are part of the deal. The consequence is
that it must be very cheap to isolate any messy state in temporary
side tracks. Now the VCS is not only a collection of snapshots, but
also a tool to manage your recent pieces of work before you finally
commit* to some of them.
This kind of mental model is stimulated by git. It explains why git
users make a fuss about amending, rebasing and efficient branching and
merging.
I'm afraid I don't agree with this. The version control systems that
came after CVS switched to storing repository-wide snapshots. CVS was a
collection of RCS files and so completely file-centric. SVN, Mercurial,
Git, ... are all repository-centric. Conceptually they work by storing a
series (linear or not) of working copy snapshots.

Darcs is a possible exception to this: I think it might fit more closely
to your unit-of-work model since it models the repository state as a
result of a number of patches (units-of-work).
...
There is no point in arguing that one mental model is superior to the
other until you fully grasp both of them. I urge anyone who feels
tempted to make agitated remarks to let this sink in for at least a
few hours.
That said, I'm confident enough to think that I can give two solid
arguments why the units-of-work model is ultimately more productive.
I also prefer people to do five units-of-work (commits) instead of one
huge. Smaller commits should be more self-contained and will be easier
to review.

But I much prefer that the project is in a working state after every
single unit-of-work. This is because I think of each commit as a
snapshot that might be checked out alone.

This commonly happens when using the bisect command: the tool (Git or
Mercurial) will assist you in a binary search for a faulty commit. They
update to the middle of the unchecked range and you have to build and
test the commit. When doing that, it's annoying if you run into commits
that fail because of other problems than the one you're investigating.
Such false positives makes automated bisecting impossible.

Each project must make up its own policy here.

-- 
Martin Geisler

aragost Trifork
Professional Mercurial support
http://www.aragost.com/mercurial/