Re: [boost] RE process (prospective from a retired FreeBSD committer)...

31 Jan 2011

      At Mon, 31 Jan 2011 12:27:24 +0300,
Vladimir Prus wrote:
...
Dave Abrahams wrote:
...
...
...
...
2. Ours 'merge to release branch' process has the advantage that
even if somebody breaks library X in trunk completely, he might
just forget to merge it to release. So, we get improved stability,
at the cost of slow progress. It's often the case that somebody
forgets to merge changes to release branch, especially for
patches and fixes applied outside of one's official libraries.
So, in comparison, to achieve the same rate of changes and quality,
(1) requires discipline with commits and checking test results,
while (2) requires the same, and additional dancing and coordination
with merges. So, (2) is strictly more complex and error prone, and
should only be done if specific maintainers want separate branches
for their components.
Anyone can make a separate branch at any time no matter which
procedure is used.
And? (2) still requires more effort.
No argument.  I am just saying that the one condition under which
you say 2 "should only be done" isn't even really valid.
I disagree. Everybody can make a branch, but that's always additional 
overhead, and it's extra overhead when you have to coordinate merging
(or other form of inclusion)
Again, the form of inclusion is a single command that says "update my
idea of Boost to reference the latest release of every submodule."
...
of that branch into release with release managers. So, you are
proposing that using a branch and coordinating with release managers
be required from every Boost developer, which is just creating more
work.
Meh.  What coordination?
...
...
...
"will"? In practice, right now, developers do forget.
This line of argument eventually devolves into "library maintainers
will forget to check their changes in at all, so they should work on a
network share that we snapshot for release."  I think marking a change
as ready for release should be a fairly conscious decision, and one
that's separate from saying "let's test this code out."  As long as
that's the case, someone can forget.  IMO the trade-off is well worth
it.
I think that running "xxx commit" should be a conscious decision. It's
not at all clear to me that after that, additional conscious decisions
should be required.
Understood.  Some people don't like to do their work on a branch.  It
does cost one extra step when you're ready to declare the work
production-ready and for public consumption.

If you really want to do that, you can commit all your work on the
STABLE branch of your repository and it will get sucked into the next
Boost release automatically (unless it causes problems).
...
...
...
And again, there are often commits made by those who are not
official maintaners of library X (because X might not even have
maintainers). The chances of such changes of falling through are
even higher.
I don't see how.  Right now someone has to explicitly decide which
changes are moved to the release branch, and something equivalent
will happen
Right now, any developer can do this himself, without depending on
somebody else to pick up the changes.
IIUC you are really pointing to the fact that every developer can
modify every file in Boost, and thus can make changes to any library
that will automatically go out in the next release unless someone
reverts them.  We could maintain that system of course.  I don't think
it would be a very good idea, but we could.
...
...
...
...
...
2. Because 'git cherry-pick' is fundamentally unable to record
any mergeinfo, this means that any time you want to merge a
single specific fix to release branch, you would have problems.
What release branch?  Who said anything about cherry-pick?  What
problems?
Suppose you have library X with 200 new changes. For next release, it is
necessary to include one of those changes.
You cherry-pick that commit onto the branch in X's repo that currently
points at X's last release.
And as we've determined, we have chances of running into merge conflicts
later.
Yes, it can happen, and this is one scenario where SVN will outperform
Git.  However, it doesn't happen often, and apparently if you're using
Git "properly" you don't really get into such a situation.  If I apply
the "logic of the Git ethic" here, I come up with this:

* Your 200 new changes are not yet publicly available (in a
  non-volatile branch), so if they're visible at all they are in a
  "look-but-don't-touch" condition

* You cherry-pick the one change onto that release branch

* You *immediately* rewrite the branch that contained the change so
  that it:

  a) no longer contains that change, and
  b) is rebased on the release branch

Operating strictly via this ethic is a new concept to me.  I have
violated it blatantly in the past and seldom run into merge conflicts,
so you can probably usually get away with not following it. However, I
think the more collaborators you have, the more necessary it becomes.
...
...
...
Second, could you name, exactly, all libraries whose developers
expressed a need for individual releases?
No, of course I can't name all of them exactly.  Can you?  I doubt
it.
It's you who is on the mission to prove this modularization effort
is gonna help anybody.
No, I know I can't prove any of this... and frankly I don't think I
need to.  Most people understand that decoupling, where practical, is
a good idea and that a monolithic Boost has lots of downsides.
...
...
...
...
...
...
> which, together with delays added by release managers, means that
> a fix to a component will be added to the unified thing after
> considerable delay
I don't see where the delay comes from.
Either release managers look at the thing they merge (which adds
considerable delay), or they don't, in which case direct push to
release branch is more efficient.
They don't look, unless there are test failures or other problems.
And they don't merge.  At most we're talking about updating submodule
references.
Which still has to be done manually?
No.  From above: "The individual library developers determined which
versions will be automatically pulled into the unified thing" by which
I mean there's a "button" release managers can push to assemble a
collection of the latest release versions of all the libraries in the
collection.
When is this button pushed? If it's pushed right before release, it
means the release state was never tested together.
It's pushed some suitable time before the next scheduled release, and
then there's a period where that release state is tested, examined,
and prepared to go out.
...
If you want to have it pushed regularly, what are guarantees it will
be pushed?
C'mon, man!  The release managers have to do *something* every three
months.  Pushing a button doesn't seem too burdensome.
...
And if you want it to be pushed automatically, why don't
just have developers push into release branch directly?
They do!  They push into the release branch of their library's
repository.  That's the only release branch there is.  Keeping the
modules well-separated *guarantees* that at integration time there's
no chance of merge conflicts.  They are simply checked out into
separate directory trees.
...
...
IIUC there is strong interest within at least part of the KDE
community in following a Ryppl-like modularized approach.  They have
been looking into various kinds of package management infrastructure
for this job.  I'll be happy to try to find out more if you want.
Mailing list pointers will help.
working on it.
...
...
...
Maybe, that suggest that we're actually solving wrong problem?
Exactly the opposite.  KDE's moving in that direction IIUC; FreeBSD is
moving in that direction.  I'm confident we could quickly find 10
other projects that are trying to modularize and zero modularized
projects that are trying to switch to a monolithic organization.  When
have you ever heard of that happening?
Nice question. Let me avoid answering it
No, thanks.  Please answer.  A one-word answer is easy to type :-)
...
and ask whether you are aware of many modularizations whereby a
component the size of boost::any is split into a separate git
repository?
Yes, it happens very commonly with emacs packages.
...
...
...
...
The problem isn't where the code is located, it's:
a. how much coordination is required to get things done
Yes. And given that right now, coordination basically boils down to
library authors merging their things before deadline (or not merging),
and given that merge is a single command,
svn switch, svn merge, svn switch == a single command?
I somehow manage with just svn merge ;-)
You at least have to commit.  If you do most of your work on the
trunk, you must keep the release branch checked out somewhere all the
time, so you don't need to switch.  I eventually started doing that
too.
...
...
But anyway,
that's irrelevant because...
...
it's not clear to me how you are improving on that.
...it's not about the number of commands, it's about the sense of
speed and freedom with which library developers can operate.
Okay. Though for me, nothing beats the speed and freedom of 
having my changed put to release branch directly.
Fair enough.  I am the first to admit this is highly subjective.

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com