
Going back to a very early post, Daniel James wrote:
2009/5/21 troy d. straszheim <troy@resophonic.com>:
That's true of course. And of course this has little to do with the practice of repeatedly testing and 'merging around' unmaintained, unreleased code.
Generally, it isn't that much of a problem, since most developers just deal with a couple of directories. It isn't great but for the most part we can just ignore most of boost. I suppose if you're working on a new build system you can't.
I think there are a variety of situations in which one can't afford to ignore most of boost, or deal with only one 'component' at a time. Boost 'power users' (I term I've started using and am sure will grow to hate, apologies in advance) may want to maintain patchsets against the entire trunk, for instance. A GSOC student who is doing a port of fusion to c++0x might temporarily need any number of workarounds sprinkled throughout boost.
Sorry, my mail was too negative.
I'm finding your follow-up very productive (thanks), no worries.
[snip] If that results in a better testing system and better modularization, that could improve the situation massively - having a single tree for such disparate libraries that are developed at different rates doesn't help.
There's an ongoing assumption that modularization will help. Some of it I am fuzzy on. It helps to have headers and src under the same directory, I think there is wide agreement on that, but why? - it enables one to more easily trim down boost to prepare subdistributions. Uncontroversial. - it enables you to put boost components on the ends of svn:externals and version them separately: very controversial. And the one I'm interested in discussing now: - it enables one to more easily 'switchmerge' a single library. This is true. But is the need for this just a consequence of the fact that merging in svn is onerous/broken, no?
I'm just being careful, trying to establish how much of this is by design, and how much has just 'evolved' on its own...
It hasn't entirely evolved on its own. A few years back we the release and trunk branches were a complete mess (the current situation looks very clean in comparison). We cleaned up a lot of the conflicts and established the rule that changes had to be made to trunk and go through regression testing before being merged to release. Which has made things a lot better. It isn't ideal but it works a lot better than what we used to have which might be why we generally feel content.
Yes clearly an improvement.
The problem is that we can't branch release from trunk - as we don't know what new stuff is release worthy. And we don't want to overwrite trunk from release as that might lose something and it would interfere with more long term development.
It seems clear that the problem is lack of granularity. The trunk is a rather full, messy waiting room for the release branch. As the runup to a release comes, the release manager has little say in what uses testing resources.
Working on a branch is a nice idea in theory, in practice it turns out to be a pain since boost is so large, especially with subversion.
Is the prioritization backwards here? The linux kernel is twice as big. They can merge in seconds, and they do, (on the order of 4x/day iirc).
And we really do need to get code into the regression testing system as soon as possible. With git it is a little better as you can create a micro branch and rebase it, but git isn't really in a state where most boost developers could use it. Modularization would help a lot.
Let me push back at you a bit here. It is easy to provide a git setup that behaves nearly identically to subversion, if one wants to do things in a centralized way, ie just checkout, update, local diff, commit. Rebasing: nice in that it allows one to forward a nice linear history to the upstream repository, but IIUC this was developed only after the linux kernel folks realized that looking at commit history was actually useful, and they started to have opinions on what looks good. In any event, the fact that code has been rebased is invisible to public repositories. As I currently see it, the primitive we lack is the release/testing manager's ability to efficiently *merge* the next chunk of code from downstream into the testing queue. Whether the code he's merging is the result of nice clean rebases or a spiderweb of merges is irrelevant, at least until people start to care about how merge history looks (which may be never). So, does modularization help a lot because it allows one to merge by copying directories around? It does appears to have other costs, for instance having to use scripts to construct include directories full of forwarding headers, potentially combined with walls of svn:externals, etc.
If you look at your summary of differences, many of them are in Boost.Spirit and Boost.Math which are well maintained and are perhaps long term changes that aren't ready for release yet. But there are a lot of small changes in largely unmaintained libraries - I think they just need people to go through them and clean them up.
I think we are able to automatically identify which libraries most headers belong to, so a tool which summarized the unchanged headers might help. Anything which is left unmerged and untouched for some time would be a cause for concern.
Right, there are ways to get in there and clean up the trunk, this is tractable. But it seems that if the system remains unchanged, the situation will naturally evolve again. Thanks again for your time and patience, -t