
WARNING: This is a long post. The tl;dr version is: 1) follow the Linux Kernel development model 2) Employ the web of trust system using GPG keys 3) we need a Boost Foundation like the Linux Foundation 4) we need to lower the barrier to entry for would-be contributors 5) use Git/Mercurial for distributed development. On Fri, Dec 17, 2010 at 11:23 PM, Jim Bell <Jim@jc-bell.com> wrote:
On 1:59 PM, John Maddock wrote:
Interesting. So if I wanted to get SVN access and start working on things in a private branch [...]
It depends - if the library has an active maintainer, then yes, you just ask the maintainer or file a ticket [...]
I accept that there may be an issue with there not being enough folks for the above to work all that well though. The basic problem is that to maintain quality we've generally required all library maintainers to already have one accepted Boost library, so I guess the question we're struggling with is how to broaden the field without risking screwing things up too badly :-0
The crux of Boost.Guild's debate. And so many topics touch on this.
So how would you measure, or design a test, to determine how badly things would get screwed up under various scenarios?
I think we need to look outside the box for a solution to this. Let me cite an example of a way to broaden the contributor pool without bogging the release process or the development cycles of developers down. The most successful example of a large open source project that has tons of contributors and has an active community is the Linux Kernel project. There you have Linus, the BDFL of the project, choosing to trust maintainers to make choices regarding the maintenance and improvement of different subsystems. Anybody -- as in absolutely *anybody* -- is encouraged to clone the repository, make their changes, and submit these changes to the maintainer(s) actively maintaining that part of the kernel. You will find that there are different kernels released by different maintainers, but the "de facto" kernel is the one that Linus releases. Note that Linus doesn't check each and every line of code that comes into the kernel. What happens is he trusts a number of maintainers to do that for the subsystems that they're responsible for. This "inner circle" is a smallish group, around 10ish, who then delegate their overall responsibility across a wider number of subsystem maintainers. For your code to get to the "main line" kernel, you'd typically have to submit it to the maintainer of the module you're patching, who then shepherds it in by signing it and merging it into this repository and then asks the maintainer of the subsystem that his module is part of to pull from him, and then later on these maintainers ask Linus to pull from their repositories when it comes time to stabilize and go through the release process. This sounds like a slow process, but because of the decentralized nature of the development of the kernel you have people who have different timelines and pace working on different parts of the kernel without any one thing bogging the work down. The release process does a code/feature freeze but that means the higher up maintainers focus on stabilizing the code and then doing a release -- you can keep working on your repository and changing whatever you want and then when you feel your work is worth pulling in, you ask someone else to pull from you. This model allows for faster innovation, greater involvement, and lower barrier to entry. Now, that doesn't remove the maintainer dilemma -- but the beauty of that system is, even when a maintainer of a subsystem suddenly goes MIA, the community can decide to just pull from a different person's repository. Then, being a maintainer of a subsystem becomes no longer a choice of the original maintainers, but mostly the contributors. Let me try and explain a little bit more. If I'm a developer A, and there's a maintainer B who's supposed to be handling the module X, all I do is clone B's repository locally, make my changes, and then ask maintainer B to pull my changes. I can send him the changes via email, I can post the changes publicly (and sign it with my GPG key), or I can expose my repository publicly so that anybody (not just B) can get it. That should be simple enough to understand. What happens when B goes MIA or unresponsive? That's simple, I ask someone else -- maybe Linus, or maybe some other higher-level maintainer, or just someone the community already trusts -- to pull my changes in. Losing maintainer B is not a hindrance anymore because the community can start pulling from each other and stamping their trust and confidence on the code. Later on the community just elects by way of pull requests who it trusts to be a maintainer of a subsystem. This sounds like some pie-in-the-sky dream, but this is the reality already with the Linux kernel development. It is the single project I know that spans the globe with thousands of contributors. This model is already proven to scale. How does the trust system work? The Linux development team uses GPG heavily -- your key needs to have been signed by others already, and the people that signed your key must be trust worthy (meaning their key has already been signed as well by other trust-worthy people, etc.). So for your key to be signed by Linus Torvalds means something -- this means he trusts that you are someone that he will vouch for in terms of your credibility or "realness". That web of trust keeps people honest because if you start crewing up or doing something bad by community standards, people can revoke their signature on your key and that's like a no vote in parliament. There are a lot of lessons in the Linux kernel development process that Boost can certainly learn from. One of them is to decentralize the development and just maintain a "canonical" or "official" release of the library. Then having people maintaining either ports of the library to different architectures, or having people concentrating on warnings removal, and employing the web of trust system should ensure the sustainability and scalability of the development process. Essentially, by encouraging people to fork and innovate, then later on have their fork folded into the main line is a good and scalable way of developing systems in a progressive manner. Then the release process would just be a matter of the BDFL or the community-trusted people to pull from the maintainers and stabilize to get a suitable release out. What the Linux kernel project has that Boost doesn't (yet) is a Linux Foundation which actually funds the development effort of the kernel. The Linux Foundation ensures that people who want to do the kernel development full-time (like Linus and others like him) get compensated to do the shepherding and the innovation -- of course there's a process for qualifying for Linux Foundation funding. Note that this is different from the Apache Foundation which has a business-oriented and parliamentary involvement process (which at one time I thought would have been a good model for Boost, but have changed thoughts about since a few conversations I've had at BoostCon 2010). A Boost foundation that has stakeholders funding it to ensure that Boost keeps going as a project and compensate those people who would do this full-time but otherwise can't because of their employment (or lack thereof) would be a Good Thing (TM) IMHO.
Thinking out loud here... one option might be for someone to say "I'm going to try and give library X a decent update" and solicit the mailing list for bug/feature requests, then carry out the work in the sandbox and ask for a mini-review - at which point you're sort of lumbered with being the new maintainer ;-)
If someone is that motivated. But could something useful happen if ten people, each 1/10th as motivated, were to apply themselves?
I think the having to say it to the mailing list part and asking for permission is the wrong way of going about it. I think, if someone wants to do it, they should be able to do it -- and if their code is good and the community (or the people that the community trusts) vote with their pulls, then it gets folded in appropriately. For the matter of having 10 people work on it, I don't think it will change the situation. If we use the current system of the maintainers being the BDFL's for the projects they "maintain" and not allowing anybody else to take the library in a different direction and letting the community have a choice on the matter, is I think, not scalable. I would much rather have 10 implementations of a Bloom filter, let the people choose which one is better, and then have that implementation pulled into the main release. The same goes for all the libraries already in the repository. Just to note, what I'm driving at here is the need to lower the barrier to entry into Boost while having a means of ensuring quality for the "official"/"canonical" Boost release. Right now there are already a handful of release managers who I think don't do the release management on a full-time basis, but could manage just pulling changes from projects that have different development pace than the release process. This requires of course that the libraries be broken up into individual pieces and that the release process would be a stabilization effort rather than an active development effort. The issue of dependency management I think is over-blown with hypothetical situations -- the Linux kernel is one monolithic kernel and the lower-level subsystem details still get changed every so often (things that almost all the other parts depend on -- scheduler, memory management APIs, etc.) and they never had to complicate the matter of dependencies among the parts. Of course, it's needless to say that Boost ought to really go either Git or Mercurial to make doing this kind of distributed development trivial. ;) Have a good one guys and I hope this helps. -- Dean Michael Berris about.me/deanberris