What's happened to Ryppl?

At Tue, 11 Jan 2011 10:36:56 +0000, Robert Jones wrote:
What's happened to Ryppl?
Fair enough; now that I'm digging myself out of the new years' pile-up, it's time for a status update. [Replies will go to the ryppl-dev list by default; see http://groups.google.com/group/ryppl-dev/subscribe for information on posting] --------- There are basically three parallel efforts in Ryppl: I. Modularize boost * Eric Niebler produced a script to shuffle Boost into separate Git repositories. He has been maintaining that script to keep up with minor changes in the Boost SVN, but doesn't have time at the moment to do much more. Fortunately, I don't think there's much more to be done on that. * John Wiegley has developed a comprehensive strategy for rewriting SVN history to preserve as much information as possible about the evolution of everything, and he's working on implementing that. I expect results soon. II. CMake-ify the modularized Boost * A bunch of work has been done on this, but we never got to the point that *everything* built, installed, and passed the tests. * Denis Arnaud has been maintaining a (non-modularized) Boost-CMake distribution; see https://groups.google.com/group/ryppl-dev/msg/b619c95964b0e003?hl=en and others in that thread for details. These two efforts can be merged; I'm sure of it. III. Dependency management * I have been working on libsat-solver. Sat-solver is the underlying technology of the zypp installer used by OpenSuse, and it contains all the important bits needed by any installation and dependency-management system and has the right licensing. It's a CMake-based project. * These are the jobs: 1. Porting to Mac. I a good chunk of this job (http://gitorious.org/opensuse/sat-solver/merge_requests/2 --- including submitting some CMake patches upstream!) but there's still more to do. Since sat-solver includes all kinds of ruby bindings and whatnot that we don't really need for this project, these parts probably need to be ported in order for the changes to be accepted upstream. 2. Replace the example program's use of RPM by Git. 3. Port to Windows. Mateusz Loskot made a bunch of progress on this (http://groups.google.com/group/ryppl-dev/browse_thread/thread/7292998aadb04b...) but it's not yet complete. ------------- Our priorities, in descending order, are: 1. Set up buildbots for modularized boost so we can test work on the CMake-ification. This step will also serve as a proof-of-concept for modularized Boost's testing infrastructure 2. Complete the CMake-ification 3. History-preserving modularization: it's an estimate of course, but I expect John to have handled that within a few weeks. 4. Do the dependency management part. As usual, we welcome your assistance, participation, and interest! If there's any part of this that you'd like to work on, please speak up. Regards, -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams
* John Wiegley has developed a comprehensive strategy for rewriting SVN history to preserve as much information as possible about the evolution of everything, and he's working on implementing that. I expect results soon.
I wanted to chime in here and say hello; and that yes, I'm working on the modularization project with input from both Dave and Eric. I have the beginnings of the migration script up here: https://github.com/jwiegley/boost-migrate It's very rough right now, as I'm still exploring the completeness of the Subversion 'dump' format, and how to use Git plumbing to avoid the migration process taking days upon days to complete. I fully expect that when completed, the migration script will not only exactly replicate the existing Subversion repository, revision for commit (with some revisions being ommitted if they only change properties/directories, and others being split if their transactions affect multiple branches simultaneously), but it will also modularize that history at the same time, preserving as much relevant history within each module as possible. If I had full-time to work on this, I'd expect the script to be completed and within 3-4 days. Since my present workload gives me only an hour or so each day to work on it, it may be a couple weeks before I can invite sincere criticism. Until then, feel free to add yourself as a watcher on the project, and I'll post changes to it as I progress. John

On Jan 14, 2011, at 2:43 PM, John Wiegley wrote:
I wanted to chime in here and say hello; and that yes, I'm working on the modularization project with input from both Dave and Eric. I have the beginnings of the migration script up here:
https://github.com/jwiegley/boost-migrate
It's very rough right now, as I'm still exploring the completeness of the Subversion 'dump' format, and how to use Git plumbing to avoid the migration process taking days upon days to complete.
Quick status update: Direct conversion from a Subversion flat-filesystem to an identical Git flat-filesystem now works. It takes 8 GB of RAM and a lot of time to run, but that can be optimized fairly easily. Next step is to read in a corrected branches.txt file (this is currently generated based on hueristics by the 'branches' subcommand), and then use that information to output a branchified object hierarchy instead of a flat one. This step should be very easy to implement. After that is reading Eric's manifest.txt file and using the information to produce multiple submodules during the repository conversion process. This step is quite a bit trickery, and will require a few days to get right. John

At Sat, 22 Jan 2011 08:01:04 -0500, John Wiegley wrote:
On Jan 14, 2011, at 2:43 PM, John Wiegley wrote:
I wanted to chime in here and say hello; and that yes, I'm working on the modularization project with input from both Dave and Eric. I have the beginnings of the migration script up here:
https://github.com/jwiegley/boost-migrate
It's very rough right now, as I'm still exploring the completeness of the Subversion 'dump' format, and how to use Git plumbing to avoid the migration process taking days upon days to complete.
Quick status update: Direct conversion from a Subversion flat-filesystem to an identical Git flat-filesystem now works.
So now you have a linear sequence of commits that reflect the state of the entire SVN tree?
It takes 8 GB of RAM and a lot of time to run, but that can be optimized fairly easily.
And incrementalized, by any chance?
Next step is to read in a corrected branches.txt file (this is currently generated based on hueristics by the 'branches' subcommand), and then use that information to output a branchified object hierarchy instead of a flat one. This step should be very easy to implement.
After that is reading Eric's manifest.txt file and using the information to produce multiple submodules during the repository conversion process. This step is quite a bit trickery, and will require a few days to get right.
Sounds like you're making great progress; keep it up! -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams
Quick status update: Direct conversion from a Subversion flat-filesystem to an identical Git flat-filesystem now works.
So now you have a linear sequence of commits that reflect the state of the entire SVN tree?
Exactly. I'm 98% of the way toward a branchified sequence today.
It takes 8 GB of RAM and a lot of time to run, but that can be optimized fairly easily.
And incrementalized, by any chance?
Not yet; still just getting the basics to work. John

At Mon, 24 Jan 2011 12:41:27 -0500, John Wiegley wrote:
Dave Abrahams
writes: Quick status update: Direct conversion from a Subversion flat-filesystem to an identical Git flat-filesystem now works.
So now you have a linear sequence of commits that reflect the state of the entire SVN tree?
Exactly. I'm 98% of the way toward a branchified sequence today.
Awesome!
It takes 8 GB of RAM and a lot of time to run, but that can be optimized fairly easily.
And incrementalized, by any chance?
Not yet; still just getting the basics to work.
Check. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams
Exactly. I'm 98% of the way toward a branchified sequence today.
Awesome!
OK, branchification is working! All that's left is submodulization as part of the same run. This will actually not be very difficult, just time consuming to run. I'll use Eric's manifest.txt file, plus 'git log --follow -C --find-copies-harder' on each element of each submodule, run against the flat history. The man page says this is an O(N^2) operation -- where N is very large in Boost's case -- so I may end up having to do some pruning to keep it from getting out of hand. Actually, the speed of this script is already too slow, so I'm rewriting it in C++ today both for the native speed increase (10x so far, for dump-file parsing), and because it lets me use libgit2 (https://github.com/libgit2) to create Git objects directly, rather than shelling out to git-hash-object and git-mktree over a million times. That alone takes over 15 hours to do on my Mac Pro. Don't even ask how long the git gc takes to run! (It's longer). If anyone wonders whether my process -- which works for any Subversion repo, btw, not just Boost -- preserves more information than plain git-svn: consider that my branchified Git has just over one million Git objects in it, while the boost-svn repository on ryppl has only 593026 right now. That means over 40% of the repository's objects got dropped on the cutting floor by git-svn's hueristics. John

On Tue, Jan 25, 2011 at 11:36 AM, John Wiegley
Dave Abrahams
writes: Exactly. I'm 98% of the way toward a branchified sequence today.
Awesome!
OK, branchification is working! All that's left is submodulization as part of the same run.
Way cool!
Actually, the speed of this script is already too slow, so I'm rewriting it in C++ today both for the native speed increase (10x so far, for dump-file parsing), and because it lets me use libgit2 (https://github.com/libgit2) to create Git objects directly, rather than shelling out to git-hash-object and git-mktree over a million times. That alone takes over 15 hours to do on my Mac Pro. Don't even ask how long the git gc takes to run! (It's longer).
Interesting. I'd love to see the C++ version too. :)
If anyone wonders whether my process -- which works for any Subversion repo, btw, not just Boost -- preserves more information than plain git-svn: consider that my branchified Git has just over one million Git objects in it, while the boost-svn repository on ryppl has only 593026 right now. That means over 40% of the repository's objects got dropped on the cutting floor by git-svn's hueristics.
Coolness! :D So now I think it's a matter of convincing the other peeps that moving from Subversion to Git is actually a worthwhile effort. ;) -- Dean Michael Berris about.me/deanberris

On 1/25/2011 11:49 AM, Dean Michael Berris wrote:
On Tue, Jan 25, 2011 at 11:36 AM, John Wiegley
wrote: OK, branchification is working! All that's left is submodulization as part of the same run. <snip>
Coolness! :D So now I think it's a matter of convincing the other peeps that moving from Subversion to Git is actually a worthwhile effort. ;)
A lot of work remains --- that is, if it's also our intention to modularize boost and have a functioning cmake build system, too. At least, modularization seems like it would be a good thing to do at the same time. And nobody is working on a bjam system for modularized boost. <idle speculation> Is it feasible to have both git and svn development going on simultaneously? Two-way synchronization from non-modularized svn boost to modularized git boost? Is that pure insanity? -- Eric Niebler BoostPro Computing http://www.boostpro.com

On 25/01/11 06:05, Eric Niebler wrote:
<idle speculation> Is it feasible to have both git and svn development going on simultaneously? Two-way synchronization from non-modularized svn boost to modularized git boost? Is that pure insanity? I was about to ask the same as I wanted to make stuff happens in MPL in a git environnement and later merge the changes into the mainstream SVM. But if everythign can be done without specific hoops, it'll be even better.
Props to all of you guys involved in this effort :)

On 25/01/11 05:05, Eric Niebler wrote:
On 1/25/2011 11:49 AM, Dean Michael Berris wrote:
On Tue, Jan 25, 2011 at 11:36 AM, John Wiegley
wrote: OK, branchification is working! All that's left is submodulization as part of the same run. <snip>
Coolness! :D So now I think it's a matter of convincing the other peeps that moving from Subversion to Git is actually a worthwhile effort. ;)
A lot of work remains ---
Regarding one of bit still under construction, it is sat-solver https://github.com/mloskot/sat-solver and if it's still wanted, I am going to continue porting it to Visual C++ in ~2 weeks. If anyone would like to join the effort, please do! Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org Member of ACCU, http://accu.org

At Tue, 25 Jan 2011 10:24:01 +0000, Mateusz Loskot wrote:
On 25/01/11 05:05, Eric Niebler wrote:
On 1/25/2011 11:49 AM, Dean Michael Berris wrote:
On Tue, Jan 25, 2011 at 11:36 AM, John Wiegley
wrote: OK, branchification is working! All that's left is submodulization as part of the same run. <snip>
Coolness! :D So now I think it's a matter of convincing the other peeps that moving from Subversion to Git is actually a worthwhile effort. ;)
A lot of work remains ---
Regarding one of bit still under construction, it is sat-solver
https://github.com/mloskot/sat-solver
and if it's still wanted, I am going to continue porting it to Visual C++ in ~2 weeks. If anyone would like to join the effort, please do!
Dude, you rock!! I'm so glad to hear that you're intending to do that. I will be glad to work with you on it. Now we have one person/organization *other than me* taking primary responsibility for each major part of the project: 1. Modularization - John W 2. CMake support - Kitware/Marcus Hanwell (note: one of Kitware's clients is actually paying for their work on this) 3. Metadata and dependency resolution - Mateusz Loskot which leaves me free to do: 4. Automated testing 5. Project coordination If anyone wants to take #4 off my hands that'd be awesome :-) -- Dave Abrahams BoostPro Computing http://www.boostpro.com

At Tue, 25 Jan 2011 12:05:46 +0700, Eric Niebler wrote:
<idle speculation> Is it feasible to have both git and svn development going on simultaneously? Two-way synchronization from non-modularized svn boost to modularized git boost? Is that pure insanity?
Probably not *pure* insanity, but also perhaps not worth the trouble, IMO. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On Tue, Jan 25, 2011 at 12:00 PM, Dave Abrahams
At Tue, 25 Jan 2011 12:05:46 +0700, Eric Niebler wrote:
<idle speculation> Is it feasible to have both git and svn development going on simultaneously? Two-way synchronization from non-modularized svn boost to modularized git boost? Is that pure insanity?
Probably not *pure* insanity, but also perhaps not worth the trouble, IMO.
Still, doing a "big bang" conversion to Git all at one time is more than a notion. Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git? --Beman

On 1/27/2011 12:52 PM, Beman Dawes wrote:
On Tue, Jan 25, 2011 at 12:00 PM, Dave Abrahams
wrote: At Tue, 25 Jan 2011 12:05:46 +0700, Eric Niebler wrote:
<idle speculation> Is it feasible to have both git and svn development going on simultaneously? Two-way synchronization from non-modularized svn boost to modularized git boost? Is that pure insanity?
Probably not *pure* insanity, but also perhaps not worth the trouble, IMO.
Still, doing a "big bang" conversion to Git all at one time is more than a notion.
Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
I hope such a discussion entails a very strong justification of why Git is better than Subversion. I still do not buy it, and only find Git more complicated and harder to use than Subversion with little advantage. I fear very much an "emperor's new clothes" situation where everyone is jumping on a bandwagon, because it is the latest thing to do, but no one is bothering to explain why this latest thing has any value to Boost.

Edward --
Edward Diener
I hope such a discussion entails a very strong justification of why Git is better than Subversion. I still do not buy it, and only find Git more complicated and harder to use than Subversion with little advantage. [...], but no one is bothering to explain why this latest thing has any value to Boost.
For my own development efforts, I've found Git to be an improvement over Subversion in the following ways: 1. Detached development. The ability to do incremental check-ins without requiring a network connection is a huge win for me. 2. Data backup. If every developer (more, every developer's computer) has a full copy of the history on it, that is more distributed and easier to obtain than making sure you have transaction-perfect replication of your master SVN repository. (Or, at least, it was for me.) 3. Experimentation. In my experience, branching is cheaper and much lighter-weight in Git than in SVN. I do sympathize with the "harder than svn" complaint; I find it so myself. But having been left out in the cold a few times by having only SVN, I will certainly run my next project with git rather than svn. Also, it's not clear that Boost has the same level of contributor fan-in that is git's truest strength. Regards, Tony

The ability to do incremental check-ins without requiring a network connection is a huge win for me.
This point is really key for me too. I am using git for all my work and svn only for boost. I am currently working on an extension of the type traits libraries and it is a real pain to not being able to do: - partial commits (= do not commit every thing) - local commits (= commit things but do not bother others, i.e. no need to wait for a perfect solution before commiting)
In my experience, branching is cheaper and much lighter-weight in Git than in SVN.
That is very interesting also: branching is very very easy. For me cvs is 1, svn is 2 and git is 1000. I would love if boost switch to git. Frédéric

We're having this discussion on the wrong list, IMO. I suggest moving
it to the developers' list.
2011/1/28 Frédéric Bron
The ability to do incremental check-ins without requiring a network connection is a huge win for me.
This point is really key for me too. I am using git for all my work and svn only for boost. I am currently working on an extension of the type traits libraries and it is a real pain to not being able to do:
- partial commits (= do not commit every thing) - local commits (= commit things but do not bother others, i.e. no need to wait for a perfect solution before commiting)
In my experience, branching is cheaper and much lighter-weight in Git than in SVN.
That is very interesting also: branching is very very easy.
For me cvs is 1, svn is 2 and git is 1000. I would love if boost switch to git.
Frédéric _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
We're having this discussion on the wrong list, IMO. I suggest moving it to the developers' list.
May I suggest that to keep signal/noise ratio on the list to acceptable level, we don't have a general VC shootout discussion. It will never lead to anything. Rather, the parties interested in having Boost switch to any version control system that is not SVN should propose a specific plan, including hosting, administration, adjustment of all scripts, new workflows for Boost maintainers and authors of proposed libraries, etc? If would be seriously not funny if 1000 messages later we'll find that git is about to solve every problem on earth, but there's nobody to do *complete* transition and onging maintenance. - Volodya -- Vladimir Prus Mentor Graphics +7 (812) 677-68-40

On Fri, Jan 28, 2011 at 21:39, Dave Abrahams
We're having this discussion on the wrong list, IMO. I suggest moving it to the developers' list.
Frankly, I do find it interesting to read what those with a lot of experience--informed opinions--have to say about the matter! Bet I'm not the only one. Best, Dee

On 1/28/2011 7:49 AM, Diederick C. Niehorster wrote:
On Fri, Jan 28, 2011 at 21:39, Dave Abrahams
wrote: We're having this discussion on the wrong list, IMO. I suggest moving it to the developers' list.
Frankly, I do find it interesting to read what those with a lot of experience--informed opinions--have to say about the matter! Bet I'm not the only one.
The developer's list isn't restricted access, is it?

On Fri, Jan 28, 2011 at 8:53 AM, Eric J. Holtman
The developer's list isn't restricted access, is it?
No it is not. If you're interested in this sort of issue, please subscribe there. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On 1/28/2011 2:12 AM, Anthony Foiani wrote:
Edward --
Edward Diener
writes: I hope such a discussion entails a very strong justification of why Git is better than Subversion. I still do not buy it, and only find Git more complicated and harder to use than Subversion with little advantage. [...], but no one is bothering to explain why this latest thing has any value to Boost.
For my own development efforts, I've found Git to be an improvement over Subversion in the following ways:
1. Detached development.
The ability to do incremental check-ins without requiring a network connection is a huge win for me.
Why do you mean by "incremental checkins" ? If I use SVN I can make as many changes locally as I want.
2. Data backup.
If every developer (more, every developer's computer) has a full copy of the history on it, that is more distributed and easier to obtain than making sure you have transaction-perfect replication of your master SVN repository. (Or, at least, it was for me.)
"More distributed" means nothing to me. Someone really needs to justify this distributed development idea with something more than "its distributed so it must be good".
3. Experimentation.
In my experience, branching is cheaper and much lighter-weight in Git than in SVN.
Please explain "cheaper and lighter weight" ? It is all this rhetoric that really bothers me from developers on the Git bandwagon. I would love to see real technical proof.

On 28 Jan 2011, at 13:42, Edward Diener wrote:
On 1/28/2011 2:12 AM, Anthony Foiani wrote:
Edward --
Edward Diener
writes: I hope such a discussion entails a very strong justification of why Git is better than Subversion. I still do not buy it, and only find Git more complicated and harder to use than Subversion with little advantage. [...], but no one is bothering to explain why this latest thing has any value to Boost.
For my own development efforts, I've found Git to be an improvement over Subversion in the following ways:
1. Detached development.
The ability to do incremental check-ins without requiring a network connection is a huge win for me.
Why do you mean by "incremental checkins" ? If I use SVN I can make as many changes locally as I want.
With 'git' you can commit those incremental checkins to your local repository. You can then decide later to either push them all up to the boost repository, merge them into a single commit, or abandon them.
3. Experimentation.
In my experience, branching is cheaper and much lighter-weight in Git than in SVN.
Please explain "cheaper and lighter weight" ?
It is all this rhetoric that really bothers me from developers on the Git bandwagon. I would love to see real technical proof.
I'm not sure what you mean by "technical proof", however we switched from svn to git at work. It is very easy to say "apply the commits X,Y and Z from branch A to branch B", whereas or to keep multiple branches in sync. We found this basically impossible to do in svn and it is necessary to manually keep track of patches which need applying. boost appears to have a similar problem, requiring frequent manual diffs between head and release to find patches which have not been applied yet. Chris

Edward, all, greetings --
Edward Diener
Why do you mean by "incremental checkins" ? If I use SVN I can make as many changes locally as I want.
Hopefully others have clarified this point for us, but to be perfectly clear: What I mean by "incremental checkins" is that I can make changes on my local machine *and save each change in my local history*. When I push these out to the world at large, I can either keep the history the way it is, or condense multiple "bite-sized" check-ins to a single "plate-sized" feature addition. E.g., if I make changes A, B, C, and D locally, I would commit *locally* between each change. If it turns out that commit B was incorrect or sub-optimal, I can use my local repo to do the equivalent of: revert D revert C revert B apply C apply D As others have pointed out, there is a huge difference between "local change" and "tracking local commits". git and hg both provide the latter feature.
"More distributed" [w.r.t. data backup] means nothing to me. Someone really needs to justify this distributed development idea with something more than "its distributed so it must be good".
[This is not meant in any way to disparage the current boost infrastructure maintainers; when I ask a question, please read it as "how would I fill out a form" and not "I assume they haven't taken care of this".] What's the backup plan for the boost SVN repository? Who maintains it? What is the replication factor? Offsite backups? How often is restore capability verified? Are all backups checksummed? With distributed repositories, every developer has a complete copy of the entire (public) history of the project as well as any local changes they have made. Verification of backup/restore capability is given by the fact that it's done via the exact same operations that are required in everyday development. In both git and hg, all content is implicitly checksummed (by virtue of content being addressed primarily by SHA1, at least in git). (This isn't quite as ballsy as Linus pointing out that he gets away with just uploading tarballs, with his backup being taken care of by the many thousands that download his releases...)
Please explain "cheaper and lighter weight" ?
Please note, this might be from my inexperience, but I've found that the only effective way to work on a "private branch" in SVN is to check out the branch I care to modify in a separate directory. As an example of SVN making things painful, I'm working on a minor fix inside GCC. I had to check out the trunk (1.8GiB on disk, no idea how much over the network). Then I checked out the current release tag, another 1.8GiB of network traffic. Compare with a project that is distributed via mercurial. In this case, I have the trunk and two private branches, each for a different feature. The original checkout of trunk was about as expensive as for SVN; after that, though, I could do a local "clone" to get my private feature-development branches.
It is all this rhetoric that really bothers me from developers on the Git bandwagon. I would love to see real technical proof.
My apologies if my original post across as propoganda; I was just trying to communicate what I found to be the distinct "in the real world" advantages of using a DVCS (git or hg) over a centralized VCS (SVN). (Granted, I suppose it's possible to have a *local* SVN server that one could use to do much of the same work as indicated above. I have no idea how painful that might be, though, and since the current leading DVCSs already solve the problem for me, I'm disinclined to try to find out.) And while my sympathies are primarily with git, I have (and do) work with projects that use hg, and I find them both vastly more pleasant than svn. I even contributed a trivial doc patch to hg; I found the learning curve, tool usage, and community response all incredibly pleasant. Best regards, Tony

Hi
Edward Diener
It is all this rhetoric that really bothers me from developers on the Git bandwagon. I would love to see real technical proof.
I'm not sure it will be of great help to you or others, but this page is meant to explain to people used to SVN the differences with decentralized source control tools and why it might be worth : http://hginit.com/00.html (it's about mercurial/hg but the decentralized way of doing things is globally the same with git and bazaar). Hope it helps understand the requests. Joël Lamotte.

On Sat, Jan 29, 2011 at 06:42, Klaim
I'm not sure it will be of great help to you or others, but this page is meant to explain to people used to SVN the differences with decentralized source control tools and why it might be worth : http://hginit.com/00.html (it's about mercurial/hg but the decentralized way of doing things is globally the same with git and bazaar).
There's one great line in there that basically sums up everything you need to know: "[A DVCS] separates the act of committing new code from the act of inflicting it on everybody else." Just about everything that's different stems from that distinction. (Pushing and pulling are how you choose who to inflict with your code and by whose code you wish to be inflicted, respectively.) ~ Scott

AMDG On 1/28/2011 11:37 PM, Anthony Foiani wrote:
Please explain "cheaper and lighter weight" ?
Please note, this might be from my inexperience, but I've found that the only effective way to work on a "private branch" in SVN is to check out the branch I care to modify in a separate directory.
You can use svn switch to point your working copy at a different branch. In Christ, Steven Watanabe

Steven, greetings --
On 1/28/2011 11:37 PM, Anthony Foiani wrote:
Please note, this might be from my inexperience, but I've found that the only effective way to work on a "private branch" in SVN is to check out the branch I care to modify in a separate directory.
Steven Watanabe
You can use svn switch to point your working copy at a different branch.
Huh, I should have known/remembered that. Thanks for the tip! (It still seems that I'd want a full clone to work on multiple independent features simultaneously, but at least "cp -a" then "svn switch" on the copy would cut out the second grab over the network. The disk usage cost would be the same as how I use hg and git, too; I suspect that "git stash" would cover some of my use cases, but I am still learning.) Thanks again! Best regards, Tony

Edward Diener
On 1/27/2011 12:52 PM, Beman Dawes wrote:
Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
I hope such a discussion entails a very strong justification of why Git is better than Subversion. I still do not buy it, and only find Git more complicated and harder to use than Subversion with little advantage. I fear very much an "emperor's new clothes" situation where everyone is jumping on a bandwagon, because it is the latest thing to do, but no one is bothering to explain why this latest thing has any value to Boost.
Indeed. Also, why git rather than another DVCS such as Mercurial or bazaar? Personally, I find Mercurial much easier to use than git, and it has the same major advantages (which are essentially common to all DVCS systems). Also, Mercurial works better on Windows than git does in my experience --- the git port for Windows is relatively recent, whereas Mercurial has supported Windows for a while. Since many of the boost developers use Windows I would have thought this was an important consideration. I haven't any personal experience of bazaar, so don't know how it fares in this regard. The chief advantage of a DVCS over subversion is that you can do local development with full version control (including history) whilst offline, and then push/pull when online. Also, you can do incremental local commits, so you have the advantage of VC, without pushing unfinished changes to the main repository. Branching and merging tends to be easier too. Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

On Fri, Jan 28, 2011 at 4:24 PM, Anthony Williams
Edward Diener
writes: On 1/27/2011 12:52 PM, Beman Dawes wrote:
Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
I hope such a discussion entails a very strong justification of why Git is better than Subversion. I still do not buy it, and only find Git more complicated and harder to use than Subversion with little advantage. I fear very much an "emperor's new clothes" situation where everyone is jumping on a bandwagon, because it is the latest thing to do, but no one is bothering to explain why this latest thing has any value to Boost.
Indeed. Also, why git rather than another DVCS such as Mercurial or bazaar? Personally, I find Mercurial much easier to use than git, and it has the same major advantages (which are essentially common to all DVCS systems).
I have to be honest here and say up front that I have no idea what the features of mercurial are, so I have some questions with it in particular: 1. Does it allow for integrating GnuPG signatures in the commit messages/history? The popular way for certifying that something is "official" or "is signed off on by <insert maintainer here>" is through GnuPG PKI. This is what makes the Linux kernel dev organization more like a self-organizing matter. 2. Does it allow for compacting and local compression of assets? Git has a rich set of tools for compressing and dealing with local repositories. It also has a very efficient way of preserving objects across branches and what not. 3. Does mercurial work in "email" mode? Git has a way of submitting patches via email -- and have the same email read-in by git and parsed as an actual "merge". This is convenient for discussing patches in the mailing list and preserving the original message/discussion. This gives people a chance to publicly review the changes and import the same changeset from the same email message. 4. How does mercurial deal with forks? In Git a repository is automatically a fork of the source repository. I don't know whether every mercurial repo is the same as a Git repo though -- meaning whether the same repository can be exposed to a number of protocols and dealt with like any other Git repo (push/pull/merge/compact, etc.)
Also, Mercurial works better on Windows than git does in my experience --- the git port for Windows is relatively recent, whereas Mercurial has supported Windows for a while. Since many of the boost developers use Windows I would have thought this was an important consideration. I haven't any personal experience of bazaar, so don't know how it fares in this regard.
I've used Msysgit for the most part, and it works very well -- actually, works the same in Linux as it does in Windows. Are we talking about the same Windows port of Git?
The chief advantage of a DVCS over subversion is that you can do local development with full version control (including history) whilst offline, and then push/pull when online. Also, you can do incremental local commits, so you have the advantage of VC, without pushing unfinished changes to the main repository. Branching and merging tends to be easier too.
+1 -- Dean Michael Berris about.me/deanberris

On 28.01.2011 10:06, Dean Michael Berris wrote:
I have to be honest here and say up front that I have no idea what the features of mercurial are, so I have some questions with it in particular: 2. Does it allow for compacting and local compression of assets? Git has a rich set of tools for compressing and dealing with local repositories. It also has a very efficient way of preserving objects across branches and what not. Mercurial has a completely different storage format from Git. 3. Does mercurial work in "email" mode? Git has a way of submitting patches via email -- and have the same email read-in by git and parsed as an actual "merge". This is convenient for discussing patches in the mailing list and preserving the original message/discussion. This gives people a chance to publicly review the changes and import the same changeset from the same email message. Yes, Mercurial can format changesets as emails. 4. How does mercurial deal with forks? In Git a repository is automatically a fork of the source repository. I don't know whether every mercurial repo is the same as a Git repo though -- meaning whether the same repository can be exposed to a number of protocols and dealt with like any other Git repo (push/pull/merge/compact, etc.) Hg and Git deal with forks pretty much the same way. There are some minor differences in the handling of anonymous branching within a single clone (i.e. what happens when you are not on the most recent commit and do a commit yourself), I believe. Hg actually has a plug-in that lets it push and pull to/from a Git server.
Also, Mercurial works better on Windows than git does in my experience --- the git port for Windows is relatively recent, whereas Mercurial has supported Windows for a while. Since many of the boost developers use Windows I would have thought this was an important consideration. I haven't any personal experience of bazaar, so don't know how it fares in this regard.
I've used Msysgit for the most part, and it works very well -- actually, works the same in Linux as it does in Windows. Are we talking about the same Windows port of Git? I don't think Git currently has any integration plug-ins like TortoiseHg (Explorer) or VisualHg (Visual Studio).
The think I like most about Git over Mercurial is the extensive history rewriting capability (hg rebase -i). Wonderful for cleaning up my local commit mess before pushing. Sebastian

I've used Msysgit for the most part, and it works very well -- actually, works the same in Linux as it does in Windows. Are we talking about the same Windows port of Git?
I don't think Git currently has any integration plug-ins like TortoiseHg (Explorer) or VisualHg (Visual Studio).
There is a TortoiseGit that relies on MsysGit. The biggest worry of Git on
Windows is that you don't have access in the command line, you have to launch git bash for this (and then you're screwed if you want to change disk). So TortoiseGit is the only hope for Git on Windows at the moment. Matthieu -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher

On 28/01/11 10:57, Matthieu Brucher wrote:
I've used Msysgit for the most part, and it works very well -- actually, works the same in Linux as it does in Windows. Are we talking about the same Windows port of Git?
I don't think Git currently has any integration plug-ins like TortoiseHg (Explorer) or VisualHg (Visual Studio).
There is a TortoiseGit that relies on MsysGit. The biggest worry of Git on Windows is that you don't have access in the command line, you have to launch git bash for this (and then you're screwed if you want to change disk)
It is not true. I never use the git bash. I add the ${GITINSTALL}\bin to my PATH and voilà. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org Member of ACCU, http://accu.org

On Fri, Jan 28, 2011 at 8:57 AM, Matthieu Brucher
I've used Msysgit for the most part, and it works very well -- actually, works the same in Linux as it does in Windows. Are we talking about the same Windows port of Git?
I don't think Git currently has any integration plug-ins like TortoiseHg (Explorer) or VisualHg (Visual Studio).
There is a TortoiseGit that relies on MsysGit. The biggest worry of Git on Windows is that you don't have access in the command line, you have to launch git bash for this (and then you're screwed if you want to change disk). So TortoiseGit is the only hope for Git on Windows at the moment.
I use it on command line in Windows all the time.
Matthieu -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher
Regards, -- Felipe Magno de Almeida

On Fri, Jan 28, 2011 at 18:33, Sebastian Redl
On 28.01.2011 10:06, Dean Michael Berris wrote:
4. How does mercurial deal with forks? In Git a repository is automatically a fork of the source repository. I don't know whether every mercurial repo is the same as a Git repo though -- meaning whether the same repository can be exposed to a number of protocols and dealt with like any other Git repo (push/pull/merge/compact, etc.)
Hg and Git deal with forks pretty much the same way. There are some minor differences in the handling of anonymous branching within a single clone (i.e. what happens when you are not on the most recent commit and do a commit yourself), I believe. Hg actually has a plug-in that lets it push and pull to/from a Git server.
I think this is a very important point for this discussion. Both Hg and bzr come with a plugin that lets them work with a git branch/server. Hence, as the current work with git is well underway and git is very popular out there, we might as well stick with it. Users that however feel more comfortable using hg or bzr can do so with no obstacle (assuming these plugins work well, haven't tried advanced usage cases myself).
The think I like most about Git over Mercurial is the extensive history rewriting capability (hg rebase -i). Wonderful for cleaning up my local commit mess before pushing.
Yup, bzr can do such things too (also useful if you accidentally committed files in some revision that you'd like to get rid off from all revisions, though I believe that was a different command). Best, Dee

Dean Michael Berris
On Fri, Jan 28, 2011 at 4:24 PM, Anthony Williams
wrote: Edward Diener
writes: On 1/27/2011 12:52 PM, Beman Dawes wrote:
Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
I hope such a discussion entails a very strong justification of why Git is better than Subversion. I still do not buy it, and only find Git more complicated and harder to use than Subversion with little advantage. I fear very much an "emperor's new clothes" situation where everyone is jumping on a bandwagon, because it is the latest thing to do, but no one is bothering to explain why this latest thing has any value to Boost.
Indeed. Also, why git rather than another DVCS such as Mercurial or bazaar? Personally, I find Mercurial much easier to use than git, and it has the same major advantages (which are essentially common to all DVCS systems).
I have to be honest here and say up front that I have no idea what the features of mercurial are, so I have some questions with it in particular:
For a quick summary of the similarities and differences, see http://stackoverflow.com/questions/1598759/git-and-mercurial-compare-and-con...
1. Does it allow for integrating GnuPG signatures in the commit messages/history? The popular way for certifying that something is "official" or "is signed off on by <insert maintainer here>" is through GnuPG PKI. This is what makes the Linux kernel dev organization more like a self-organizing matter.
Yes. See http://mercurial.selenic.com/wiki/GpgExtension
2. Does it allow for compacting and local compression of assets? Git has a rich set of tools for compressing and dealing with local repositories. It also has a very efficient way of preserving objects across branches and what not.
Mercurial does compress the repository. How it compares with git, I don't know.
3. Does mercurial work in "email" mode? Git has a way of submitting patches via email -- and have the same email read-in by git and parsed as an actual "merge". This is convenient for discussing patches in the mailing list and preserving the original message/discussion. This gives people a chance to publicly review the changes and import the same changeset from the same email message.
From Mercurial, you can export patches to a text file containing the diffs and a few headers, and import that text file into another repo, where it preserves the commit message. Is that the sort of thing you meant?
4. How does mercurial deal with forks? In Git a repository is automatically a fork of the source repository. I don't know whether every mercurial repo is the same as a Git repo though -- meaning whether the same repository can be exposed to a number of protocols and dealt with like any other Git repo (push/pull/merge/compact, etc.)
Your local repository can push/pull from any remote repository, and you can set up a default remote repo for "hg push" and "hg pull" without a repository path. I don't know the full set of protocol options; I use local and http access.
Also, Mercurial works better on Windows than git does in my experience --- the git port for Windows is relatively recent, whereas Mercurial has supported Windows for a while. Since many of the boost developers use Windows I would have thought this was an important consideration. I haven't any personal experience of bazaar, so don't know how it fares in this regard.
I've used Msysgit for the most part, and it works very well -- actually, works the same in Linux as it does in Windows. Are we talking about the same Windows port of Git?
The old port was cygwin based, and was a real pain. I tried using msysgit and had a few problems, but it was an early version. It might be much better now. OTOH, Mercurial has always "just worked" for me, on both Windows and Linux. Like I said above, my personal opinion is that mercurial is easier to use. YMMV. I also know people who a big fans of bazaar, but I've never used it myself. Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

On Fri, Jan 28, 2011 at 6:34 PM, Anthony Williams
Dean Michael Berris
writes: I have to be honest here and say up front that I have no idea what the features of mercurial are, so I have some questions with it in particular:
For a quick summary of the similarities and differences, see http://stackoverflow.com/questions/1598759/git-and-mercurial-compare-and-con...
Thanks for the link -- that was a pretty long accepted answer. :)
1. Does it allow for integrating GnuPG signatures in the commit messages/history? The popular way for certifying that something is "official" or "is signed off on by <insert maintainer here>" is through GnuPG PKI. This is what makes the Linux kernel dev organization more like a self-organizing matter.
Okay.
2. Does it allow for compacting and local compression of assets? Git has a rich set of tools for compressing and dealing with local repositories. It also has a very efficient way of preserving objects across branches and what not.
Mercurial does compress the repository. How it compares with git, I don't know.
Okay.
3. Does mercurial work in "email" mode? Git has a way of submitting patches via email -- and have the same email read-in by git and parsed as an actual "merge". This is convenient for discussing patches in the mailing list and preserving the original message/discussion. This gives people a chance to publicly review the changes and import the same changeset from the same email message.
From Mercurial, you can export patches to a text file containing the diffs and a few headers, and import that text file into another repo, where it preserves the commit message. Is that the sort of thing you meant?
Well, not really -- git has git-format-patch that actually crafts an appropriately encoded email message. Git actually has support for importing patches from a mail message directly.
4. How does mercurial deal with forks? In Git a repository is automatically a fork of the source repository. I don't know whether every mercurial repo is the same as a Git repo though -- meaning whether the same repository can be exposed to a number of protocols and dealt with like any other Git repo (push/pull/merge/compact, etc.)
Your local repository can push/pull from any remote repository, and you can set up a default remote repo for "hg push" and "hg pull" without a repository path. I don't know the full set of protocol options; I use local and http access.
Okay, but I think the thing I was asking was whether the same two repositories share the same history information?
I've used Msysgit for the most part, and it works very well -- actually, works the same in Linux as it does in Windows. Are we talking about the same Windows port of Git?
The old port was cygwin based, and was a real pain. I tried using msysgit and had a few problems, but it was an early version. It might be much better now. OTOH, Mercurial has always "just worked" for me, on both Windows and Linux.
Ok.
Like I said above, my personal opinion is that mercurial is easier to use. YMMV. I also know people who a big fans of bazaar, but I've never used it myself.
I agree. However since hg and git can work with each other, I don't see why using either one would be a big problem as both have a pretty similar model looking at it from the outside. I'd love to hear from someone who uses bzr though. Thanks again Anthony! -- Dean Michael Berris about.me/deanberris

Dean Michael Berris
On Fri, Jan 28, 2011 at 6:34 PM, Anthony Williams
wrote: Dean Michael Berris
writes: 4. How does mercurial deal with forks? In Git a repository is automatically a fork of the source repository. I don't know whether every mercurial repo is the same as a Git repo though -- meaning whether the same repository can be exposed to a number of protocols and dealt with like any other Git repo (push/pull/merge/compact, etc.)
Your local repository can push/pull from any remote repository, and you can set up a default remote repo for "hg push" and "hg pull" without a repository path. I don't know the full set of protocol options; I use local and http access.
Okay, but I think the thing I was asking was whether the same two repositories share the same history information?
Yes. A Mercurial clone is a full copy of the source repo, including all history.
However since hg and git can work with each other, I don't see why using either one would be a big problem as both have a pretty similar model looking at it from the outside.
That's true. I might try out the hg-git extension (http://mercurial.selenic.com/wiki/HgGit) Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

Dean Michael Berris wrote:
From Mercurial, you can export patches to a text file containing the diffs and a few headers, and import that text file into another repo, where it preserves the commit message. Is that the sort of thing you meant?
Well, not really -- git has git-format-patch that actually crafts an appropriately encoded email message.
And you have even push those messages to the Draft folder of your IMAP email server. In practice, though, it's not like using attachments is too painful, and for most practical cases, the time you need to work with the patch on both ends is far greater than time spend attaching a file and then saving an attachment. And 'git am' is actually strong candidate for the worst command in git. You gonna love those .rej files and how 'git mergetool' does not work git 'git am' fails. So, I don't think git sets any points on this particular item. - Volodya -- Vladimir Prus Mentor Graphics +7 (812) 677-68-40

On 1/28/2011 3:24 AM, Anthony Williams wrote:
Edward Diener
writes: On 1/27/2011 12:52 PM, Beman Dawes wrote:
Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
I hope such a discussion entails a very strong justification of why Git is better than Subversion. I still do not buy it, and only find Git more complicated and harder to use than Subversion with little advantage. I fear very much an "emperor's new clothes" situation where everyone is jumping on a bandwagon, because it is the latest thing to do, but no one is bothering to explain why this latest thing has any value to Boost.
Indeed. Also, why git rather than another DVCS such as Mercurial or bazaar? Personally, I find Mercurial much easier to use than git, and it has the same major advantages (which are essentially common to all DVCS systems).
Also, Mercurial works better on Windows than git does in my experience --- the git port for Windows is relatively recent, whereas Mercurial has supported Windows for a while. Since many of the boost developers use Windows I would have thought this was an important consideration. I haven't any personal experience of bazaar, so don't know how it fares in this regard.
The chief advantage of a DVCS over subversion is that you can do local development with full version control (including history) whilst offline, and then push/pull when online. Also, you can do incremental local commits, so you have the advantage of VC, without pushing unfinished changes to the main repository. Branching and merging tends to be easier too.
I do not follow why these are advantages. I can make any changes locally for files using SVN without having to have a connection to the SVN server. Your phrase "incremental local commits" sounds like more Git rhetoric to me. How does this differ from just changing files locally under SVN ?

On 1/28/2011 7:46 AM, Edward Diener wrote:
I do not follow why these are advantages. I can make any changes locally for files using SVN without having to have a connection to the SVN server. Your phrase "incremental local commits" sounds like more Git rhetoric to me. How does this differ from just changing files locally under SVN ?
You can check in while you work. Which means you don't have to worry about "breaking the build", or anything like that. Write some code, test it, seems to work, check it in. Come back after lunch, discover it's fubar, revert. Lather, rinse repeat. I still use SVN, but I occasionally think about switching.

From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Eric J. Holtman Sent: January-28-11 8:56 AM To: boost-users@lists.boost.org Subject: Re: [Boost-users] What's happened to Ryppl?
On 1/28/2011 7:46 AM, Edward Diener wrote:
I do not follow why these are advantages. I can make any changes locally for files using SVN without having to have a connection to the SVN server. Your phrase "incremental local commits" sounds like more Git rhetoric to me. How does this differ from just changing files locally under SVN ?
You can check in while you work. Which means you don't have to worry about "breaking the build", or anything like that.
Write some code, test it, seems to work, check it in. Come back after lunch, discover it's fubar, revert. Lather, rinse repeat.
This may be adequate IF you're working alone and if you have all the time in the world, but it will become an unmaintainable nightmare when the number of programmers contributing to the project increases significantly and as time pressures grow. Imagine the chaos that would result from this if you had a dozen programmers doing this independantly to the same codebase. And actually, I don't care which version control software is in use, as long as it is used well. Each product has strengths and weaknesses, and thus some will have strong preferences for thos eproducts that most closely reflect their own tastes. Unless a given product's developers are blithering idiots, a rational argument can be made for using their product. And then, the final decision is made either by the team, democratically, or autocratically by a project manager or team lead. And if a change is to be made, the onus is on the proponents of the change to first prove the change is wise and then to provide/plan the means of making the change. It is folly to decide to make a change, even if to a demonstrably superior product, if there are insufficient resources (including time) to get it done; especially if the existing setup is working adequately. A practical rogrammer, while having preferences for certain tools over others, will be flexible enough to work with whatever is in place for a project to which he has been assigned to contribute (in a commercial setting) or to which he wishes to contribute in other cases. As one responsible for a small development team, and who has to keep junior and intermediate programmers on track, this notion of " Write some code, test it, seems to work, check it in. Come back after lunch, discover it's fubar, revert. Lather, rinse repeat" scares me. When you're dealing with a commercial app that has half a million lines of code, or more, it is just too easy to break things (there are practical and commercial reasons for rational modularization as results in object oriented programming, as well as proper management/design of compilation units, &c.). I would prefer a model where there is a highly developed suite of test code (actually, I find it is often best if one begins with the tests first - but sometimes that is not practical), unit tests, integration tests and usability tests, and nothing gets checked in unless the code base plus the new code not only compiles, but the developer can show that with his implemented changes, the system still passes all tests. And note, his new code must come with a test suite of its own, and must pass through a code review before we accept that his tests are adequate and thus before he can run the full test suite. With this model, it happens that new code stresses existing code in initially unexpected ways, revealing previously undetected bugs. But at the same time, it makes it less likely that new code will introduce a significant number of new bugs when it is approved to be commited to the codebase. And this means that while the new code that is approved to be commited will have the same number of bugs per thousand lines of code that most other programmers experience in their code, the number of bugs per thousand lines of code can only decrease. And where the version control software in place does not support a given detail of this model (and note, this model can be made to work even with something as primitive as RCS or CVS), we need a manual process to make it work. In my practice, no member of my team commits anything until we know it works with everything already in the repository - no exceptions. This means that sometimes a programmer will work on an assigned task for days, or even a week, without commiting changes to the repository. This actually help productivity rates since we waste much less time tracking down bugs that had been introduced weeks or months earlier, and then fixing them along with new code that unwittingly depended on code that was broken. Until I learned this, I occassionallt saw situations where weeks or months worth of work had to be discarded because of dependancies within code along with months old code that had subtle bugs (but then, I have been doing this for 30+ years in a number of different languages, and the commercial practice of software development wasn't then what it is now). Cheers Ted

On 1/28/2011 9:59 AM, Ted Byers wrote:
This may be adequate IF you're working alone and if you have all the time in the world, but it will become an unmaintainable nightmare when the number of programmers contributing to the project increases significantly and as time pressures grow. Imagine the chaos that would result from this if you had a dozen programmers doing this independantly to the same codebase.
You proceed from a false assumption. You might have 1000 commits in your local copy. When you're done, and ready to publish, you "push" it as one. No different than SVN.
commercial app that has half a million lines of code, or more, it is just too easy to break things (there are practical and commercial reasons for rational modularization as results in object oriented programming, as well as proper management/design of compilation units, &c.).
Again, no one sees your changes until your ready. It's there to help the individual. How often have you been in the middle of a large change to quite a few files, then said "oh, this sucks", and revert it. Then, a half hour later, you decide "Oh, wait, 100 of those lines (out of the 1000 you just threw away) might be useful". With SVN, you're hosed. With git, you're not.

From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Eric J. Holtman
Sent: January-28-11 11:07 AM To: boost-users@lists.boost.org Subject: Re: [Boost-users] What's happened to Ryppl?
On 1/28/2011 9:59 AM, Ted Byers wrote:
This may be adequate IF you're working alone and if you have all the time in the world, but it will become an unmaintainable nightmare when the number of programmers contributing to the project increases significantly and as time pressures grow. Imagine the chaos that would result from this if you had a dozen programmers doing this
independantly to the same codebase.
You proceed from a false assumption. You might have 1000 commits in your local copy. When you're done, and ready to publish, you "push" it as one.
No different than SVN.
So no pressing need to change from SVN
commercial app that has half a million lines of code, or more, it is just too easy to break things (there are practical and commercial reasons for rational modularization as results in object oriented programming, as well as proper management/design of compilation units, &c.).
Again, no one sees your changes until your ready. It's there to help the individual. How often have you been in the middle of a large change to quite a few files, then said "oh, this sucks", and revert it. Then, a half hour later, you >decide "Oh, wait, 100 of those lines (out of the 1000 you just threw away) might be useful".
With SVN, you're hosed. With git, you're not.
Really? Anyone with any experience has faced this sort of thing countless times before, and even in the absense of software that makes handling this easy, have developed methods (often manual) to deal with it. Only a complete novice will not have figured this out and thus be so dependant on his software that he'd be "hosed" if he uses the 'wrong' software. But then, in a commercial setting, part of the role of the senior programmers/team leads/&c. is to teach their juniors so that they are flexible enough to cope with these situations regardless of the supporting software used. What I encourage among my team is to never throw anything away, unless it is demonstrably wrong. So, even with something as primitive as RCS, I/we wold not be "hosed". If git provides help with this, fine, but it is not essential. My practice has been to tell my team members they can use whatever tools they wish on their own machine. Some like Emacs, others vi, &c. Some like commenting out code that appears problematic while others include version info in temporary backups. I encourage them to work in watever way they find conducive to being as productive as practicable, as long as the code they produce works. The actual decision making process used can be rather complex, and is not simply a matter of comparing product feature lists in most cases. There is first an analysis of what development model is required. Then, we need an examination of features of available products and the extent to which each supports the development model selected. If the product to be developed is new, then one selects that product that best meets perceived needs (of course, this involves putting each of the products to the test to verify that each does what it says it does - it would be irresponsible to fail to put each option through its paces). If the project involves extending or refactoring an existing product, there is then an examination of the current state of that product and the software used to support it, along with everything that would be necessary to migrate it to use one or more new tools. And then there is the question of the implications for anyone else who works with the code. I would be quite annoyed if someone on some other team made a decision that forced me and my team to change some of the software or development processes we use because they adopted software that is incompatible with what we had been using. I have seen situations where one member of a team hyped one product (generally a commercial library), spending months developing new code using it, despite being repeatedly told of one or more deficiencies in it. His argument was that the product is the best on the market (with some support from the trade literature), and that he just needed a bit more time to figure out how to address the apparent deificiencies. After months had been spent (some would say wasted), he told us that there is no way to work around the deficiencies in the product and that we had to just live with them. I found a simpler product that DID meet all our needs, but it cost a few months to refactor the code to use the simpler product that, as limited as it was, met all our needs instead of the alleged 'best' product that did not, despite the fact that other aspects of it worked fine and that it had a huge number of features for which we had no need. The point, with an existing product, is that it represents a whole suite of decisions that had been made about design, and the various tools to be used to support it, and any decision to replace one tool (or anything else) carries costs that must be carefully evaluated and compared with alleged benefits. To illustrate, comparing software to manage task lists and meeting schedules, one has a wide range of products to examine from Google Calendar through open source products like SugarCRM, to the various commercial CRM products. At the one extreme, Google calendar is like a bicycle, with SugarCRM being like a 10 year old Chevy, and the commercial products more like a late model Mercedes. For some, Google Calendar is sufficient, and works well. For others (perhaps most) something like SugarCRM would be appropriate, and for others, with greater needs and deep pockets, one of the commercial offerings may be preferred. But, if you already have extensive data in SugarCRM, and you learn of a commecial offering that better meets your needs, migrating all your data from the one to the other will not be trivial, and may in fact make switching counterproductive, making hiring a PHP programmer to extend SugarCRM instead a more rational option. Please understand, I am not arguing the merits of git with you. Rather, I am pointing out you haven't made the case that a change to use it instead of SVN is either warranted or feasible or practicable. I have a number of products in svn, and if I am to be convinced to use any other version control software to manage their code instead of SVN, I'd need to be presented with an analysis not only of what each option (RCS,CVS,SVN,git,MKS, &c.) offers, but proof that each works as described, that the benefits of making a switch outweigh the costs, and a viable plan for making the switch. And all of this would have to be supported from well documented experience. An argument comprised simply of a claim that product 'X' is the best available because it does 'Y' does not even come close. Cheers Ted

On 1/28/2011 12:12 PM, Ted Byers wrote:
Please understand, I am not arguing the merits of git with you. Rather, I am pointing out you haven't made the case that a change to use it instead of SVN is either warranted or feasible or practicable.
I never made that case. I use SVN. I was just pointing out that one of your objections was a straw man.

At Fri, 28 Jan 2011 13:12:06 -0500, Ted Byers wrote:
Please understand, I am not arguing the merits of git with you. Rather, I am pointing out you haven't made the case that a change to use it instead of SVN is either warranted or feasible or practicable. I have a number of products in svn, and if I am to be convinced to use any other version control software to manage their code instead of SVN, I'd need to be presented with an analysis not only of what each option (RCS,CVS,SVN,git,MKS, &c.) offers, but proof that each works as described, that the benefits of making a switch outweigh the costs, and a viable plan for making the switch. And all of this would have to be supported from well documented experience. An argument comprised simply of a claim that product 'X' is the best available because it does 'Y' does not even come close.
This is all way off-topic for the boost-users list. I'm not a moderator here, but if I were, I'd be asking you to take it elsewhere. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Ted, all: greetings --
"Ted Byers"
Really? Anyone with any experience has faced this sort of thing countless times before, and even in the absense of software that makes handling this easy, have developed methods (often manual) to deal with it.
This is a key observation. Having said that, would rather have generation after generation of "complete novice" programmers have to climb this wall -- each in their own way, with their own tools, using their own toolset, and their own assumptions -- or would you prefer to switch to a tool that solves the problem for you (and them)? You're completely correct that it's quite trivial to generate local, offline snapshots of an SVN tree. Here's mine: $ cat snapshot-dir.sh #!/bin/bash # # snapshot-dir.sh # # This is mostly useful when we can't get to a version control server, # but still want to capture a particular edit state. Switching to git # (which is local by design) is probably really the right answer, but # for now... ts=$(date +'%Y%m%d%H%M%S') for d in $* do echo "=== $d ===" tar --create \ --gzip \ --verbose \ --file "saved-snaps/$d-$ts.tar.gz" \ --exclude ".svn" \ "$d" done But you know what? Integrating those local snapshots, once I was back on the network, was quite a bit more effort than I'd like. The difference between a home-brew snapshot script (and manual merging), and a tool that is designed to support this style of work, is huge. Would you rather your programmers spend time writing tools like this, or would you prefer they utilize toolkits that give them the features they need up-front?
But then, in a commercial setting, part of the role of the senior programmers/team leads/&c. is to teach their juniors so that they are flexible enough to cope with these situations regardless of the supporting software used.
Another role of the "senior programmers", however, is to step back and review their *process* occasionally. That's a big part of what this thread is about; the contributors to boost are by no means "novice" programmers.
What I encourage among my team is to never throw anything away, unless it is demonstrably wrong. So, even with something as primitive as RCS, I/we wold not be "hosed". If git provides help with this, fine, but it is not essential.
How do they "never throw anything away"? How are those changes tracked, and do they have contemporaneous comments that would make suitable commit logs? Put another way, how many possibly-useful changes are sitting in your developers' working directories, under names like "third-try" and "oops"?
Please understand, I am not arguing the merits of git with you. Rather, I am pointing out you haven't made the case that a change to use it instead of SVN is either warranted or feasible or practicable.
Totally agreed. In this case, I would venture that there are some non-arguable facts: 1. Boost development is decentralized. This seems obvious on the surface. Multiple companies, programmers, countries, and even timezones. 2. Boost release coordination is centralized. This is a good thing! There is one focus for naming, QA, praise, and complaints. 3. Any DVCS can trivially emulate any given centralized VCS. 4. The DVCSs under discussion (hg and git) have both been proven workable (and, in general, superior to centralized VCSs and simple email patch-passing) by some very large, very popular, and many-contributor projects.
I have a number of products in svn, and if I am to be convinced to use any other version control software to manage their code instead of SVN, I'd need to be presented with an analysis not only of what each option (RCS,CVS,SVN,git,MKS, &c.) offers, but proof that each works as described, that the benefits of making a switch outweigh the costs, and a viable plan for making the switch.
This is a laudable goal, but proper "proof" consists of running both scenarios in parallel (providing a control group). Boost being a volunteer effort, I posit that it is impossible to "prove" that a particular change in methodology yields a specified cost/benefit ratio. Lacking rigorous proof, we can instead look to the experiences of similar projects. Of those projects that are large and decentralized, it seems that the choice to use a DVCS has been largely accepted. I would suggest following their lead. Best regards, Tony

As one responsible for a small development team, and who has to keep junior and intermediate programmers on track, this notion of " Write some code, test it, seems to work, check it in. Come back after lunch, discover it's fubar, revert. Lather, rinse repeat" scares me.
you check in lcoally, nothing break until the changes are pulled in the main code base

I would prefer a model where there is a highly developed suite of test code (actually, I find it is often best if one begins with the tests first - but sometimes that is not practical), unit tests, integration tests and usability tests, and nothing gets checked in unless the code base plus the new code not only compiles, but the developer can show that with his implemented changes, the system still passes all tests. And note, his new code must come with a test suite of its own, and must pass through a code review before we accept that his tests are adequate and thus before he can run the full test suite. With this model, it happens that new code stresses existing code in initially unexpected ways, revealing previously undetected bugs. But at the same time, it makes it less likely that new code will introduce a significant number of new bugs when it is approved to be commited to the codebase. And this means that while the new code that is approved to be commited will have the same number of bugs per thousand lines of code that most other programmers experience in their code, the number of bugs per thousand lines of code can only decrease. And where the version control software in place does not support a given detail of this model (and note, this model can be made to work even with something as primitive as RCS or CVS), we need a manual process to make it work. In my practice, no member of my team commits anything until we know it works with everything already in the repository - no exceptions. This means that sometimes a programmer will work on an assigned task for days, or even a week, without commiting changes to the repository.
I agree with except for the latest phrase. This is exactly what DVCS shines at, BUT you can commit regularly. Why is this important? To have small increments that helps bisecting for bugs. Of course, you don't commit to the main repository each time, of course, it has to be reviewed, of course it has to pass all the tests... but in addition, you get the history, and I was saved many times by this feature (and of course, it has to compile each time, because without this, you can't bisect ;)). Cheers, Matthieu -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher

On Fri, Jan 28, 2011 at 11:59 PM, Ted Byers
From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Eric J. Holtman Sent: January-28-11 8:56 AM To: boost-users@lists.boost.org Subject: Re: [Boost-users] What's happened to Ryppl?
You can check in while you work. Which means you don't have to worry about "breaking the build", or anything like that.
Write some code, test it, seems to work, check it in. Come back after lunch, discover it's fubar, revert. Lather, rinse repeat.
This may be adequate IF you're working alone and if you have all the time in the world, but it will become an unmaintainable nightmare when the number of programmers contributing to the project increases significantly and as time pressures grow. Imagine the chaos that would result from this if you had a dozen programmers doing this independantly to the same codebase.
One project to look at: Linux. Time pressure? Couple of weeks to merge upstream. Programmers contributing to the project? Thousands. Programmers doing this independently on the same codebase? Absolutely. 'nuff said. [snip tl;dr] -- Dean Michael Berris about.me/deanberris

On Fri, Jan 28, 2011 at 07:59, Ted Byers
I would prefer a model where there is a highly developed suite of test code (actually, I find it is often best if one begins with the tests first - but sometimes that is not practical), unit tests, integration tests and usability tests, and nothing gets checked in unless the code base plus the new code not only compiles, but the developer can show that with his implemented changes, the system still passes all tests. [...] In my practice, no member of my team commits anything until we know it works with everything already in the repository - no exceptions. This means that sometimes a programmer will work on an assigned task for days, or even a week, without commiting changes to the repository.
In such an environment, a DCVS is *more* useful, not less. Your process overloads commits with an extra meaning of validation. With a DCVS, team members commit as often as is helpful for their personal work, without affecting others. Then the validation can be run in parallel, and only pushed to the anointed repo once the validation passes. It avoid the downtime imposed by requiring code review or CITs that occurs when you can't check things in because they haven't passed the process, but nor can you keep working on things since there's no nice way to commit something other than your working copy. (In git, for example, you've checked in the changes, and all you need to send is your box's name and the hash of the commit to others -- be it the code review alias, the automated test suite runner, or the person responsible for the official build.) I've spent time trying to work with another dev on a feature at a company with a strict "code review before checkin" policy, and the lack of source control -- since you can't check anything in anywhere -- makes it a terrible experience.

On 28.01.2011 14:46, Edward Diener wrote:
I do not follow why these are advantages. I can make any changes locally for files using SVN without having to have a connection to the SVN server. Your phrase "incremental local commits" sounds like more Git rhetoric to me. How does this differ from just changing files locally under SVN ?
There is a difference between changing a file and committing a change. One is just changed file data. The other is a record of that change in the VCS. DVCSs allow you to do commits locally, without having a connection to some central server. Sebastian

Edward Diener
On 1/28/2011 3:24 AM, Anthony Williams wrote:
The chief advantage of a DVCS over subversion is that you can do local development with full version control (including history) whilst offline, and then push/pull when online. Also, you can do incremental local commits, so you have the advantage of VC, without pushing unfinished changes to the main repository. Branching and merging tends to be easier too.
I do not follow why these are advantages. I can make any changes locally for files using SVN without having to have a connection to the SVN server. Your phrase "incremental local commits" sounds like more Git rhetoric to me. How does this differ from just changing files locally under SVN ?
By "incremental local commits", I meant that I can make a small change and commit it to the VCS locally whilst offline. I can then make another and another and another, rollback some changes and make some more, and commit that, and so forth. Then, later, I can upload the whole bunch of commits (with the log messages I made at the time) to the remote server. e.g. I'm working on a new, complex feature that impacts a lot of stuff. I can write a test, make it pass, and check in the change locally. When I'm working well I can make such commits every few minutes. With a remote server this can be slow and painful. Also, it doesn't matter if my changes leave partially complete features that won't integrate well, as no-one else can use it. From an SVN perspective, it's like having a private branch, and only merging to trunk at carefully chosen points. However, DVCSs tend to have much better handling of branching and merging than SVN, so merging that private branch to trunk when someone else has merged their changes in the mean time is much less of an ordeal. I can also rollback locally to an older revision whilst offline, do diffs and merges between branches whilst offline, and then push changes to the remote server later when online. The lack of a need for a remote connection for such things can make them considerably faster. Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

Beman Dawes wrote:
On Tue, Jan 25, 2011 at 12:00 PM, Dave Abrahams
wrote: At Tue, 25 Jan 2011 12:05:46 +0700, Eric Niebler wrote:
<idle speculation> Is it feasible to have both git and svn development going on simultaneously? Two-way synchronization from non-modularized svn boost to modularized git boost? Is that pure insanity?
Probably not *pure* insanity, but also perhaps not worth the trouble, IMO.
Still, doing a "big bang" conversion to Git all at one time is more than a notion.
Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
To me, this illustrates a fundamental problem. If the issue of modularization were addressed, there would be no requirement that all libraries use the same version control system. That is, change to a different version control system would occur one library at time. Same can be said for the build system. The only coupling really required between libraries is a) namespace coordination b) directory structure - to some extent at least at the top levels c) quality standards i) testing ii) platform coverage iii) documentation requirements If coupling is required somewhere else, it's an error that is holding us back. Robert Ramey
--Beman

On Thu, Jan 27, 2011 at 1:26 PM, Robert Ramey
Beman Dawes wrote:
Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
To me, this illustrates a fundamental problem. If the issue of modularization were addressed, there would be no requirement that all libraries use the same version control system. That is, change to a different version control system would occur one library at time.
Same can be said for the build system.
In principle, true. In practice, we need some consistency across boost or it will become hard for the community to contribute, and especially hard for people (including release managers) to step in and fix things, or even assemble a distribution. This is to say nothing of automated testing.
The only coupling really required between libraries is
a) namespace coordination b) directory structure - to some extent at least at the top levels c) quality standards i) testing
introduces a build system dependency.
ii) platform coverage iii) documentation requirements
If coupling is required somewhere else, it's an error that is holding us back.
I don't believe such an error exists here. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
On Thu, Jan 27, 2011 at 1:26 PM, Robert Ramey
wrote: Beman Dawes wrote:
Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
To me, this illustrates a fundamental problem. If the issue of modularization were addressed, there would be no requirement that all libraries use the same version control system. That is, change to a different version control system would occur one library at time.
Same can be said for the build system.
In principle, true. In practice, we need some consistency across boost or it will become hard for the community to contribute, and especially hard for people (including release managers) to step in and fix things, or even assemble a distribution. This is to say nothing of automated testing.
The only coupling really required between libraries is
a) namespace coordination b) directory structure - to some extent at least at the top levels c) quality standards d) testing
introduces a build system dependency.
A particular library will depend upon at least one build/test system. But that doesn't imply that all libraries have to depend on the same one. In fact, we already have the situation that many (all?) libraries can be built/tested with bjam or Ctest. The "boost test" would be just the union of the test procedure implemented for each library. That is foreach(library L) lib/L/test/test.bat // or /lib/L/test.sh The current approach implements the view of Boost as a particular set of libraries ONLY built/tested/distributed as an whole. My view is that is not scaling well and can never do so. Each library should be "buildable, testable, and distributable" on it's own. The "official boost distribution" would be just the union of all the certified boost libraries. Of course anyone could make his own subdistribution if he want's to. Already we have a step in this direction with B?P (distribute one library and all it's pre-requisites). I envision the future of boost that looks more like sourceforge but with all libraries meeting the boost requirements. I see boost as spending more time on reviews and less time on testing, packaging, etc. I see "packaging/distribution" as being handled by anyone who wants to create any subset of the boost libraries. Finally, I see the testing as being done by each user to get a wider coverage. I see the centralized functions being limited to: a) reviews/certification b) accumulation of testing results c) coordination/maintainence of standards (a-d above) d) promotion of developer practices compatible with the above (licenses, etc). Suppose such an environment existed today. The whole issue of moving to git wouldn't be an issue. Each library author could use which ever system he preferred. Movement to git could proceed on a library by library basis if/when other developers were convinced it was an improvement. It would be one less thing to spend time on.
ii) platform coverage iii) documentation requirements
If coupling is required somewhere else, it's an error that is holding us back.
I don't believe such an error exists here.
Robert Ramey

At Fri, 28 Jan 2011 08:45:48 -0800, Robert Ramey wrote:
Dave Abrahams wrote:
On Thu, Jan 27, 2011 at 1:26 PM, Robert Ramey
wrote: Beman Dawes wrote:
Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
To me, this illustrates a fundamental problem. If the issue of modularization were addressed, there would be no requirement that all libraries use the same version control system. That is, change to a different version control system would occur one library at time.
Same can be said for the build system.
In principle, true. In practice, we need some consistency across boost or it will become hard for the community to contribute, and especially hard for people (including release managers) to step in and fix things, or even assemble a distribution. This is to say nothing of automated testing.
The only coupling really required between libraries is
a) namespace coordination b) directory structure - to some extent at least at the top levels c) quality standards d) testing
introduces a build system dependency.
A particular library will depend upon at least one build/test system. But that doesn't imply that all libraries have to depend on the same one.
Again, true in principle, but IMO not workable in practice, for the same reasons I just cited.
In fact, we already have the situation that many (all?) libraries can be built/tested with bjam or Ctest.
The "boost test" would be just the union of the test procedure implemented for each library. That is
foreach(library L) lib/L/test/test.bat // or /lib/L/test.sh
The current approach implements the view of Boost as a particular set of libraries ONLY built/tested/distributed as an whole.
My view is that is not scaling well and can never do so.
+1 Still, that doesn't mean we're going to be more nimble and scalable if there's no standardization of tools across Boost. Quite the contrary, IMO. I can imagine all kinds of problems coming up that are simply ruled out by using the same tools.
Each library should be "buildable, testable, and distributable" on it's own.
Except that there are interdependencies among some of the libraries. How many build tools should you need in order to install Boost.Serialization?
I see the centralized functions being limited to: a) reviews/certification b) accumulation of testing results c) coordination/maintainence of standards (a-d above) d) promotion of developer practices compatible with the above (licenses, etc).
Suppose such an environment existed today. The whole issue of moving to git wouldn't be an issue. Each library author could use which ever system he preferred. Movement to git could proceed on a library by library basis if/when other developers were convinced it was an improvement. It would be one less thing to spend time on.
Or, it could be one more thing to spend time on. Standardization != coordination, and while coordination can slow things down, standardization brings efficiencies. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

The current approach implements the view of Boost as a particular set of libraries ONLY built/tested/distributed as an whole.
My view is that is not scaling well and can never do so.
+1
Still, that doesn't mean we're going to be more nimble and scalable if there's no standardization of tools across Boost. Quite the contrary, IMO. I can imagine all kinds of problems coming up that are simply ruled out by using the same tools.
+1 from me, we must IMO have standardized tools - whatever we decide those are - otherwise what you're proposing is the complete fragmentation of Boost into something even more unmanageable than now. I still haven't heard from the git proponents, what's wrong with using git-svn to manage a local - i.e. distributed - git repository, and then periodically pushing changes to SVN. In other words working with git just as you normally would, except for having to type "git svn" from time to time? This isn't a rhetorical question BTW, I've never used either git or git-svn, so I clearly don't know what I'm missing ;-) John. PS, just looked at the git website, and it appears that us Windows users are restricted to either Cygwin or MSys builds? If so that appears to be a major drawback IMO.... OK I see there's a TortoiseGit, but it looks distinctly immature at first glance, and still depends on MSys (i.e. no easy integrated install)?

On Sat, Jan 29, 2011 at 2:46 AM, John Maddock
I still haven't heard from the git proponents, what's wrong with using git-svn to manage a local - i.e. distributed - git repository, and then periodically pushing changes to SVN. In other words working with git just as you normally would, except for having to type "git svn" from time to time? This isn't a rhetorical question BTW, I've never used either git or git-svn, so I clearly don't know what I'm missing ;-)
This doesn't change the Boost central repo which is actually one of the reasons why the current process doesn't scale well. The idea really (partially hashed out here, still a work in progress: https://svn.boost.org/trac/boost/wiki/DistributedDevelopmentProcess) at least when I first brought it up is that we should be able to get multiple distributions, allow the independent but coordinated development of individual libraries, allow contributors to get into the game easier, and rely on an organic web of trust to allow for self-organization of sub-communities and a larger Boost community. Git is part of that idea mostly because the barrier to entry for potential contributors is 0. Anybody can absolutely clone the git repository, get development going locally, adding their contributions and submitting pull requests easily. The pull requests can go to maintainers, co-maintainers, the mailing list at large, or someone who's already a contributor to shepherd changes in. This allows all the work to happen in a distributed manner, with release management largely a matter of packaging publicly published versions of libraries that are tested to work well together in a single distribution. I just cannot imagine how this would be done with anything other than git that integrates the web of trust, organic fan-out growth of the self-organizing community, and rich set of tools and practices supporting it. Of course it's not just the git thing, it's also a workflow thing, and the distributed workflow along with the distributed version control system go hand-in-hand.
John.
PS, just looked at the git website, and it appears that us Windows users are restricted to either Cygwin or MSys builds? If so that appears to be a major drawback IMO.... OK I see there's a TortoiseGit, but it looks distinctly immature at first glance, and still depends on MSys (i.e. no easy integrated install)?
MSysGit is the best Git I've seen on Windows so far. I used it extensively on the command-line and had 0 problems working with it using the tutorials for Git on Linux. YMMV though. HTH -- Dean Michael Berris about.me/deanberris

On 1/28/2011 1:46 PM, John Maddock wrote:
The current approach implements the view of Boost as a particular set of libraries ONLY built/tested/distributed as an whole.
My view is that is not scaling well and can never do so.
+1
Still, that doesn't mean we're going to be more nimble and scalable if there's no standardization of tools across Boost. Quite the contrary, IMO. I can imagine all kinds of problems coming up that are simply ruled out by using the same tools.
+1 from me, we must IMO have standardized tools - whatever we decide those are - otherwise what you're proposing is the complete fragmentation of Boost into something even more unmanageable than now.
I still haven't heard from the git proponents, what's wrong with using git-svn to manage a local - i.e. distributed - git repository, and then periodically pushing changes to SVN. In other words working with git just as you normally would, except for having to type "git svn" from time to time? This isn't a rhetorical question BTW, I've never used either git or git-svn, so I clearly don't know what I'm missing ;-)
The arguments of Git's superiority as a distributed VCS over SVN's centralized VCS do not convince me either. I understand them but I wonder if the switch from SVN to Git is worth it just so end-users can make their own changes to a local Git repository and then push their entire repository to a centralized location some time later. This is as opposed to SVN users making periodic changes by committing to a centralized SVN repository periodically. I just do not see the big deal in the difference. I do not see Boost's possible need to become less centralized and go from a monolithic distribution to possible individual distributions as dependent on using a distributed repository model versus a centralized repository model. I believe many other issues are much more important, as brought up by Robert Ramey and others. I would much rather Boost have a discussion of those other issues than focus on Git versus SVN, which I think of as just another red herring.
John.
PS, just looked at the git website, and it appears that us Windows users are restricted to either Cygwin or MSys builds? If so that appears to be a major drawback IMO.... OK I see there's a TortoiseGit, but it looks distinctly immature at first glance, and still depends on MSys (i.e. no easy integrated install)?
I have not looked at what it takes to build it from source, but installing Tortoise Git on Windows is pretty easy from a binary download. The documentation is not as good as Tortoise SVN amd leaves much to be desired. There is a mailing list/Gmane NG for questions etc.

At Fri, 28 Jan 2011 15:10:56 -0500, Edward Diener wrote:
I would much rather Boost have a discussion of those other issues than focus on Git versus SVN, which I think of as just another red herring.
Hi Edward, Sounds like a good topic. Why don't you start that discussion over on the Boost developers' list? -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On Sat, Jan 29, 2011 at 4:10 AM, Edward Diener
On 1/28/2011 1:46 PM, John Maddock wrote:
I still haven't heard from the git proponents, what's wrong with using git-svn to manage a local - i.e. distributed - git repository, and then periodically pushing changes to SVN. In other words working with git just as you normally would, except for having to type "git svn" from time to time? This isn't a rhetorical question BTW, I've never used either git or git-svn, so I clearly don't know what I'm missing ;-)
The arguments of Git's superiority as a distributed VCS over SVN's centralized VCS do not convince me either. I understand them but I wonder if the switch from SVN to Git is worth it just so end-users can make their own changes to a local Git repository and then push their entire repository to a centralized location some time later. This is as opposed to SVN users making periodic changes by committing to a centralized SVN repository periodically. I just do not see the big deal in the difference.
I think you're looking at it as a purely tool vs tool comparison which doesn't amount to much. Consider then what the workflow a distributed version control system enables and you might see the difference clearer. Consider a library being worked on by N different people concurrently. Each one can work on exactly the same code locally, making their changes locally. Then say someone pushes their changes to the "canonical" repository. Each person can then pull these changes locally, stabilizing their own local repository, and fixing things until it's stable. You can keep doing this every time without any one of these N people waiting on anybody to "finish". Now then imagine that there's only one person who has push capabilities/rights to that "canonical" repository and that person's called a maintainer. All the N-1 people then ask this maintainer to pull changes in or merge patches submitted by them. If the maintainer is willing and capable, that's fine and dandy changes get merged. Now consider when maintainer is unwilling or incapable, what happens to the changes these N-1 people make? Simple, they publish their repository somewhere accessible and all the N-2 people can congregate around that repository instead. MIA maintainer out of the way, release managers can choose to pull from someone else's published version of the library. Easy as pie. Explain to me now then how you will enable this kind of workflow with a centralized SCM.
I do not see Boost's possible need to become less centralized and go from a monolithic distribution to possible individual distributions as dependent on using a distributed repository model versus a centralized repository model. I believe many other issues are much more important, as brought up by Robert Ramey and others.
How about that try above?
I would much rather Boost have a discussion of those other issues than focus on Git versus SVN, which I think of as just another red herring.
How about the workflow, is that something you'd like to see discussed as well? -- Dean Michael Berris about.me/deanberris

Dean Michael Berris wrote:
Consider a library being worked on by N different people concurrently. Each one can work on exactly the same code locally, making their changes locally. Then say someone pushes their changes to the "canonical" repository. Each person can then pull these changes locally, stabilizing their own local repository, and fixing things until it's stable. You can keep doing this every time without any one of these N people waiting on anybody to "finish". Now then imagine that there's only one person who has push capabilities/rights to that "canonical" repository and that person's called a maintainer.
All the N-1 people then ask this maintainer to pull changes in or merge patches submitted by them. If the maintainer is willing and capable, that's fine and dandy changes get merged. Now consider when maintainer is unwilling or incapable, what happens to the changes these N-1 people make? Simple, they publish their repository somewhere accessible and all the N-2 people can congregate around that repository instead. MIA maintainer out of the way, release managers can choose to pull from someone else's published version of the library. Easy as pie.
Explain to me now then how you will enable this kind of workflow with a centralized SCM.
Private branches existed in all SCMs since, like, forever. As repeatedly mentioned before, all you are talking above is a process matter, not a tool matter. - Volodya -- Vladimir Prus Mentor Graphics +7 (812) 677-68-40

I think you're looking at it as a purely tool vs tool comparison which doesn't amount to much. Consider then what the workflow a distributed version control system enables and you might see the difference clearer.
Consider a library being worked on by N different people concurrently. Each one can work on exactly the same code locally, making their changes locally. Then say someone pushes their changes to the "canonical" repository. Each person can then pull these changes locally, stabilizing their own local repository, and fixing things until it's stable. You can keep doing this every time without any one of these N people waiting on anybody to "finish".
That's exactly what we do now with SVN.
Now then imagine that there's only one person who has push capabilities/rights to that "canonical" repository and that person's called a maintainer.
All the N-1 people then ask this maintainer to pull changes in or merge patches submitted by them. If the maintainer is willing and capable, that's fine and dandy changes get merged. Now consider when maintainer is unwilling or incapable, what happens to the changes these N-1 people make? Simple, they publish their repository somewhere accessible and all the N-2 people can congregate around that repository instead. MIA maintainer out of the way, release managers can choose to pull from someone else's published version of the library. Easy as pie.
OK, if forking is a good thing then I can see how that helps. Question: what's to stop you from right now, building a better and greater version of library X in the sandbox, and then asking the Boost community to consider that the new Trunk and you the new maintainer? Different tool, same process. I still think there are pros and cons though: * As I see it Git encourages developers to keep their changes local for longer and then merge when stable. That's cool, and I can see some advantages especially for developers wanting to get involved, but I predict more work for maintainers of the canonical repro trying to figure out how to resolve all those conflicts. Obviously with SVN we still get conflicts - for example Paul and I often step on each others toes editing the Math lib docs - but these issues tend to crop up sooner rather than later which at least makes the issue manageable to some level. * I happen to like the fact that SVN stores things *not on my hard drive*, it means I just don't have to worry about what happens if my laptop goes belly up, gets lost, stolen, dropped, or heaven forbid "coffeed". On the other hand the "instant" commits and version history from a local copy would be nice... Regards, John.

On Sat, Jan 29, 2011 at 6:08 PM, John Maddock
I think you're looking at it as a purely tool vs tool comparison which doesn't amount to much. Consider then what the workflow a distributed version control system enables and you might see the difference clearer.
Consider a library being worked on by N different people concurrently. Each one can work on exactly the same code locally, making their changes locally. Then say someone pushes their changes to the "canonical" repository. Each person can then pull these changes locally, stabilizing their own local repository, and fixing things until it's stable. You can keep doing this every time without any one of these N people waiting on anybody to "finish".
That's exactly what we do now with SVN.
Something was missing in translation there: making changes, means committing locally. That's not what we do now with SVN. See, when you're using git, committing locally and making changes are equivalent. It's something you don't think about as something different -- as opposed to how you think about it in SVN where "making changes" is not equivalent to "committing". Therefore were you see git users say 'making changes' that really means 'committing changes locally'.
Now then imagine that there's only one person who has push capabilities/rights to that "canonical" repository and that person's called a maintainer.
All the N-1 people then ask this maintainer to pull changes in or merge patches submitted by them. If the maintainer is willing and capable, that's fine and dandy changes get merged. Now consider when maintainer is unwilling or incapable, what happens to the changes these N-1 people make? Simple, they publish their repository somewhere accessible and all the N-2 people can congregate around that repository instead. MIA maintainer out of the way, release managers can choose to pull from someone else's published version of the library. Easy as pie.
OK, if forking is a good thing then I can see how that helps.
Is there any question that forking is a good thing? I thought that was kinda assumed with open source development. ;-)
Question: what's to stop you from right now, building a better and greater version of library X in the sandbox, and then asking the Boost community to consider that the new Trunk and you the new maintainer? Different tool, same process.
Because doing that requires permission to get sandbox access. And because doing that is largely more steps than just clicking a 'fork' button on a web UI on something like github. And also because doing that means that you have to work with a single repository that has potentially other people clobbering it. :)
I still think there are pros and cons though:
* As I see it Git encourages developers to keep their changes local for longer and then merge when stable. That's cool, and I can see some advantages especially for developers wanting to get involved, but I predict more work for maintainers of the canonical repro trying to figure out how to resolve all those conflicts.
What gives the impression that resolving conflicts is hard on git? It's easily one of the easiest things to do with git along with branching. And because branching is so light-weight in git (meaning you don't have to pull the branch everytime you're switching between branches on your local repo) these conflict resolution and feature-development isolation is part of the daily work that comes with software development on Git. And having multiple maintainers maintaining a single "canonical" git repo is the sweetest thing ever. Merging changes from many different sources into a single "master" is actually *fun* as opposed to painful with a centralized VCS.
Obviously with SVN we still get conflicts - for example Paul and I often step on each others toes editing the Math lib docs - but these issues tend to crop up sooner rather than later which at least makes the issue manageable to some level.
See, imagine how that would scale if you added 2 more people working on the same library with SVN. Just updating everytime you need to commit anything is a pain with the potentially huge conflicts you get precisely because you can't commit changes more granularly locally in your repository. Note that there's no notion of a "working copy" because your local repository is what you work on directly. The "pull-merge-push" workflow is so simple with git that it's largely not something you *ever* have to deal with in any special manner. It's just part of everyday development with git. A suggestion would be maybe someone ought to run a workshop or a tutorial IRL on how the git workflow looks like. I think there are tons of videos out there already along with countless books written on the subject already.
* I happen to like the fact that SVN stores things *not on my hard drive*, it means I just don't have to worry about what happens if my laptop goes belly up, gets lost, stolen, dropped, or heaven forbid "coffeed". On the other hand the "instant" commits and version history from a local copy would be nice...
See, git does the same thing if you're using github as a publicly accessible repo. You can duplicate the same to gitorious. You can even put it on sourceforge for good measure. Synchronizing each one is a scriptable manner and is not rocket science. The fact that this is even possible with git is something that gives it much more appeal for disaster recovery. With SVN, if you're working on something locally (not committed yet) and your hard-drive goes belly up, I don't see why it's better than if you were working on git and have local commits and your hard-drive goes belly up. With SVN though the question is what happens when your server gets wiped out, what value is the data on your hard-drive then? How do you reconstitute the history of the project from what you do have with you? With git the risk mitigation options are a lot more accessible and largely trivial. With SVN, not so much. HTH -- Dean Michael Berris about.me/deanberris

On Sat, Jan 29, 2011 at 6:08 PM, John Maddock
* I happen to like the fact that SVN stores things *not on my hard drive*, it means I just don't have to worry about what happens if my laptop goes belly up, gets lost, stolen, dropped, or heaven forbid "coffeed". On the other hand the "instant" commits and version history from a local copy would be nice...
Sorry but I don't like that actually. Servers (here the boost repository) are also likely to crash/burn whatever (oh and yes there are backups but... I've seen backup destroyed before). So here it is what happen with git : your laptop dies, the main repository dies? That's not really important as thousands of user all over the world have copies of this repo. That's much more safe than just one central server. (actually that could be compared to thousands of backups).

AMDG On 1/29/2011 3:02 AM, Dean Michael Berris wrote:
On Sat, Jan 29, 2011 at 6:08 PM, John Maddock
wrote: * As I see it Git encourages developers to keep their changes local for longer and then merge when stable. That's cool, and I can see some advantages especially for developers wanting to get involved, but I predict more work for maintainers of the canonical repro trying to figure out how to resolve all those conflicts.
What gives the impression that resolving conflicts is hard on git?
Nothing except that I do not trust any automated tool, no matter how smart it is, to do the merge correctly without manual review of every change. The tool has no knowledge of the semantics of what its merging.
It's easily one of the easiest things to do with git along with branching. And because branching is so light-weight in git (meaning you don't have to pull the branch everytime you're switching between branches on your local repo) these conflict resolution and feature-development isolation is part of the daily work that comes with software development on Git.
And having multiple maintainers maintaining a single "canonical" git repo is the sweetest thing ever. Merging changes from many different sources into a single "master" is actually *fun* as opposed to painful with a centralized VCS.
What does this have to do with whether the repository is centralized or distributed? In Christ, Steven Watanabe

And having multiple maintainers maintaining a single "canonical" git
repo is the sweetest thing ever. Merging changes from many different sources into a single "master" is actually *fun* as opposed to painful with a centralized VCS.
What does this have to do with whether the repository is centralized or distributed?
Merging is natural for a decentralized tool, it's not the case for centralized. The best proof is Subversion which cannot handle correctly until a short while history merges.
Matthieu -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher

On Sat, Jan 29, 2011 at 3:33 PM, Steven Watanabe
And having multiple maintainers maintaining a single "canonical" git repo is the sweetest thing ever. Merging changes from many different sources into a single "master" is actually *fun* as opposed to painful with a centralized VCS.
What does this have to do with whether the repository is centralized or distributed?
The fact that you can quickly try doing it several different ways without affecting the "official repo" is a big plus. There's no reason anyone should take my word for this, but I didn't really "get it" about DVCSes until I actually tried using Git for a while. Something about it changes the user experience drastically in ways that are simply not obvious until you've gotten used to it. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

AMDG On 1/29/2011 1:50 PM, Dave Abrahams wrote:
On Sat, Jan 29, 2011 at 3:33 PM, Steven Watanabe
wrote: And having multiple maintainers maintaining a single "canonical" git repo is the sweetest thing ever. Merging changes from many different sources into a single "master" is actually *fun* as opposed to painful with a centralized VCS.
What does this have to do with whether the repository is centralized or distributed?
The fact that you can quickly try doing it several different ways without affecting the "official repo" is a big plus.
I don't understand. Quickly try doing what?
There's no reason anyone should take my word for this, but I didn't really "get it" about DVCSes until I actually tried using Git for a while. Something about it changes the user experience drastically in ways that are simply not obvious until you've gotten used to it.
In Christ, Steven Watanabe

On Sun, Jan 30, 2011 at 6:22 AM, Steven Watanabe
AMDG
On 1/29/2011 1:50 PM, Dave Abrahams wrote:
On Sat, Jan 29, 2011 at 3:33 PM, Steven Watanabe
wrote: And having multiple maintainers maintaining a single "canonical" git repo is the sweetest thing ever. Merging changes from many different sources into a single "master" is actually *fun* as opposed to painful with a centralized VCS.
What does this have to do with whether the repository is centralized or distributed?
The fact that you can quickly try doing it several different ways without affecting the "official repo" is a big plus.
I don't understand. Quickly try doing what?
The merge.
There's no reason anyone should take my word for this, but I didn't really "get it" about DVCSes until I actually tried using Git for a while. Something about it changes the user experience drastically in ways that are simply not obvious until you've gotten used to it.
+1 to Dave's statement above. -- Dean Michael Berris about.me/deanberris

AMDG On 1/29/2011 11:36 PM, Dean Michael Berris wrote:
On Sun, Jan 30, 2011 at 6:22 AM, Steven Watanabe
wrote: On 1/29/2011 1:50 PM, Dave Abrahams wrote:
On Sat, Jan 29, 2011 at 3:33 PM, Steven Watanabe
wrote: And having multiple maintainers maintaining a single "canonical" git repo is the sweetest thing ever. Merging changes from many different sources into a single "master" is actually *fun* as opposed to painful with a centralized VCS.
What does this have to do with whether the repository is centralized or distributed?
The fact that you can quickly try doing it several different ways without affecting the "official repo" is a big plus.
I don't understand. Quickly try doing what?
The merge.
That's what I thought, but it didn't make sense to me. a) Merges in svn are always done locally first. It doesn't change the repository until you commit. b) Why would I want to try it several different ways? I always know exactly what I want to merge before I start. c) Even if I were merging by trial and error, I still don't understand what makes a distributed system so much better. It doesn't seem like it should matter. In Christ, Steven Watanabe

On Sun, Jan 30, 2011 at 11:49 PM, Steven Watanabe
AMDG
On 1/29/2011 11:36 PM, Dean Michael Berris wrote:
On Sun, Jan 30, 2011 at 6:22 AM, Steven Watanabe
wrote: On 1/29/2011 1:50 PM, Dave Abrahams wrote:
The fact that you can quickly try doing it several different ways without affecting the "official repo" is a big plus.
I don't understand. Quickly try doing what?
The merge.
That's what I thought, but it didn't make sense to me.
a) Merges in svn are always done locally first. It doesn't change the repository until you commit.
That's the same in git, except locally it's a repository too. So that means you can back out individual commits that cause conflicts, choose which ones you actually want to commit locally -- largely because the local copy is a repository you can re-order commits, glob together multiple commits into a single commit, edit the history to make it nice and clean and manageable (i.e. globbing together small related changes into a single commit). All these things you cannot do with subversion because there's only ever one repository version.
b) Why would I want to try it several different ways? I always know exactly what I want to merge before I start.
Which is also the point with git -- because you can choose which changesets exactly you want to take from where into your local repository. The fact that you *can* do this is a life saver for multi-developer projects -- and because it's easy it's something you largely don't have to avoid doing.
c) Even if I were merging by trial and error, I still don't understand what makes a distributed system so much better. It doesn't seem like it should matter.
Because in a distributed system, you can have multiple sources to choose from and many different ways of globbing things together. I don't know if you follow how the Linux development model works, but the short of it is that it won't work if they had a single repo that everybody (as in 1000s of developers) touched. Even in an environment where you had just 2 developers, by not having to synchronize everything you're lowering the chance of friction -- and when friction does occur, the "mess" only happens on the local repository, which you can fix locally, and then have the changes reflected in a different "canonical" repository. For that matter, which repository is the "canonical" repository is largely a matter of policy. In the Linux case it's the Linus repository that is the "de facto, canonical" repository. If Linus' repo is gone or suddenly the community stops trusting him and his published repo, then the community can congregate around a different repo. -- Dean Michael Berris about.me/deanberris

AMDG On 1/30/2011 9:05 AM, Dean Michael Berris wrote:
On Sun, Jan 30, 2011 at 11:49 PM, Steven Watanabe
wrote: a) Merges in svn are always done locally first. It doesn't change the repository until you commit.
That's the same in git, except locally it's a repository too. So that means you can back out individual commits that cause conflicts, choose which ones you actually want to commit locally
Okay, so this is just an instance of using local commits--which as far as I can tell is the only actual advantage of a DVCS. I can understand someone wanting this although it would probably not affect me personally, since, a) If I'm merging a lot of changes at once, it's only because I'm synching 2 branches. If I were trying to combine independent changes, each one would get a separate commit anyway. b) If I want to merge something and I get conflicts, I'm probably going to resolve them instead of reverting the changeset.
-- largely because the local copy is a repository you can re-order commits, glob together multiple commits into a single commit, edit the history to make it nice and clean and manageable (i.e. globbing together small related changes into a single commit).
In svn, one commit is still one commit, no matter how many changes you merge to create it. The notion of editing the history to clean it up is not relevant.
b) Why would I want to try it several different ways? I always know exactly what I want to merge before I start.
Which is also the point with git -- because you can choose which changesets exactly you want to take from where into your local repository. The fact that you *can* do this is a life saver for multi-developer projects -- and because it's easy it's something you largely don't have to avoid doing.
This doesn't answer the question I asked.
c) Even if I were merging by trial and error, I still don't understand what makes a distributed system so much better. It doesn't seem like it should matter.
Because in a distributed system, you can have multiple sources to choose from and many different ways of globbing things together.
So, what I'm hearing is the fact that you have more things to merge makes merging easier. But that can't be what you mean, because it's obviously nonsense. Come again?
I don't know if you follow how the Linux development model works, but the short of it is that it won't work if they had a single repo that everybody (as in 1000s of developers) touched. Even in an environment where you had just 2 developers, by not having to synchronize everything you're lowering the chance of friction -- and when friction does occur, the "mess" only happens on the local repository, which you can fix locally, and then have the changes reflected in a different "canonical" repository.
Have you ever heard of branches? Subversion does support them, you know. In Christ, Steven Watanabe

On Sun, Jan 30, 2011 at 15:38, Steven Watanabe
Okay, so this is just an instance of using local commits--which as far as I can tell is the only actual advantage of a DVCS.
I agree -- everything different about a DVCS is a consequence of allowing local commits. The best way I've seen of phrasing that fundamental difference: "[A DVCS] separates the act of committing new code from the act of inflicting it on everybody else." ~ http://hginit.com/00.html Similarly, the differences between git and all the other DVCSs seem to come from one of two decisions: 1) Store trees, not changes, and 2) All parents in a merge are equal. ~ Scott

On 1/30/2011 7:16 PM, Scott McMurray wrote:
On Sun, Jan 30, 2011 at 15:38, Steven Watanabe
wrote: Okay, so this is just an instance of using local commits--which as far as I can tell is the only actual advantage of a DVCS.
I agree -- everything different about a DVCS is a consequence of allowing local commits. The best way I've seen of phrasing that fundamental difference:
"[A DVCS] separates the act of committing new code from the act of inflicting it on everybody else." ~http://hginit.com/00.html
But of course when you eventually merge all your local commits which Git encourages you to keep to yourself until you are good and ready to combine them with other people's repositories and local commits, it all works flawlessly and easily without any question of conflicts and the resulting code is just perfect because Git is so good.

On 01/30/2011 08:13 PM, Edward Diener wrote:
On 1/30/2011 7:16 PM, Scott McMurray wrote:
On Sun, Jan 30, 2011 at 15:38, Steven Watanabe
wrote: Okay, so this is just an instance of using local commits--which as far as I can tell is the only actual advantage of a DVCS.
I agree -- everything different about a DVCS is a consequence of allowing local commits. The best way I've seen of phrasing that fundamental difference:
"[A DVCS] separates the act of committing new code from the act of inflicting it on everybody else." ~http://hginit.com/00.html
Hi! Idle comment from the peanut gallery here. If git is so good, and it is so easy to maintain and merge between branches, why not fork boost into a pure git canonical repo, (not the ridiculously complicated setup(s) done previously) and then maintain patches in a single dedicated svn branch for the svn dead enders, to be applied whenever that makes sense? DVCS merging competence is supposed to be bidirectional. (In the biz 32 years now, cvs user for 20+, svn since the beginning, git for 3+) Russell (submerging again, but not subversioning anymore... thank you git-svn)

On Mon, Jan 31, 2011 at 7:38 AM, Steven Watanabe
AMDG
On 1/30/2011 9:05 AM, Dean Michael Berris wrote:
On Sun, Jan 30, 2011 at 11:49 PM, Steven Watanabe
wrote: a) Merges in svn are always done locally first. It doesn't change the repository until you commit.
That's the same in git, except locally it's a repository too. So that means you can back out individual commits that cause conflicts, choose which ones you actually want to commit locally
Okay, so this is just an instance of using local commits--which as far as I can tell is the only actual advantage of a DVCS.
Sure, if you choose to look at it that way and ignore all the other good things DVCSes bring to the table.
I can understand someone wanting this although it would probably not affect me personally, since, a) If I'm merging a lot of changes at once, it's only because I'm synching 2 branches. If I were trying to combine independent changes, each one would get a separate commit anyway.
With subversion, each commit is a different revision number right? Therefore that means there's only one state of the entire repository including private branches, etc. What then happens when you have two people trying to merge from two different branches into one branch. Can you do this incrementally? How would you track the single repository's state? How do you avoid clobbering each other's incremental merges? Remember you're assuming that you're the only one trying to do the merge on the same code in a single repository. Consider the case where you have more than just you merging from different branches into the same branch. In git, merging N remote-tracking branches into a single branch is possible with a single command on a local repo -- if you really wanted to do it that way. Of course you already stated that you don't want automated tools so if you *really* wanted to inspect the merge one commit at a time you can actually do it interactively as well.
b) If I want to merge something and I get conflicts, I'm probably going to resolve them instead of reverting the changeset.
Sure in which way there's largely no problem whether you're using git or subversion. But in git with a multi-developer project, since you're only basically touching your own repo most of the time and synchronizing a canonical repo is mostly a matter of policy (who does it, when, etc.). In the context of merging this means you can fix the merge locally and then push to the canonical repo if you have the rights to do it so that others can pull from that again and continue on with their work (at their own pace). With subversion what happens is everyone absolutely has to be on the same page all the time and that's a problem.
-- largely because the local copy is a repository you can re-order commits, glob together multiple commits into a single commit, edit the history to make it nice and clean and manageable (i.e. globbing together small related changes into a single commit).
In svn, one commit is still one commit, no matter how many changes you merge to create it. The notion of editing the history to clean it up is not relevant.
Why is it not relevant? In subversion, one commit can only happen if your working copy's version is up to date with the repo's version of the same checked out branch. In git, because you have a local repo, well your commits are basically just on your repo -- if something changes upstream and you want to get up to date, then you pull and merge the stuff locally. This means you can still commit changes without clobbering others' work (or clobbering others' work but locally) and then 1) if you're the maintainer push to the canonical publicly accessible repo or if you're not the maintainer 2) ask the maintainer to pull your changes in via a merge that the maintainer does for you. The maintainer can then do the adjustments on the history of the repo -- things like consolidating commits, etc. -- which largely is really what maintainers do, only with git it's just a lot easier. Of course I realize that's a matter of taste and paradigm though so I think YMMV depending on whether you can wrap your head around it or not.
b) Why would I want to try it several different ways? I always know exactly what I want to merge before I start.
Which is also the point with git -- because you can choose which changesets exactly you want to take from where into your local repository. The fact that you *can* do this is a life saver for multi-developer projects -- and because it's easy it's something you largely don't have to avoid doing.
This doesn't answer the question I asked.
Of course you're looking at the whole thing with centralized VCS in mind. Consider the case that you have multiple remote branches you can pull from. If you're the maintainer and you want to basically consolidate the effort of multiple developers working on different parts of the same system, then you can do this piece-meal. For example, you, Dave Abrahams, and I are working on some extensions to MPL. Let's just say for the sake of example. I can have published changes up on my github fork of the MPL library, and Dave would be the maintainer, and you would have your published changes up on your github fork as well. Now let's say I'm not done yet with what I'm working on but the changes are available already from my fork. Let's say you tell Dave "hey Dave, I'm done, here's a pull request". Dave can then basically do a number of things: 1.) Just merge in what you've done because you're already finished and there's a pull request waiting. He does this on his local repo first to run tests locally -- once he's done with that he can push the changes to the canonical repo. 2.) Pull in my (not yet complete) changes first before he tries to merge your stuff in to see if there's something that I've touched that could potentially break what you've done. In this case Dave can notify you to pull the changes I've already made and see if you can work it out to get things fixed again. Or he can notify me and say "hey fix this!". 3.) Ask me to pull your stuff and ask me to finish up what I'm doing so that I can send a pull request that actually already incorporates your changes when I'm done. ... ad infinitum. With subversion, there's no way for something like this to happen with little friction. First we can't be working on the same code anyway because every time we try to commit we could be stomping on each other's changes and be spending our time just cursing subversion as we wait for the network traffic and spend most of our time just trying to merge changes when all we want to do is commit our changes so that we can record progress. Second we're going to have to use branches and have "rebasing" done manually anyway just so that we can all stay synchronized all the time -- which is sometimes largely unnecessary until it's time to actually integrate changes. I can list more but this reply is already taking longer than I expected so I'll stop it short there.
c) Even if I were merging by trial and error, I still don't understand what makes a distributed system so much better. It doesn't seem like it should matter.
Because in a distributed system, you can have multiple sources to choose from and many different ways of globbing things together.
So, what I'm hearing is the fact that you have more things to merge makes merging easier. But that can't be what you mean, because it's obviously nonsense. Come again?
Yes, that's exactly what I mean. Because merging is easy with git and is largely an automated process anyway, merging changes from multiple sources when integrating for example to do a "feature freeze" and "stabilization" by the release engineering group is actually made *fun* and easier than if you had to merge every time you had to commit in an actively changing codebase.
I don't know if you follow how the Linux development model works, but the short of it is that it won't work if they had a single repo that everybody (as in 1000s of developers) touched. Even in an environment where you had just 2 developers, by not having to synchronize everything you're lowering the chance of friction -- and when friction does occur, the "mess" only happens on the local repository, which you can fix locally, and then have the changes reflected in a different "canonical" repository.
Have you ever heard of branches? Subversion does support them, you know.
And have you tried merging in changes from N different branches into your private branch in Subversion to get the latest from other developers working on the same code? Because I have done this with git and it's *trivial*. Also are you really suggesting that Linux development would work with thousands of developers using subversion to do branches? Do you expect anybody to get anything done in that situation? And no that's not a rhetorical question. HTH -- Dean Michael Berris about.me/deanberris

AMDG On 1/30/2011 4:35 PM, Dean Michael Berris wrote:
On Mon, Jan 31, 2011 at 7:38 AM, Steven Watanabe
wrote: I can understand someone wanting this although it would probably not affect me personally, since, a) If I'm merging a lot of changes at once, it's only because I'm synching 2 branches. If I were trying to combine independent changes, each one would get a separate commit anyway.
With subversion, each commit is a different revision number right? Therefore that means there's only one state of the entire repository including private branches, etc.
Yes, but I don't see how that's relevant. How can the repository be in more than one state? Now if only we had quantum repositories...
What then happens when you have two people trying to merge from two different branches into one branch. Can you do this incrementally?
What do you mean by that? I can merge any subset of the changes, so I can split it up if I want to, or I can merge everything at once.
How would you track the single repository's state?
Each commit is guaranteed to be atomic.
How do you avoid clobbering each other's incremental merges?
If the merges touch the same files, the second person's commit will fail. This is a good thing because /someone/ has to resolve the conflict. Updating and retrying the commit will work if the tool can handle the merge automatically. (I personally always re-run the tests after updating, to make sure that I've tested what will be the new state of the branch even if there were no merge conflicts.).
Remember you're assuming that you're the only one trying to do the merge on the same code in a single repository. Consider the case where you have more than just you merging from different branches into the same branch.
In git, merging N remote-tracking branches into a single branch is possible with a single command on a local repo -- if you really wanted to do it that way.
This would require N svn commands. (Of course if I did it a lot I could script it. It really isn't a big deal.).
Of course you already stated that you don't want automated tools so if you *really* wanted to inspect the merge one commit at a time you can actually do it interactively as well.
I didn't say that I didn't want automated tools. I said that I didn't trust them. With svn that means that, before I commit I always a) run all the relevant tests b) review the full diff This is regardless of whether I'm committing new changes or merging from somewhere else.
b) If I want to merge something and I get conflicts, I'm probably going to resolve them instead of reverting the changeset.
Sure in which way there's largely no problem whether you're using git or subversion. But in git with a multi-developer project, since you're only basically touching your own repo most of the time and synchronizing a canonical repo is mostly a matter of policy (who does it, when, etc.). In the context of merging this means you can fix the merge locally and then push to the canonical repo if you have the rights to do it so that others can pull from that again and continue on with their work (at their own pace).
With subversion what happens is everyone absolutely has to be on the same page all the time and that's a problem.
It isn't a problem unless you're editing the same piece of code in parallel. If you find that you're stomping on each other's changes a lot a) The situation is best avoided to begin with. The version control tool can only help you so much, no matter how cool it is. No tool is ever going to be able to resolve true merge conflicts for you. b) Working in branches will buy you about as much as using a DVCS as far as putting off resolving conflicts is concerned. Honestly, if you assume the worst case, and don't use the tool intelligently, you're bound to get in trouble. I'm sure that I could invent cases where I get myself in trouble (mis-)using git that work fine with svn.
The maintainer can then do the adjustments on the history of the repo -- things like consolidating commits, etc. -- which largely is really what maintainers do,
Is it? I personally don't want to spend a lot of time dealing with version control--and I don't. The vast majority of my time is spent writing code or reviewing patches or running tests. All of which are largely unaffected by the version control tool.
only with git it's just a lot easier.
It isn't just easier with git, it's basically impossible with svn. In svn, the history is strictly append only. (Of course, some including me see this as a good thing...)
b) Why would I want to try it several different ways? I always know exactly what I want to merge before I start.
Which is also the point with git -- because you can choose which changesets exactly you want to take from where into your local repository. The fact that you *can* do this is a life saver for multi-developer projects -- and because it's easy it's something you largely don't have to avoid doing.
This doesn't answer the question I asked.
Of course you're looking at the whole thing with centralized VCS in mind. Consider the case that you have multiple remote branches you can pull from. If you're the maintainer and you want to basically consolidate the effort of multiple developers working on different parts of the same system, then you can do this piece-meal.
For example, you, Dave Abrahams, and I are working on some extensions to MPL. Let's just say for the sake of example.
I can have published changes up on my github fork of the MPL library, and Dave would be the maintainer, and you would have your published changes up on your github fork as well. Now let's say I'm not done yet with what I'm working on but the changes are available already from my fork. Let's say you tell Dave "hey Dave, I'm done, here's a pull request". Dave can then basically do a number of things:
1.) Just merge in what you've done because you're already finished and there's a pull request waiting. He does this on his local repo first to run tests locally -- once he's done with that he can push the changes to the canonical repo.
2.) Pull in my (not yet complete) changes first before he tries to merge your stuff in to see if there's something that I've touched that could potentially break what you've done. In this case Dave can notify you to pull the changes I've already made and see if you can work it out to get things fixed again. Or he can notify me and say "hey fix this!".
3.) Ask me to pull your stuff and ask me to finish up what I'm doing so that I can send a pull request that actually already incorporates your changes when I'm done.
... ad infinitum.
4.) Dave isn't paying attention, so nothing happens. A couple years later, after we've both moved on to other things, he notices my changes and decides that they're good and merges them. ...More time passes... He sees your changes and they look reasonable, so he tries to merge them. He gets a merge conflict and then notifies you asking you to update your feature. You are no longer following Boost development, so the changes get dropped on the floor. ...A few more years go by... Another developer finds that he needs your stuff. He resolves the conflicts with the current version and the changes eventually go into the official version. This is something like how things seem to work in practice now, and I don't see how using a different tool is going to change it.
With subversion, there's no way for something like this to happen with little friction.
Why not? Replace "github fork" with "branch" and subversion supports everything that you've described.
First we can't be working on the same code anyway because every time we try to commit we could be stomping on each other's changes and be spending our time just cursing subversion as we wait for the network traffic and spend most of our time just trying to merge changes when all we want to do is commit our changes so that we can record progress. Second we're going to have to use branches and have "rebasing" done manually anyway just so that we can all stay synchronized all the time --
What do you mean by "rebasing." Subversion has no such concept. If you want to stay synchronized constantly, you can. If you want to ignore everyone else's changes, you can. If you want to synchronize periodically, you can. If you want to take specific changes, you can. What's the problem?
c) Even if I were merging by trial and error, I still don't understand what makes a distributed system so much better. It doesn't seem like it should matter.
Because in a distributed system, you can have multiple sources to choose from and many different ways of globbing things together.
So, what I'm hearing is the fact that you have more things to merge makes merging easier. But that can't be what you mean, because it's obviously nonsense. Come again?
Yes, that's exactly what I mean.
Apparently not, since your answer flips around what I said.
Because merging is easy with git and is largely an automated process anyway,
If you will recall, the question I started out with is: "What about a distributed version control system makes merging easier?" That question remains unanswered. The best I've gotten is "git's automated merge is smart," but it seems to me that this is orthogonal to the fact that git is a DVCS.
merging changes from multiple sources when integrating for example to do a "feature freeze" and "stabilization" by the release engineering group is actually made *fun* and easier than if you had to merge every time you had to commit in an actively changing codebase.
I've never run into this issue. a) Boost code in general isn't changing that fast. b) My commits are generally "medium-sized." i.e. Each commit is a single unit that I consider ready to publish to the world. For smaller units, I've found that my memory and my editor's undo are good enough. Now, please don't tell me that I'm thinking like a centralized VCS user. I know I am, and I don't see a problem with it, when I'm using a centralized VCS. c) There's nothing stopping you from using a branch to avoid this problem. If you're unwilling to use the means that the tool provides to solve your issue, then the problem is not with the tool.
I don't know if you follow how the Linux development model works, but the short of it is that it won't work if they had a single repo that everybody (as in 1000s of developers) touched. Even in an environment where you had just 2 developers, by not having to synchronize everything you're lowering the chance of friction -- and when friction does occur, the "mess" only happens on the local repository, which you can fix locally, and then have the changes reflected in a different "canonical" repository.
Have you ever heard of branches? Subversion does support them, you know.
And have you tried merging in changes from N different branches into your private branch in Subversion to get the latest from other developers working on the same code? Because I have done this with git and it's *trivial*.
I've never wanted to do this, but unless there are conflicts, it should work just fine. If there are conflicts, you're going to have to resolve them one way or another regardless of the version control tool.
Also are you really suggesting that Linux development would work with thousands of developers using subversion to do branches? Do you expect anybody to get anything done in that situation? And no that's not a rhetorical question.
It might overload the server. That's a legitimate concern. But other than that, I don't see why not. (However, since I have nothing to do with Linux development, I may be totally wrong.) In Christ, Steven Watanabe

On Mon, Jan 31, 2011 at 22:20, Steven Watanabe
If you will recall, the question I started out with is: "What about a distributed version control system makes merging easier?" That question remains unanswered.
Hi! Sorry to interrupt, but I just googled this question (in fact directly used keywords in stackoverflow) and got this question/answer that might (or might not) clarify the "why" : http://stackoverflow.com/questions/43995/why-is-branching-and-merging-easier... Basically : "The hassle in CVS/SVN comes from the fact that these systems do not remember the parenthood of changes. In Git and Mercurial, not only can a commit have multiple children, it can also have multiple parents!" A full explaination is done in the (validated) answer. However it might be that there are other reasons helping the merges, but I guess this answer describe the main differencies between CVS/SVN and Mercurial/Git on the merge point. Hope it helps. Joël Lamotte.

On Mon, Jan 31, 2011 at 23:40, Klaim
On Mon, Jan 31, 2011 at 22:20, Steven Watanabe
wrote: If you will recall, the question I started out with is: "What about a distributed version control system makes merging easier?" That question remains unanswered.
Hi!
Sorry to interrupt, but I just googled this question (in fact directly used keywords in stackoverflow) and got this question/answer that might (or might not) clarify the "why" :
http://stackoverflow.com/questions/43995/why-is-branching-and-merging-easier...
This other question/answer might help too as the question is formulated as : "I often read that Hg (and Git and...) are better at merging than SVN but I have never seen practical examples of where Hg/Git can merge something where SVN fails (or where SVN needs manual intervention). Could you post a few step-by-step lists of branch/modify/commit/...-operations that show where SVN would fail while Hg/Git happily moves on? Practical, not highly exceptional cases please..." Reading it all is long but might help too to understand the underlying reasons about the easier merge statement. Joël Lamotte

This other question/answer might help too as the question is formulated as :
"I often read that Hg (and Git and...) are better at merging than SVN but I have never seen practical examples of where Hg/Git can merge something where SVN fails (or where SVN needs manual intervention). Could you post a few step-by-step lists of branch/modify/commit/...-operations that show where SVN would fail while Hg/Git happily moves on? Practical, not highly exceptional cases please..."
Reading it all is long but might help too to understand the underlying reasons about the easier merge statement.
I just forgot the link : http://stackoverflow.com/questions/2475831/merging-hg-git-vs-svn

Steven Watanabe
How do you avoid clobbering each other's incremental merges?
If the merges touch the same files, the second person's commit will fail. This is a good thing because /someone/ has to resolve the conflict. Updating and retrying the commit will work if the tool can handle the merge automatically. (I personally always re-run the tests after updating, to make sure that I've tested what will be the new state of the branch even if there were no merge conflicts.).
Right. This is one area where I get the most value from a DVCS (YMMV). When someone has done conflicting changes, you can commit your changes locally, so they are kept safe as a coherent whole. Only when you try and push to the shared repository do you get merge conflicts. When you've resolved the merge conflicts, you can then commit a new version locally, and push that to the main repo. The final revision history will now show your changes and the merge as separate entries in the log, and if you mess up the merge it's easy to revert back to your private state and try again. With subversion, unless you are working on a private branch, then if someone else makes conflicting changes before you check your code in then you have to merge their changes into your working directory before you can commit. Unless you save your changes first locally (e.g. in a zip file, or a backup directory), then if you mess up the merge you might well lose your local changes too. Anthony -- Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/ just::thread C++0x thread library http://www.stdthread.co.uk Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

On Tue, Feb 1, 2011 at 5:20 AM, Steven Watanabe
AMDG
On 1/30/2011 4:35 PM, Dean Michael Berris wrote:
With subversion, each commit is a different revision number right? Therefore that means there's only one state of the entire repository including private branches, etc.
Yes, but I don't see how that's relevant. How can the repository be in more than one state? Now if only we had quantum repositories...
That's important because in a DVCS, there's no one repository. Therefore that means each and every repository clone of the original canonical repository out there has its own state. However in Git, every commit has its own identity which knows who its parents are. That means you can then apply many different commits from many places, conglomerate them into your local repository, and reflect that tree onto the canonical repo if you're the maintainer. This then allows everyone else to merge in this tree into their local repo and therefore you get the distributed and scalable aspect for multi-developer projects. It's really an important distinction.
What then happens when you have two people trying to merge from two different branches into one branch. Can you do this incrementally?
What do you mean by that? I can merge any subset of the changes, so I can split it up if I want to, or I can merge everything at once.
I mean let's say we have the tree in subversion: trunk r1 |---- your branch |---- my branch If I merge in changes from trunk to my branch, and then you do the same at a slightly later time, and we both try to merge back into trunk. In subversion we would have to do that in a single commit each. With Git, merging the trees of my branch and your branch is a single command, and is largely automatic -- if we share commits from trunk in our branches, Git knows what to do and a lot of the conflicts are largely really just the conflicts on changes we both have that need resolving. We do the resolving locally so we both can keep trying to run into each other's merge race, but then the canonical repo we're both working on maintains a single tree.
How would you track the single repository's state?
Each commit is guaranteed to be atomic.
And so that means the state I have in my working copy is not a unique state. That means, if I've made a ton of changes locally that I haven't committed yet and 80% of those changes conflict with changes in the central repo, I just say "OH FML"? Note that even if I do have a branch, synchronizing changes back into the source branch will be a PITA.
How do you avoid clobbering each other's incremental merges?
If the merges touch the same files, the second person's commit will fail. This is a good thing because /someone/ has to resolve the conflict. Updating and retrying the commit will work if the tool can handle the merge automatically. (I personally always re-run the tests after updating, to make sure that I've tested what will be the new state of the branch even if there were no merge conflicts.).
We have the same workflow except in Git, I don't have a chance to mess up the canonical repo if I'm not the maintainer. What happens with Git is that if I am a co-maintainer of a library's repo with you, then if I push and the merge that happens upstream is not a "fast-forward" (i.e. no merge conflicts will happen and is likely just an index update) then I have to pull the changes and merge the commits locally -- this is not the case with Subversion as I have to have exactly the same revision number in my working copy for me to be able to commit anything. Note that git works as a series of patches laid out in a DAG and each commit is unique, meaning each commit can be transplanted from branch to branch and the identity is maintained even if you had it in a ton of branches (or repositories).
Remember you're assuming that you're the only one trying to do the merge on the same code in a single repository. Consider the case where you have more than just you merging from different branches into the same branch.
In git, merging N remote-tracking branches into a single branch is possible with a single command on a local repo -- if you really wanted to do it that way.
This would require N svn commands. (Of course if I did it a lot I could script it. It really isn't a big deal.).
Not only that, it would also require your working copy to be sync'ed with repo's state for the branch for which you want to do the commit. This synchronization is a killer in multi-developer projects touching the same code base.
Of course you already stated that you don't want automated tools so if you *really* wanted to inspect the merge one commit at a time you can actually do it interactively as well.
I didn't say that I didn't want automated tools. I said that I didn't trust them. With svn that means that, before I commit I always a) run all the relevant tests b) review the full diff
And that workflow is very much supported in git as well. You can review full diffs in git as well. And you can commit locally and test everything locally and when you're satisfied you can then ask the maintainer to (or if you're the maintainer, just) push the changes up to the publicly accessible repo. Every other developer then synchronizes their own local repo by merging in changes from upstream (the canonical repo) and then stabilizing their local repo's at their own pace. What's important here is the "at their own pace" part.
This is regardless of whether I'm committing new changes or merging from somewhere else.
Agreed. And this is precisely the same workflow that is made a lot easier by Git. Not only is it easy, it's crazy fast as well.
With subversion what happens is everyone absolutely has to be on the same page all the time and that's a problem.
It isn't a problem unless you're editing the same piece of code in parallel. If you find that you're stomping on each other's changes a lot
a) The situation is best avoided to begin with. The version control tool can only help you so much, no matter how cool it is. No tool is ever going to be able to resolve true merge conflicts for you.
Right, but in the cases where lots of people are touching the same code base, a tool that supports this workflow better is suited best for that situation. If we want to keep Boost as a "read, but don't touch" open source project where only a handful of people get the privilege of mucking the single repository up, then I guess there's no point in having a discussion on using Git because subversion is perfect for that. If we're in agreement that we don't want more contributors pitching in to the same codebase then I withdraw the suggestion to use Git and use it in my projects instead.
b) Working in branches will buy you about as much as using a DVCS as far as putting off resolving conflicts is concerned.
Honestly, if you assume the worst case, and don't use the tool intelligently, you're bound to get in trouble. I'm sure that I could invent cases where I get myself in trouble (mis-)using git that work fine with svn.
I think there's fundamental impedance mismatch when you look at the act of developing an open source project with 100 people as compared to having just 2 or three people touching the same code. In the simplest case scenario of less than a handful of people are touching the code, heck tarballs and exchanging patches should work just fine -- you're going to resolve conflicts anyway. But if you have a lot more people doing this then you have a choice: either use a tool that supports thousands of concurrent developers, or use one that supports maybe a few tens of developers. Branches in git are a different beast from how a branch in subversion looks like. Basically in subversion, you're copying a snapshot of the code and making changes on top of that. When merging you take commits that are made from one branch into another using the repository version as the identifier for changes made in the code. That's alright if you only had one repository and just a few developers touching the code and doing the merging -- now scale that to a hundred people touching the same code and having one branch each, then you start seeing how just branches won't cut it. This is true not only for Boost -- imagine a hundred people working on the containers or algorithms collections at the same time -- especially if it wants to support a lot more contributors than it already has. In Git, branches themselves are basically sub-trees where each and every sub-branch (a branch from a branch) can be transplanted from one branch to any other branch. And then you have the distributed nature of the beast where in your branches can be tracking remote branches -- meaning it will be synchronizing with the remote branch's tree. So there's no "one state" of the whole project except the one that the developers agree upon is the canonical repo. Then that means anybody can be working on Boost libraries and porting them to a platform that nobody else in the current Boost pool of developers has, and making it stable until such time that they see it fit to contribute changes back upstream -- this means they didn't need to get anybody's permission to muck around with Boost to get commit access to the repository so that they can work on things that matter to them *and keep a record of the changes locally*. This goes the same for people to just want to maintain local Boost repositories for their own organizations and would want for example to fix all warnings and not have to submit those changes until they're ready later on.
The maintainer can then do the adjustments on the history of the repo -- things like consolidating commits, etc. -- which largely is really what maintainers do,
Is it? I personally don't want to spend a lot of time dealing with version control--and I don't. The vast majority of my time is spent writing code or reviewing patches or running tests. All of which are largely unaffected by the version control tool.
Of course in Boost, what happens is maintainers are largely the same developers of the project as well. Which is odd for an open source project the magnitude and importance of Boost. If you don't want to spend a lot of time dealing with version control then git is precisely the tool you want. If you spend a couple of seconds (or maybe a minute) committing things or merging them to the single Boost subversion repository, then you can spend a fraction of that (an order of magnitude less) than the time you would using git. Benchmarks abound comparing performance of git against subversion in most of these routine operations showing how git is much more efficient and better at staying out of your way than subversion is.
only with git it's just a lot easier.
It isn't just easier with git, it's basically impossible with svn. In svn, the history is strictly append only. (Of course, some including me see this as a good thing...)
In publicly-accessible Git repositories, it is encouraged that history is preserved so that those that clone from it and build upon it see a "truthful" version of the code. But precisely because you can muck around with your local commits before submitting patches upstream, this flexibility allows you to do things like that on your local repository. Just one of those things that changes the workflow and allows developers to improve things locally *incrementally* and synchronize later only when it's necessary.
b) Why would I want to try it several different ways? I always know exactly what I want to merge before I start.
Which is also the point with git -- because you can choose which changesets exactly you want to take from where into your local repository. The fact that you *can* do this is a life saver for multi-developer projects -- and because it's easy it's something you largely don't have to avoid doing.
This doesn't answer the question I asked.
Of course you're looking at the whole thing with centralized VCS in mind. Consider the case that you have multiple remote branches you can pull from. If you're the maintainer and you want to basically consolidate the effort of multiple developers working on different parts of the same system, then you can do this piece-meal.
For example, you, Dave Abrahams, and I are working on some extensions to MPL. Let's just say for the sake of example.
I can have published changes up on my github fork of the MPL library, and Dave would be the maintainer, and you would have your published changes up on your github fork as well. Now let's say I'm not done yet with what I'm working on but the changes are available already from my fork. Let's say you tell Dave "hey Dave, I'm done, here's a pull request". Dave can then basically do a number of things:
1.) Just merge in what you've done because you're already finished and there's a pull request waiting. He does this on his local repo first to run tests locally -- once he's done with that he can push the changes to the canonical repo.
2.) Pull in my (not yet complete) changes first before he tries to merge your stuff in to see if there's something that I've touched that could potentially break what you've done. In this case Dave can notify you to pull the changes I've already made and see if you can work it out to get things fixed again. Or he can notify me and say "hey fix this!".
3.) Ask me to pull your stuff and ask me to finish up what I'm doing so that I can send a pull request that actually already incorporates your changes when I'm done.
... ad infinitum.
4.) Dave isn't paying attention, so nothing happens. A couple years later, after we've both moved on to other things, he notices my changes and decides that they're good and merges them. ...More time passes... He sees your changes and they look reasonable, so he tries to merge them. He gets a merge conflict and then notifies you asking you to update your feature. You are no longer following Boost development, so the changes get dropped on the floor. ...A few more years go by... Another developer finds that he needs your stuff. He resolves the conflicts with the current version and the changes eventually go into the official version.
This is something like how things seem to work in practice now, and I don't see how using a different tool is going to change it.
And this is so easy to fix with git because then if Dave the maintainer isn't paying attention, either one of us can ping a release manager or let everybody know that "hey, we're trying to consolidate changes here but Dave isn't paying attention!" and thus someone can pick either one of our repositories as the "canonical" repo for the library. Of course that promotes either one of us to be the maintainer -- it's a lot more fluid process that is explicitly supported and encouraged by the git workflow. This is the insurance mechanism and the "business continuity process" that is built-into the distributed version control systems like git, mercurial, bazaar, etc.
With subversion, there's no way for something like this to happen with little friction.
Why not? Replace "github fork" with "branch" and subversion supports everything that you've described.
If you made your subversion repository publicly accessible without need for authenticating who the user is to be able to commit changes then that would be true. Otherwise as it stands at the moment you need permission to even touch the Boost repository. And this part turns a lot of people away from wanting to contribute because the other way around it is to submit a patch in Trac -- which is quite honestly painful and time consuming as heck.
First we can't be working on the same code anyway because every time we try to commit we could be stomping on each other's changes and be spending our time just cursing subversion as we wait for the network traffic and spend most of our time just trying to merge changes when all we want to do is commit our changes so that we can record progress. Second we're going to have to use branches and have "rebasing" done manually anyway just so that we can all stay synchronized all the time --
What do you mean by "rebasing." Subversion has no such concept. If you want to stay synchronized constantly, you can. If you want to ignore everyone else's changes, you can. If you want to synchronize periodically, you can. If you want to take specific changes, you can. What's the problem?
The concept of rebasing is really simple: 1. I branch from trunk revision 1, and make changes until revision 30. 2. In between r30 and r1 some things change in trunk. 3. I want to make my branch upto date with the changes that have been in trunk since r1 to r30 so I 're-base' by pulling the changes from trunk into my branch up to r30. 4. I have to (or subversion has to) remember that I've already merged in changes up to r30 so the next time I do the same operation, I don't try to pull in the changes that are already there. 5. When I commit r31, then I have effectively rebased my branch to trunk r30. OTOH with git, we can just be working on our local master tracking the canonical master, and just keep making changes willy-nilly locally. When we want to push to the repository only then would we want to actually merge in changes. That's supported sure, and that's no better than the subversion approach. BUT ... with git you and I can work on separate local branches that fork off from master. We can keep making changes in that branch and then later on once we're ready to integrate back to master, we do that locally (we might even squash commits from the local branch so that we can submit a single big-ass patch to the upstream maintainer). That doesn't seem enticing at first but then imagine 20 or 100 of us doing that to the same source code and you'll quickly see why the subversion approach isn't going to scale and can potentially hold our individual progress up, not just the progress of the whole project.
c) Even if I were merging by trial and error, I still don't understand what makes a distributed system so much better. It doesn't seem like it should matter.
Because in a distributed system, you can have multiple sources to choose from and many different ways of globbing things together.
So, what I'm hearing is the fact that you have more things to merge makes merging easier. But that can't be what you mean, because it's obviously nonsense. Come again?
Yes, that's exactly what I mean.
Apparently not, since your answer flips around what I said.
I meant, with *git* and because merging is almost as painless as possible most of the time, having more things to merge from makes it easier. The logic is really simple: if I can pick from more sources of things to merge in, I can do that all at the same time and if things fail, back out and exclude a source and see if things go fine. I can then isolate which things I would merge without having to resolve manual conflicts, push those to the canonical repo, and as a maintainer just tell the other sources "hey, synchronize with the state now and try again". That means it's easier for me as a maintainer now that I don't have to manually figure everything out, I can have other sources deal with it for me if they really want to have their stuff included. The carrot is that your changes get into Boost, the stick is your pull request has to apply cleanly.
Because merging is easy with git and is largely an automated process anyway,
If you will recall, the question I started out with is: "What about a distributed version control system makes merging easier?" That question remains unanswered.
I've said yes all the while here and that was largely mostly because once you've tried it and have been in a project where the distributed thing is actually done, you'd see that having everything be synchronized is just a waste of time.
The best I've gotten is "git's automated merge is smart," but it seems to me that this is orthogonal to the fact that git is a DVCS.
Why? Because it's distributed is precisely why merging is so much easier.
merging changes from multiple sources when integrating for example to do a "feature freeze" and "stabilization" by the release engineering group is actually made *fun* and easier than if you had to merge every time you had to commit in an actively changing codebase.
I've never run into this issue. a) Boost code in general isn't changing that fast.
Which I suppose is due to: 1. Lack of active contributors. 2. The process to contributing requires all sorts of permissions and front-loaded work on potential contributors which means even before people want to strart contributing they're turned off by the rigidity of the process and the toolset leading to 1 above. 3. See 1 above.
b) My commits are generally "medium-sized." i.e. Each commit is a single unit that I consider ready to publish to the world. For smaller units, I've found that my memory and my editor's undo are good enough. Now, please don't tell me that I'm thinking like a centralized VCS user. I know I am, and I don't see a problem with it, when I'm using a centralized VCS.
Now, with a DVCS you don't have to rely on your memory too much or your editor's undo limit. This also, if I may say so myself, isn't a scalable way of doing it. There are local branches for that sort of thing and if you want to submit a singular patch (a squashed merge into a single commit) then that's *trivial* to do with git.
c) There's nothing stopping you from using a branch to avoid this problem. If you're unwilling to use the means that the tool provides to solve your issue, then the problem is not with the tool.
But having to branch in a central repo compared to branching a local repo is the difference between night in the jungle and day by the beach, respectively.
Have you ever heard of branches? Subversion does support them, you know.
And have you tried merging in changes from N different branches into your private branch in Subversion to get the latest from other developers working on the same code? Because I have done this with git and it's *trivial*.
I've never wanted to do this, but unless there are conflicts, it should work just fine. If there are conflicts, you're going to have to resolve them one way or another regardless of the version control tool.
Unfortunately, that's not as easy as you make it sound with subversion. Let's take a simple example: 1. I branch off trunk r1 2. Developer B branches of trunk r99 3. Developer C branches off trunk r1000 Now I want to merge changes from Developer B's branch into my branch so that I can try it out. That's fine because I'll be pulling the changes from r1 to r99. Now let's take the reverse, Developer C wants to pull from my branch, what happens? Hell breaks loose because he doesn't have the history in his branch about the state that was trunk r1..999. This kind of thing is what I'm talking about git makes easy -- because you know the whole history up front, even if two branches were branched off different points in the tree, there's no problem making that merge and trying to replay your changes on top of things. Of course the likelihood that you'll see conflicts is dependent on the parts of the code that is being touched, but the fact that *it's possible* is just powerful. Now scale the above to 10, 20, 50 developers and you'll see why the centralized model breaks down.
Also are you really suggesting that Linux development would work with thousands of developers using subversion to do branches? Do you expect anybody to get anything done in that situation? And no that's not a rhetorical question.
It might overload the server. That's a legitimate concern. But other than that, I don't see why not. (However, since I have nothing to do with Linux development, I may be totally wrong.)
It's not just that. Nobody would want to be merging anything with subversion that way. And imagine if everyone asked Linus to do the merge for them into his branch. That just isn't the scalable way to go. Oh and not to mention the administration nightmare of that managing thousands of usernames and passwords, worrying about backups, the insane checkouts and switches required, etc. HTH -- Dean Michael Berris about.me/deanberris

AMDG On 2/1/2011 1:27 AM, Dean Michael Berris wrote: > On Tue, Feb 1, 2011 at 5:20 AM, Steven Watanabewrote: >> On 1/30/2011 4:35 PM, Dean Michael Berris wrote: >>> >>> With subversion, each commit is a different revision number right? >>> Therefore that means there's only one state of the entire repository >>> including private branches, etc. >>> >> >> Yes, but I don't see how that's relevant. >> How can the repository be in more than one state? >> Now if only we had quantum repositories... >> > > That's important because in a DVCS, there's no one repository. > Therefore that means each and every repository clone of the original > canonical repository out there has its own state. > Okay. Let me just say this once and for all. The thing that really bugs me when people start talking about how wonderful DVCS's are is that as far as I am concerned, *The physical location of the data does not matter much.* > (along with every other version control tool > since the dawn of time)> > >>> How >>> would you track the single repository's state? >> >> Each commit is guaranteed to be atomic. >> > > And so that means the state I have in my working copy is not a unique > state. That means, if I've made a ton of changes locally that I > haven't committed yet Just don't do that. It sounds to me like you're imposing your DVCS development model on a centralized VCS and trying to keep changes to yourself for a long time. If you don't use the tool correctly, it isn't the tool's fault when things break down. > and 80% of those changes conflict with changes > in the central repo, I just say "OH FML"? Note that even if I do have > a branch, synchronizing changes back into the source branch will be a > PITA. > And it will be no matter what tool you use. There's no way that git can somehow magically make the conflicts go away. >>> How do you avoid >>> clobbering each other's incremental merges? >> >> If the merges touch the same files, the >> second person's commit will fail. This is >> a good thing because /someone/ has to resolve >> the conflict. Updating and retrying the commit >> will work if the tool can handle the merge >> automatically. (I personally always re-run >> the tests after updating, to make sure that >> I've tested what will be the new state of >> the branch even if there were no merge conflicts.). >> > > We have the same workflow except in Git, I don't have a chance to mess > up the canonical repo if I'm not the maintainer. What happens with Git > is that if I am a co-maintainer of a library's repo with you, then if > I push and the merge that happens upstream is not a "fast-forward" > (i.e. no merge conflicts will happen and is likely just an index > update) then I have to pull the changes and merge the commits locally > -- this is not the case with Subversion as I have to have exactly the > same revision number in my working copy for me to be able to commit > anything. To be clear, the global revision number doesn't matter. The requirement is that you have to have the most recent versions of all the files that your commit touches. If this happens, I /want/ to be notified, even if the merge can be done automatically. > Note that git works as a series of patches laid out in a DAG > and each commit is unique, meaning each commit can be transplanted > from branch to branch and the identity is maintained even if you had > it in a ton of branches (or repositories). > >>> Remember you're assuming >>> that you're the only one trying to do the merge on the same code in a >>> single repository. Consider the case where you have more than just you >>> merging from different branches into the same branch. >>> >>> In git, merging N remote-tracking branches into a single branch is >>> possible with a single command on a local repo -- if you really wanted >>> to do it that way. >> >> This would require N svn commands. (Of course >> if I did it a lot I could script it. It really >> isn't a big deal.). >> > > Not only that, it would also require your working copy to be sync'ed > with repo's state for the branch for which you want to do the commit. > This synchronization is a killer in multi-developer projects touching > the same code base. > This synchronization isn't necessary and depends on how you manage your branches. If you run into problems here, it's self-inflicted. >>> Of course you already stated that you don't want >>> automated tools so if you *really* wanted to inspect the merge one >>> commit at a time you can actually do it interactively as well. >>> >> >> I didn't say that I didn't want automated tools. >> I said that I didn't trust them. With svn that >> means that, before I commit I always >> a) run all the relevant tests >> b) review the full diff >> > > And that workflow is very much supported in git as well. Of course it is. This is really basic usage of the version control tool. > I think there's fundamental impedance mismatch when you look at the > act of developing an open source project with 100 people as compared > to having just 2 or three people touching the same code. > > In the simplest case scenario of less than a handful of people are > touching the code, heck tarballs and exchanging patches should work > just fine -- you're going to resolve conflicts anyway. But if you have > a lot more people doing this then you have a choice: either use a tool > that supports thousands of concurrent developers, or use one that > supports maybe a few tens of developers. > > Branches in git are a different beast from how a branch in subversion > looks like. Basically in subversion, you're copying a snapshot of the > code and making changes on top of that. When merging you take commits > that are made from one branch into another using the repository > version as the identifier for changes made in the code. That's alright > if you only had one repository and just a few developers touching the > code and doing the merging -- now scale that to a hundred people > touching the same code and having one branch each, then you start > seeing how just branches won't cut it. This is true not only for Boost > -- imagine a hundred people working on the containers or algorithms > collections at the same time -- especially if it wants to support a > lot more contributors than it already has. > Is this ever going to happen? Probably not. Anyway, we aren't even close to hitting the limits of subversion, so your concerns about scalability are pretty much irrelevant at this point in time. > This goes the same for people to just want to > maintain local Boost repositories for their own organizations and > would want for example to fix all warnings and not have to submit > those changes until they're ready later on. > Unless you assume that the entire world is using git, which is not the case now, and will probably never be the case, using git won't help all that much here. The solution to this problem is really simple and well-known: import Boost into your version control tool. >>> The maintainer can then do the adjustments on the history of the repo >>> -- things like consolidating commits, etc. -- which largely is really >>> what maintainers do, >> >> Is it? I personally don't want to spend a lot of >> time dealing with version control--and I don't. >> The vast majority of my time is spent writing code >> or reviewing patches or running tests. All of >> which are largely unaffected by the version control >> tool. >> > > Of course in Boost, what happens is maintainers are largely the same > developers of the project as well. Which is odd for an open source > project the magnitude and importance of Boost. > Do you see a problem with this situation? It seems pretty fundamental to the way Boost operates. > If you don't want to spend a lot of time dealing with version control > then git is precisely the tool you want. If you spend a couple of > seconds (or maybe a minute) committing things or merging them to the > single Boost subversion repository, then you can spend a fraction of > that (an order of magnitude less) than the time you would using git. How exactly is saving an order of magnitude on 1% of my development time supposed to help? Come on. It takes me longer to find the right terminal, than it does to commit. > Benchmarks abound comparing performance of git against subversion in > most of these routine operations showing how git is much more > efficient and better at staying out of your way than subversion is. > Of course there are, given all the hype about git. > And this is so easy to fix with git because then if Dave the > maintainer isn't paying attention, either one of us can ping a release > manager or let everybody know that "hey, we're trying to consolidate > changes here but Dave isn't paying attention!" and thus someone can > pick either one of our repositories as the "canonical" repo for the > library. Of course that promotes either one of us to be the maintainer > -- it's a lot more fluid process that is explicitly supported and > encouraged by the git workflow. This is the insurance mechanism and > the "business continuity process" that is built-into the distributed > version control systems like git, mercurial, bazaar, etc. > The bottleneck here is exactly the same as the bottleneck in the current system. - someone has to volunteer to be the new maintainer - someone has to approve this Using git will not change anything essential. It will only change minor details of the process. >>> With subversion, there's no way for something like this to happen with >>> little friction. >> >> Why not? Replace "github fork" with "branch" and >> subversion supports everything that you've described. >> > > If you made your subversion repository publicly accessible without > need for authenticating who the user is to be able to commit changes > then that would be true. Otherwise as it stands at the moment you need > permission to even touch the Boost repository. As far as I know, sandbox access is granted to pretty much anyone who asks. Besides which, if you want to avoid having anything to do with the official Boost, you always have the option of importing Boost into and working off in your own world. > And this part turns a > lot of people away from wanting to contribute I'm sure that it will turn away the git users who think they can't use anything else. Frankly, I don't care. > because the other way > around it is to submit a patch in Trac -- which is quite honestly > painful and time consuming as heck. > ??? Creating a diff is easy. Uploading a file is easy. > The concept of rebasing is really simple: > > 1. I branch from trunk revision 1, and make changes until revision 30. > 2. In between r30 and r1 some things change in trunk. > 3. I want to make my branch upto date with the changes that have been > in trunk since r1 to r30 so I 're-base' by pulling the changes from > trunk into my branch up to r30. > 4. I have to (or subversion has to) remember that I've already merged > in changes up to r30 so the next time I do the same operation, I don't > try to pull in the changes that are already there. Okay. Subversion does remember this, so where's the problem? >>>>>> c) Even if I were merging by trial and error, I >>>>>> still don't understand what makes a distributed >>>>>> system so much better. It doesn't seem like it >>>>>> should matter. >>>>>> >>>>> >>>>> Because in a distributed system, you can have multiple sources to >>>>> choose from and many different ways of globbing things together. >>>>> >>>> >>>> So, what I'm hearing is the fact that you >>>> have more things to merge makes merging >>>> easier. But that can't be what you mean, >>>> because it's obviously nonsense. Come again? >>>> >>> >>> Yes, that's exactly what I mean. >> >> Apparently not, since your answer flips around >> what I said. >> > > I meant, with *git* and because merging is almost as painless as > possible most of the time, having more things to merge from makes it > easier. The logic is really simple: if I can pick from more sources of > things to merge in, I can do that all at the same time and if things > fail, back out and exclude a source and see if things go fine. I can > then isolate which things I would merge without having to resolve > manual conflicts, push those to the canonical repo, and as a > maintainer just tell the other sources "hey, synchronize with the > state now and try again". That means it's easier for me as a > maintainer now that I don't have to manually figure everything out, I > can have other sources deal with it for me if they really want to have > their stuff included. > Okay so what you're saying is that if the changes are split up into smaller independent pieces, it's easier to isolate the parts that cause conflicts. Is that correct? This is (finally) something I can make sense of. >>> Because merging is easy with git and >>> is largely an automated process anyway, >> >> If you will recall, the question I started out with >> is: "What about a distributed version control system >> makes merging easier?" That question remains unanswered. > > I've said yes all the while here *Sigh* Are you even reading what I wrote? This wasn't a yes/no question. > and that was largely mostly because > once you've tried it and have been in a project where the distributed > thing is actually done, you'd see that having everything be > synchronized is just a waste of time. > That has nothing to do with synchronization. >> The best I've gotten is "git's automated merge is smart," >> but it seems to me that this is orthogonal to the fact >> that git is a DVCS. >> > > Why? Because it's distributed is precisely why merging is so much easier. > Are we even speaking the same language? Define "distributed." >>> merging changes from multiple >>> sources when integrating for example to do a "feature freeze" and >>> "stabilization" by the release engineering group is actually made >>> *fun* and easier than if you had to merge every time you had to commit >>> in an actively changing codebase. >>> >> >> I've never run into this issue. >> a) Boost code in general isn't changing that fast. > > Which I suppose is due to: > > 1. Lack of active contributors. > > 2. The process to contributing requires all sorts of permissions and > front-loaded work on potential contributors which means even before > people want to strart contributing they're turned off by the rigidity > of the process and the toolset leading to 1 above. > > 3. See 1 above. > 4. A lot of the code is fairly mature. There is no need for continuous massive changes. >> b) My commits are generally "medium-sized." i.e. >> Each commit is a single unit that I consider >> ready to publish to the world. For smaller units, >> I've found that my memory and my editor's undo >> are good enough. Now, please don't tell me that >> I'm thinking like a centralized VCS user. I know >> I am, and I don't see a problem with it, when I'm >> using a centralized VCS. > > Now, with a DVCS you don't have to rely on your memory too much or > your editor's undo limit. This also, if I may say so myself, isn't a > scalable way of doing it. > Sure it is. It isn't like the chunks that I work on are going to get bigger as Boost grows. I know my limits and I stay within them. >> c) There's nothing stopping you from using a branch to >> avoid this problem. If you're unwilling to use >> the means that the tool provides to solve your >> issue, then the problem is not with the tool. >> > > But having to branch in a central repo compared to branching a local > repo is the difference between night in the jungle and day by the > beach, respectively. > Which means absolutely nothing to me. I already know that you prefer working with a DVCS. > Unfortunately, that's not as easy as you make it sound with subversion. > > Let's take a simple example: > > 1. I branch off trunk r1 > 2. Developer B branches of trunk r99 > 3. Developer C branches off trunk r1000 > > Now I want to merge changes from Developer B's branch into my branch > so that I can try it out. That's fine because I'll be pulling the > changes from r1 to r99. Now let's take the reverse, Developer C wants > to pull from my branch, what happens? Hell breaks loose because he > doesn't have the history in his branch about the state that was trunk > r1..999. > Huh? svn knows where you branched from. > This kind of thing is what I'm talking about git makes easy -- because > you know the whole history up front, even if two branches were > branched off different points in the tree, there's no problem making > that merge and trying to replay your changes on top of things. Of > course the likelihood that you'll see conflicts is dependent on the > parts of the code that is being touched, but the fact that *it's > possible* is just powerful. > I just tried a modified form of what you described. (Obviously I don't want to write 1000 changesets...) For completeness I made 2 copies of Developer C's branch and merged a) my changes only. b) my entire branch. (Also merging B's changes) I then merged all four resulting branches back to the trunk. (Note that I avoided creating merge conflicts, since that's a separate issue.) Here's a slightly cleaned up transcript. Starting from a blank repository: >svn checkout %MYREPO% working Checked out revision 0. >cd working >svn mkdir trunk branches tags A trunk A branches A tags >svn commit -m "Set up standard repository layout." Adding branches Adding tags Adding trunk Committed revision 1. >notepad trunk\file.txt 1. The quick brown fox jumps over the lazy dog. 2. abcdefghijklmnopqrstuvwxyz. 3. If Peter Piper picked a peck of pickled peppers... 4. When in the course of human events it becomes necessary... 5. This is a test of the emergency broadcasting system. 6. Humpty Dumpty sat on a wall. 7. Row, row, row your boat, gently down the stream. >svn add trunk\file.txt A trunk\file.txt >svn commit -m "Add a new file." Adding trunk\file.txt Transmitting file data . Committed revision 2. >perl -i~ -lpe "s/2.*/2. trunk edit\n2.5 trunk add/" trunk\file.txt >svn diff Index: trunk/file.txt =================================================================== --- trunk/file.txt (revision 2) +++ trunk/file.txt (working copy) @@ -1,5 +1,6 @@ 1. The quick brown fox jumps over the lazy dog. -2. abcdefghijklmnopqrstuvwxyz. +2. trunk edit +2.5 trunk add 3. If Peter Piper picked a peck of pickled peppers... 4. When in the course of human events it becomes necessary... 5. This is a test of the emergency broadcasting system. >svn commit -m "Edit the trunk." Sending trunk\file.txt Transmitting file data . Committed revision 3. >perl -i~ -lpe "s/(1.*)/0.5 trunk edit again\n$1/" trunk\file.txt >svn diff Index: trunk/file.txt =================================================================== --- trunk/file.txt (revision 3) +++ trunk/file.txt (working copy) @@ -1,3 +1,4 @@ +0.5 trunk edit again 1. The quick brown fox jumps over the lazy dog. 2. trunk edit 2.5 trunk add >svn commit -m "Edit the trunk again." Sending trunk\file.txt Transmitting file data . Committed revision 4. >svn cp -m "Branch the trunk." %MYREPO%/trunk@2 %MYREPO%/branches/branchA Committed revision 5. >svn cp -m "Branch the trunk." %MYREPO%/trunk@3 %MYREPO%/branches/branchB Committed revision 6. >svn cp -m "Branch the trunk." %MYREPO%/trunk@4 %MYREPO%/branches/branchC Committed revision 7. C:\Users\Steven\Documents\boost\tmp4\working>svn up A branches\branchB A branches\branchB\file.txt A branches\branchC A branches\branchC\file.txt A branches\branchA A branches\branchA\file.txt Updated to revision 7. >perl -i~ -lpe "s/6.*/6. do something./" branches\branchA\file.txt >svn diff Index: branches/branchA/file.txt =================================================================== --- branches/branchA/file.txt (revision 7) +++ branches/branchA/file.txt (working copy) @@ -3,5 +3,5 @@ 3. If Peter Piper picked a peck of pickled peppers... 4. When in the course of human events it becomes necessary... 5. This is a test of the emergency broadcasting system. -6. Humpty Dumpty sat on a wall. +6. do something. 7. Row, row, row your boat, gently down the stream. >svn commit -m "Humpty Dumpty is obsolete." Sending branches\branchA\file.txt Transmitting file data . Committed revision 8. >perl -i~ -lne "print unless /4./" branches\branchB\file.txt >svn diff Index: branches/branchB/file.txt =================================================================== --- branches/branchB/file.txt (revision 7) +++ branches/branchB/file.txt (working copy) @@ -2,7 +2,6 @@ 2. trunk edit 2.5 trunk add 3. If Peter Piper picked a peck of pickled peppers... -4. When in the course of human events it becomes necessary... 5. This is a test of the emergency broadcasting system. 6. Humpty Dumpty sat on a wall. 7. Row, row, row your boat, gently down the stream. >svn commit -m "Remove declaration of independence." Sending branches\branchB\file.txt Transmitting file data . Committed revision 9. >echo 8. Developer C's cool stuff >> branches\branchC\file.txt >svn diff Index: branches/branchC/file.txt =================================================================== --- branches/branchC/file.txt (revision 7) +++ branches/branchC/file.txt (working copy) @@ -7,3 +7,4 @@ 5. This is a test of the emergency broadcasting system. 6. Humpty Dumpty sat on a wall. 7. Row, row, row your boat, gently down the stream. +8. Developer C's cool stuff >svn commit -m "My cool changes" Sending branches\branchC\file.txt Transmitting file data . Committed revision 10. >svn cp -m "Duplicate C." %MYREPO%/branches/branchC %MYREPO%/branches/branchC2 Committed revision 11. >svn merge %MYREPO%/branches/branchB branches/branchA --- Merging r3 into 'branches\branchA': U branches\branchA\file.txt --- Merging r4 through r11 into 'branches\branchA': G branches\branchA\file.txt >svn diff Property changes on: branches\branchA ___________________________________________________________________ Added: svn:mergeinfo Merged /trunk:r3 Merged /branches/branchB:r6-11 Index: branches/branchA/file.txt =================================================================== --- branches/branchA/file.txt (revision 8) +++ branches/branchA/file.txt (working copy) @@ -1,7 +1,7 @@ 1. The quick brown fox jumps over the lazy dog. -2. abcdefghijklmnopqrstuvwxyz. +2. trunk edit +2.5 trunk add 3. If Peter Piper picked a peck of pickled peppers... -4. When in the course of human events it becomes necessary... 5. This is a test of the emergency broadcasting system. 6. do something. 7. Row, row, row your boat, gently down the stream. >svn commit -m "merge from B" Sending branches\branchA svn: Commit failed (details follow): svn: Directory '/branches/branchA' is out of date >svn up A branches\branchC2 A branches\branchC2\file.txt Updated to revision 11. >svn commit -m "merge from B" Sending branches\branchA Sending branches\branchA\file.txt Transmitting file data . Committed revision 12. >svn merge %MYREPO%/branches/branchA branches/branchC --- Merging r3 through r12 into 'branches\branchC': U branches\branchC\file.txt U branches\branchC >svn diff Property changes on: branches\branchC ___________________________________________________________________ Added: svn:mergeinfo Merged /branches/branchA:r5-12 Merged /branches/branchB:r6-11 Index: branches/branchC/file.txt =================================================================== --- branches/branchC/file.txt (revision 11) +++ branches/branchC/file.txt (working copy) @@ -3,8 +3,7 @@ 2. trunk edit 2.5 trunk add 3. If Peter Piper picked a peck of pickled peppers... -4. When in the course of human events it becomes necessary... 5. This is a test of the emergency broadcasting system. -6. Humpty Dumpty sat on a wall. +6. do something. 7. Row, row, row your boat, gently down the stream. 8. Developer C's cool stuff >svn commit -m "Merge A's changes" Sending branches\branchC Sending branches\branchC\file.txt Transmitting file data . Committed revision 13. >svn log branches\branchA ------------------------------------------------------------------------ r12 | Steven | 2011-02-01 12:38:06 -0800 (Tue, 01 Feb 2011) | 1 line merge from B ------------------------------------------------------------------------ r8 | Steven | 2011-02-01 12:34:44 -0800 (Tue, 01 Feb 2011) | 1 line Humpty Dumpty is obsolete. ------------------------------------------------------------------------ r5 | Steven | 2011-02-01 12:33:23 -0800 (Tue, 01 Feb 2011) | 1 line Branch the trunk. ------------------------------------------------------------------------ r2 | Steven | 2011-02-01 12:31:08 -0800 (Tue, 01 Feb 2011) | 1 line Add a new file. ------------------------------------------------------------------------ r1 | Steven | 2011-02-01 12:29:32 -0800 (Tue, 01 Feb 2011) | 1 line Set up standard repository layout. ------------------------------------------------------------------------ >svn merge -c 8 %MYREPO%/branches/branchA branches/branchC2 --- Merging r8 into 'branches\branchC2': U branches\branchC2\file.txt >svn diff Property changes on: branches\branchC2 ___________________________________________________________________ Added: svn:mergeinfo Merged /branches/branchA:r8 Index: branches/branchC2/file.txt =================================================================== --- branches/branchC2/file.txt (revision 11) +++ branches/branchC2/file.txt (working copy) @@ -5,6 +5,6 @@ 3. If Peter Piper picked a peck of pickled peppers... 4. When in the course of human events it becomes necessary... 5. This is a test of the emergency broadcasting system. -6. Humpty Dumpty sat on a wall. +6. do something. 7. Row, row, row your boat, gently down the stream. 8. Developer C's cool stuff >svn commit -m "Merge A's changes." Sending branches\branchC2 Sending branches\branchC2\file.txt Transmitting file data . Committed revision 14. >svn merge %MYREPO%/branches/branchA trunk --- Merging r3 through r14 into 'trunk': U trunk\file.txt U trunk >svn merge %MYREPO%/branches/branchB trunk >svn merge %MYREPO%/branches/branchC trunk --- Merging r5 through r14 into 'trunk': G trunk\file.txt G trunk >svn merge %MYREPO%/branches/branchC2 trunk --- Merging r11 through r14 into 'trunk': G trunk\file.txt G trunk >svn diff Property changes on: trunk ___________________________________________________________________ Added: svn:mergeinfo Merged /branches/branchA:r5-14 Merged /branches/branchB:r6-14 Merged /branches/branchC:r7-14 Merged /branches/branchC2:r11-14 Index: trunk/file.txt =================================================================== --- trunk/file.txt (revision 11) +++ trunk/file.txt (working copy) @@ -3,7 +3,7 @@ 2. trunk edit 2.5 trunk add 3. If Peter Piper picked a peck of pickled peppers... -4. When in the course of human events it becomes necessary... 5. This is a test of the emergency broadcasting system. -6. Humpty Dumpty sat on a wall. +6. do something. 7. Row, row, row your boat, gently down the stream. +8. Developer C's cool stuff >svn commit -m "Merge back to trunk." Sending trunk Sending trunk\file.txt Transmitting file data . Committed revision 15. In Christ, Steven Watanabe

On 2/1/2011 3:11 PM, Steven Watanabe wrote:
Okay. Let me just say this once and for all. The thing that really bugs me when people start talking about how wonderful DVCS's are is that as far as I am concerned, *The physical location of the data does not matter much.*
The thing that I like about DVCS is that it gives me a way to make more frequent commits without worrying about whether or not I will break the build for other users. This gives me the ability to easily work on a feature or what not and say, make a commit everytime I save a file, even when my source might not even compile, it is equivalent to a lot of IDEs that offer a local history feature such as eclipse, netbeans or intellij, then when I am done I just push my changes to the main repo. The key difference being that I don't have to explicitly create a new branch for this, and since it can be done offline, I can do this say, on my 18 hour flight to Moscow on saturday...

Steven Watanabe wrote: [SNIP]
Okay. Let me just say this once and for all. The thing that really bugs me when people start talking about how wonderful DVCS's are is that as far as I am concerned, *The physical location of the data does not matter much.* [SNIP] Just don't do that. It sounds to me like you're imposing your DVCS development model on a centralized VCS and trying to keep changes to yourself for a long time. If you don't use the tool correctly, it isn't the tool's fault when things break down. [SNIP and SNIP]
I am following this discussion, so far silently, for some time now. I see very good descriptions about distributed version control and the way it enables more flexibility. And then I see your words Steven. And I am sorry to say that it makes me fel sad to read your words. People are kind and helpful, trying to explain to you what they mean, and why they think it is a good idea to use distributed version control. You get information/experience from people who use it daily and see the advantages. This, in itself, is a good thing. Even if git will never be used for Boost. However when I read your words, I get the feeling that you are not listening. Not only that, but you talk to people in an abrupt, unfriendly, even adversarial manner. (Thank you Google for spelling that for me.) To me it seems that you argue with what you think is an "attack" by "git dogmatics" on SVN and your way of working. I feel that you are arguing with your preconceptions about what the others think, and during that "heroic fight against irrelevance" you manage to quickly write off arguments, and do that with an attitude. Since it makes me feel bad (and I already spend too much on antidepressants) I cannot stop myself but to write and tell you, ask you to please change that tone. Nobody is attacking you. Nobody says your ways or the Boost ways are faulty. Nobody accuses you of anything bad. All that people have said is that (after extensive use of distributed VCS) they find it a more useful tool than centralized VC. It makes *their* work easier, and it enables *them* (and according to them many others) to contribute with more *ease*. And how easy something is, is of course, a matter of perception. Something that is not objective. I find this DVCS vs. CVCS argument very similar to the assembler vs. any-higher-level-language argument of the 80es and then then same with C vs. C++ later etc. I honestly believe that there is value in the words told about DVCS here. And I see your point as well. Why change if it works for now. I just wish you have made those points in a more polite manner. Attila

Attila Feher F
Steven Watanabe wrote: [SNIP]
Okay. Let me just say this once and for all. The thing that really bugs me when people start talking about how wonderful DVCS's are is that as far as I am concerned, *The physical location of the data does not matter much.* [SNIP] Just don't do that. It sounds to me like you're imposing your DVCS development model on a centralized VCS and trying to keep changes to yourself for a long time. If you don't use the tool correctly, it isn't the tool's fault when things break down. [SNIP and SNIP]
<schnipp>
I honestly believe that there is value in the words told about DVCS here. And I see your point as well. Why change if it works for now. I just wish you have made those points in a more polite manner.
For what it's worth (and not to imply there's anything wrong with what Attila wrote), I didn't find the quoted passages rude, and I thought Steven made good points. Furthermore, while I still disagree with him, I also understand why Steven feels the way he does. The beauty of Git eluded me until I had actually used it a bit, and I wished someone could explain why it was a big deal in terms that I thought held water logically. Here is one logically-watertight fact, FWIW: A *lot* of what I do with a VCS involves looking at history, including other branches (merges are an example). When the VCS is Git, those operations are *fast*, because the repo is local, and because the Git developers make performance a top priority. However, I don't expect that alone to change anyone's mind about Git. :-) Here's something a bit more subjective, but I think also compelling: having to commit (relatively slowly) to a publicly-visible repository is a disincentive to exploratory development and a handicap in when your current line of work gets interrupted. First, there's the fact that your experimental changes go out to the world. One could argue that programmers should simply get over the fear that other people will judge them on the basis of that code, but it's simply human nature not to want to expose anything but one's best work. Moreover, especially if, like me, you prefer to make fine-grained commits, it is slow to negotiate with a server for each little thing you want to do. So the incentives associated with SVN encourage you to keep changes on your local disk, with no logical separation or comment about their meaning until you're ready to show them. Working on a fix for something and need to handle an emergency somewhere else? Do you check in what you're working on and go back to a clean state to handle that emergency? With SVN I almost never did. With Git it's easy and fast, and doesn't expose incomplete work to the world, so I always do. I still have a boost svn tree with years-old pending changes in it. I tried to review them a few times to see what I could commit and what I should discard, but I get lost each time, because all I have is a wad of modified files without a record of what I was doing. Git encourages you to keep track of what you were doing by allowing you to snapshot your tree and write little notes about it, and by making it really fast to do so. So if you wonder why I'm trying to upend Boost's infrastructure and process, it's because too much has become a disincentive to participation (for me and for others, though obviously not everyone). -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
Attila Feher F
wrote: Steven Watanabe wrote: [SNIP]
Okay. Let me just say this once and for all. The thing that really bugs me when people start talking about how wonderful DVCS's are is that as far as I am concerned, *The physical location of the data does not matter much.* [SNIP] Just don't do that. It sounds to me like you're imposing your DVCS development model on a centralized VCS and trying to keep changes to yourself for a long time. If you don't use the tool correctly, it isn't the tool's fault when things break down. [SNIP and SNIP]
<schnipp>
I honestly believe that there is value in the words told about DVCS here. And I see your point as well. Why change if it works for now. I just wish you have made those points in a more polite manner.
For what it's worth (and not to imply there's anything wrong with what Attila wrote), I didn't find the quoted passages rude, and I thought Steven made good points. Furthermore, while I still disagree with him, I also understand why Steven feels the way he does. The beauty of Git eluded me until I had actually used it a bit, and I wished someone could explain why it was a big deal in terms that I thought held water logically.
Here is one logically-watertight fact, FWIW:
A *lot* of what I do with a VCS involves looking at history, including other branches (merges are an example). When the VCS is Git, those operations are *fast*, because the repo is local, and because the Git developers make performance a top priority.
However, I don't expect that alone to change anyone's mind about Git. :-)
Here's something a bit more subjective, but I think also compelling: having to commit (relatively slowly) to a publicly-visible repository is a disincentive to exploratory development and a handicap in when your current line of work gets interrupted.
First, there's the fact that your experimental changes go out to the world. One could argue that programmers should simply get over the fear that other people will judge them on the basis of that code, but it's simply human nature not to want to expose anything but one's best work. Moreover, especially if, like me, you prefer to make fine-grained commits, it is slow to negotiate with a server for each little thing you want to do.
It's true.
So the incentives associated with SVN encourage you to keep changes on your local disk, with no logical separation or comment about their meaning until you're ready to show them. Working on a fix for something and need to handle an emergency somewhere else? Do you check in what you're working on and go back to a clean state to handle that emergency? With SVN I almost never did. With Git it's easy and fast, and doesn't expose incomplete work to the world, so I always do.
It's also true. It's also true that git has some features that help with that. But, it does not mean Boost repository should be git. I am primarily using git to keep local patches against various projects, which use Subversion and CVS. In some cases, it's a substantial convenience. E.g. when a patch review turnaround can be a week, it's nice to put a sequence of patches on a git branch. Note however, that git is not necessary the only, or the best answer. Some folks who are way more effective that myself in dealing with patch series use quilt, and it works just fine desprite not having any hype around. And, if you prefer git, you can use git-svn just fine. This will solve all your issues above. It is true that having Boost use git will make it slightly more convenient to use git on your computer. However, only slightly more. And the downsides are: - Transition costs - Ongoing problems for new folks who are not familiar with git - Quirky ideas about how you should to version control that git appears to enforce on its users - That cherry-pick deficiency - And, finally, a serious risk that as soon as Boost switches to git, everybody will get excited, create 100 clones everywhere, and everybody will spends days sending and merging pull requests, while there will be no official version. This is of course, a process problem, but given our track record of ignoring process problems and focusing at not too important discussions, I bet we would not be able to fix this.
So if you wonder why I'm trying to upend Boost's infrastructure and process, it's because too much has become a disincentive to participation (for me and for others, though obviously not everyone).
Why don't you try git-svn, first? - Volodya -- Vladimir Prus Mentor Graphics +7 (812) 677-68-40

At Wed, 02 Feb 2011 12:12:10 +0300, Vladimir Prus wrote:
So if you wonder why I'm trying to upend Boost's infrastructure and process, it's because too much has become a disincentive to participation (for me and for others, though obviously not everyone).
Why don't you try git-svn, first?
Don't worry, I have. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On 2/2/11 4:12 AM, in article iib75a$v6a$1@dough.gmane.org, "Vladimir Prus" wrote:
So the incentives associated with SVN encourage you to keep changes on your local disk, with no logical separation or comment about their meaning until you're ready to show them. Working on a fix for something and need to handle an emergency somewhere else? Do you check in what you're working on and go back to a clean state to handle that emergency? With SVN I almost never did. With Git it's easy and fast, and doesn't expose incomplete work to the world, so I always do.
It's also true. It's also true that git has some features that help with that. But, it does not mean Boost repository should be git.
I am primarily using git to keep local patches against various projects, which use Subversion and CVS. In some cases, it's a substantial convenience. E.g. when a patch review turnaround can be a week, it's nice to put a sequence of patches on a git branch. Note however, that git is not necessary the only, or the best answer. Some folks who are way more effective that myself in dealing with patch series use quilt, and it works just fine desprite not having any hype around. And, if you prefer git, you can use git-svn just fine. This will solve all your issues above.
It is true that having Boost use git will make it slightly more convenient to use git on your computer. However, only slightly more. And the downsides are:
- Transition costs - Ongoing problems for new folks who are not familiar with git - Quirky ideas about how you should to version control that git appears to enforce on its users - That cherry-pick deficiency - And, finally, a serious risk that as soon as Boost switches to git, everybody will get excited, create 100 clones everywhere, and everybody will spends days sending and merging pull requests, while there will be no official version. This is of course, a process problem, but given our track record of ignoring process problems and focusing at not too important discussions, I bet we would not be able to fix this.
Been following along silently since the beginning. Just a couple of comments here: There are ongoing problems with new users and SVN. Boost is the ONLY project that I use that uses SVN. Every time I want to hack on boost I have dig out my notes and get back up to speed on SVN. What is stopping anyone from creating their own SVN server with Boost on it? A Simple google search will point the way to the "Official" boost SVN repository. So neither one of those arguments are really valid in my mind. I dont' have enough experience to comment on the other issues that you raise. Mike Jackson

AMDG On 2/1/2011 11:12 PM, Attila Feher F wrote:
I am following this discussion, so far silently, for some time now. I see very good descriptions about distributed version control and the way it enables more flexibility. And then I see your words Steven. And I am sorry to say that it makes me fel sad to read your words. People are kind and helpful, trying to explain to you what they mean, and why they think it is a good idea to use distributed version control. You get information/experience from people who use it daily and see the advantages. This, in itself, is a good thing. Even if git will never be used for Boost.
However when I read your words, I get the feeling that you are not listening. Not only that, but you talk to people in an abrupt, unfriendly, even adversarial manner. (Thank you Google for spelling that for me.)
To me it seems that you argue with what you think is an "attack" by "git dogmatics" on SVN and your way of working. I feel that you are arguing with your preconceptions about what the others think, and during that "heroic fight against irrelevance" you manage to quickly write off arguments, and do that with an attitude.
Since it makes me feel bad (and I already spend too much on antidepressants) I cannot stop myself but to write and tell you, ask you to please change that tone. Nobody is attacking you. Nobody says your ways or the Boost ways are faulty. Nobody accuses you of anything bad. All that people have said is that (after extensive use of distributed VCS) they find it a more useful tool than centralized VC. It makes *their* work easier, and it enables *them* (and according to them many others) to contribute with more *ease*. And how easy something is, is of course, a matter of perception. Something that is not objective.
I find this DVCS vs. CVCS argument very similar to the assembler vs. any-higher-level-language argument of the 80es and then then same with C vs. C++ later etc.
I honestly believe that there is value in the words told about DVCS here. And I see your point as well. Why change if it works for now. I just wish you have made those points in a more polite manner.
FWIW, I do understand that a DVCS has some advantages. I just think that it's being blown way out of proportion. I don't have any particular attachment to svn, although I object to changing it on general principles, since switching will cause a massive disruption for a little while, no matter how well it's managed. In Christ, Steven Watanabe

AMDG On 1/29/2011 1:50 PM, Dave Abrahams wrote:
The fact that you can quickly try doing it several different ways without affecting the "official repo" is a big plus. There's no reason anyone should take my word for this, but I didn't really "get it" about DVCSes until I actually tried using Git for a while. Something about it changes the user experience drastically in ways that are simply not obvious until you've gotten used to it.
I've noticed. Using Git seems to incapacitate people from using any other version control tool. I think it should be banned as a public hazard. In Christ, Steven Watanabe

On Mon, Jan 31, 2011 at 8:01 AM, Steven Watanabe
AMDG
On 1/29/2011 1:50 PM, Dave Abrahams wrote:
The fact that you can quickly try doing it several different ways without affecting the "official repo" is a big plus. There's no reason anyone should take my word for this, but I didn't really "get it" about DVCSes until I actually tried using Git for a while. Something about it changes the user experience drastically in ways that are simply not obvious until you've gotten used to it.
I've noticed. Using Git seems to incapacitate people from using any other version control tool. I think it should be banned as a public hazard.
I OTOH pity the ones stuck with subversion. Learning and leveraging Git effectively is like discovering the world is really round and that there's actually a different and largely better way of working with code. How does that saying go, once you go black... -- Dean Michael Berris about.me/deanberris

Hi,
I'm not a boost contributor (yet) and maybe can be considered novice so take
my advice knowing that.
On Sat, Jan 29, 2011 at 11:08, John Maddock
* I happen to like the fact that SVN stores things *not on my hard drive*, it means I just don't have to worry about what happens if my laptop goes belly up, gets lost, stolen, dropped, or heaven forbid "coffeed". On the other hand the "instant" commits and version history from a local copy would be nice...
I have the same kind of concerns. I've only started using Mercurial (hg) in the middle of the last year and I'm not an expert but the decentralized way of doing things changes a lot my point of view. (I used SVN before) The decentralized nature of the repositories moves the organization responsability from the tool (like in SVN) to the user (like in any DSVC). What I mean is that as every developer have a full repository, or several clones, the communication of changes between repositories is only driven by the contributors/team organisation. For this specific point, habing a non-local clone of your work, here is what I'm doing for all my projects now: 1. On my laptop, that I use in transit, I have one repository of my project. You could say it's my "trunk", comming from the SVN world. 2. On the same laptop, I have several clones of the 1. repo. In each, I work on experimental features that I often just delete after some tests. If the feature is good enough, I "push" the changes (the commits that have been done in this local repo) in the repo 1.. That means I have to merge before sometimes as I still work on 1. for the main features, and I kind of work like if I was a big team alone. Anyway, I already can fork for myself with decentralized repositories, allowing me to experiment more easily. 3. I have another repo on my home desktop. That's a clone from 1. with some other works that require me to have several computer screens available, so I work on those features mainly on my desktop. In fact I swith from laptop to desktop often, and often make sure both are synchronised. I don't know about how it's done in git but hg (using tortoise hg or not) can setup an http server for you that will listen for pull and push requests. So it's super-easy to just transfer changes between my computers, from 1. or 2. or 3., to any one of them. In fact, I start to build some kind of hierarchical relation between my repos (that becomes natural when you understand the potential of "cloning"/"forking"). So I have two computers as various backups. 4. I don't trust my hardware very much, and sometimes I need to have changes from my desktop in my laptop but I'm too far from it, like in another city. So, I also keep another repo in my online server (an ubuntu box that I loan mostly for websites and SVN repo, nothing special). In this repo I push changes from laptop and desktop that are not experimental. By the way, hg works with ssh so it's incredible to see that you just have to clone a repo to make it available online via ssh (if it's not public, or with more security if it is). You don't have to launch any server (other than the one to manage ssh). My online server is kind of a more secure backup that is available from anywhere. 5. I also have other repos on my servers that are for my personal work that is experimental but takes a lot of time. So I still have an online available backup. 6. I have some friends that want to work with me on my project. I've setup clones for them on my private server, one for each user. They can work however they want on their local computers and simply push their changes in my private server, on their dedicated repository on my server. When they think they made works that can be used and is complete, and is all pushed in their repo, they mail me to make me review their changes, comment and if all is fine then pull the changes in the "truth" repo that merge all the team effort. That truth repo is in fact a clone of mine that I call the "team" repo. But anyway I can call it what I like, it's just an organisation matter. Like setting up the "graph" of your team :) 7. In fact, some of my projects are open-source (not worth showing here and it's not the point) so I have additional repositories in bitbucket.org and google code hosting. That way, anyone can do like any teammate from my private projects. It's just that the repository is available publicly in read-only access. To "write", people have to ask me to review their code, then if I htink it's worth and follow my standards for the project, I pull the changes and add a line somewhere about who contributed to what. For open source project, public repos are my "trust" repos, where I pull all work that is valid and finished. Note that bitbucket and github allows you with "one click" as pointed by Dean to clone a repo. That allows you to work on your version for a time, pull public changes while you still work on your features, then ask for the maintainer of the truth repo to get your changes. Or not. 8. I don't trust all hosting services, so I've setup scripts that pull all public and private changes in separate backup repos. 9. I didn't even started to talk about branching. Branching is kept in the repositories history, while forking is not. I have branches for features that are required, not experimental, but are long to implement. So, for the security of backup problems, it diseapear because it's easier to setup any dsvc hosting than with SVN (because there is no need for a server to listen). Some other points in my description might help to understand why decentralised changes everything when coming from SVN. Someone (Dean?) did that analogy some time ago : SVN/CVS are like mutexes over a container while DSVC are like lockfree containers. That's, I think, the most accurate analogy of the differencies between the two. I have not used DSVC for years because I thought that it was not worth it in team environnement, "because" of the decentralized nature. In fact I tried only because it was good for one person project in my mind. I must say I was totally wrong, it's the exact opposite. That said, I'm juste a "junior", so again take my experience as such. Joël Lamotte

[This reply does not advocate moving boost to git.] On Friday 28 January 2011 10:46:17 John Maddock wrote:
I still haven't heard from the git proponents, what's wrong with using git-svn to manage a local - i.e. distributed - git repository, and then periodically pushing changes to SVN. In other words working with git just as you normally would, except for having to type "git svn" from time to time? This isn't a rhetorical question BTW, I've never used either git or git-svn, so I clearly don't know what I'm missing ;-)
The strength of git (and other DVCSs) comes from the ability to create essentially unlimited local branches and share/merge said branches with local branches of other parties. git-svn does not work with this model if you want the ability to push changes into SVN. The fundamental problem with git-svn ------------------------------------ One cannot clone a git repository (created with git-svn) and then use the resulting repository to contribute to the original SVN repository without the use of hacks. Two points from the git-svn manual: 1. git clone does not clone branches under the refs/remotes/ hierarchy or any git svn metadata, or config. So repositories created and managed with using git svn should use rsync for cloning, if cloning is to be done at all. 2. For the sake of simplicity and interoperating with a less-capable system (SVN), it is recommended that all git svn users clone, fetch and dcommit directly from the SVN server, and avoid all git clone/pull/merge/push operations between git repositories and branches. The two points above essentially make git-svn worthless for use as a *distributed* VCS since cloned repositories are neither equally capable nor can they be used to follow the develop/push/pull methodology of DVCSs. The strength of DVCSs is directly related to (a somewhat convoluted, but very useful definition of) zero-barrier to entry: any changes you make in your clone are equal in terms of publishability with those in anyone else's clone; git-svn cannot handle this model because SVN cannot handle it. Hope this helps. Regards, Ravi

Dave Abrahams wrote:
Except that there are interdependencies among some of the libraries. How many build tools should you need in order to install Boost.Serialization?
Currently boost serialization needs only one tool to build/test - Bjam. The rest of the dependcies are header only. I don't know, but boost serialization might be buildable/testable with CTest. I don't think that's a huge hill - certainly not greater than the current situation. (aside, starting in 1.46, testing the serialization library now also depends onthe filesystem library which is also compiled)
I see the centralized functions being limited to: a) reviews/certification b) accumulation of testing results c) coordination/maintainence of standards (a-d above) d) promotion of developer practices compatible with the above (licenses, etc).
Suppose such an environment existed today. The whole issue of moving to git wouldn't be an issue. Each library author could use which ever system he preferred. Movement to git could proceed on a library by library basis if/when other developers were convinced it was an improvement. It would be one less thing to spend time on.
Or, it could be one more thing to spend time on.
Standardization != coordination, and while coordination can slow things down, standardization brings efficiencies.
It is unbelievable painful for me to say this, but I think we're in agreement here. I'm proposing that we "standardize" things like namespaces, directly structure, testability requierments, documentation requirements, platform support requirements, etc. BUT that we try to move away from having to coordinate so closely - e.g. making a giant release on a specific date. Robert Ramey

On Fri, 28 Jan 2011 13:35:19 -0800, Robert Ramey
I'm proposing that we "standardize" things like namespaces, directly structure, testability requierments, documentation requirements, platform support requirements, etc.
+1 As a mere user of boost, I believe such client-facing standardization will go a long way in easing the adoption and learning of boost libraries. Mostafa

On Fri, Jan 28, 2011 at 4:35 PM, Robert Ramey
Standardization != coordination, and while coordination can slow things down, standardization brings efficiencies.
It is unbelievable painful for me to say this, but I think we're in agreement here.
Yeah, it's actually giving me a headache and a nasty case of bursitis.
I'm proposing that we "standardize" things like namespaces, directly structure, testability requierments, documentation requirements, platform support requirements, etc.
and, I'm suggesting, tools such as VCSes, Build/test infrastructure, etc.
BUT that we try to move away from having to coordinate so closely - e.g. making a giant release on a specific date.
i-think-i-may-have-slipped-a-disc-too-ly y'rs -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
On Fri, Jan 28, 2011 at 4:35 PM, Robert Ramey
wrote:
In the interest of moving toward a better modularization of boost and decoupling of libraries, I would like to make a suggestion: Can we change library testing so that each library is tested against the current release branch of all the other libraries? When I test on my own machine, I don't test against the trunk. I test against the current release tree. Another way of saying this is that on my current machine I have my directory tree set to the the boost release branch. ONLY the serialization library directories are set to the trunk. So when I test on my local machine I KNOW that when I merge into the release branch I won't have any unexpected problems. a) I don't think this would be a huge change. b) It would better test what the user does c) It would isolate each library from another so that errors in the trunk (wild west) in one library don't impact on development of other libaries. d) It would promote decoupling of libraries. e) as an intermediate step, not all testers would have to make the change - some could use the old script. It would be a small, not too difficult step on the path we think we want to travel. Robert Ramey

Robert, great questions. Could you post them to the boost developers'
list? They really don't belong here.
Thanks.
On Sat, Jan 29, 2011 at 12:57 PM, Robert Ramey
Dave Abrahams wrote:
On Fri, Jan 28, 2011 at 4:35 PM, Robert Ramey
wrote: In the interest of moving toward a better modularization of boost and decoupling of libraries, I would like to make a suggestion:
Can we change library testing so that each library is tested against the current release branch of all the other libraries?
When I test on my own machine, I don't test against the trunk. I test against the current release tree. Another way of saying this is that on my current machine I have my directory tree set to the the boost release branch. ONLY the serialization library directories are set to the trunk. So when I test on my local machine I KNOW that when I merge into the release branch I won't have any unexpected problems.
a) I don't think this would be a huge change. b) It would better test what the user does c) It would isolate each library from another so that errors in the trunk (wild west) in one library don't impact on development of other libaries. d) It would promote decoupling of libraries. e) as an intermediate step, not all testers would have to make the change - some could use the old script.
It would be a small, not too difficult step on the path we think we want to travel.
Robert Ramey
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Dave Abrahams BoostPro Computing http://www.boostpro.com

On Thu, Jan 27, 2011 at 5:52 PM, Beman Dawes
Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
I'm already a convert to DVCS's, so in principle migration to Git seems like a good thing to me, but on a practical level I find I can't run through the Ryppl 'Getting Started' guide because our corporate internet gateways block or restrict the git protocol. Dropping back to http only works for the first step, presumably because the submodule download reverts to the git protocol. I recognise of course that this is local issue for me, but I imagine I will not be alone. - Rob.

On 28/01/11 07:58, Robert Jones wrote:
On Thu, Jan 27, 2011 at 5:52 PM, Beman Dawes
mailto:bdawes@acm.org> wrote: Independent of modularization, ryppl, or anything else, is it time to start a discussion on the main list about moving to Git?
I'm already a convert to DVCS's, so in principle migration to Git seems like a good thing to me, but on a practical level I find I can't run through the Ryppl 'Getting Started' guide because our corporate internet gateways block or restrict the git protocol. Dropping back to http only works for the first step, presumably because the submodule download reverts to the git protocol.
I recognise of course that this is local issue for me, but I imagine I will not be alone.
You are not alone Robert. I think HTTPS *and* at least read-only access through HTTP is a must. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org Member of ACCU, http://accu.org

On Fri, Jan 28, 2011 at 5:24 PM, Mateusz Loskot
On 28/01/11 07:58, Robert Jones wrote:
I'm already a convert to DVCS's, so in principle migration to Git seems like a good thing to me, but on a practical level I find I can't run through the Ryppl 'Getting Started' guide because our corporate internet gateways block or restrict the git protocol. Dropping back to http only works for the first step, presumably because the submodule download reverts to the git protocol.
I recognise of course that this is local issue for me, but I imagine I will not be alone.
You are not alone Robert. I think HTTPS *and* at least read-only access through HTTP is a must.
Git does support both -- if it's on Github, you get it for free. For "pushing" stuff to other people's repository, there's a way to send the changesets as email -- git-am I believe is the term to Google. :) HTH -- Dean Michael Berris about.me/deanberris

On Fri, Jan 28, 2011 at 5:26 PM, Dean Michael Berris
On Fri, Jan 28, 2011 at 5:24 PM, Mateusz Loskot
wrote: You are not alone Robert. I think HTTPS *and* at least read-only access through HTTP is a must.
Git does support both -- if it's on Github, you get it for free.
For "pushing" stuff to other people's repository, there's a way to send the changesets as email -- git-am I believe is the term to Google. :)
Sorry, git-am is to apply changesets from a mailbox/email. git-patch is the way to format patches as emails. :) -- Dean Michael Berris about.me/deanberris

Eric Niebler wrote: A lot of work remains --- that is, if it's also our intention to modularize boost and have a functioning cmake build system, too. At least, modularization seems like it would be a good thing to do at the same time. And nobody is working on a bjam system for modularized boost. *** I don't believe there is any reason to couple modularization of boost to any particular build system. I use bjam to build and test the serialization library on my local machine. I just set the current directory to libs/serialization/test and run bjam with some switches. This builds/updates the prerequisites to the serialization library, builds the serialization library, then builds and runs the tests. (and in my case builds a table of test results since i use library status.sh). I'm would expect that I could the same with CTest. The key issue is that the build system permit the building of just one "module" (and its necessary prerequisites). Bjam (and hopefully ctest) does this now. Building of "all" of boost is just the building of each module. Building of some alternative "distribution" is just the building of each of the component modules (and their prequisites). There isn't even any reason why each module has to use the same build system. <idle speculation> Is it feasible to have both git and svn development going on simultaneously? Two-way synchronization from non-modularized svn boost to modularized git boost? Is that pure insanity? *** By the same token, a "modularized" boost needn't require that all modules use the same source control system. Ideally, the build for each module would use checkout/update the local copy of the module according to the "configuration file" (...v2 or ctest.?). Once the procedure for for building a module is moved to the module rather than invoked "from the top", modularization can proceed incrementally. Robert Ramey --
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

[Redirecting replies to the boost developers' list; we should have been there nearly from the beginning. Anyone who wants to see the earlier parts of the thread should look to http://groups.google.com/group/boostusers/browse_thread/thread/6d0d01eb3cac4...] Hi Robert, Could you please use standard quoting? I am having trouble separating the parts you wrote below from what Eric wrote. Thanks, Dave At Tue, 25 Jan 2011 09:06:19 -0800, Robert Ramey wrote:
Eric Niebler wrote:
A lot of work remains --- that is, if it's also our intention to modularize boost and have a functioning cmake build system, too.
At least, modularization seems like it would be a good thing to do at the same time. And nobody is working on a bjam system for modularized boost.
*** I don't believe there is any reason to couple modularization of boost to any particular build system. I use bjam to build and test the serialization library on my local machine. I just set the current directory to libs/serialization/test and run bjam with some switches. This builds/updates the prerequisites to the serialization library, builds the serialization library, then builds and runs the tests. (and in my case builds a table of test results since i use library status.sh). I'm would expect that I could the same with CTest.
The key issue is that the build system permit the building of just one "module" (and its necessary prerequisites). Bjam (and hopefully ctest) does this now. Building of "all" of boost is just the building of each module. Building of some alternative "distribution" is just the building of each of the component modules (and their prequisites). There isn't even any reason why each module has to use the same build system.
<idle speculation> Is it feasible to have both git and svn development going on simultaneously? Two-way synchronization from non-modularized svn boost to modularized git boost? Is that pure insanity?
*** By the same token, a "modularized" boost needn't require that all modules use the same source control system. Ideally, the build for each module would use checkout/update the local copy of the module according to the "configuration file" (...v2 or ctest.?).
Once the procedure for for building a module is moved to the module rather than invoked "from the top", modularization can proceed incrementally.
Robert Ramey
--
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Dave Abrahams BoostPro Computing http://www.boostpro.com

At Mon, 24 Jan 2011 22:36:02 -0500, John Wiegley wrote:
Dave Abrahams
writes: Exactly. I'm 98% of the way toward a branchified sequence today.
Awesome!
OK, branchification is working! All that's left is submodulization as part of the same run.
What do you mean by "submodulization?" I don't think we'll end up using Git submodules much in the end, but I can imagine why you'd want to do that now.
This will actually not be very difficult, just time consuming to run. I'll use Eric's manifest.txt file, plus 'git log --follow -C --find-copies-harder' on each element of each submodule, run against the flat history. The man page says this is an O(N^2) operation -- where N is very large in Boost's case -- so I may end up having to do some pruning to keep it from getting out of hand.
Actually, the speed of this script is already too slow, so I'm rewriting it in C++ today both for the native speed increase (10x so far, for dump-file parsing), and because it lets me use libgit2 (https://github.com/libgit2) to create Git objects directly, rather than shelling out to git-hash-object and git-mktree over a million times. That alone takes over 15 hours to do on my Mac Pro. Don't even ask how long the git gc takes to run! (It's longer).
If anyone wonders whether my process -- which works for any Subversion repo, btw, not just Boost -- preserves more information than plain git-svn: consider that my branchified Git has just over one million Git objects in it, while the boost-svn repository on ryppl has only 593026 right now. That means over 40% of the repository's objects got dropped on the cutting floor by git-svn's hueristics.
Yay, John! :-) -- Dave Abrahams BoostPro Computing http://www.boostpro.com
participants (33)
-
Anthony Foiani
-
Anthony Williams
-
Attila Feher F
-
Beman Dawes
-
Christopher Jefferson
-
Dave Abrahams
-
Dean Michael Berris
-
Diederick C. Niehorster
-
Edward Diener
-
Eric J. Holtman
-
Eric Niebler
-
Felipe Magno de Almeida
-
Frédéric Bron
-
Joel Falcou
-
Joel.Falcou@lri.fr
-
John Maddock
-
John Wiegley
-
Klaim
-
Mateusz Loskot
-
Mathieu -
-
Matthieu Brucher
-
Michael Jackson
-
Mostafa
-
Raindog
-
Ravi
-
Robert Jones
-
Robert Ramey
-
Russell L. Carter
-
Scott McMurray
-
Sebastian Redl
-
Steven Watanabe
-
Ted Byers
-
Vladimir Prus