On 15 Oct 2013 at 17:52, Beman Dawes wrote:
It's the hideous complexity of doing a perfect conversion, not specific issues.
We are not trying to do a perfect conversion. We are trying to do a reasonably good conversion and then move on.
We want files to appear in the right places, we want to retain history at least for trunk and branches/release, and we want a few other things. But mostly we want to move on and put the conversion behind us. Remember that we will be doing nothing that degrades the svn repo. It will be available for years and years into the future.
Neither Dave nor Daniel appear to think that anything less than a perfect conversion will be acceptable to the Boost community. Everyone here knows how loony I think a perfect conversion is without Dave and Daniel being paid at least the going contracting rate for all those free hours by those who demand a perfect conversion.
One way I have not previously mentioned here is rebuilding the existing SVN repo to rectify unfortunate historical commits which make perfect git conversion very hard. I know Dave is opposed to this solution, and it certainly launches yet another new tool for converting SVN to cleaner-SVN. But it might be less work overall.
Less work that what? I can't see how adding an additional step to build a new svn repo and then converting that could possibly be less work that finishing off the work that has already been done, and getting on with the actual conversion.
Beman it isn't as easy as that. In git you can't modify any past commit without completely invalidating every single commit thereafter. If after launch someone finds that Boost2Git corrupted some file committed back in 2002, to fix that commit would require rebuilding *every single commit* from 2002 until now. I *absolutely* agree that this is crazy, and in my opinion we should draw a line under how far back the git conversion should go and flatten history before that. Let's say three years: remember, every single commit in the past three years will need checking out and a full build and set of unit tests run, with any differences walked through by human eyes [1]. [1]: Why not just compare the checkouts for equivalence which is much quicker? Because git's checkout and svn's checkout can never be equivalent [2], so Boost2Git employs heuristics to have git checkout something which ought to build and pass unit tests. [2]: Why aren't svn and git checkouts equivalent? Because some files - and we don't know which ones exactly - in svn are incorrectly committed [3], and "just happen" to work in svn but won't in git. Boost2Git fixes some of these up during conversion, hence the loss of bitwise equivalence in the hope of gaining *semantic* equivalence. [3]: This is what I meant by rebuilding the svn repo: if we rebuilt it fixing up all the files which were incorrectly committed in the first place, this solves the mismatch between Boost svn and git.
<snip> ... three possible approaches ... </snip>
All I'm seeing is additional complexity and few if any benefits.
It is one thing to fix any remaining glitches, but quite another to re-engineer the whole process because there is fear it might have hidden problems.
I think that's already happened before as Boost2Git is at least the second major attempt at this. And last time I looked (Dave and the Daniel can confirm) the present Boost2Git output has known unknown hidden problems. Until someone does a commit by commit automated bisection to find out how reliable history is (I'm hazarding it will work until the SVN pre-commit filters which exclude bad commits were added), we actually don't know. Niall -- Currently unemployed and looking for work. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/