
On 29 Nov 2013 at 16:26, Beman Dawes wrote:
When we talk about renormalisation, we're talking about the procedure described in the gitattributes man page.
Unfortunately that procedure would corrupt many files within Boost.
I've tested, on Windows and Linux, without apparent problems. The most files modified (152) are on master. Here is the list:
I looked through that list - it doesn't seem to contain anything from Sandbox? My original Boost2Git EOL conversion patch was developed against trunk only i.e. not including sandbox. That conversion, to my knowledge, was completely lossless, even when going right back to the beginning. The problem was Sandbox really. Dave spotted that my patch fatal exited when Sandbox was included, and it was an assertion check for an "impossible" condition. Things got worse the more I worked around problems. I eventually concluded we would have to accept data loss if we were converting all of history, and upcalled the decision to Dave and Daniel. One option I did posit was to convert the last three years of history, and flatten everything before that. We could have done a perfect conversion of the last three years easily enough, and it was my personally preferred option. I think we're beyond that now though.
Off the top of my head, you'll need to watch for the following (this list is incomplete): [snip] None of the modified files fit in that category.
I checked tools/inspect/wrong_line_ends_test.cpp to see why it wasn't normalized, and the reason was simple. It had already been normalized! That isn't worrisome - inspect is a tool that will need tuning for git anyhow.
The problematic files were mostly in Sandbox. I found little issue with trunk. I think because stuff in trunk had to pass peer review, people didn't do unwise commits.
.pdf is in .gitattributes, so no problem. I checked a couple of .pdf files to be sure, and adobe reader opens them without problems.
Yes, I made sure .pdf was in gitattributes specifically to deal with that problem. I still think a test which sweeps the first 8Kb of all files in Boost which don't have extensions in .gitattributes is a very good idea.
We never dealt with these issues during conversion, and there are probably more we don't know about yet. This is why I said Boost is not ready to do the transition - plus too few want to do the manual labour involved in achieving a "perfect" conversion and just want "someone else" to do the tedious work for them.
I'm sure there are problems we don't know about. But we have done enough testing do know that vast numbers of files were converted correctly, that passing tests on both trunk and branches/release still pass, and that the small number of the minor problems we have found are not even close to being showstoppers.
To delay further just because of FUD will be harmful.
Caution is not FUD. My biggest single worry has always been lack of testing of the validity of the conversion. I personally hate doing anything irreversible which has not been tested to destruction. The decision is out of my hands of course. And I have always seen your point about getting on with it Beman, it's just I am personally much more cautious in this type of situation (equally I am far less cautious than many on this list in many other situations). Also, it's entirely possible much more testing has been done than I am aware of - after all I am not on the steering commitee. If that is so, my caution is unwarranted and I have been talking out of my ass.
Thanks for your list of possible problem areas. It gave me something additional things to look for.
You're welcome. Niall -- Currently unemployed and looking for work. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/