Re: [boost] [conversion] Isolating the phantom file changes problem

1 Dec 2013

      On 29 Nov 2013 at 16:26, Beman Dawes wrote:
...
...
...
...
When we talk about renormalisation, we're talking about the procedure
described in the gitattributes man page.
Unfortunately that procedure would corrupt many files within Boost.
I've tested, on Windows and Linux, without apparent problems. The most
files modified (152) are on master. Here is the list:
I looked through that list - it doesn't seem to contain anything from 
Sandbox?

My original Boost2Git EOL conversion patch was developed against 
trunk only i.e. not including sandbox. That conversion, to my 
knowledge, was completely lossless, even when going right back to the 
beginning.

The problem was Sandbox really. Dave spotted that my patch fatal 
exited when Sandbox was included, and it was an assertion check for 
an "impossible" condition. Things got worse the more I worked around 
problems. I eventually concluded we would have to accept data loss if 
we were converting all of history, and upcalled the decision to Dave 
and Daniel.

One option I did posit was to convert the last three years of 
history, and flatten everything before that. We could have done a 
perfect conversion of the last three years easily enough, and it was 
my personally preferred option. I think we're beyond that now though.
...
...
Off the top of my head, you'll need to watch for the following (this
list is incomplete):
[snip]
None of the modified files fit in that category.
I checked tools/inspect/wrong_line_ends_test.cpp to see why it wasn't
normalized, and the reason was simple. It had already been normalized! That
isn't worrisome - inspect is a tool that will need tuning for git anyhow.
The problematic files were mostly in Sandbox. I found little issue 
with trunk. I think because stuff in trunk had to pass peer review, 
people didn't do unwise commits.
...
.pdf is in .gitattributes, so no problem. I checked a couple of .pdf files
to be sure, and adobe reader opens them without problems.
Yes, I made sure .pdf was in gitattributes specifically to deal with 
that problem. I still think a test which sweeps the first 8Kb of all 
files in Boost which don't have extensions in .gitattributes is a 
very good idea.
...
...
We never dealt with these issues during conversion, and there are
probably more we don't know about yet. This is why I said Boost is
not ready to do the transition - plus too few want to do the manual
labour involved in achieving a "perfect" conversion and just want
"someone else" to do the tedious work for them.
I'm sure there are problems we don't know about. But we have done enough
testing do know that vast numbers of files were converted correctly, that
passing tests on both trunk and branches/release still pass, and that the
small number of the minor problems we have found are not even close to
being showstoppers.
To delay further just because of FUD will be harmful.
Caution is not FUD. My biggest single worry has always been lack of 
testing of the validity of the conversion. I personally hate doing 
anything irreversible which has not been tested to destruction.

The decision is out of my hands of course. And I have always seen 
your point about getting on with it Beman, it's just I am personally 
much more cautious in this type of situation (equally I am far less 
cautious than many on this list in many other situations).

Also, it's entirely possible much more testing has been done than I 
am aware of - after all I am not on the steering commitee. If that is 
so, my caution is unwarranted and I have been talking out of my ass.
...
Thanks for your list of possible problem areas. It gave me something
additional things to look for.
You're welcome.

Niall

-- 
Currently unemployed and looking for work.
Work Portfolio: http://careers.stackoverflow.com/nialldouglas/

Re: [boost] [conversion] Isolating the phantom file changes problem

Niall Douglas