
On Fri, Nov 29, 2013 at 12:26 PM, Niall Douglas
On 29 Nov 2013 at 8:14, Beman Dawes wrote:
When we talk about renormalisation, we're talking about the procedure described in the gitattributes man page.
Unfortunately that procedure would corrupt many files within Boost.
I've tested, on Windows and Linux, without apparent problems. The most files modified (152) are on master. Here is the list:
+1
There is also a nice discussion at
https://help.github.com/articles/dealing-with-line-endings#re-normalizing-a-...
I've forked the boost super repo and am testing the procedure now.
Off the top of my head, you'll need to watch for the following (this list is incomplete):
* Files with text file extensions not in ASCII or UTF-8. If you use simple EOL renormalisation with UTF-16 text for example, you'll corrupt that text.
None of the modified files fit in that category.
* Text files with intentionally mixed EOLs. You'll need to change their extension to not .txt (best), or add special exceptions to .gitattributes (brittle, I wouldn't recommend this option).
None of the modified files fit in that category. I checked tools/inspect/wrong_line_ends_test.cpp to see why it wasn't normalized, and the reason was simple. It had already been normalized! That isn't worrisome - inspect is a tool that will need tuning for git anyhow.
* Scan the first 8Kb of every file with an extension not marked as text nor binary in .gitattributes for zeros. If you don't find a zero, git will assume it is text and EOL normalise it. Unfortunately some binary file types such as PDF don't have zeros in their first 8Kb, so that would be very bad.
.pdf is in .gitattributes, so no problem. I checked a couple of .pdf files to be sure, and adobe reader opens them without problems. All of the modified files have extensions that are in .gitattributes, by the way.
We never dealt with these issues during conversion, and there are probably more we don't know about yet. This is why I said Boost is not ready to do the transition - plus too few want to do the manual labour involved in achieving a "perfect" conversion and just want "someone else" to do the tedious work for them.
I'm sure there are problems we don't know about. But we have done enough testing do know that vast numbers of files were converted correctly, that passing tests on both trunk and branches/release still pass, and that the small number of the minor problems we have found are not even close to being showstoppers. To delay further just because of FUD will be harmful. Thanks for your list of possible problem areas. It gave me something additional things to look for. --Beman