[modular boost] non-linked headers

On Windows after the .\b2 headers step, there are these directories under the modular-boost/boost diectory which are not symbolic links: detail graph numeric pending I can understand 'detail' but should not the others have been created as symbolic links to their appropriate libraries header files ?

On 12/02/2013 10:45 PM, Edward Diener wrote:
On Windows after the .\b2 headers step, there are these directories under the modular-boost/boost diectory which are not symbolic links:
detail
It is because they have more than one source directory, so a symbolic link will not do what is needed. I do not think windows is changing anything unless you are on a version not supporting symbolic links. $ find libs -name detail | grep include/boost/detail libs/optional/include/boost/detail libs/detail/include/boost/detail libs/thread/include/boost/detail libs/config/include/boost/detail libs/smart_ptr/include/boost/detail libs/utility/include/boost/detail libs/conversion/include/boost/detail libs/graph/include/boost/detail libs/dynamic_bitset/include/boost/detail
graph
$ find libs -name graph | grep include/boost/graph libs/graph_parallel/include/boost/graph libs/graph/include/boost/graph $ diff boost/graph libs/graph/include/boost/graph/ Only in boost/graph: accounting.hpp Only in boost/graph: distributed Only in boost/graph: parallel
numeric
$ find libs -name numeric | grep include/boost/numeric libs/numeric/interval/include/boost/numeric libs/numeric/conversion/include/boost/numeric libs/numeric/ublas/include/boost/numeric libs/numeric/odeint/include/boost/numeric numeric has multiple sub-libraries, e.g. ublas: $ diff boost/numeric libs/numeric/ublas Only in libs/numeric/ublas: bench1 Only in libs/numeric/ublas: bench2 Only in libs/numeric/ublas: bench3 Only in libs/numeric/ublas: bench4 Only in libs/numeric/ublas: bench5 Only in boost/numeric: conversion Only in libs/numeric/ublas: doc Only in libs/numeric/ublas: .git Only in libs/numeric/ublas: .gitattributes Only in libs/numeric/ublas: include Only in libs/numeric/ublas: index.html Only in boost/numeric: interval Only in boost/numeric: interval.hpp Only in boost/numeric: odeint Only in boost/numeric: odeint.hpp Only in libs/numeric/ublas: test Only in boost/numeric: ublas
pending
$ find libs -name pending libs/regex/include/boost/regex/pending libs/disjoint_sets/include/boost/pending libs/detail/include/boost/pending libs/iterator/include/boost/pending libs/graph_parallel/include/boost/pending libs/graph/include/boost/pending libs/dynamic_bitset/include/boost/pending
I can understand 'detail' but should not the others have been created as symbolic links to their appropriate libraries header files ?
HTH -- Bjørn

On 12/02/2013 11:29 PM, Bjørn Roald wrote:
$ diff boost/numeric libs/numeric/ublas
wrong command, I was thinking: $ diff boost/numeric libs/numeric/ublas/include/boost/numeric Only in boost/numeric: conversion Only in boost/numeric: interval Only in boost/numeric: interval.hpp Only in boost/numeric: odeint Only in boost/numeric: odeint.hpp -- Bjørn

On 12/2/2013 5:29 PM, Bjørn Roald wrote:
On 12/02/2013 10:45 PM, Edward Diener wrote:
On Windows after the .\b2 headers step, there are these directories under the modular-boost/boost diectory which are not symbolic links:
detail
It is because they have more than one source directory, so a symbolic link will not do what is needed.
Symbolic links can be created to individual files also.
I do not think windows is changing anything unless you are on a version not supporting symbolic links.
$ find libs -name detail | grep include/boost/detail libs/optional/include/boost/detail libs/detail/include/boost/detail libs/thread/include/boost/detail libs/config/include/boost/detail libs/smart_ptr/include/boost/detail libs/utility/include/boost/detail libs/conversion/include/boost/detail libs/graph/include/boost/detail libs/dynamic_bitset/include/boost/detail
Just looking at detail we are evidently creating duplicating files rather than creating symbolic links to files. For instance in libs/detail, which is a submodule, we have allocator_utilities.hpp. In boost/detail we have allocator_utilities.hpp. Let us suppose that the libs/detail/allocator_utilities.hpp gets updated in git. When this happens, how is the boost/detail/allocator_utilities.hpp supposed to be updated if it is an entirely separate file ?

On 12/03/2013 12:39 AM, Edward Diener wrote:
On 12/2/2013 5:29 PM, Bjørn Roald wrote:
On 12/02/2013 10:45 PM, Edward Diener wrote:
On Windows after the .\b2 headers step, there are these directories under the modular-boost/boost diectory which are not symbolic links:
detail
It is because they have more than one source directory, so a symbolic link will not do what is needed.
Symbolic links can be created to individual files also.
yes, but b2 headers create hard links rather than symbolic links individual files on platforms supporting hardlinks. If you have no simple way of checking if you got hardlinks, simply try to modify one to see if you have a link or a copy. Windows have had hardlinks since at least XP, so if it does not work for you it is a bug I think. b2 headers will fall back to copy if that is the only thing supported I think, or that is what it should do.
I do not think windows is changing anything unless you are on a version not supporting symbolic links.
$ find libs -name detail | grep include/boost/detail libs/optional/include/boost/detail libs/detail/include/boost/detail libs/thread/include/boost/detail libs/config/include/boost/detail libs/smart_ptr/include/boost/detail libs/utility/include/boost/detail libs/conversion/include/boost/detail libs/graph/include/boost/detail libs/dynamic_bitset/include/boost/detail
Just looking at detail we are evidently creating duplicating files rather than creating symbolic links to files.
For instance in libs/detail, which is a submodule, we have allocator_utilities.hpp. In boost/detail we have allocator_utilities.hpp. Let us suppose that the libs/detail/allocator_utilities.hpp gets updated in git. When this happens, how is the boost/detail/allocator_utilities.hpp supposed to be updated if it is an entirely separate file ?
It wan't get updated. And that is a major pain for any platforms that don't support links. But links should work on any "normal" development host computer these days. Look for hard links, on my linux box $ cd boost/detail $ touch bamse $ ls -al total 344 drwxrwsr-x 2 bjorn users 4096 Dec 3 01:03 ./ drwxrwsr-x 6 bjorn users 12288 Dec 2 23:08 ../ -rw-rw-r-- 2 bjorn users 2960 Dec 2 21:19 algorithm.hpp -rw-rw-r-- 2 bjorn users 5216 Dec 2 21:18 allocator_utilities.hpp -rw-rw-r-- 2 bjorn users 618 Dec 2 21:24 atomic_count.hpp -rw-rw-r-- 2 bjorn users 644 Dec 2 21:25 atomic_redef_macros.hpp -rw-rw-r-- 2 bjorn users 913 Dec 2 21:25 atomic_undef_macros.hpp -rw-rw-r-- 1 bjorn users 0 Dec 3 01:03 bamse -rw-rw-r-- 2 bjorn users 4414 Dec 2 21:18 basic_pointerbuf.hpp -rw-rw-r-- 2 bjorn users 6358 Dec 2 21:18 binary_search.hpp ^ referense count note that reference count is 2 for all but "bamse" -- Bjørn

On 3 December 2013 00:08, Bjørn Roald
yes, but b2 headers create hard links
It really should use soft links. Most programs don't change files in place, so as soon as such a change is made the two entries will be pointing to different inodes. Which defeats the purpose, since they should always be the same.

On 12/03/2013 01:11 AM, Daniel James wrote:
On 3 December 2013 00:08, Bjørn Roald
wrote: yes, but b2 headers create hard links
It really should use soft links. Most programs don't change files in place, so as soon as such a change is made the two entries will be pointing to different inodes. Which defeats the purpose, since they should always be the same.
yes that would be bad. Is this behavior documented somewhere? -- Bjørn

On 12/03/2013 01:17 AM, Bjørn Roald wrote:
On 12/03/2013 01:11 AM, Daniel James wrote:
On 3 December 2013 00:08, Bjørn Roald
wrote: yes, but b2 headers create hard links
It really should use soft links. Most programs don't change files in place, so as soon as such a change is made the two entries will be pointing to different inodes. Which defeats the purpose, since they should always be the same.
yes that would be bad.
Is this behavior documented somewhere?
not sure I understood exactly what you refer to, but I did just test command line concatenation, emacs, gedit, vim. All of them seem to change the file as I expected. What programs do you have in mind? I am surprised if there is a problem with this, but if it is we should re-consider the use of hardlinks. -- Bjørn

On 3 December 2013 00:32, Bjørn Roald
not sure I understood exactly what you refer to, but I did just test command line concatenation, emacs, gedit, vim. All of them seem to change the file as I expected. What programs do you have in mind?
Git for a start. If you check out a different version of a header it will break the link. Also, soft links tell you where the file came from, and make it explicit that it's in multiple locations.

On 12/03/2013 01:48 AM, Daniel James wrote:
On 3 December 2013 00:32, Bjørn Roald
wrote: not sure I understood exactly what you refer to, but I did just test command line concatenation, emacs, gedit, vim. All of them seem to change the file as I expected. What programs do you have in mind?
Git for a start. If you check out a different version of a header it will break the link.
In that case we should use symlinks for sure. The problem here is that the dependency in b2 for the link should catch this in the build, but not if the date stamp move in the wrong direction. IBM cleamake solves this in clearcase views, but we do not have that build tool. Using filetime "greater than" to detect dependency changes is a fundamentally broken hack used by almost all build tools. As far as I remember symlinks to files are not Supported on windows prior to Vista, how much of a concern should that be? I guess copies are annoying for XP hosts, but not as devious as I see hardlinks could be. -- Bjørn

On 12/2/2013 8:06 PM, Bjørn Roald wrote:
On 12/03/2013 01:48 AM, Daniel James wrote:
On 3 December 2013 00:32, Bjørn Roald
wrote: not sure I understood exactly what you refer to, but I did just test command line concatenation, emacs, gedit, vim. All of them seem to change the file as I expected. What programs do you have in mind?
Git for a start. If you check out a different version of a header it will break the link.
In that case we should use symlinks for sure. The problem here is that the dependency in b2 for the link should catch this in the build, but not if the date stamp move in the wrong direction. IBM cleamake solves this in clearcase views, but we do not have that build tool. Using filetime "greater than" to detect dependency changes is a fundamentally broken hack used by almost all build tools.
As far as I remember symlinks to files are not Supported on windows prior to Vista, how much of a concern should that be? I guess copies are annoying for XP hosts, but not as devious as I see hardlinks could be.
I do not use XP any longer but there is a driver for symbolic links under Windows XP. See http://schinagl.priv.at/nt/hardlinkshellext/hardlinkshellext.html#symbolicli....

On 12/2/2013 9:02 PM, Edward Diener wrote:
On 12/2/2013 8:06 PM, Bjørn Roald wrote:
On 12/03/2013 01:48 AM, Daniel James wrote:
On 3 December 2013 00:32, Bjørn Roald
wrote: not sure I understood exactly what you refer to, but I did just test command line concatenation, emacs, gedit, vim. All of them seem to change the file as I expected. What programs do you have in mind?
Git for a start. If you check out a different version of a header it will break the link.
In that case we should use symlinks for sure. The problem here is that the dependency in b2 for the link should catch this in the build, but not if the date stamp move in the wrong direction. IBM cleamake solves this in clearcase views, but we do not have that build tool. Using filetime "greater than" to detect dependency changes is a fundamentally broken hack used by almost all build tools.
As far as I remember symlinks to files are not Supported on windows prior to Vista, how much of a concern should that be? I guess copies are annoying for XP hosts, but not as devious as I see hardlinks could be.
I do not use XP any longer but there is a driver for symbolic links under Windows XP. See http://schinagl.priv.at/nt/hardlinkshellext/hardlinkshellext.html#symbolicli....
Unfortunately it seems that there would still be no command-line support for symbolic links under Windows XP even with the driver mentioned.

On 3/12/2013 16:42, Quoth Edward Diener:
On 12/2/2013 9:02 PM, Edward Diener wrote:
I do not use XP any longer but there is a driver for symbolic links under Windows XP. See http://schinagl.priv.at/nt/hardlinkshellext/hardlinkshellext.html#symbolicli....
Unfortunately it seems that there would still be no command-line support for symbolic links under Windows XP even with the driver mentioned.
According to what I can see that includes a version of ln [1]. Also there are some other third-party programs that can create links too. (Granted neither of these are especially convenient for use by Boost.) [1] http://schinagl.priv.at/nt/ln/ln.html

On 12/2/2013 11:13 PM, Gavin Lambert wrote:
On 3/12/2013 16:42, Quoth Edward Diener:
On 12/2/2013 9:02 PM, Edward Diener wrote:
I do not use XP any longer but there is a driver for symbolic links under Windows XP. See http://schinagl.priv.at/nt/hardlinkshellext/hardlinkshellext.html#symbolicli....
Unfortunately it seems that there would still be no command-line support for symbolic links under Windows XP even with the driver mentioned.
According to what I can see that includes a version of ln [1]. Also there are some other third-party programs that can create links too.
(Granted neither of these are especially convenient for use by Boost.)
Good find. I cannot imagine that there are many developers who want to use Modular Boost to get the source being still on Windows XP ( or Windows 2000 ) but we could still publish this information and try to provide a solution for them. If we know that hard links will not work with git changes when generating some of the headers, we should at least offer an alternative for OSs without native symbolic link support.

AMDG On 12/02/2013 05:06 PM, Bjørn Roald wrote:
On 12/03/2013 01:48 AM, Daniel James wrote:
On 3 December 2013 00:32, Bjørn Roald
wrote: not sure I understood exactly what you refer to, but I did just test command line concatenation, emacs, gedit, vim. All of them seem to change the file as I expected. What programs do you have in mind?
Git for a start. If you check out a different version of a header it will break the link.
In that case we should use symlinks for sure. The problem here is that the dependency in b2 for the link should catch this in the build, but not if the date stamp move in the wrong direction. IBM cleamake solves this in clearcase views, but we do not have that build tool. Using filetime "greater than" to detect dependency changes is a fundamentally broken hack used by almost all build tools.
I thought that I wrote this to try symlinks first. Anyway, if everyone agrees that the correct behavior is to create a symlink, then it's really easy to change. Go to link.jam, find the rule do-file-link, and switch the order that it checks hard links and symlinks.
As far as I remember symlinks to files are not Supported on windows prior to Vista, how much of a concern should that be? I guess copies are annoying for XP hosts, but not as devious as I see hardlinks could be.
A copy has no advantages over a hard link. - If the source is overwritten in-place, then the hard link is still correct, and there is no problem. - If the source is replaced, then the hard link is left pointing to the original file, and the state is essentially the same as if we had made a copy. In Christ, Steven Watanabe

On 12/03/2013 04:09 AM, Steven Watanabe wrote:
AMDG
On 12/02/2013 05:06 PM, Bjørn Roald wrote:
On 12/03/2013 01:48 AM, Daniel James wrote:
On 3 December 2013 00:32, Bjørn Roald
wrote: not sure I understood exactly what you refer to, but I did just test command line concatenation, emacs, gedit, vim. All of them seem to change the file as I expected. What programs do you have in mind?
Git for a start. If you check out a different version of a header it will break the link.
In that case we should use symlinks for sure. The problem here is that the dependency in b2 for the link should catch this in the build, but not if the date stamp move in the wrong direction. IBM cleamake solves this in clearcase views, but we do not have that build tool. Using filetime "greater than" to detect dependency changes is a fundamentally broken hack used by almost all build tools.
I thought that I wrote this to try symlinks first. Anyway, if everyone agrees that the correct behavior is to create a symlink, then it's really easy to change.
You did, I think I deliberately changed the preference to hardlinks in my patch, probably a bad move. I don't remember but I think I only got copies before the patch though, thats why I made changes to the rule in the first place, should have left the preference for symlinks the way it was.
Go to link.jam, find the rule do-file-link, and switch the order that it checks hard links and symlinks.
Sure, but not today on my part, need to get some sleep.
As far as I remember symlinks to files are not Supported on windows prior to Vista, how much of a concern should that be? I guess copies are annoying for XP hosts, but not as devious as I see hardlinks could be.
A copy has no advantages over a hard link. - If the source is overwritten in-place, then the hard link is still correct, and there is no problem. - If the source is replaced, then the hard link is left pointing to the original file, and the state is essentially the same as if we had made a copy.
+1 -- Bjørn

On 3 December 2013 01:06, Bjørn Roald
On 12/03/2013 01:48 AM, Daniel James wrote:
Git for a start. If you check out a different version of a header it will break the link.
In that case we should use symlinks for sure. The problem here is that the dependency in b2 for the link should catch this in the build, but not if the date stamp move in the wrong direction. IBM cleamake solves this in clearcase views, but we do not have that build tool. Using filetime "greater than" to detect dependency changes is a fundamentally broken hack used by almost all build tools.
True, but even if we used a build system that acted differently, we can't assume that everyone will use it.
As far as I remember symlinks to files are not Supported on windows prior to Vista, how much of a concern should that be? I guess copies are annoying for XP hosts, but not as devious as I see hardlinks could be.
I was mainly worried about unix variants. I think it's fair to have a lesser experience on XP, it's pretty much obsolete now (and I write that as someone who's still on XP).

On 12/2/2013 7:48 PM, Daniel James wrote:
On 3 December 2013 00:32, Bjørn Roald
wrote: not sure I understood exactly what you refer to, but I did just test command line concatenation, emacs, gedit, vim. All of them seem to change the file as I expected. What programs do you have in mind?
Git for a start. If you check out a different version of a header it will break the link.
Also, soft links tell you where the file came from, and make it explicit that it's in multiple locations.
If git breaks this than this use of hardlinks instead of symbolic links for individual files needs to change for Modular Boost back to using symbolic links for files also. Needless to say it is also consistent with the symbolic links to directories and is apparent in Explorer in Windows.

On 3 December 2013 00:17, Bjørn Roald
On 12/03/2013 01:11 AM, Daniel James wrote:
On 3 December 2013 00:08, Bjørn Roald
wrote: yes, but b2 headers create hard links
It really should use soft links. Most programs don't change files in place, so as soon as such a change is made the two entries will be pointing to different inodes. Which defeats the purpose, since they should always be the same.
yes that would be bad.
Is this behavior documented somewhere?
A bit of testing showed that "most" probably wasn't true, but it definitely varies. But here's someone complaining about emacs breaking hard links: http://blog.hartwork.org/?p=25 'sed -i' also breaks hard links. I think it's actually configurable in both emacs and vim, which is worse, because you can't know. The problem is that it varies so it's unpredictable. Soft links are consistent.

On Mon, Dec 2, 2013 at 7:11 PM, Daniel James
On 3 December 2013 00:08, Bjørn Roald
wrote: yes, but b2 headers create hard links
It really should use soft links. Most programs don't change files in place, so as soon as such a change is made the two entries will be pointing to different inodes. Which defeats the purpose, since they should always be the same.
I'm missing something. Could you give an example of what you mean by "Most programs don't change files in place"? Hard links ensure that a change is always seen by both the entries because there is only one underlying file. I ran into that with Visual Studio when I tried symlinks and found that as a result Visual Studio failed to realize when a dependency had changed. --Beman

On 3/12/2013 15:02, Quoth Beman Dawes:
I'm missing something. Could you give an example of what you mean by "Most programs don't change files in place"? Hard links ensure that a change is always seen by both the entries because there is only one underlying file. I ran into that with Visual Studio when I tried symlinks and found that as a result Visual Studio failed to realize when a dependency had changed.
I don't know about "most", but it's not uncommon for editors designed to work on very large files (or in defense against antivirus scanners, or their own bugs) to write out a temporary file either elsewhere or beside the target file, and then delete the original and rename the copy to match the original. This minimises the window in which the file does not contain valid contents, and is typically as close as you can get in a non-transactional filesystem to an atomic file replacement, since delete and renames are near-instant compared to writes. This helps to defend against data loss and against other software that is monitoring the directory for files with particular filename patterns, and often improves performance when editing very large files. I'm not entirely sure how this interacts with symlinks -- whether it'll just work or whether the editor would need to be symlink-aware.

On 12/03/2013 03:02 AM, Beman Dawes wrote:
On Mon, Dec 2, 2013 at 7:11 PM, Daniel James
wrote: On 3 December 2013 00:08, Bjørn Roald
wrote: yes, but b2 headers create hard links
It really should use soft links. Most programs don't change files in place, so as soon as such a change is made the two entries will be pointing to different inodes. Which defeats the purpose, since they should always be the same.
I'm missing something. Could you give an example of what you mean by "Most programs don't change files in place"? Hard links ensure that a change is always seen by both the entries because there is only one underlying file. I ran into that with Visual Studio when I tried symlinks and found that as a result Visual Studio failed to realize when a dependency had changed.
As you surely know, names of hardlinks are reference counted handles to a shared file, kind of like shared_pointers for shared memory. I think the problem is that the perceived shared state could be broken the same way the perceived shared state of shared_pointers can if you replace the pointer rather than writing to the shared memory, shared_ptr<int> pa(new int('a')); shared_ptr<int> another_pa = pa; shared_ptr<int> pb(new int('b')); pa = pb; // *pa == 'b' OK that is what we want, sort of // *another_pa != 'b'; // but - ouch git could e.g. be deleting the old file in the working directory before checking out the new version, leaving the header link in limbo - ouch. I am not sure how this actually works with git though, but some web hits indicate such issues with git in the past. Symbolic links would likely not have similar problem, because they are a reference to the name of the shared file, not the underlying storage. -- Bjørn

On 12/2/2013 9:02 PM, Beman Dawes wrote:
On Mon, Dec 2, 2013 at 7:11 PM, Daniel James
wrote: On 3 December 2013 00:08, Bjørn Roald
wrote: yes, but b2 headers create hard links
It really should use soft links. Most programs don't change files in place, so as soon as such a change is made the two entries will be pointing to different inodes. Which defeats the purpose, since they should always be the same.
I'm missing something. Could you give an example of what you mean by "Most programs don't change files in place"? Hard links ensure that a change is always seen by both the entries because there is only one underlying file. I ran into that with Visual Studio when I tried symlinks and found that as a result Visual Studio failed to realize when a dependency had changed.
Let us suppose there is a particular file AAA somewhere with a hard link reference count of 2, meaning that there are 2 hard links to the file from different places in the file system. A tool wants to replace this file's data and in doing so it attempts to delete it and then recreate it with the same name. The 'delete' would reduce the reference count to 1. What happens then when the file gets 'created' with the same name ? By create we can also consider 'copy' or 'move' to that same name: 1) The file create fails 2) The file create actually change the file which still exists and increases the hard link reference count to 2 If it does 1) as I suspect, than using the hardlink fails with such a product. If as another poster has ascertained 'git' uses the 'delete' and 'create' method to change the file to a different version, then 'git' will fail. The same goes for any tool that changes the file by deleting it and recreating it if 1) is how things work with hardlinks. OTOH if 2) is the case then there are no problems.

On 3/12/2013 16:34, Quoth Edward Diener:
A tool wants to replace this file's data and in doing so it attempts to delete it and then recreate it with the same name. The 'delete' would reduce the reference count to 1. What happens then when the file gets 'created' with the same name ? By create we can also consider 'copy' or 'move' to that same name:
1) The file create fails 2) The file create actually change the file which still exists and increases the hard link reference count to 2
If it does 1) as I suspect, than using the hardlink fails with such a product. If as another poster has ascertained 'git' uses the 'delete' and 'create' method to change the file to a different version, then 'git' will fail. The same goes for any tool that changes the file by deleting it and recreating it if 1) is how things work with hardlinks. OTOH if 2) is the case then there are no problems.
No, that's not how it works. If the file is deleted, the reference count of the *other* copy of the file will go down to 1, and in the local directory there will be no file. If a file with the same name is then created (whether via write, copy, or move), this will succeed but result in an independent file that is not associated with the "other" file at all; changing one will not affect the other. In order to "fix" this, something would have to go through and explicitly recreate the links, possibly inspecting each of the copies to determine which one is "better". (Or be configured to always steamroll one way or the other.)

On 12/2/2013 10:47 PM, Gavin Lambert wrote:
On 3/12/2013 16:34, Quoth Edward Diener:
A tool wants to replace this file's data and in doing so it attempts to delete it and then recreate it with the same name. The 'delete' would reduce the reference count to 1. What happens then when the file gets 'created' with the same name ? By create we can also consider 'copy' or 'move' to that same name:
1) The file create fails 2) The file create actually change the file which still exists and increases the hard link reference count to 2
If it does 1) as I suspect, than using the hardlink fails with such a product. If as another poster has ascertained 'git' uses the 'delete' and 'create' method to change the file to a different version, then 'git' will fail. The same goes for any tool that changes the file by deleting it and recreating it if 1) is how things work with hardlinks. OTOH if 2) is the case then there are no problems.
No, that's not how it works.
If the file is deleted, the reference count of the *other* copy of the file will go down to 1, and in the local directory there will be no file.
If a file with the same name is then created (whether via write, copy, or move), this will succeed but result in an independent file that is not associated with the "other" file at all; changing one will not affect the other.
That sounds right. It was my confused thinking about what a file entry in a file system is that posited those ridiculous other possibilities. That still proves that in the face of any tools, which recreate a file when the file is updated, using a hard link as opposed to a symbolic link is the wrong thing to do if you want the hard links to refer to the same file data.
In order to "fix" this, something would have to go through and explicitly recreate the links, possibly inspecting each of the copies to determine which one is "better". (Or be configured to always steamroll one way or the other.)
IMO way too much work unless there is no symbolic links on an OS as an alternative to hard links.

On 3 December 2013 02:02, Beman Dawes
On Mon, Dec 2, 2013 at 7:11 PM, Daniel James
wrote: On 3 December 2013 00:08, Bjørn Roald
wrote: yes, but b2 headers create hard links
It really should use soft links. Most programs don't change files in place, so as soon as such a change is made the two entries will be pointing to different inodes. Which defeats the purpose, since they should always be the same.
I'm missing something. Could you give an example of what you mean by "Most programs don't change files in place"? Hard links ensure that a change is always seen by both the entries because there is only one underlying file. I ran into that with Visual Studio when I tried symlinks and found that as a result Visual Studio failed to realize when a dependency had changed.
I corrected myself a bit later, it isn't "most" programs, but is many, especially things that involve unix shell scripting or want to update file atomically. I was also talking about using linux and os x, I don't know anything about hard links on windows. The problem is that different programs do different things. There's no consensus on what the semantic of hard links are - sometimes they are used as a cheap copy, sometimes to link two directory entries. And most programmers don't even think about this when writing software, so they usually stumble into one unknowingly (or even both). If you have a system you completely control you can ensure one or the other, but once your software is exposed to the world the other behaviour easily sneaks in. And because it's subtle, people often don't realise what's happening. Git deliberately breaks hard links, so that you can create a fast local clone using hardlinks and then checkout another branch without disrupting the original. Someone proposed making that an option, and it was rejected: http://thread.gmane.org/gmane.comp.version-control.git/97974 For a demonstration, take a clean checkout of master and do this: ./b2 headers cd libs/detail git checkout develop -- include/boost/detail/iterator.hpp cd ../.. diff boost/detail/iterator.hpp libs/detail/include/boost/detail/iterator.hpp This can be fixed by running 'b2 headers' again, but this isn't something that should be relied on - people use other build tools. I also wonder if there's a possibility that 'b2 headers' will choose the wrong file. Soft links also have other advantages, they make it clear which module the header file came from, and it's more immediately obvious that file is linked - especially in graphical tools.

On Tue, Dec 3, 2013 at 3:52 AM, Daniel James
On 3 December 2013 02:02, Beman Dawes
wrote: On Mon, Dec 2, 2013 at 7:11 PM, Daniel James
wrote: On 3 December 2013 00:08, Bjørn Roald
wrote: yes, but b2 headers create hard links
It really should use soft links. Most programs don't change files in place, so as soon as such a change is made the two entries will be pointing to different inodes. Which defeats the purpose, since they should always be the same.
I'm missing something. Could you give an example of what you mean by "Most programs don't change files in place"? Hard links ensure that a change is always seen by both the entries because there is only one underlying file. I ran into that with Visual Studio when I tried symlinks and found that as a result Visual Studio failed to realize when a dependency had changed.
I corrected myself a bit later, it isn't "most" programs, but is many, especially things that involve unix shell scripting or want to update file atomically. I was also talking about using linux and os x, I don't know anything about hard links on windows. The problem is that different programs do different things. There's no consensus on what the semantic of hard links are - sometimes they are used as a cheap copy, sometimes to link two directory entries. And most programmers don't even think about this when writing software, so they usually stumble into one unknowingly (or even both). If you have a system you completely control you can ensure one or the other, but once your software is exposed to the world the other behaviour easily sneaks in. And because it's subtle, people often don't realise what's happening.
Git deliberately breaks hard links, so that you can create a fast local clone using hardlinks and then checkout another branch without disrupting the original. Someone proposed making that an option, and it was rejected:
http://thread.gmane.org/gmane.comp.version-control.git/97974
For a demonstration, take a clean checkout of master and do this:
./b2 headers cd libs/detail git checkout develop -- include/boost/detail/iterator.hpp cd ../.. diff boost/detail/iterator.hpp libs/detail/include/boost/detail/iterator.hpp
This can be fixed by running 'b2 headers' again, but this isn't something that should be relied on - people use other build tools. I also wonder if there's a possibility that 'b2 headers' will choose the wrong file.
Soft links also have other advantages, they make it clear which module the header file came from, and it's more immediately obvious that file is linked - especially in graphical tools.
I have no objection to changing to symlinks as the default behavior. Whichever approach is used, I'm concerned about those developers who use IDE's or build systems that will fail to notice changes with that approach. At the least, we need to provide some documentation, and mention by name any widely used IDE's or build systems that will have trouble. It would also be helpful to suggest workarounds. --Beman

On 4/12/2013 02:24, Quoth Beman Dawes:
This can be fixed by running 'b2 headers' again, but this isn't something that should be relied on - people use other build tools. I also wonder if there's a possibility that 'b2 headers' will choose the wrong file.
Soft links also have other advantages, they make it clear which module the header file came from, and it's more immediately obvious that file is linked - especially in graphical tools.
I have no objection to changing to symlinks as the default behavior.
Would symlinks even solve that issue? I could be wrong, but I would think that a tool or editor that followed the delete-move pattern would break symlinks just as easily as hardlinks, unless they explicitly followed the symlink to the "real" file and used that path in the delete-move. (And at least on Windows, most programs are not symlink-aware, partly because links are fairly rare.) Although it's true that symlinks are more obvious as such, so that might possibly encourage people to edit the "real" file instead of the linked version. Would it be possible to link only at the whole-directory level, rather than individual files? I think behaviour is much more stable and desirable that way (and it should fix the make-timestamp issue too). To support this, the top-level headers could be owned by the main repository, and (where not already) just be a thin redirect into the "real" headers in the library's linked subdir. It would mean a bit of restructuring for some of the older libraries, but that's probably a good thing overall for modularisation.

AMDG On 12/03/2013 02:22 PM, Gavin Lambert wrote:
Would it be possible to link only at the whole-directory level, rather than individual files?
Yes. In fact, the current implementation does this wherever it's possible to do so.
I think behaviour is much more stable and desirable that way (and it should fix the make-timestamp issue too). To support this, the top-level headers could be owned by the main repository, and (where not already) just be a thin redirect into the "real" headers in the library's linked subdir.
It would mean a bit of restructuring for some of the older libraries, but that's probably a good thing overall for modularisation.
In Christ, Steven Watanabe

On 4/12/2013 11:46, Quoth Steven Watanabe:
On 12/03/2013 02:22 PM, Gavin Lambert wrote:
Would it be possible to link only at the whole-directory level, rather than individual files?
Yes. In fact, the current implementation does this wherever it's possible to do so.
I was suggesting not merely "where possible" but "exclusively", rearranging files or adding non-link redirects as required to make it work.
I think behaviour is much more stable and desirable that way (and it should fix the make-timestamp issue too). To support this, the top-level headers could be owned by the main repository, and (where not already) just be a thin redirect into the "real" headers in the library's linked subdir.
It would mean a bit of restructuring for some of the older libraries, but that's probably a good thing overall for modularisation.

On 12/3/2013 5:22 PM, Gavin Lambert wrote:
On 4/12/2013 02:24, Quoth Beman Dawes:
This can be fixed by running 'b2 headers' again, but this isn't something that should be relied on - people use other build tools. I also wonder if there's a possibility that 'b2 headers' will choose the wrong file.
Soft links also have other advantages, they make it clear which module the header file came from, and it's more immediately obvious that file is linked - especially in graphical tools.
I have no objection to changing to symlinks as the default behavior.
Would symlinks even solve that issue? I could be wrong, but I would think that a tool or editor that followed the delete-move pattern would break symlinks just as easily as hardlinks, unless they explicitly followed the symlink to the "real" file and used that path in the delete-move. (And at least on Windows, most programs are not symlink-aware, partly because links are fairly rare.)
The delete-move pattern does not break symlinks. Since a symlink is only a pointer to another full file name, as long as the name remains the same after the delete-move the symlink still works correctly.
Although it's true that symlinks are more obvious as such, so that might possibly encourage people to edit the "real" file instead of the linked version.
The grammar above is confusing. If you edit the file by specifying the symlink location you are editing the "real" file.
Would it be possible to link only at the whole-directory level, rather than individual files? I think behaviour is much more stable and desirable that way (and it should fix the make-timestamp issue too). To support this, the top-level headers could be owned by the main repository, and (where not already) just be a thin redirect into the "real" headers in the library's linked subdir.
It would mean a bit of restructuring for some of the older libraries, but that's probably a good thing overall for modularisation.
Certainly this could be done, but I see no necessity for it if symlinks are used, other than a possibly better design of some header file locations. Let's get Modular Boost working correctly for programmers now by replacing the "b2 headers" command with one that uses symlinks rather than hardlinks. Then if we want to consider header file placement we can do that as a further chore in the future.

On 4/12/2013 16:32, Quoth Edward Diener:
Would symlinks even solve that issue? I could be wrong, but I would think that a tool or editor that followed the delete-move pattern would break symlinks just as easily as hardlinks, unless they explicitly followed the symlink to the "real" file and used that path in the delete-move. (And at least on Windows, most programs are not symlink-aware, partly because links are fairly rare.)
The delete-move pattern does not break symlinks. Since a symlink is only a pointer to another full file name, as long as the name remains the same after the delete-move the symlink still works correctly.
I was thinking of edits in the location where the symlinks appear to be, not where they point to. Possibly this isn't the scenario that people are most worried about though.
Although it's true that symlinks are more obvious as such, so that might possibly encourage people to edit the "real" file instead of the linked version.
The grammar above is confusing. If you edit the file by specifying the symlink location you are editing the "real" file.
AFAIK, if a symlink-unaware editor/script issues an "rm path-user-entered" (or equivalent API as part of a semi-atomic replace) then that will delete the symlink if that happens to be what is actually at the path-user-entered. It will not follow the symlink and delete the "real" file, and so they'll be left with duplicate unlinked files again. It's only safe if the path-user-entered is the "real" (target) path, not the symlink (or if the editor itself or something between the user and the editor follows the symlink first). And hardlinks are not safe either way around. That's what I was referring to above.
Let's get Modular Boost working correctly for programmers now by replacing the "b2 headers" command with one that uses symlinks rather than hardlinks. Then if we want to consider header file placement we can do that as a further chore in the future.
Definitely. Making "b2 headers" create symlinks is still probably the best option, as that will defend against git updates and *most* user edits. It won't help if users are editing the files directly in boost/ (using something that uses the delete-move pattern) but that should be the less common case and there probably isn't much that can be done about that short of retraining.

On 12/04/2013 05:05 AM, Gavin Lambert wrote:
On 4/12/2013 16:32, Quoth Edward Diener:
Would symlinks even solve that issue? I could be wrong, but I would think that a tool or editor that followed the delete-move pattern would break symlinks just as easily as hardlinks, unless they explicitly followed the symlink to the "real" file and used that path in the delete-move. (And at least on Windows, most programs are not symlink-aware, partly because links are fairly rare.)
The delete-move pattern does not break symlinks. Since a symlink is only a pointer to another full file name, as long as the name remains the same after the delete-move the symlink still works correctly.
I was thinking of edits in the location where the symlinks appear to be, not where they point to. Possibly this isn't the scenario that people are most worried about though.
Although it's true that symlinks are more obvious as such, so that might possibly encourage people to edit the "real" file instead of the linked version.
The grammar above is confusing. If you edit the file by specifying the symlink location you are editing the "real" file.
AFAIK, if a symlink-unaware editor/script issues an "rm path-user-entered" (or equivalent API as part of a semi-atomic replace) then that will delete the symlink if that happens to be what is actually at the path-user-entered. It will not follow the symlink and delete the "real" file, and so they'll be left with duplicate unlinked files again. It's only safe if the path-user-entered is the "real" (target) path, not the symlink (or if the editor itself or something between the user and the editor follows the symlink first). And hardlinks are not safe either way around. That's what I was referring to above.
neither solution is fail safe for all scenarios. Developers need to be aware that header changes *shall* go in the libs/*/inblude/boost/... structure. The links are for convenience of build tools, debuggers, etc. and for the accidental edit you do when you IDE take you there. I would be surprised if current solution creates many issues used this way. However I believe using symlinks is more robust, so we should change current order of preference in b2 headers from: symlink dir hardlink file symlink file copy to prefer: symlink dir symlink file hardlink file copy
Let's get Modular Boost working correctly for programmers now by replacing the "b2 headers" command with one that uses symlinks rather than hardlinks. Then if we want to consider header file placement we can do that as a further chore in the future.
+1
Definitely. Making "b2 headers" create symlinks is still probably the best option, as that will defend against git updates and *most* user edits. It won't help if users are editing the files directly in boost/ (using something that uses the delete-move pattern) but that should be the less common case and there probably isn't much that can be done about that short of retraining.
I don't see much chance for a text editor to do the wrong thing here. If it did, you will get the same effect as with a copy of the header -- the first clean build would overwrite your changes. -- Bjørn

On 12/4/2013 1:30 AM, Bjørn Roald wrote:
On 12/04/2013 05:05 AM, Gavin Lambert wrote:
On 4/12/2013 16:32, Quoth Edward Diener:
Would symlinks even solve that issue? I could be wrong, but I would think that a tool or editor that followed the delete-move pattern would break symlinks just as easily as hardlinks, unless they explicitly followed the symlink to the "real" file and used that path in the delete-move. (And at least on Windows, most programs are not symlink-aware, partly because links are fairly rare.)
The delete-move pattern does not break symlinks. Since a symlink is only a pointer to another full file name, as long as the name remains the same after the delete-move the symlink still works correctly.
I was thinking of edits in the location where the symlinks appear to be, not where they point to. Possibly this isn't the scenario that people are most worried about though.
Although it's true that symlinks are more obvious as such, so that might possibly encourage people to edit the "real" file instead of the linked version.
The grammar above is confusing. If you edit the file by specifying the symlink location you are editing the "real" file.
AFAIK, if a symlink-unaware editor/script issues an "rm path-user-entered" (or equivalent API as part of a semi-atomic replace) then that will delete the symlink if that happens to be what is actually at the path-user-entered. It will not follow the symlink and delete the "real" file, and so they'll be left with duplicate unlinked files again. It's only safe if the path-user-entered is the "real" (target) path, not the symlink (or if the editor itself or something between the user and the editor follows the symlink first). And hardlinks are not safe either way around. That's what I was referring to above.
neither solution is fail safe for all scenarios. Developers need to be aware that header changes *shall* go in the libs/*/inblude/boost/... structure. The links are for convenience of build tools, debuggers, etc. and for the accidental edit you do when you IDE take you there.
+1
I would be surprised if current solution creates many issues used this way. However I believe using symlinks is more robust, so we should change current order of preference in b2 headers from:
symlink dir hardlink file symlink file copy
to prefer:
symlink dir symlink file hardlink file copy
+1
Let's get Modular Boost working correctly for programmers now by replacing the "b2 headers" command with one that uses symlinks rather than hardlinks. Then if we want to consider header file placement we can do that as a further chore in the future.
+1
Definitely. Making "b2 headers" create symlinks is still probably the best option, as that will defend against git updates and *most* user edits. It won't help if users are editing the files directly in boost/ (using something that uses the delete-move pattern) but that should be the less common case and there probably isn't much that can be done about that short of retraining.
Users just have to be told that the headers under boost/ are not to be changed directly as these are all just links to headers in different libs/. I do not think that is much of an imposition on users of Boost. Once the headers in boost/ are all symbolic links to headers under libs/, it is easy enough in the various OSs to find where the "real" file resides if an end-user wants to make an experimental change.
I don't see much chance for a text editor to do the wrong thing here. If it did, you will get the same effect as with a copy of the header -- the first clean build would overwrite your changes.

On 12/2/2013 7:08 PM, Bjørn Roald wrote:
On 12/03/2013 12:39 AM, Edward Diener wrote:
On 12/2/2013 5:29 PM, Bjørn Roald wrote:
On 12/02/2013 10:45 PM, Edward Diener wrote:
On Windows after the .\b2 headers step, there are these directories under the modular-boost/boost diectory which are not symbolic links:
detail
It is because they have more than one source directory, so a symbolic link will not do what is needed.
Symbolic links can be created to individual files also.
yes, but b2 headers create hard links rather than symbolic links individual files on platforms supporting hardlinks. If you have no simple way of checking if you got hardlinks, simply try to modify one to see if you have a link or a copy. Windows have had hardlinks since at least XP, so if it does not work for you it is a bug I think.
b2 headers will fall back to copy if that is the only thing supported I think, or that is what it should do.
I do not think windows is changing anything unless you are on a version not supporting symbolic links.
$ find libs -name detail | grep include/boost/detail libs/optional/include/boost/detail libs/detail/include/boost/detail libs/thread/include/boost/detail libs/config/include/boost/detail libs/smart_ptr/include/boost/detail libs/utility/include/boost/detail libs/conversion/include/boost/detail libs/graph/include/boost/detail libs/dynamic_bitset/include/boost/detail
Just looking at detail we are evidently creating duplicating files rather than creating symbolic links to files.
For instance in libs/detail, which is a submodule, we have allocator_utilities.hpp. In boost/detail we have allocator_utilities.hpp. Let us suppose that the libs/detail/allocator_utilities.hpp gets updated in git. When this happens, how is the boost/detail/allocator_utilities.hpp supposed to be updated if it is an entirely separate file ?
It wan't get updated. And that is a major pain for any platforms that don't support links. But links should work on any "normal" development host computer these days.
I see now that the boost/detail/allocator_utilities.hpp is a hard link. Unfortunately Windows doesn't normally indicate hard links either in Explorer or via the 'dir' command. One must use a shell extension or the 'fsutil hardlink list xxx' command to get that information. I assume then that git never deletes a file and recreates it with the same name when the file is updated, else the hardlink would no longer be pointing to the same file and the purpose of having a hardlink, as opposed to a symbolic link, would be defeated.

On 12/03/2013 02:01 AM, Edward Diener wrote:
On 12/2/2013 7:08 PM, Bjørn Roald wrote:
On 12/03/2013 12:39 AM, Edward Diener wrote:
I assume then that git never deletes a file and recreates it with the same name when the file is updated, else the hardlink would no longer be pointing to the same file and the purpose of having a hardlink, as opposed to a symbolic link, would be defeated.
Right, that is what we need too know figure out - I don't know, but thinking about it I would not be surprised if use of hardlinks here is a very bad idea as it may be too much to expect of tools like git to always in-place copy everything if target is a link. -- Bjørn

On 12/2/2013 5:29 PM, Bjørn Roald wrote:
On 12/02/2013 10:45 PM, Edward Diener wrote:
On Windows after the .\b2 headers step, there are these directories under the modular-boost/boost diectory which are not symbolic links:
detail
It is because they have more than one source directory, so a symbolic link will not do what is needed. I do not think windows is changing anything unless you are on a version not supporting symbolic links.
$ find libs -name detail | grep include/boost/detail libs/optional/include/boost/detail libs/detail/include/boost/detail libs/thread/include/boost/detail libs/config/include/boost/detail libs/smart_ptr/include/boost/detail libs/utility/include/boost/detail libs/conversion/include/boost/detail libs/graph/include/boost/detail libs/dynamic_bitset/include/boost/detail
Let me rephrase: Just looking at detail we are evidently duplicating files rather than creating symbolic links to files. For instance in libs/detail, which is a submodule, we have libs/detail/include/boost/detail/allocator_utilities.hpp. In boost/detail we have allocator_utilities.hpp. They are separate files but with the same text. Let us suppose that the libs/detail/include/boost/detail/allocator_utilities.hpp gets updated in git. When this happens, how is the boost/detail/allocator_utilities.hpp supposed to be updated if it is an entirely separate file ?
participants (6)
-
Beman Dawes
-
Bjørn Roald
-
Daniel James
-
Edward Diener
-
Gavin Lambert
-
Steven Watanabe