
On Wed, Nov 4, 2009 at 9:08 PM, David Abrahams <dave@boostpro.com>
I'm hoping Troy can give us a public report on the modularization effort. Troy?
I suppose a picture is worth a thousand words: http://sodium.resophonic.com/boost/boost-dependencies.jpg Please stop reading now and look at that. Look for cycles and dependencies that don't seem to make sense. Realize that if you took the dependencies of test binaries into account you'd just add more edges. I don't know how many more. This graph was generated by graphviz from dependency information encoded in Boost.CMake's 'module.cmake' files, located in each library's source directory. Boost.CMake has the ability to reorganize headers so that instead of being held in one directory, they are held in ~70 (IIRC). First I'll explain my take on the tough bits about reorganizing boost's directory structure, then I'll explain why I think this is a bad idea. As I recently wrote offlist, regarding what happens to include paths when each of ~100 libraries has its own include/ directory:
Here's my back-of-the-envelope sketch of the discussion as I last recall it.
The hard use case is the library that is dependent on all the rest of boost. Here, adding a bunch of -I flags to the compile line and calculating some dependencies isn't the problem, as no sane person puts 100 header directories on one compile line, it is hard on the eyes and just doesn't scale. Eventually you're going to run out of commandline buffer space somewhere.
CMake's 'modularize' target moves headers out of toplevel boost/ as follows, this was uncontroversial:
libs/ python/ src/ ... doc/ ... test/ ... examples/ ... include/ boost/ <---- moved files python.hpp python/
and for each library p_i in a project's P's dependencies, -I$BOOST_ROOT/libs/p_i/include was added to P's compile line.
So with 100 dependencies, how do you present all of these things to the compiler such that you've got a reasonable compile line? .rsp files may do it on windows. Elsewhere, you're going to need to somehow construct a single header directory on disk. Maybe the mechanism is build in to source control, maybe it isn't. Possibilities:
- symlinks (not on windows you don't) - hardlinks created on checkout by version control - hardlinks generated by script - one generated directory full of forwarding headers (Qt has done this for years. They check these files in. I implemented a python script to do this for boost at some point.). - svn:externals (I used to advocate this, not any more) - git submodules (I don't advocate this either)
My view: I would prefer to follow Qt's method as it is simple, does not rely on version control tricks, and can do sanity checking for duplicate files and conventions about where one is allowed to put things. I believe I would distribute the generated header directory in release tarballs but not check it in. Developers would have to learn to regenerate the main header directory when adding/removing headers.
But since then my view has changed. Now why this is a bad idea: Shuffling headers around just makes the problem worse. It won't remove any edges from that graph. Take the parameter -> python dependency, for instance. Parameter is dependent on python because it uses python's internal version of referent_storage (basically just aligned_storage) in only one place. Python's version of boost/aligned_storage.hpp dates from 2002, presumably before boost/aligned_storage.hpp came along. This is easy to fix: just point them both at the toplevel aligned_storage, and you're done, and an edge disappears from that graph. There are probably hundreds more cases like this. Remove those edges. Then you would start seeing what modularity might look like. As when detangling a large knot: gently tug at the loose bits and see if you can unravel it, being careful not to make it worse. Shuffling headers around and making source control more complicated, on the other hand, won't remove any edges from that graph. So I would: - Write a script to generate that graph from the header files themselves. - Declare a moratorium on new libraries. - Start removing edges. -t