Date: Fri, 18 Oct 2013 00:24:57 +0200 From: steveire@gmail.com To: boost@lists.boost.org Subject: [boost] [modularization] Modularizing Boost (modularization)
* Phase 0 - remove dead weight by bumping compiler feature requirements * Phase 1 - move some files around so that the modularized repos form a mostly directed graph * Phase 2 - Form some kind of 'boost core library' or 'boost feature normalization library' from the guts of existing libraries like type_traits, static_assert, config mpl and utilities. * Phase 3 - Try to port the mpl to variadic templates so that the dependency on Boost.PP is not needed when variadic templates are available.
I'm still catching up on all of the replies, and I like the work overall and it seems to be going in the right direction. One minor concern I have with the steps you outline below is that they seem to optimizing for reducing edges in the dependency graph rather than creating the right dependency structure, which is a subtle distinction. For example, moving the exception classes used by all of boost out of the exception library doesn't really seem like a win overall, since having files grouped logically is the entire point of a module. I think the biggest question to ask is what libraries are expected to be at the core of boost and used by all other libraries and effectively create a layering structure, where libraries belong to layers and can only depend on libraries at it's level or below. The goal is to create as many levels as possible by moving libraries up in level whenever they have no more incoming links at their current level. I also think the search should initially be focused on dependency links which make no sense, like proto -> spirit (I think that was the one Eric noticed) before moving on to dependencies which make sense but perhaps the files are in the wrong location. Lots of the changes below seem to be of the form "move utility code from library X to some library that is more heavily used than X and will therefore, reduce the graph". Is there a library that is core enough that it can be used by every other library and yet uses no other libraries? If that doesn't exist, should it?
If the dependencies between repositories are analysed, the result is this graph:
http://steveire.com/boost/graph_all.dot http://steveire.com/boost/graph_all_small.png
There are 104 nodes and 1159 edges. Each edge is a real dependency which exists between repos, and which will exist between boost modularized packages. They should not be considered optional.
If we remove any nodes which are not strongly connected, we are left with a new graph of nodes which are all strongly connected:
http://steveire.com/boost/graph_strong.dot http://steveire.com/boost/graph_strong_small.png
There are 68 nodes and 675 edges.
Move enable_if from utility to type_traits:
This change makes sense, since it's in type_traits in the standard.
There are now 670 edges and 68 nodes.
Next, move boost/detail/workaround.hpp to the config library/repo:
This also makes sense, and it doesn't even have to move locations in the final layout (as mentioned in a follow up mail).
There are now 661 edges and 68 nodes.
Next, move boost/limits.hpp to the config library/repo:
This move makes sense.
There are now 653 edges and 68 nodes.
Next, move boost/detail/iterator.hpp and boost/iterator/iterator_traits.hpp to the type_traits library/repo:
Wouldn't it make more sense for iterator and iterator_traits to be in the iterator library? I can imagine moving lots of useful utilities to the 'detail' library simply cause that's what it's for, but the type_traits library shouldn't be used to contain iterator related code that has nothing to do with type_traits.
There are now 650 edges and 68 nodes.
Next, move boost/iterator.hpp from iterator to the utility library/repo:
Again, I don't see how this change makes sense even if it does make the graph 'better'.
There are now 649 edges and 68 nodes. utility no longer depends on iterator.
Move boost/version.hpp to config:
This makes sense. Though it seems like you are suggesting that config is more widely used than detail, which seems strange given detail's goal as being the melting pot of various little things without a home. It's not obvious to me why moving something from one to the other really makes a big difference, but version code should be in a library that has no dependencies, like the library I suggested above.
There are now 512 edges and 64 nodes. The config, integer, io and static_assert libraries are no longer part of the mesh
Next, move boost/pointee.hpp from iterator to detail:
This makes sense, since this has nothing to do with iterators (or at least, it has as much to do with them as it does with smart pointers). However, it doesn't help to move this unless iterator_traits is also moved, which I think isn't the right idea. Though, it's not obvious why std::iterator_traits isn't sufficient for this use case, but that's a side issue.
Move exception/detail/attribute_noreturn.hpp into config:
This seems fine, since it has nothing to do with exceptions and everything to do with compiler specific config. I also think that libraries shouldn't use 'detail' code from other libraries. If it's useful more generally, it makes more sense to put it in a shared library like 'detail' that has all the shared code.
Move boost/throw_exception.hpp and exception.hpp into utility:
I agree with the person who said that exception related files should stay in the exception library.
There are now 485 edges and 64 nodes. Several libraries form an inner mesh:
If we treat them as one element for now, we get a new graph from the rest
There are now 138 edges and 39 nodes
Move parts of {vector_,}property_map into graph_parallel:
This makes no sense to me, why would property_map code belong in the graph library? They seem to be solving completely different problems and my naive assumption would be that they have no dependencies on each other, though if that were the case, we probably wouldn't have a dependency graph that looks like spagetti. Anyways, perhaps you could explain this in more detail?
There are now 95 edges and 33 nodes
The remaining problematic edges are:
conversion->range conversion->math range->algorithm math->multiprecision concept_check->parameter
Assuming they can be broken by moving some files aroung (I believe they can be), we end up with a small graph of strongly connected components:
http://steveire.com/boost/graph_after_remaining.dot http://steveire.com/boost/graph_after_remaining.png
There are now 18 edges and 11 nodes.
Looking at the entire graph again, we get this:
http://steveire.com/boost/graph_final.dot http://steveire.com/boost/graph_final_small.png
Obviously, this is not perfect, but it is a beginning, and it is mostly now a directed graph.
Any comments?
Thanks again for doing all this research. It's definitely useful and in the right direction.