On 10/25/2013 07:35 AM, Ahmed Charles wrote:
Date: Fri, 18 Oct 2013 00:24:57 +0200 From: steveire@gmail.com To: boost@lists.boost.org Subject: [boost] [modularization] Modularizing Boost (modularization)
* Phase 0 - remove dead weight by bumping compiler feature requirements * Phase 1 - move some files around so that the modularized repos form a mostly directed graph * Phase 2 - Form some kind of 'boost core library' or 'boost feature normalization library' from the guts of existing libraries like type_traits, static_assert, config mpl and utilities. * Phase 3 - Try to port the mpl to variadic templates so that the dependency on Boost.PP is not needed when variadic templates are available.
I'm still catching up on all of the replies, and I like the work overall and it seems to be going in the right direction.
Thanks for the detailed analysis.
One minor concern I have with the steps you outline below is that they seem to optimizing for reducing edges in the dependency graph rather than creating the right dependency structure, which is a subtle distinction.
Yes, they are distinct. Creating the right dependency structure is more what I had in mind as part of phase 2.
For example, moving the exception classes used by all of boost out of the exception library doesn't really seem like a win overall, since having files grouped logically is the entire point of a module.
I guess 'logically' might depend on point of view. BOOST_THROW_EXCEPTION might not throw an exception, depending on the capabilities of the compiler, among other things. So, from my point of view, it can be seen as 'feature normalization' and belong in some core location with other parts of boost which do 'normalization' of features of toolkits/compilers (think most of Config and static_assert, type_traits etc, and someday boost.any perhaps). The main reason I moved it out in my analysis was because the exception library also has other dependencies. Of course, another option would be to split it into two libraries to isolate those dependencies. Looking at the dependencies again now, even that may be unnecessary depending on what happens with the detail and utility repos.
I think the biggest question to ask is what libraries are expected to be at the core of boost and used by all other libraries and effectively create a layering structure, where libraries belong to layers and can only depend on libraries at it's level or below. The goal is to create as many levels as possible by moving libraries up in level whenever they have no more incoming links at their current level.
That's similar to what we are doing in KDE. The kdelibs libraries in KDE4 are very interdependent. http://quickgit.kde.org/?p=kdelibs.git&a=tree&hb=master 3 years ago I wrote "More interdependence between libraries is easier, but makes independent reuse much harder. We just have to find out if it's worth it." http://thread.gmane.org/gmane.comp.kde.devel.core/67458/focus=67520 and created a plan of 'tiers': http://techbase.kde.org/index.php?title=Projects/KDELibsModifications&oldid=55633 It took a lot of work and moving stuff around to 'clean up' kdelibs: http://community.kde.org/Frameworks/Epics/kdelibs_cleanups reduce duplication with Qt provisions (note that there is duplication of some classes/concepts between some boost libraries) : http://community.kde.org/Frameworks/Epics/Reduce_class_duplication and to do actual splitting: http://community.kde.org/Frameworks/Epics/Splitting_kdelibs Today we have four split out tiers and a staging directory (libraries not yet ready to be moved to a tier) to manage and layer dependencies: http://quickgit.kde.org/?p=kdelibs.git&a=tree&hb=frameworks and we'll soon be splitting that repo into multiple smaller git repos and making releases of modular 'KDE Frameworks 5'. Note that we are doing repo splitting after code modularization, in contrast to what boost is doing, and we'll use git grafts where we want the history from the unmodularized repo in the future. So, yes, what boost (should be) trying to do regarding modularization sounds familiar to me. There are indeed several lines of work to do with subtle or not-so subtle distinctions between them.
I also think the search should initially be focused on dependency links which make no sense, like proto -> spirit (I think that was the one Eric noticed) before moving on to dependencies which make sense but perhaps the files are in the wrong location.
Yes. I am not familiar enough with boost though to know what makes no sense. I would point anyone at the dot files I linked in my original mail if you want to find other edges which should be removed.
Lots of the changes below seem to be of the form "move utility code from library X to some library that is more heavily used than X and will therefore, reduce the graph". Is there a library that is core enough that it can be used by every other library and yet uses no other libraries? If that doesn't exist, should it?
It doesn't currently exist. I proposed creating it as part of phase 2.
There are now 653 edges and 68 nodes.
Next, move boost/detail/iterator.hpp and boost/iterator/iterator_traits.hpp to the type_traits library/repo:
Wouldn't it make more sense for iterator and iterator_traits to be in the iterator library? I can imagine moving lots of useful utilities to the 'detail' library simply cause that's what it's for, but the type_traits library shouldn't be used to contain iterator related code that has nothing to do with type_traits.
The motivation here is similar to what I wrote above about exceptions. Look at the dependencies of the iterator library in graph_all.dot to see how it contributes to the graph. Note also that boost/detail/iterator.hpp actually only contains traits helpers. So both files are about iterator traits. Moving them is appropriate in my point of view because they are related to traits, and because most users of those files do not need most of the iterator library or its dependencies.
This makes sense. Though it seems like you are suggesting that config is more widely used than detail, which seems strange given detail's goal as being the melting pot of various little things without a home. It's not obvious to me why moving something from one to the other really makes a big difference, but version code should be in a library that has no dependencies, like the library I suggested above.
Config also has no dependencies (as of https://svn.boost.org/trac/boost/changeset/85274 ) But you wrote and as I wrote in the OP, there may be sense in creating a core library/repo of which config is a part.
There are now 485 edges and 64 nodes. Several libraries form an inner mesh:
If we treat them as one element for now, we get a new graph from the rest
There are now 138 edges and 39 nodes
Move parts of {vector_,}property_map into graph_parallel:
This makes no sense to me, why would property_map code belong in the graph library? They seem to be solving completely different problems and my naive assumption would be that they have no dependencies on each other, though if that were the case, we probably wouldn't have a dependency graph that looks like spagetti. Anyways, perhaps you could explain this in more detail?
I think Edward and Jeremiah have made progress on this issue, though I admit I didn't follow the details closely, so I can't say more: http://thread.gmane.org/gmane.comp.lib.boost.devel/245078/focus=245276
Thanks again for doing all this research. It's definitely useful and in the right direction.
And thanks for the feedback! Steve.