On Tuesday 16 September 2014 09:42:25 Robert Ramey wrote:
I think the notion of "dependency" is richer than can be captured in this sort of graph. So it can't be understood in terms of this graph alone. I've written about this in the past - my maybe my post was lost due to google forum issues. For anyone who's interested here it is again.
Consider another simple case - date time/serialization.hpp
most date/time users don't use this - but a few do. Is serialization a prerequisite for date/time? which users are we talking about? One can't win here. If you distribute serialization with every use of date/time you're distributing too much. If you don't, you'll be failing to ship functionality which some users need. What is the solution here - make two libraries out of date/time? or what?
The solution will be to separate the dependency on Serialization into an optional component. This can be a header or a git submodule or a sublib in DateTime or something else. What exactly this is is defined by a number of aspects, including maintenance convenience, access control, distribution and deployment infrastructure. I agree that many of these aspects are not defined at the moment, but from the perspective of maintenance, access permissions and modularization effort a sublib looks most feasible to me.
Suppose I have a simple application A which uses the text_archive and only serializable types defined within the application itself. It should be clear that I can ship that application without shipping any of the libraries or code in ../serialization/variant.hpp etc..., xml_archive etc... So one can say that A is not dependent upon anything other than the serialization library. So, at least for this application, the dependency graph referred to above is not a good indicator of what I have to ship with my app. In fact, it's misleading.
In an ideal world you could distribute your application with the subset of Boost on per-header basis. But I think this task is not realistic at the current stage - mostly because it's difficult to correctly discover all possible dependencies on per-header basis. At this point the most reasonable level of dependency tracking is per-library or per-sub-library. It is not optimal in that it can add dependencies you don't actually need, but it's certainly better than the monolithic Boost. Returning to your example, the application will pull Serialization and everything it depends on, unless you extract the optional bits to sublibs or make them optional otherwise.
A little reflection reveals why this is so. The graph is generated by considering what it takes to build the serialization DLL and/or LIB which includes all the archive classes and perhaps a bunch more stuff.
So the graph tells us something, but what?
The serialization library has several classes of components
a) library core - implements common code to all serialization/archives b) particular archive implementations, xml_archive, ... dependencies according to the particular archive type being used or built c) serialization of other library components - e.g. shared_ptr - which depends on share_ptr itself.
These are probably the best candidates for separating from the core.
d) the test suite - which depends on all the archives being tested - which is the boost build default usage e) examples - will depend only on a small part of the serialization library.
Tests and examples typically use more components than the library itself (at least, most tests need some testing library or infrastructure). For this reason I consider them as a special kind of sublibs, in the sense that they are optional, and you would have to explicitly install them so that their dependencies are pulled. When you only need the library itself, you don't have to install dependencies of its tests and examples.
Now if you wanted o make a series of graphs like: a) particular archives text_archive, ... b) serialization for each included type e.g variant c) all tests, or each subset per archive d) examples e) other libraries such as date/time which use the serialization library in some its applications and test but not in others.
You'd have something more accurate - but alas - more complex to interpret and hence less useful.
The reports Peter publish show the library headers dependencies - which is our main concern now and is enough to work on the current stage. A more accurate report would also include dependencies needed to build library from sources (i.e. the dependencies of src/*). The dependencies of tests and examples are not the issue now, but they will be when we have a deployment tool. But if we can track dependencies between libraries, I don't see the problem doing the same for tests and examples.
If I had more time, I might be able to make this argument more coherent and tighter. Sorry about that.
But the real questions are: a) what do we want modularization to accomplish and is this a feasible goal.
Being able to download and install a subset of Boost.
b) Do we want to obsolete the original concept of equivalency between module and developer responsibility?
I don't think we're doing this. At least, not so far.
c) Do we want to support deployment of boost subset? I think we do.
I think too.
d) How should such a subset be defined - via BCP or some boost build dependency.
The instrumental question is important, and there's no definitive answer yet. Mostly because there are no prototypes, so there's nothing to choose from. I remember only one proposal that was discussed on this list, and it wasn't Boost.Build. Currently, boostdep is used to track dependencies and generate reports, but there's no modularized deployment tool.
e) How fine grain should such a dependency measured. Does importing one header - makes the whole other library a prerequisite or just that header and associated *.cpp.
At this point on library/sublib level. I don't think header level is feasible at this point, but it may be in future. All the above is my opinion and understanding, of course.