Stephen Kelly-2 wrote
The graph is showing public module dependencies. I think that's understood.
Not by me. There definition of "module dependency" is unclear to me. I presume it's defined by the situation where to build one thing, one has to build other things. So if you start out with thing "A" then implies build/inclusion of some stuff from other libraries, and so on inductively until one defines a closed set. I could buy this. But the problem is when thing A is a module. Does building A refer to building the library, running tests, building the examples, building one app. Clearly if I'm building something which includes test_archive I have a different set of dependent "modules" than if I'm building something that includes xml_archive. I'm questioning the whole concept of "module dependency". To me it's ill defined and actually not definable outside of a more specific context. Hence it can't be used to determine how a large body of code should be (re) factored. We need something more precise - which has yet to be articulated.
Consider another simple case - date time/serialization.hpp
most date/time users don't use this - but a few do. Is serialization a prerequisite for date/time? which users are we talking about? One can't win here. If you distribute serialization with every use of date/time you're distributing too much. If you don't, you'll be failing to ship functionality which some users need. What is the solution here - make two libraries out of date/time? or what?
The solution is to make serialization low-cost to depend on, so that depending on it is not a problem. That is exactly what I am recommending. The current problem with serialization is that it is expensive in terms of needless dependencies. My recommendation does a lot to solve that for serialization.
I'm reluctant to propose specific courses of action too soon. It's almost for sure I will get it at least partially wrong. I'm going to over come my reluctance to address this specific case as an example and see where it goes. A user includes the date-time library in his code. He is dependent on a of boost headers which don't include boost serialization. He can build his app without including and/or linking serialization. He's happy about this. But he has to install the whole serialization module which under current rules means he installs spirit, and a whole lot of other stuff. He's unhappy about this. Damn! the date-time library refers to serialization even though I don't use it and it means that my date-time DLL is a lot larger than it has to be. This is really annoying to me. Also when someone mucks with the serialization library it might keep serialization from building which might keep my app from building even though I don't use even one line of code from it!!!! Very annoying. Since most of the problem is xml_archive->spirit - we can "fix" this by moving the xml_archive to ?. This will "solve" the problem above. Of course this comes a the expensive of everyone who wants to ship serialization with support for all of the archives classes in the package. They will now have to link with some other module other than serialization which is pretty non-obvious. So the net improvement in utility of boost libraries is not likely to be positive. The "correct" solution to the above is for date-time to build two modules: date-time and date-time-serialization. Now the original app user above has only what he wants and is not dependent upon boost serialization. Yet other users of the serialization library have what they want - serialization all in one place. To summarize - the right thing to extract is the serialization of date-time to a separate module. This kind of module has been referred to as a "helper module" (or something like that I don't remember). It's place in a "module dependency" graph is unclear. This means that the author of date-time has to refactor somewhat to create two modules. Add support for auto-linking and this is not quite as easy as it would first appear. I know this as I have addressed this within the serialization library itself. I did not want users to have to import the whole wide character code when they weren't going to need it. Hence I create serialization.dll for all the common code and wserialization.dll which includes code specific to wide character functionality. wserialization.dll calls into serialization.dll for core functionality. So we have the case where applications which don't use wide character functionality don't have to pay for it. And those that do get this functionality without having to do anything special - auto-link is fully implemented. Note that this refactoring/modularity is not at all visible in the "module dependency" graph. Never the less, I think this approach and result are consistent with your goal of "minimizing dependencies" (don't forget I don't think this phrase is well defined). At this point there would be a couple of things that would be possible. a) require/encourage authors of "library helper" (bad term!) modules to build them as separate DLLS/LIBS. b) divide the serialization (again) so that rather than wserialization and serialization it would be four modules serialization, serialization_with_xml_archive, wserialization wserialization_with_xml_archive. And of course don't forget to support auto-link. Note that while either of these options would address the "problem" faced by the user(s) above, The current "module dependency" graph would be the same in all cases. That is, this graph cannot be used to distinguish those cases where a problem exists and where it doesn't. The graph in interesting, but can't be used to make any real decisions. Also not neither of these options would require any changes to git module organization. Only Boost Build scripts and module source code would change. So it's my view that the current focus "Modularization" is somewhat misguided. It needs to be considered in terms of what boost policy should be toward importing other boost modules, granularity of modules, implementation of auto-linking - things like that. And deciding these things will take a level of consideration and effort that we haven't yet been able to muster. Perhaps your advocacy will provide the necessary sense of urgency to do this.
So the graph tells us something, but what?
Module/package dependencies.
So - the degree of "modularization" cannot be determined or illustrated or measured by examining the graph above.
Disputed.
LOL - and what does that mean? Of course this is the source of our disagreement. To you it seems clear what it means, to me its undefined. It will take a while to reconcile this.
So, taken to it's logical conclusion, extracting xml_archive would lead to extracting other components as well.
Nope. No one has suggested that. Extracting xml_archive isolates the spirit dependency. There is no similar motivation to extract other parts. I looked a little bit into splitting all of the archive parts away from the serialization part, but that still ties all the rest of the archive parts needlessly to spirit.
What I recommend isolates the cost of spirit to the code that uses it.
There could be reason to try to split the rest of the archive stuff from serialization, but I didn't look into that, so I'm not recommending it.
I think the problem is more fundamental that just moving around a few libraries/sublibraries. To me the current "problem" is an incidental side effect of the lack of implementation of certain policies that we have failed to define. So this "piece meal" approach will lead to unnecessary complexity and not really fix much. If we keep going down this road there will always be something to (re)factor.
But the real questions are: a) what do we want modularization to accomplish and is this a feasible goal.
This is where you are providing a lot of bad stop-energy. Were not these questions answered years ago?
Tell me this: Why did boost migrate away from svn to 100 fractured (not modularized!) git repos?
c) Do we want to support deployment of boost subset? I think we do.
This question was answered years ago.
Why did boost migrate away from svn to 100 fractured (not modularized!) git repos?
My basic point is that these questions have to be addressed before the notion of decoupling can be carried much further.
Insisting that they are not already answered is not helpful.
Oh no !!!. The reason we're having this problem is that we're never really thought about it. Before modularized Boost, there wasn't much we could do about it. Now we're looking at using modularized Boost to permit Boost to be made a lot bigger, this in turn raises the issue of deployment subsets and and for the first time we're starting look seriously at this. Up until now it was just an occasional grumbling. You're suggesting I'm against doing anything. That's not true. I'm against doing the wrong thing. These are not the same. You're also suggesting that I don't think there is a problem. That's also not true. But I don't buy the argument "something needs to be done, this is something, therefore we must do this".
b) created as a separate library module
This is the proposal.
I'm still not quite getting what you mean by creating a separate module. Do you mean something similar to what I mentioned above as serialization_xml_archive... This wouldn't effect the "module dependency" graph but it would might accidentally address the "subset deployment" issue. Do you mean creating a separate module at the git level? This would make the "module dependency" graph look more like what I think you want it to look like. But I'm convinced it would actually address the issue of users importing code that they don't actually use - I'd have to think about this. Or do you mean something else entirely? My real point is that I believe it's pre-mature to start investing in "minimizing module dependencies" before really considering what it is we want to achieve and the alternatives for achieving it. I believe my arguments supports the proposition that this is not an unreasonable request. Robert Ramey -- View this message in context: http://boost.2283326.n4.nabble.com/modularization-Extract-xml-archive-from-s... Sent from the Boost - Dev mailing list archive at Nabble.com.