A set of individual libraries vs. One big library

Hello, I was wondering if there was any interest into making the boost libraries more ``independent''? Boost contains many libraries with very different application domains. In most cases you'll want to use only one/two libraries in your project (without having to carry the whole boost code base around). But: - The files in the boost tarball are organized as if boost was a single big library: All the headers in `/boost' and all the source files together ... etc - The header file dependencies are unclear... it makes very difficult to ``extract'' just the library you need from the tarball (it is not enough to copy the directory whose name matches the library you want) It would be nice to handle this in a way similar to how ``packages'' are handled on modern Linux systems: - Each library (header+source+doc+test/..) in one directory/package (instead of having all the headers from all the libraries in the same dir, and all the tests in the same directory, all the documentation in another... etc) - A clear/automated way to handle dependencies between packages. This would help spot the ``common services'' (like boost.config) that so many libraries rely on. I think having the tarball organized in terms of individual libraries (instead of a single big one) should be the default as it make more sens from a user perspective. What do you think? Reguards, - Anis

On Mon, Jan 19, 2009 at 8:03 AM, Anis Benyelloul <anis.benyelloul@gmail.com> wrote:
Hello,
I was wondering if there was any interest into making the boost libraries more ``independent''?
Boost contains many libraries with very different application domains. In most cases you'll want to use only one/two libraries in your project
Let's say that somehow you manage to separate Boost into a bunch of libraries. Consider a situation where Boost library A uses only a single header file from another (large) Boost library B. Doesn't your original reasoning apply? Shouldn't that single header file be a separate library (since -- obviously -- it makes sense on its own)? You could go for a system where a C++ cpp/header pair is the only logical unit. Such a unit would contain the minimum declarations and definitions that can possibly exist on their own. But this is pretty much how the header-only libraries in Boost are organized. :) For example, if you need to use shared_ptr.hpp you just #include it; if you need weak_ptr.hpp, you don't care if it's part of the Boost smart pointer "library", you just #include it when needed.
From this point of view, the solution is a tool that can automatically extract the minimum subset of Boost that you need. The reason why this is not practical is that you can't mix and match units from different Boost versions. Therefore, the only way Boost can be distributed is as a single giant release.
Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

On Mon, Jan 19, 2009 at 11:38 AM, Emil Dotchevski <emildotchevski@gmail.com> wrote:
On Mon, Jan 19, 2009 at 8:03 AM, Anis Benyelloul <anis.benyelloul@gmail.com> wrote:
Hello,
I was wondering if there was any interest into making the boost libraries more ``independent''?
Boost contains many libraries with very different application domains. In most cases you'll want to use only one/two libraries in your project
Let's say that somehow you manage to separate Boost into a bunch of libraries. Consider a situation where Boost library A uses only a single header file from another (large) Boost library B. Doesn't your original reasoning apply? Shouldn't that single header file be a separate library (since -- obviously -- it makes sense on its own)?
You could go for a system where a C++ cpp/header pair is the only logical unit. Such a unit would contain the minimum declarations and definitions that can possibly exist on their own.
But this is pretty much how the header-only libraries in Boost are organized. :) For example, if you need to use shared_ptr.hpp you just #include it; if you need weak_ptr.hpp, you don't care if it's part of the Boost smart pointer "library", you just #include it when needed.
From this point of view, the solution is a tool that can automatically extract the minimum subset of Boost that you need. The reason why this is not practical is that you can't mix and match units from different Boost versions. Therefore, the only way Boost can be distributed is as a single giant release.
Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
However, if you only need a part of boost for a project and you wish to include it with your source distribution, use the boost tool "bcp". It can be given some files to look at, or some names, and it copies out from your boost directory into a custom directory everything that is needed for those files, and nothing else. Then just put that directory somewhere in your build tree and use it like a normal linked in header directory in your build scripts/solutions. It has been very useful for me recently.

Emil Dotchevski wrote:
On Mon, Jan 19, 2009 at 8:03 AM, Anis Benyelloul <anis.benyelloul@gmail.com> wrote:
Hello,
I was wondering if there was any interest into making the boost libraries more ``independent''?
There definitely is !
Boost contains many libraries with very different application domains. In most cases you'll want to use only one/two libraries in your project
Let's say that somehow you manage to separate Boost into a bunch of libraries. Consider a situation where Boost library A uses only a single header file from another (large) Boost library B. Doesn't your original reasoning apply? Shouldn't that single header file be a separate library (since -- obviously -- it makes sense on its own)?
You could go for a system where a C++ cpp/header pair is the only logical unit. Such a unit would contain the minimum declarations and definitions that can possibly exist on their own.
This seems to me a rather academic question, in contrast to the original request. Boost is still shipped as a single package on most (if not all) platforms, and providing some help to packagers to be able to split it up would be extremely useful. Whether individual libraries are header-only or not isn't all that relevant in this context. If I'm building software that requires some parts of boost, yet I'm not packaging it myself but instead rely on external packages, I'm quite concerned about what other implied prerequisites this may drag in. Any work in this direction would be very much appreciated. Thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin...

On Mon, Jan 19, 2009 at 11:13 AM, Stefan Seefeld <seefeld@sympatico.ca> wrote:
Emil Dotchevski wrote: This seems to me a rather academic question, in contrast to the original request. Boost is still shipped as a single package on most (if not all) platforms, and providing some help to packagers to be able to split it up would be extremely useful.
How would it be extremely useful? Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

Emil Dotchevski wrote:
On Mon, Jan 19, 2009 at 11:13 AM, Stefan Seefeld <seefeld@sympatico.ca> wrote:
Emil Dotchevski wrote: This seems to me a rather academic question, in contrast to the original request. Boost is still shipped as a single package on most (if not all) platforms, and providing some help to packagers to be able to split it up would be extremely useful.
How would it be extremely useful?
Permit me to answer that. a) First of all - it is ALOT less time to build and test only the libraries that one actually uses rather than the whole set. b) In practice, its common to find a couple of anomolies in one or the other libraries and this takes a disproportionate amount of tiem to track down and get things built. If one is building/testing less only what he is going to actually use, this situation is less likely to waste a lot of time. And, each build/test attempt takes less time as well. d) When some such anomolie occurs, one has to spend the time to determine whether or not it's a problem for the current application and this takes time as well. c) As boost grows - this problem gets bigger - even for someone using only a couple of libraries. That is, the current system doesn't scale. Robert Ramey
Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On Mon, Jan 19, 2009 at 2:03 PM, Robert Ramey <ramey@rrsd.com> wrote:
Emil Dotchevski wrote:
On Mon, Jan 19, 2009 at 11:13 AM, Stefan Seefeld <seefeld@sympatico.ca> wrote:
Emil Dotchevski wrote: This seems to me a rather academic question, in contrast to the original request. Boost is still shipped as a single package on most (if not all) platforms, and providing some help to packagers to be able to split it up would be extremely useful.
How would it be extremely useful?
Permit me to answer that.
a) First of all - it is ALOT less time to build and test only the libraries that one actually uses rather than the whole set.
Less time than downloading the (entire) pre-built Boost distribution, that's already been tested?
b) In practice, its common to find a couple of anomolies in one or the other libraries and this takes a disproportionate amount of tiem to track down and get things built.
Your best bet to avoid anomalies is to not migrate to another version of Boost. Your second-best bet is to not mix and match headers from different Boost releases: you want to download and use a single Boost distribution.
c) As boost grows - this problem gets bigger - even for someone using only a couple of libraries. That is, the current system doesn't scale.
I consider myself a prime example of someone using only a few Boost libraries. Still, the easiest and most reliable strategy for me is to download the entire Boost. The only better alternative would be if someone tests and distributes a Boost release that includes just the stuff I personally need and nothing else. But you can't convince me that I'm better off doing this packaging myself -- even if it's automated -- compared to just downloading an entire Boost release. Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

Emil Dotchevski wrote:
How would it be extremely useful?
Permit me to answer that.
a) First of all - it is ALOT less time to build and test only the libraries that one actually uses rather than the whole set.
Less time than downloading the (entire) pre-built Boost distribution, that's already been tested?
It sounds as if you never attempted to use boost in a professional product that needs full-scale testing or even certification. There is much more to QA than running unit tests.
I consider myself a prime example of someone using only a few Boost libraries. Still, the easiest and most reliable strategy for me is to download the entire Boost.
Right. Isn't that sad ? (And I won't even start to rant about ABI / API backward-compatible ('minor') releases again, as this topic has been beaten almost to death numerous times in the past. How more easier that would be with a more modular approach...) Regards, Stefan -- ...ich hab' noch einen Koffer in Berlin...

On Mon, Jan 19, 2009 at 2:41 PM, Stefan Seefeld <seefeld@sympatico.ca> wrote:
How would it be extremely useful? Permit me to answer that. a) First of all - it is ALOT less time to build and test only the libraries that one actually uses rather than the whole set.
Less time than downloading the (entire) pre-built Boost distribution, that's already been tested?
It sounds as if you never attempted to use boost in a professional product that needs full-scale testing or even certification.
There is much more to QA than running unit tests.
My background is in games. Our testing procedures wouldn't change based on whether any library we use was downloaded in its entirety or not. We test the product as a whole, not its components. (Regardless, I'm reasonably certain that Robert meant testing the Boost libraries, not a particular product that uses them, since we've discussed this before.)
I consider myself a prime example of someone using only a few Boost libraries. Still, the easiest and most reliable strategy for me is to download the entire Boost.
Right. Isn't that sad ?
The question is, is there an alternative approach?
(And I won't even start to rant about ABI / API backward-compatible ('minor') releases again, as this topic has been beaten almost to death numerous times in the past. How more easier that would be with a more modular approach...)
Right, describe this modular approach in more detail. What is your definition of a module? Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

Emil Dotchevski wrote:
On Mon, Jan 19, 2009 at 8:03 AM, Anis Benyelloul <anis.benyelloul@gmail.com> wrote:
From this point of view, the solution is a tool that can automatically extract the minimum subset of Boost that you need. The reason why this is not practical is that you can't mix and match units from different Boost versions.
Actually I don't see any need for any special tool of this sort. Suppose I have a simple application. #include "boost/archive/xml_archive.hpp" ... Which drags in only the headers it actually uses.. If one doesn't have one of those headers installed, then the user either has to get it, make his own or use a different library. Likewise, precompiled libraries are are included only to the extent they are used. Finally, modules within libraries are included only to the extent they are used. And this currently works pretty well. If you set your directory to the library you're interested in, move the the corresponing test directory invoke bjam with the appropriate switches (a whole other topic!) You will: a) build all the required libraries - if any - and no other libraries b) build and run all the tests in the library you're interested in. There are only a couple of things lacking here. a) The only thing is that the libraries will be built in an inconvenient place so you'll have to move them to some central place "by hand" b) The "test" isn't recursive. That is if I invoke the test/build for some library A which uses library B, the libraries for A and B are built, but tests are only run for library A.
Therefore, the only way Boost can be distributed is as a single giant release.
This point is true only because boost libraries are: a) too tightly coupled. b) some library authors are to cavalier about changing the library interface. Even this could be addressed by establishing some convention such as #define BOOST_SERIALIZATION_LIBRARY_VERSION 137 in each header of the library. Then each library which requires it could specify #include <boost/archive/xml_archive.hpp> // specifiy requirements for other libraries. BOOST_REQUIRES(BOOST_EXCEPTION, 130, 136) Probably each library would have to have to have a "manifest" with all the above macros. Of course, to actually get there from here would probably not be worth the effort. But this is not due to any fundamental obstacle, but rather due to common boost development practices which I don't expect to change in my lifetime. Robert Ramey

Robert Ramey a écrit :
Actually I don't see any need for any special tool of this sort. Suppose I have a simple application.
Sure : http://www.boost.org/doc/libs/1_37_0/tools/bcp/bcp.html -- ___________________________________________ Joel Falcou - Assistant Professor PARALL Team - LRI - Universite Paris Sud XI Tel : (+33)1 69 15 66 35

On Mon, Jan 19, 2009 at 12:16 PM, Robert Ramey <ramey@rrsd.com> wrote:
Emil Dotchevski wrote:
On Mon, Jan 19, 2009 at 8:03 AM, Anis Benyelloul <anis.benyelloul@gmail.com> wrote:
From this point of view, the solution is a tool that can automatically extract the minimum subset of Boost that you need. The reason why this is not practical is that you can't mix and match units from different Boost versions.
Actually I don't see any need for any special tool of this sort. Suppose I have a simple application.
#include "boost/archive/xml_archive.hpp" ...
Which drags in only the headers it actually uses.. If one doesn't have one of those headers installed, then the user either has to get it, make his own or use a different library.
That's only enough to make the #include succeed. :) If you want everything to work, your best bet is to include the headers from a single Boost release.
Therefore, the only way Boost can be distributed is as a single giant release.
This point is true only because boost libraries are:
a) too tightly coupled. b) some library authors are to cavalier about changing the library interface.
How tightly coupled Boost is is a separate topic, IMO. As long as you have two libraries that include the same header file, you can't reliably distribute the two libraries in separate packages (unless you separate the shared header in its own package.) As far as changing library interfaces, you _can_ break user code without changing the interface of a library. For example, didn't changes in shared_ptr implementation details break the serialization library at some point? What distributing a single giant release gives us is a single point of reference for everything. You can distribute your own code and simply state that it needs Boost version 1.37. What is the alternative? Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

on Mon Jan 19 2009, Anis Benyelloul <anis.benyelloul-AT-gmail.com> wrote:
Hello,
I was wondering if there was any interest into making the boost libraries more ``independent''?
There is already work underway: https://svn.boost.org/trac/boost/wiki/CMakeModularizeLibrary -- Dave Abrahams BoostPro Computing http://www.boostpro.com

I had never seen this before. I applaud the effort. I also recognise that this will take quite a while. But still, its good to see things moving in what I see is the right direction. Robert Ramey David Abrahams wrote:
on Mon Jan 19 2009, Anis Benyelloul <anis.benyelloul-AT-gmail.com> wrote:
Hello,
I was wondering if there was any interest into making the boost libraries more ``independent''?
There is already work underway: https://svn.boost.org/trac/boost/wiki/CMakeModularizeLibrary

David Abrahams wrote:
on Mon Jan 19 2009, Anis Benyelloul <anis.benyelloul-AT-gmail.com> wrote:
Hello,
I was wondering if there was any interest into making the boost libraries more ``independent''?
There is already work underway: https://svn.boost.org/trac/boost/wiki/CMakeModularizeLibrary
This is certainly great to hear. But in what way is this tied to one particular build system ? Thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin...

David Abrahams <dave <at> boostpro.com> writes:
There is already work underway: https://svn.boost.org/trac/boost/wiki/CMakeModularizeLibrary
Awesome... The part talking about having each library (source+tests+headers+doc+manifest_describing_depedencies) in its own dedicated directory is exactly what I meant. PS. But the page talks about using CMake (i.e replacing bjam???) Aren't we mixing two problems? (the build system used vs. the way libraries are organized)

Anis Benyelloul wrote:
PS. But the page talks about using CMake (i.e replacing bjam???) Aren't we mixing two problems? (the build system used vs. the way libraries are organized)
I guess CMake is not as powerful as bjam, and therefore requires the libraries to be more modularized. But this is not really the fault of CMake, since it only generates the build-environments for a variety of existing build tools (like Visual Studio, KDevelop, eclipse, "classic" unix make, ...), so it somehow has to limit the allowed non-modularity to what the "weakest" of the supported build-environments supports. Regards, Thomas

on Tue Jan 20 2009, "Thomas Klimpel" <Thomas.Klimpel-AT-synopsys.com> wrote:
Anis Benyelloul wrote:
PS. But the page talks about using CMake (i.e replacing bjam???) Aren't we mixing two problems? (the build system used vs. the way libraries are organized)
I guess CMake is not as powerful as bjam, and therefore requires the libraries to be more modularized.
Not really. It's just much, much better that way... as it is for bjam. -- Dave Abrahams BoostPro Computing http://boostpro.com

Thomas Klimpel wrote:
Anis Benyelloul wrote:
PS. But the page talks about using CMake (i.e replacing bjam???) Aren't we mixing two problems? (the build system used vs. the way libraries are organized)
I guess CMake is not as powerful as bjam, and therefore requires the libraries to be more modularized. But this is not really the fault of CMake, since it only generates the build-environments for a variety of existing build tools (like Visual Studio, KDevelop, eclipse, "classic" unix make, ...), so it somehow has to limit the allowed non-modularity to what the "weakest" of the supported build-environments supports.
Actually, boost-cmake can build boost 'modularized' or as a monolith, and can do the modularization without the use of any external tools (other than cmake itself). This is independent of what build tool (unix make, VS, etc) is used. All that is required is to move some directories around and tweak include paths accordingly. IIRC the process of putting in the modularization code revealed a few interesting circular dependencies as well. Given that it is pretty easy to reconfigure the build environment to handle a modularized boost (this shouldn't be hard in bjam either), I agree that the choice of build tool and this 'modularization' are mostly orthogonal issues (the hard part is to comb the knots out of the tree of intermodule dependencies). -t

on Tue Jan 20 2009, Anis Benyelloul <anis.benyelloul-AT-gmail.com> wrote:
David Abrahams <dave <at> boostpro.com> writes:
There is already work underway: https://svn.boost.org/trac/boost/wiki/CMakeModularizeLibrary
Awesome... The part talking about having each library (source+tests+headers+doc+manifest_describing_depedencies) in its own dedicated directory is exactly what I meant.
PS. But the page talks about using CMake (i.e replacing bjam???) Aren't we mixing two problems? (the build system used vs. the way libraries are organized)
Yes, but it's just a matter of necessity driving progress. The reorganization described there is something we have to do in order to get the CMake system to act as we'd like. -- Dave Abrahams BoostPro Computing http://boostpro.com

Anis Benyelloul wrote:
- The header file dependencies are unclear... it makes very difficult to ``extract'' just the library you need from the tarball (it is not enough to copy the directory whose name matches the library you want)
The bcp tool, that has existed for a while, does just that.

Mathias Gaunard wrote:
Anis Benyelloul wrote:
- The header file dependencies are unclear... it makes very difficult to ``extract'' just the library you need from the tarball (it is not enough to copy the directory whose name matches the library you want)
The bcp tool, that has existed for a while, does just that.
Is bcp a name that is supposed to mean something? The documentation page doesn't seem to suggest any rational for the name bcp. Doesn't matter too much just curious. -- Michael Marcin

on Wed Jan 21 2009, Michael Marcin <mike.marcin-AT-gmail.com> wrote:
Mathias Gaunard wrote:
Anis Benyelloul wrote:
- The header file dependencies are unclear... it makes very difficult to ``extract'' just the library you need from the tarball (it is not enough to copy the directory whose name matches the library you want)
The bcp tool, that has existed for a while, does just that.
Is bcp a name that is supposed to mean something? The documentation page doesn't seem to suggest any rational for the name bcp. Doesn't matter too much just curious.
I always assumed "b => Boost" and "cp => Unix cp" (copy) -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Is bcp a name that is supposed to mean something? The documentation page doesn't seem to suggest any rational for the name bcp. Doesn't matter too much just curious.
I always assumed "b => Boost" and "cp => Unix cp" (copy)
That was the intention, made sense to me at the time anyway ;-) John.

John Maddock wrote:
Is bcp a name that is supposed to mean something? The documentation page doesn't seem to suggest any rational for the name bcp. Doesn't matter too much just curious.
I always assumed "b => Boost" and "cp => Unix cp" (copy)
That was the intention, made sense to me at the time anyway ;-)
Ah I'm a windows guy so I didn't get it. It makes sense now thanks :). -- Michael Marcin
participants (12)
-
Anis Benyelloul
-
David Abrahams
-
Emil Dotchevski
-
Joel Falcou
-
John Maddock
-
Mathias Gaunard
-
Michael Marcin
-
OvermindDL1
-
Robert Ramey
-
Stefan Seefeld
-
Thomas Klimpel
-
troy d. straszheim