Library dependencies and intrer-library code reuse

26 Mar 2010

      There's a lot of chatter on the list recently about how to "grow" boost.  I
wanted to add my own idea to the pot.  Due to the scope of what I describe,
I don't think it's reasonable to expect everyone to be on board with it
because TBH it's a lot of work, but I'm just throwing it out there, and I
think that if you forget about what would be involved in actually making it
happen, it actually has many highly desirable properties.  But comments
would be welcome.

So anyway, one of the recurring questions that pops up over and over and
over is that of whether or not a library should reinvent the wheel, or use
some data structure / class from some other library.

On one side of the fence, there's people who say that it's often ok to
reinvent the wheel inside library A if it avoids introducing a dependency on
boost library B.  Generally, the problem is that introducing a dependency on
B
     a) increases compile time, because often only an extremely small
component of library B is needed, but the header-onlyness of most libraries
causes inclusion explosion, and
     b) introduces logistical problems when library B is not "permitted" for
use in a certain organization.  This is becoming increasingly common as
certain organizations begin whitelisting boost libraries that are allowed in
their codebase.

On the other side of the fence, there's people who say that reinventing the
wheel is almost always bad.  The reasons for this are obvious and I think
everyone here is well aware of the issues, so I won't repeat them.  Note
that I'm not taking sides here, just pointing out that the argument against
reinventing the wheel is well understood.

Is there a way to solve *both* of these problems at the same time?  I think
there is.   Consider the following system:

1) Library maintenance is more flexible, and not necessarily limited to only
a single maintainer or a whitelisted group of maintainers.  There is another
thread on this topic already, so rather than debating this particular point
here, it's probably better suited in the other thread.  For the purposes of
this post, just assume that there is *some* system, with completely
unspecified parameters, where there is *some* way for people to easily get
their own code into an existing library that they are not necessarily the
author / maintainer of.  At the very least, *other library authors* would be
given a "fast track" to submitting changes to other libraries, whatever
"fast track" means.

2) All libraries would be required to support both pimpl'ed and header-only
implementations.  I realize this is sort of a huge undertaking and requires
retrofitting tons of libraries, but I think the advantages are worth it.
 The users end up winning in the long run, nobody ever has to have this
debate again about whether or not header only libraries are good or bad.  If
you want header only libraries, use #include <boost/someheader.hpp>.  If you
want pimpl'ed libraries, use #include <boostp/someheader.hpp>.  pimpled'
headers only include other pimpled headers.  Also it makes the system I'm
describing possible.

3) The boost build process is changed to allow additional options that
either: a) automatically select all libraries for external visibility, b)
select a specific list of libraries for external visibility, or c) select a
specific list of libraries to disable external visibility.   During the
build process, what is currently known as the "boost" directory (e.g. the
boost folder in #include <boost/shared_ptr.hpp>), gets a different name.
 Perhaps it's changed to something like <boost/internal/shared_ptr.hpp>.
 This happens EVERYWHERE.  All boost library header and source files would
need to be updated to include from boost/internal.  For those libraries that
were selected for external visibility, a symlink is created under the boost
directory inside the internal directory.  On windows a directory junction is
used.  Note that directory junctions are supported since Windows 2000, so I
don't think this is an issue.  It's very easy to create them, just that
people rarely do.

The end result of this is the following:

1) Users notice minimal impact.  The only difference, in fact, is that if
they want pimpled implementations they #include <boostp/...> instead of
#include <boost/...>.  This means the current behavior continues to be the
default.  After they build boost, the build process handles all the
logistics.  Everything is already in the correct structure to "just work",
all symlinks have been created as described in 3) above, and the user only
needs to set their include path to the top level boost directory.

2) All boost libraries can use all other boost libraries with ease,
regardless of whether or not an organization's policy allows use of those
libraries in user code, because internally all libraries include from
boost/internal.

3) Despite the fact that all these additional dependencies that a user isn't
technically allowed to be using due to his organization's policy are indeed
being used, compile times can be kept sane because the user has access to
pimpled libraries, and pimpled libraries only depend on other pimpled
libraries, which is possible since all libraries are required to support
pimpled versions.

So there we go.  Does this work, and if not why not?   Even if we agree it's
a huge undertaking, is it worth it?  And if not, why not?

Zach

Zachary Turner

Emil Dotchevski

Steven Watanabe

Zachary Turner

David Abrahams

Tom Brinkman

Vicente Botet Escriba

David Abrahams

Daniel James

David Abrahams

Daniel James

Zachary Turner

David Abrahams

Steven Watanabe

Peter Dimov

John Phillips

David Abrahams

tags

participants (9)