Library dependencies and intrer-library code reuse

There's a lot of chatter on the list recently about how to "grow" boost. I wanted to add my own idea to the pot. Due to the scope of what I describe, I don't think it's reasonable to expect everyone to be on board with it because TBH it's a lot of work, but I'm just throwing it out there, and I think that if you forget about what would be involved in actually making it happen, it actually has many highly desirable properties. But comments would be welcome. So anyway, one of the recurring questions that pops up over and over and over is that of whether or not a library should reinvent the wheel, or use some data structure / class from some other library. On one side of the fence, there's people who say that it's often ok to reinvent the wheel inside library A if it avoids introducing a dependency on boost library B. Generally, the problem is that introducing a dependency on B a) increases compile time, because often only an extremely small component of library B is needed, but the header-onlyness of most libraries causes inclusion explosion, and b) introduces logistical problems when library B is not "permitted" for use in a certain organization. This is becoming increasingly common as certain organizations begin whitelisting boost libraries that are allowed in their codebase. On the other side of the fence, there's people who say that reinventing the wheel is almost always bad. The reasons for this are obvious and I think everyone here is well aware of the issues, so I won't repeat them. Note that I'm not taking sides here, just pointing out that the argument against reinventing the wheel is well understood. Is there a way to solve *both* of these problems at the same time? I think there is. Consider the following system: 1) Library maintenance is more flexible, and not necessarily limited to only a single maintainer or a whitelisted group of maintainers. There is another thread on this topic already, so rather than debating this particular point here, it's probably better suited in the other thread. For the purposes of this post, just assume that there is *some* system, with completely unspecified parameters, where there is *some* way for people to easily get their own code into an existing library that they are not necessarily the author / maintainer of. At the very least, *other library authors* would be given a "fast track" to submitting changes to other libraries, whatever "fast track" means. 2) All libraries would be required to support both pimpl'ed and header-only implementations. I realize this is sort of a huge undertaking and requires retrofitting tons of libraries, but I think the advantages are worth it. The users end up winning in the long run, nobody ever has to have this debate again about whether or not header only libraries are good or bad. If you want header only libraries, use #include <boost/someheader.hpp>. If you want pimpl'ed libraries, use #include <boostp/someheader.hpp>. pimpled' headers only include other pimpled headers. Also it makes the system I'm describing possible. 3) The boost build process is changed to allow additional options that either: a) automatically select all libraries for external visibility, b) select a specific list of libraries for external visibility, or c) select a specific list of libraries to disable external visibility. During the build process, what is currently known as the "boost" directory (e.g. the boost folder in #include <boost/shared_ptr.hpp>), gets a different name. Perhaps it's changed to something like <boost/internal/shared_ptr.hpp>. This happens EVERYWHERE. All boost library header and source files would need to be updated to include from boost/internal. For those libraries that were selected for external visibility, a symlink is created under the boost directory inside the internal directory. On windows a directory junction is used. Note that directory junctions are supported since Windows 2000, so I don't think this is an issue. It's very easy to create them, just that people rarely do. The end result of this is the following: 1) Users notice minimal impact. The only difference, in fact, is that if they want pimpled implementations they #include <boostp/...> instead of #include <boost/...>. This means the current behavior continues to be the default. After they build boost, the build process handles all the logistics. Everything is already in the correct structure to "just work", all symlinks have been created as described in 3) above, and the user only needs to set their include path to the top level boost directory. 2) All boost libraries can use all other boost libraries with ease, regardless of whether or not an organization's policy allows use of those libraries in user code, because internally all libraries include from boost/internal. 3) Despite the fact that all these additional dependencies that a user isn't technically allowed to be using due to his organization's policy are indeed being used, compile times can be kept sane because the user has access to pimpled libraries, and pimpled libraries only depend on other pimpled libraries, which is possible since all libraries are required to support pimpled versions. So there we go. Does this work, and if not why not? Even if we agree it's a huge undertaking, is it worth it? And if not, why not? Zach

On Thu, Mar 25, 2010 at 7:30 PM, Zachary Turner <divisortheory@gmail.com> wrote:
So anyway, one of the recurring questions that pops up over and over and over is that of whether or not a library should reinvent the wheel, or use some data structure / class from some other library.
Things are much more complicated. Take for example something as basic as BOOST_STATIC_ASSERT: it is more complex than it could be because it supports lots of broken compilers and this injects a lot of headers into a client library. That is pure overhead for users of that library if the library doesn't support the broken compilers anyway. In that case, a simple typedef int a_must_be_less_than_42[a<42]; is preferable IMO. Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

AMDG Zachary Turner wrote:
So there we go. Does this work, and if not why not? Even if we agree it's a huge undertaking, is it worth it? And if not, why not?
Even if it were a good idea, it isn't going to happen. Nothing that requires that much effort is ever going to happen around here. If we did have that kind of manpower, I think there are many higher priorities. In Christ, Steven Watanabe

On Thu, Mar 25, 2010 at 10:43 PM, Steven Watanabe <watanabesj@gmail.com>wrote:
AMDG
Zachary Turner wrote:
So there we go. Does this work, and if not why not? Even if we agree it's a huge undertaking, is it worth it? And if not, why not?
Even if it were a good idea, it isn't going to happen. Nothing that requires that much effort is ever going to happen around here. If we did have that kind of manpower, I think there are many higher priorities.
Surely we can't adopt that stance forever can we? It's not difficult to imagine a scenario down the line where Boost has hundreds of independent libraries. This won't scale. It *cant* scale. But at the same time, it really doesn't make sense for everyone to continue reinventing wheels in every single new library that gets added to boost. It defeats the whole purpose of having a generic library in the first place, and makes the exact problem that everyone complains about (slow compile times) even worse! We can propose alternative solutions, but ultimately I don't see any way around something that really does require that much effort. Is it easier to start embracing that effort now, when boost has a manageable (albeit still large) number of libraries, or do we wait a couple years and realize that it really is too late at that point? In order for Boost to live on, I think there necessarily must be a system such that a) Only selected libraries are exposed to the user, and b) libraries can freely reuse other libraries without any noticeable implications to the user. If the reuse problem is not solved, maintenance grows exponentially more complicated as time passes, as bugs in library A's implementation of foo are not propagated to library B, C, D, and E's implementation of foo. This compounds as more libraries are added. What could have been fixed by 1 person needs to be fixed by 5 different people. If the selective external visibility problem is not solved, boost's sandbox approach to adding more and more libraries will alienate more and more people to the point that people really do just stop using it altogether. I see both of these outcomes as disastrous and I can't think of anything that should be higher priority than making sure it doesn't happen. I'm certainly open to alternatives, and I see some of them being discussed independently in other threads on the list in recent days, but I think any reasonable solution must be designed around both problems at the same time.

At Thu, 25 Mar 2010 22:58:45 -0500, Zachary Turner wrote:
On Thu, Mar 25, 2010 at 10:43 PM, Steven Watanabe <watanabesj@gmail.com>wrote:
AMDG
Zachary Turner wrote:
So there we go. Does this work, and if not why not? Even if we agree it's a huge undertaking, is it worth it? And if not, why not?
Even if it were a good idea, it isn't going to happen. Nothing that requires that much effort is ever going to happen around here. If we did have that kind of manpower, I think there are many higher priorities.
Surely we can't adopt that stance forever can we? It's not difficult to imagine a scenario down the line where Boost has hundreds of independent libraries. This won't scale. It *cant* scale. But at the same time, it really doesn't make sense for everyone to continue reinventing wheels in every single new library that gets added to boost. It defeats the whole purpose of having a generic library in the first place, and makes the exact problem that everyone complains about (slow compile times) even worse!
Untangling (and minimizing) intra-library dependencies is certainly doable—the untangling part has already been done (http://gitorious.org/boost)—but you proposed something more radical… and probably impossible when you consider the pimpl/header-only requirement. A pimpl-based type traits library? -- Dave Abrahams Meet me at BoostCon: http://www.boostcon.com BoostPro Computing http://www.boostpro.com

Unbelievably Awesome. On Fri, Mar 26, 2010 at 1:11 AM, David Abrahams <dave@boostpro.com> wrote:
At Thu, 25 Mar 2010 22:58:45 -0500, Zachary Turner wrote:
On Thu, Mar 25, 2010 at 10:43 PM, Steven Watanabe <watanabesj@gmail.com>wrote:
AMDG
Zachary Turner wrote:
So there we go. Does this work, and if not why not? Even if we agree it's a huge undertaking, is it worth it? And if not, why not?
Even if it were a good idea, it isn't going to happen. Nothing that requires that much effort is ever going to happen around here. If we did have that kind of manpower, I think there are many higher priorities.
Surely we can't adopt that stance forever can we? It's not difficult to imagine a scenario down the line where Boost has hundreds of independent libraries. This won't scale. It *cant* scale. But at the same time, it really doesn't make sense for everyone to continue reinventing wheels in every single new library that gets added to boost. It defeats the whole purpose of having a generic library in the first place, and makes the exact problem that everyone complains about (slow compile times) even worse!
Untangling (and minimizing) intra-library dependencies is certainly doable—the untangling part has already been done (http://gitorious.org/boost)—but you proposed something more radical… and probably impossible when you consider the pimpl/header-only requirement. A pimpl-based type traits library?
-- Dave Abrahams Meet me at BoostCon: http://www.boostcon.com BoostPro Computing http://www.boostpro.com
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Tom Brinkman-4 wrote:
Unbelievably Awesome.
On Fri, Mar 26, 2010 at 1:11 AM, David Abrahams <dave@boostpro.com> wrote:
At Thu, 25 Mar 2010 22:58:45 -0500, Zachary Turner wrote:
On Thu, Mar 25, 2010 at 10:43 PM, Steven Watanabe <watanabesj@gmail.com>wrote:
AMDG
Zachary Turner wrote:
So there we go. Does this work, and if not why not? Even if we agree it's a huge undertaking, is it worth it? And if not, why not?
Even if it were a good idea, it isn't going to happen. Nothing that requires that much effort is ever going to happen around here. If we did have that kind of manpower, I think there are many higher priorities.
Surely we can't adopt that stance forever can we? It's not difficult to imagine a scenario down the line where Boost has hundreds of independent libraries. This won't scale. It *cant* scale. But at the same time, it really doesn't make sense for everyone to continue reinventing wheels in every single new library that gets added to boost. It defeats the whole purpose of having a generic library in the first place, and makes the exact problem that everyone complains about (slow compile times) even worse!
Untangling (and minimizing) intra-library dependencies is certainly doable—the untangling part has already been done (http://gitorious.org/boost)—but you proposed something more radical… and probably impossible when you consider the pimpl/header-only requirement. A pimpl-based type traits library?
-- Dave Abrahams Meet me at BoostCon: http://www.boostcon.com BoostPro Computing http://www.boostpro.com
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Please don't top post. Vicente -- View this message in context: http://old.nabble.com/Library-dependencies-and-intrer-library-code-reuse-tp2... Sent from the Boost - Dev mailing list archive at Nabble.com.

At Fri, 26 Mar 2010 03:45:51 -0700 (PDT), Vicente Botet Escriba wrote: <snip whole thread>
Please don't top post.
Vicente
Please don't overquote. just-for-symmetry-ly y'rs, -- Dave Abrahams Meet me at BoostCon: http://www.boostcon.com BoostPro Computing http://www.boostpro.com

On 26 March 2010 08:11, David Abrahams <dave@boostpro.com> wrote:
Untangling (and minimizing) intra-library dependencies is certainly doable—the untangling part has already been done (http://gitorious.org/boost)—but you proposed something more radical… and probably impossible when you consider the pimpl/header-only requirement. A pimpl-based type traits library?
I don't think it's a good idea to divide boost up based on the current directory structure. We really should sort out the structure of boost before splitting it up, not after. Daniel

At Fri, 26 Mar 2010 10:30:51 +0000, Daniel James wrote:
On 26 March 2010 08:11, David Abrahams <dave@boostpro.com> wrote:
Untangling (and minimizing) intra-library dependencies is certainly doable—the untangling part has already been done (http://gitorious.org/boost)—but you proposed something more radical… and probably impossible when you consider the pimpl/header-only requirement. A pimpl-based type traits library?
I don't think it's a good idea to divide boost up based on the current directory structure. We really should sort out the structure of boost before splitting it up, not after.
Nah, then you have to do the whole thing at once and you're bottlenecked until it happens, and it will never happen. Get the libraries separated and working with reasonable structure, and sorting out other structural issues becomes something each maintainer can take care of. Part of the reason things don't progress as well as they could around here is that every consequential job begins to look like it involves someone being willing to touch everything. -- Dave Abrahams Meet me at BoostCon: http://www.boostcon.com BoostPro Computing http://www.boostpro.com

On 26 March 2010 18:00, David Abrahams <dave@boostpro.com> wrote:
Nah, then you have to do the whole thing at once and you're bottlenecked until it happens, and it will never happen. Get the libraries separated and working with reasonable structure, and sorting out other structural issues becomes something each maintainer can take care of. Part of the reason things don't progress as well as they could around here is that every consequential job begins to look like it involves someone being willing to touch everything.
Well, you should at least ask beforehand. We're going to have to move around a lot of files. Ironically, that suggests that changing version control systems would be a bad idea as subversion is the best suited for tracking that kind of thing. Daniel

On Fri, Mar 26, 2010 at 3:11 AM, David Abrahams <dave@boostpro.com> wrote:
At Thu, 25 Mar 2010 22:58:45 -0500, Zachary Turner wrote:
On Thu, Mar 25, 2010 at 10:43 PM, Steven Watanabe <watanabesj@gmail.com wrote:
AMDG
Zachary Turner wrote:
So there we go. Does this work, and if not why not? Even if we
agree
it's a huge undertaking, is it worth it? And if not, why not?
Even if it were a good idea, it isn't going to happen. Nothing that requires that much effort is ever going to happen around here. If we did have that kind of manpower, I think there are many higher priorities.
Surely we can't adopt that stance forever can we? It's not difficult to imagine a scenario down the line where Boost has hundreds of independent libraries. This won't scale. It *cant* scale. But at the same time, it really doesn't make sense for everyone to continue reinventing wheels in every single new library that gets added to boost. It defeats the whole purpose of having a generic library in the first place, and makes the exact problem that everyone complains about (slow compile times) even worse!
Untangling (and minimizing) intra-library dependencies is certainly doable—the untangling part has already been done (http://gitorious.org/boost)—but you proposed something more radical… and probably impossible when you consider the pimpl/header-only requirement. A pimpl-based type traits library?
Ok, sure. In some cases it's not possible. I guess I didn't explicitly say that because I thought it was kind of obvious but you're right, I should have clarified, or at least not emphasized *every* library. But the point was that if it's possible, it should be done. I think that it's possible in large part for almost every library which is not strictly a metaprogramming library, or which generates some runtime code. Most libraries use some element of metaprogramming, but oftentimes they are internal details and the metaprogramming-related headers can be included only from CPP files. Like you said, minimizing inter-library dependencies isn't that impractical. My point was that I don't think minimizing inter-library dependencies is necessarily a worthy goal. We should *encourage* libraries to use other libraries, while mitigating the negative effects of these dependencies from the user's point of view. Zach

At Fri, 26 Mar 2010 08:30:49 -0500, Zachary Turner wrote:
Ok, sure. In some cases it's not possible. I guess I didn't explicitly say that because I thought it was kind of obvious but you're right, I should have clarified, or at least not emphasized *every* library. But the point was that if it's possible, it should be done. I think that it's possible in large part for almost every library which is not strictly a metaprogramming library, or which generates some runtime code. Most libraries use some element of metaprogramming, but oftentimes they are internal details and the metaprogramming-related headers can be included only from CPP files.
Like you said, minimizing inter-library dependencies isn't that impractical. My point was that I don't think minimizing inter-library dependencies is necessarily a worthy goal. We should *encourage* libraries to use other libraries, while mitigating the negative effects of these dependencies from the user's point of view.
I simply don't have time to argue that point right now, but I disagree about the worthiness of the goal. Every dependency should be an informed and conscious decision on the part of a library developer. -- Dave Abrahams Meet me at BoostCon: http://www.boostcon.com BoostPro Computing http://www.boostpro.com

Steven Watanabe:
AMDG
David Abrahams wrote:
Every dependency should be an informed and conscious decision on the part of a library developer.
Humph. It's not as though #includes mysteriously appear in your code when you're looking the other way.
They do, as part of bug fixes made by others and as indirect dependencies. This goes unnoticed because the test infrastructure always has the whole of Boost available. One needs to actively fight dependencies, or they do creep in.

Peter Dimov wrote:
Steven Watanabe:
AMDG
David Abrahams wrote:
Every dependency should be an informed and conscious decision on the part of a library developer.
Humph. It's not as though #includes mysteriously appear in your code when you're looking the other way.
They do, as part of bug fixes made by others and as indirect dependencies. This goes unnoticed because the test infrastructure always has the whole of Boost available. One needs to actively fight dependencies, or they do creep in.
They also do thanks to bad coding choices. I just had a job applicant doing a coding test today who opened his source file with a couple of dozen lines of includes and defines that he referred to as his "standard header" that he uses for everything. Unfortunately, only about 3 lines of it had anything to do with the coding problem he was supposed to be solving. One more way is by removing dependencies in the code. If a change or improvement eliminates the existing references to a header that lead to it being included, it is not always easy to know that and remove the include directive. Especially if this removal happens incrementally over a few changes. John

At Fri, 26 Mar 2010 20:30:25 +0200, Peter Dimov wrote:
Steven Watanabe:
AMDG
David Abrahams wrote:
Every dependency should be an informed and conscious decision on the part of a library developer.
Humph. It's not as though #includes mysteriously appear in your code when you're looking the other way.
They do, as part of bug fixes made by others and as indirect dependencies. This goes unnoticed because the test infrastructure always has the whole of Boost available. One needs to actively fight dependencies, or they do creep in.
They also appear indirectly. If A depends on B and B acquires a new dependency, A now has a new dependency too. -- Dave Abrahams Meet me at BoostCon: http://www.boostcon.com BoostPro Computing http://www.boostpro.com
participants (9)
-
Daniel James
-
David Abrahams
-
Emil Dotchevski
-
John Phillips
-
Peter Dimov
-
Steven Watanabe
-
Tom Brinkman
-
Vicente Botet Escriba
-
Zachary Turner