[containers] Clarifications on incomplete type support?
Hi all, I've been using Boost.Containers for some time now. As for many, the primary (and only) reason for using them (over STL ones) is that they support incomplete value types. Since the beginning, I have always assumed that support for incomplete value types meant that I could instantiate the container type and any indirect type, most notably, **iterators**. I cannot stress enough how important it is for data structure designs to be able to instantiate the iterators of containers of incomplete value types! And, in general, there is no inherent reason why, in general, incomplete type support could not extend to iterators as well. This is even more true in light of the SCARY iterator requirements (which boost containers purport to support, which cannot possibly be true given the compilation errors I have seen). So, my main question is this: Is the following code supposed to work when using Boost.Containers? boost::container::some_container<IncompleteType>::iterator it; The documentation only states that "all Boost.Containers containers [..] are designed to support incomplete types". This needs to be clarified when it comes to iterators, at least. If the above code is not guaranteed to work, then may I request that it be added to the requirements? Because in my view, not having iterator support for incomplete types sort of defeats the whole purpose of supporting incomplete types, except for the most trivial use-cases (like those "recursive containers" examples). And moreover, without this support, workarounds are so major that they won't benefit from the container's support for incomplete types any more (and therefore, might as well use the standard ones instead).
From experience, I know that most Boost.Container containers had working iterators for incomplete value types up to and including version 1.54 (and it's a kind of natural consequence of supporting incomplete types for the container itself). But recently, changes to the boost::container::list class have made it impossible to use of boost::container::list iterators for incomplete types (due to some convoluted new iterator scheme that I can barely understand myself). If iterators are supposed to be included in the incomplete type support, then this is a bug in the list container (and btw, boost::unordered_set also has the same problem), and I would also recommend adding this to the container unit-tests.
Thanks, Mikael Persson
El 21/09/2014 23:19, Mikael Persson escribió:
I cannot stress enough how important it is for data structure designs to be able to instantiate the iterators of containers of incomplete value types! And, in general, there is no inherent reason why, in general, incomplete type support could not extend to iterators as well. This is even more true in light of the SCARY iterator requirements (which boost containers purport to support, which cannot possibly be true given the compilation errors I have seen).
So, my main question is this: Is the following code supposed to work when using Boost.Containers?
boost::container::some_container<IncompleteType>::iterator it;
There was no requirement for that. Only for the container itself.
The documentation only states that "all Boost.Containers containers [..] are designed to support incomplete types". This needs to be clarified when it comes to iterators, at least.
Sure.
If the above code is not guaranteed to work, then may I request that it be added to the requirements? Because in my view, not having iterator support for incomplete types sort of defeats the whole purpose of supporting incomplete types, except for the most trivial use-cases (like those "recursive containers" examples). And moreover, without this support, workarounds are so major that they won't benefit from the container's support for incomplete types any more (and therefore, might as well use the standard ones instead).
I didn't have that need in the past, I used containers of incomplete to define recursive containers or like PIMPL-like idiom. If defining iterator type is desired, I see no problem in supporting that. The standard proposal: http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3890.html does not mention iterators.
From experience, I know that most Boost.Container containers had working iterators for incomplete value types up to and including version 1.54 (and it's a kind of natural consequence of supporting incomplete types for the container itself). But recently, changes to the boost::container::list class have made it impossible to use of boost::container::list iterators for incomplete types (due to some convoluted new iterator scheme that I can barely understand myself). If iterators are supposed to be included in the incomplete type support, then this is a bug in the list container (and btw, boost::unordered_set also has the same problem), and I would also recommend adding this to the container unit-tests.
I've seen your pull request, thanks. I will review it and try to add support for iterators to incomplete types. And yes, containers of incomplete types (including iterators) should be better tested in the unit-tests. Best, Ion
Thanks for the reply.
I didn't have that need in the past, I used containers of incomplete to define recursive containers or like PIMPL-like idiom.
My intended application is part of a complete re-write of the
adjacency_list class template of the Boost.Graph library, and especially,
it's new little cousin, a linked-tree implementation (which are both
working wonderfully on version 1.54). Using the guarantees on incomplete
types from Boost.Container, it allows for a great number of simplifications
(less indirections) under the hood, unlike the existing adjacency_list
implementation which relies on STL containers and is forced to make many
contortions (including void* style type-erasure) to avoid instantiating
containers of incomplete types.
What my problem basically boils down to is being able to have a recursive
data structure with "back-pointers" (as iterators). Like this:
struct vertex;
struct edge {
T value;
containerA<vertex>::iterator target_vertex;
}
struct vertex {
T value;
containerB<edge> out_edges;
containerB<edge>::iterator in_edge;
}
I hope you can see how crucial it is to have this support, and how it is an
integral part of supporting recursive data structures (where such
"back-pointers" are very commonly needed or used). Without incomplete type
support for iterators, the options to solve this problem are:
1. assume that "container<int>::iterator" has the same layout as any
"container<T>::iterator" and simply do reinterpret-casts on the iterators,
which is obvious dirty and unsafe, but works in practice.
2. do "container
El 22/09/2014 19:34, Mikael Persson wrote:
[snip]
What my problem basically boils down to is being able to have a recursive data structure with "back-pointers" (as iterators). Like this:
Thanks for the example. Back-pointers seem to be a very useful feature.
I've merged your pull request, added the guarantee to instantiate
iterators to containers of incomplete types and updated the recursive
container example. I don't know if more explicit guarantees should be
given both for containers and iterators. For example, this compiles:
#include
Have you found any need to support member functions of containers of incomplete types?
I don't think there is any need for that. What really matters is the memory layouts, i.e., to be able to form sizeof(container<T>) and sizeof(container<T>::iterator) and so on for other iterators. I really haven't encountered nor am I able to think of any context where you would need to write code that uses members of a container of incomplete types (or of an iterator thereof). It's very easy to move code around or to correctly place the instantiation points of the code to where all types are complete, it's really the memory layout that is nearly impossible to form safely if you have to work around an incomplete type problem. Just like having pointers to incomplete types doesn't allow you to really do anything with them (e.g., pointer arithmetic) until the type's completed, but what matters is that you can **form** pointer variables. It would be tedious and tricky to guarantee that specific members will work for incomplete types, and it might significantly restrict the implementation. Thanks for merging my commit. BTW, while testing my new BGL code against the current branch of Boost.Container, I also discovered that many of my unit-tests trigger infinite memory-leaking loops (when sets or lists are used as containers), which was not the case against version 1.54 of Boost.Container. It will take some time to track down where this is coming from, but if it comes from Boost.Container, you'll hear from me again about fixing those bugs ;) (N.B.: it's very possible that the errors are on my side of things, not yours, but I need to investigate it first). Mikael.
El 22/09/2014 23:57, Mikael Persson escribió:
It would be tedious and tricky to guarantee that specific members will work for incomplete types, and it might significantly restrict the implementation.
I've found that PIMPL-like idioms might require at least some operations (like the default constructor) to be incomplete type friendly. That would make possible to inline default constructors of types with containers of incomplete types. unique_ptr and shared_ptr also provide this guarantee.
BTW, while testing my new BGL code against the current branch of Boost.Container, I also discovered that many of my unit-tests trigger infinite memory-leaking loops (when sets or lists are used as containers), which was not the case against version 1.54 of Boost.Container. It will take some time to track down where this is coming from, but if it comes from Boost.Container, you'll hear from me again about fixing those bugs ;) (N.B.: it's very possible that the errors are on my side of things, not yours, but I need to investigate it first).
Boost.Container has suffered a lot of changes recently, and it might surely be a Boost.Container bug. Best, Ion
I've found that PIMPL-like idioms might require at least some operations (like the default constructor) to be incomplete type friendly.
Yes, that sounds reasonable. It could be reasonable to guarantee that default constructors (and assignment, maybe) can be possible for incomplete types. I guess in the same way as a pointer to an incomplete type can provide some basic operations (create a pointer, init to NULL, copy the pointer, etc.). It could make sense to just transpose whatever guarantees the standard gives for pointers of incomplete type to the container and iterator types as well. But I don't think it should or could go much beyond that.
Boost.Container has suffered a lot of changes recently, and it might surely be a Boost.Container bug.
Actually, I investigated a bit, and the problem stems from a rather embarrassing oversight on my part. I was using default-constructed iterators as a kind of universal "null iterator". This means that my code was riddled with undefined behavior as a consequence of that. It just happened to work on Boost.Container containers prior to the SCARY iterator changes. So, I was going to ask if it could be considered to add some guarantee that default-constructed (value-initialized) iterators have a well-defined "null" value (similar to default-constructed standard stream-iterators). Then, I noticed that proposal n3644 proposes exactly that, and it was accepted to C++14. See proposal "Null Forward Iterators": http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2013/n3644.pdf Therefore, this needs to be added to Boost.Container containers as well for compliance with C++14. I would be happy to do it myself, but I'm afraid that the whole under-the-hood iterator architecture used by Boost.Container and Boost.Intrusive is too much for me to wrap my mind around just for a purpose of adding a well-defined default-constructor. It should be a very simple fix for someone who already knows the code-base well. Right? Thanks, Mikael.
Hi Ion, I noticed that you fixed the issue comprehensively, replacing the quick fix that I had made for it. Thank you very much for being so quick and responsive with these issues. I would nominate you for the "best boost maintainer award", if one existed! My own code now works without any issue, thanks to your implementation of n3644 compliance in Boost.Container/Intrusive. In the process of implementing my own fix for n3644, I had a chance to take a cross-section look into the inner workings of Boost.Container/Intrusive. I would just like to make a general observation about it. I find that there is a lot of stuff under there. I understand that a robust library like Container or Intrusive cannot be as simple under-the-hood as a naive re-implementation of STL containers (i.e., a basic naive re-implementation of std::vector might take a couple of hundred lines of code, while a more production-quality one will take significantly more code than that). However, I think it's important to keep compilation times in mind. When I originally switched my BGL code from STL containers to Boost.Container, I noticed a pretty significant (almost crippling) increase in compilation times and memory-consumption. I understand that it can be neat to have, for example, a single iterator class template to implement iterators for all node-based containers (list, slist, map, set, etc.), but it requires heavy template meta-programming to make that happen without too much run-time overhead, leading to an increase in compilation time and memory. I'm not sure that this trade-off is really to the advantage of the users. I've learned to favour simple mechanics under-the-hood even if it implies a certain amount of code repetition, instead of the fancier or more elegant TMP mechanics, which I have regretted using in the past due to compilation times, memory and bloated diagnostics and debugging. That's just an opinion I thought I should share with you, and something for you to keep in mind for future coding iterations on the library (suggestion: create some comparative compilation time/memory benchmarks between STL implementations and the Boost.Container implementation, and try to aim to remain comparable to them). Thanks, Mikael.
El 24/09/2014 21:04, Mikael Persson escribió:
In the process of implementing my own fix for n3644, I had a chance to take a cross-section look into the inner workings of Boost.Container/Intrusive. I would just like to make a general observation about it. I find that there is a lot of stuff under there. I understand that a robust library like Container or Intrusive cannot be as simple under-the-hood as a naive re-implementation of STL containers (i.e., a basic naive re-implementation of std::vector might take a couple of hundred lines of code, while a more production-quality one will take significantly more code than that). However, I think it's important to keep compilation times in mind. When I originally switched my BGL code from STL containers to Boost.Container, I noticed a pretty significant (almost crippling) increase in compilation times and memory-consumption. I understand that it can be neat to have, for example, a single iterator class template to implement iterators for all node-based containers (list, slist, map, set, etc.), but it requires heavy template meta-programming to make that happen without too much run-time overhead, leading to an increase in compilation time and memory. I'm not sure that this trade-off is really to the advantage of the users. I've learned to favour simple mechanics under-the-hood even if it implies a certain amount of code repetition, instead of the fancier or more elegant TMP mechanics, which I have regretted using in the past due to compilation times, memory and bloated diagnostics and debugging. That's just an opinion
I'm a bit worried about compilation times and memory usage . However I haven't diagnosed where the problem lies and I have no much time and knowledge to investigate it. Any help is welcome. Do you notice this overhead in C++03 or C++0x compilers, in both modes? In c++03 compilers the preprocessor is also used to generate overloads of many functions (allocator_traits, emplace, etc.) and that might be also a problem. In any case without measurements, we don't know where we waste the compilation time. I don't know if measuring this overhead reliably is possible. Boost.Intrusive is also a bit heavy in metaprogramming, but at least vector should not rely on that. Which containers do you use in BGL?
I thought I should share with you, and something for you to keep in mind for future coding iterations on the library (suggestion: create some comparative compilation time/memory benchmarks between STL implementations and the Boost.Container implementation, and try to aim to remain comparable to them).
The first step I was planning is reducing header dependencies for some classes. In any case you suggestion sounds reasonable and I'm very interested in having more feedback from Boost.Container users. Best, Ion
Do you notice this overhead in C++03 or C++0x compilers, in both modes?
Can't say really, the move from STL containers to Boost.Containers was a while ago (approaching 2 years, maybe), so I don't really remember, but I would assume the problem is as much of a problem in either case. In my experience the pre-processor tricks that were common in C++03, which were replaced by the enhancements of C++11, are not really much worse (or better) than their C++11 equivalent. This is because what consumes a lot of time for compilation is the instantiations of templates (not the amount of code or the amount of overloads), and whether multiple templates are generated with the pre-processor or whether they are generated with variadic templates or some combination of specializations and Sfinae-style tricks, the real problems are with the combinatorial explosion of template instantiations. Some (possibly obvious) tips: Limit the depth of TMP recursive meta-evaluations. Never use more than the bare minimum of template arguments in the meta-functions or other helper templates. Things like Sfinae can be a significant problem for compilation times, because they require context-aware complex template instantiations to evaluate each overload, prefer to use tag-dispatching or regular overloads. Variadic templates are also really dangerous for compilation times. It is very easy to create deep recursions and O(N^2) template instantiations in cases where the C++03-style of template would have N templates of O(N) instantiation complexity, which is much better.
However I haven't diagnosed where the problem lies and I have no much time and knowledge to investigate it.
Yeah. I hear you. This is a really hard problem to tackle, and it's never very obvious what causes the most issues.
In any case without measurements, we don't know where we waste the compilation time. I don't know if measuring this overhead reliably is possible.
It's sort of possible. Some time ago I did some crude tests like that on a project and it was very tedious. I basically enabled compilation profiling on GCC and created small cpp files that contained explicit template instantiations of selective pieces of the library's code. It's crude and tedious, you start with high-level (end-user) templates to see which are worse, and work your way through the template instantiations that they trigger until you find the templates that seem to be the worse offenders, and then try to understand way (e.g., find the complexity (O(N)..) of the number of template instantiations.. it's really tedious and time-consuming, but it can pay off sometimes. A similar trick is to comment out certain features (e.g., certain functions or parts of the container) and see how compilation times change (it only needs to compile, it does not need to be able run, so you can comment out a lot of stuff without having to replace them with dummy code or stubs). But recently, there has been some work on a template instantiation profiler (time and memory) and a template instantiation debugger, which is called Templight (an instrumentation of Clang). The problem is that it is still in a beta stage. I'm actually the one who picked up the code from the original authors (who seem to have abandoned it) and enhanced it as well as creating an interactive (GDB-style) debugger for stepping through template instantiations in the same way you can step through code execution on a regular run-time debugger. However, this project has been on my back-burner for a while because every time I hack away at the Clang code-base I get very frustrated and discouraged, because it is such a huge mess and a horrible code-base to work with. Anyway, in theory, you can use Templight on profile the time and memory consumed by the compiler for each instantiation templates, just like a run-time profiler, but for template instantiations instead of function calls. I've been trying to bump up the priority of that item ("finish templight coding") on my to-do list, and hope to get to it some time in the next few weeks. Part of being able to get to that to-do item is wrapping up my work on those major additions to the BGL (which is another side-project that has been lingering for far too long).
Which containers do you use in BGL?
I use most of them. The main chunk of code is that I entirely re-implemented the adjacency_list class with Boost.Container on the back-end. As you probably know, the adj-list class provides options for choosing the type of containers used under-the-hood, and so, I implemented it for every sensible combination of vector, list, set, multiset, unordered_set, unordered_multiset, and "pool" (which is a vector container, but leaves "holes" in it upon removal of elements to avoid invalidating indices / iterators to other elements). My implementation of adj-list is much leaner than the existing one, but it takes quite a bit more time to compile it (at least, so it seems, I haven't measured it), and so, I now suspect it's coming from the Boost containers. But I cannot really test against STL containers any more, because I rely on incomplete type guarantees and n3644 iterators ;) neither of which are implemented in standard STL releases. But I agree that the vector container is probably not a problem, because I use it elsewhere, and it compiles fast (this is just my subjective impression, of course, it's hard to measure). Cheers, Mikael.
participants (2)
-
Ion Gaztañaga
-
Mikael Persson