[review] Review of PolyCollection starts today (May 3rd)

Hi everyone, The formal review of Joaquín M. López Muñoz's PolyCollection library starts today. I'd like to encourage your participation as the proposed library is small and focused, and reviewers don't need to be domain experts to appreciate the potential usefulness of the library and propose improvements. PolyCollection implements fast containers of polymorphic objects. Typically, polymorphic objects cannot be stored directly in regular containers and need be accessed through an indirection pointer, which introduces performance problems related to CPU caching and branch prediction. Boost.PolyCollection implements a novel data structure that is able to contiguously store polymorphic objects without such indirection, thus providing a value-semantics user interface and better performance. Three polymorphic collections are provided: * boost::base_collection * boost::function_collection * boost::any_collection dealing respectively with classic base/derived or OOP polymorphism, function wrapping in the spirit of std::function and so-called duck typing as implemented by Boost.TypeErasure. The library can be found here: Incubator: http://blincubator.com/bi_library/polycollection/?gform_post_id=1643 Github: https://github.com/joaquintides/poly_collection and the documentation here: http://rawgit.com/joaquintides/poly_collection/website/doc/html/index.html Please post your comments and review to the boost mailing list (preferably), the Boost Library Incubator. or privately to the Review Manager (to me ;-). Here are some questions you might want to answer in your review: - What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With what compiler? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain? And most importantly: - Do you think the library should be accepted as a Boost library? For more information about Boost Formal Review Process, see: http://www.boost.org/community/reviews.html Waiting your reviews! Ion

On 5/2/2017 7:46 PM, Ion Gaztañaga via Boost wrote:
I could not understand from the documentation what it is I am supposed to be inserting into a function_collection. The tutorial did not explain this to me and the reference's technical explanation on this eluded me. As usual the tutorial-reference form of documentation befuddles me whereas a simple explanation of the main topics of a library would probably make it easy for me to understand a library.

El 06/05/2017 a las 5:53, Edward Diener via Boost escribió:
You can insert into a boost::function_collection<Signature> any entity
that a
std::function<Signature> object could be constructed from. For instance,
boost::function_collection

On 5/6/2017 5:07 AM, Joaquin M López Muñoz via Boost wrote:
The explanation you cite with the link above is a tutorial, not an explanation. Furthermore, perhaps for the sake of syntactical brevity, you cite an example in which a lambda function returns another lambda function !!! This may be exceedingly clever but it can hardly be a mainstream explanation of your functionality which most programmers would easily understand as how a function collection "works". Your examples above are a much better and more mainstream example of how the function_collection works, and even the short explanation you have given me should have been in the documentation. It is also hard to understand what the advantage of a poly collection of these objects entail over a more normal C++ collection ( vector, array etc. ) of std::function<Signature> objects, since the latter object type already represents a generalized callable construct in C++. Some of your performance tests suggest there is little or no advantage, so a discussion of this would have been welcome in the documentation.
Joaquín M López Muñoz

El 06/05/2017 a las 16:49, Edward Diener via Boost escribió:
Will think about how to make the intro part more accessible along the lines you suggest.
Performance is actually the *only* reason for using Boost.PolyCollection
as opposed to say
std:.vector

Ion Gaztañaga Via Boost wrote:
The library looks very well and I think it'd be a valuable contribution
to Boost.
But I'm wondering, would it make sense to expand the scope of the
library? Besides storing polymorphic data of the same type based on
run-time info to improve caching there are situations when data might be
divided into hot and cold data based on compile-time info. E.g. a
"vector" of tuples could internally store tuple of vectors and provide
interface allowing to access tuples more or less transparently. Storing
hot and cold data in different but contigeous parts of memory would also
improve caching. Since the principle is similar, would it have sense to
have both types of collections in one library? Not necessarily now, but
if this was a good idea then the name of the library would have to be
changed into something more general.
I noticed that in the documentation a word "container" is used WRT poly
collections. AFAIU this is not correct because poly_collection doesn't
meet the requirements of C++ container as defined in the standard, i.e.
allocator_traits

El 08/05/2017 a las 15:48, Adam Wulkiewicz via Boost escribió:
Thank you!
IMHO this is an entirely different beast that happens to share the same underlying data structure as Boost.PolyCollection, but the apropriate interface would look different: no type registration, no type restitution, typed local_iterator's should have to go, etc. I think this would be best served by another, unrelated library.
Throughout the docs the word "container" is used in a non-standardese sense, as you correctly point out. On the reference, however, the concept PolymorphicCollection is defined as a refinement of PolymorphicContainer (http://tinyurl.com/m9yh3hq ), which is explicitly defined as resembling, but not the same as a standard Container. The part about 23.2.1.3 is not listed among the differences, and I should include it as well (thanks!). BTW, the standard itself is rather lousy with respect to the Container concept. For instance, [forwardlist.overview]/1 says "A forward_list is a container that supports forward iterators..." only to state in the next parapragh that "A forward_list satisfies all of the requirements of a container (Table 96), except that the size() member function is not provided and operator== has linear complexity." So, in the end, forward_list is *not* a container :-) Joaquín M López Muñoz

On 9/05/2017 01:48, Adam Wulkiewicz wrote:
On that note, do you know of a more-comprehensible-than-standardese guide on how allocator_traits is supposed to be used for node-based custom container types? I've tried to figure it out a couple times but I'm sure I'm missing important points.

Gavin Lambert Via Boost wrote:
No, sorry, I'm not aware of any literature regarding it, though it doesn't mean it doesn't exist. I don't consider myself an expert in this area. I'd suggest to check out several STL implementations and Boost.Container's code for hints. Adam

Gavin Lambert wrote:
What do you want to know? 1. The storage for the node should be allocated with the Allocator's 'allocate()', for an allocator instance 'a' whose value_type is the containers node type (i.e. 'a' can be achieved by rebinding an existing allocator instance in the normal way). 2. The node object itself would be constructed in the normal way - e.g. placement new to construct the node object. 3. The containers' value_type object within the node must be constructed by allocator_traits<U>::construct(u, p, args...) where u of type U is a rebound copy of the allocator such that the allocator's value_type is the container's value_type. 4. Similar to 3, for destruction of the container's value_type object should be destroyed by allocator_traits<U>::destroy(u, p) 5. Similar to 2, the node object is destroyed in the normal way - e.g. invoking the destructor of the node type. Glen

On 13/05/2017 12:50, Glen Fernandes wrote:
Yeah, that's what it sounded like it said. But those rules seem very peculiar to me. Why wouldn't you use allocator_traits<node>::construct to construct the node type as well instead of using placement new directly? That way you only ever need one allocator instance, that for the node type. Why is it sensible to call allocate with a different type than construct, if the reason to not call construct is to avoid leaking the node type externally? In particular, these rules seem to basically require that the node type be trivially-constructible, because you can't legally call the node constructor and then call the value constructor where the value is a field of the node; you'd have to construct the node, destroy the value, then re-construct the value, which seems well over the line into crazypants territory. It all seems very arbitrary to me.

On Sun, May 14, 2017 at 7:02 PM, Gavin Lambert wrote:
No. You could also design the node type to contain a
std::aligned_storage_t

On 15/05/2017 11:12, Glen Fernandes wrote:
Apologies for the delayed response, but I was distracted by other things.
If the node actually contains aligned_storage rather than T, doesn't
that then require reinterpret_casting everywhere (since everything will
expect a T*, not an aligned_storage_t<>*)? Isn't that usually
considered a bad thing?
It also seems really error-prone, since
allocator_traits::construct/destroy accept pointers of all types, not
just their parameterised type, and then act based on the received type.
So given:
struct node
{
node* next;
std::aligned_storage_t

El 15/06/2017 a las 10:17, Gavin Lambert via Boost escribió:
It's not a POD, but it has a trvial default ctor. My reading of 3.8 Object lifetime [basic.life] implies that you can omit construction and begin using a just-allocated object when its default ctor is trivial. Maybe some language lawyer can shed some light here. Joaquín M López Muñoz

On Thu, Jun 15, 2017 at 4:17 AM, Gavin Lambert via Boost wrote:
Yes., and...
It also seems really error-prone,
Yes. :) i.e. It is easy to make the mistake you mentioned, among others. The model isn't nice (but it is what we have, until someone designs a less terrible one). Glen

Hi, I am not a Boost author, but I think I am able to give a review of PolyCollection, so here it goes.
I only looked at the interface and the description of how it works, both is great. There is nothing I would do differently. Concerning the interface, I expected the collections to implement the std::set interface, which they mostly do. Of course, in addition they have the extra interface to do type-specific stuff. The std::set methods count, find, equal_range, lower_bound, upper_bound are missing, but the library has equivalent free functions, which I like better anyway.
- What is your evaluation of the implementation?
I didn't look at the code. The concept as described in the documentation made a lot of sense.
- What is your evaluation of the documentation?
I really liked the graphics which explain how objects are stored internally. The quality of the documentation is very high. Everything I looked at was very professional and made sense. Some specifics on the tutorial section: I understood boost::base_collection right away. This part is perfect from my POV. boost::function_collection was not so clear to me. It is nice to keep the narrative of game development going, but like someone said before, I think it would help to give a simpler example. I felt that the benefit of using boost::function_collection was not so clearly presented as for boost::base_collection. The function pointers are all the same size, so how exactly does the optimisation enter? I assume it is not about memory allocation here, but about sorting equal function signatures into homogeneous blocks to improve branch prediction and locality, but I am not sure. The description of boost::any_collection was clear again.
- What is your evaluation of the potential usefulness of the library?
The library optimises the use of collections of polymorphic objects. I can see many applications. The optimisation feels so obvious now after reading the documentation that I wonder why such a library was not added sooner. It is a good thing that Boost provides a standard implementation of this optimisation so that developers don't have to reinvent it in their domain.
- Did you try to use the library? With what compiler? Did you have any problems?
I tried to compile the tests on a Mac with Apple-clang 8.0.0. It worked after I manually specified "cxxflags=-std=c++11" in the b2 call. By the way, it would be great if b2 used the highest standard that is available by default.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I studied the documentation, mostly the tutorial. I also studied the benchmarks with interest, and I glanced at the reference. I tried to compile the tests. All in all, I guess I spend 3 hours.
- Are you knowledgeable about the problem domain?
Yes, as it is not very specific...
- Do you think the library should be accepted as a Boost library?
Yes. Best regards, Hans

El 10/05/2017 a las 17:56, Hans Dembinski via Boost escribió:
Hi,
I am not a Boost author, but I think I am able to give a review of PolyCollection, so here it goes.
Thanks for your review!
I have to think over the boost::function_collection given that it does't
look as clear as
I expected... As you correctly guess my aim was to have a single
narrative, but seems
like that's not working so well.
Regarding the benefits of boost::function_collection<Signature> with
respect to
std::vector
It's weird because the Jamfile does have this flag already: https://github.com/joaquintides/poly_collection/blob/master/test/Jamfile.v2 I guess the snapshot provided from the Incubator might be oldish and not include that... I can't check right now because blincubator.com is not responding :-(
Thank you! Joaquín M López Muñoz

Hi Joaquin,
perhaps you can keep the narrative and just simplify the example a bit? I have nothing against the consistent narrative, it is a nice touch.
Regarding the benefits of boost::function_collection<Signature> with respect to […]
Ok, I think I got it now, thanks :). As far as I am concerned, just adding this explanation to the tutorial would be enough.
It seems I got the right version, I checked out from github, not the incubator. My version of the Jamfile has these lines project : requirements <toolset>gcc:<cxxflags>-std=c++11 <toolset>clang:<cxxflags>-std=c++11 ; I am not a Boost.Build expert, but that looks ok to me. Nevertheless, when I run just ./b2 in the "tests" folder, it tries to compile but fails. When I run ./b2 cxxflags=-std=c++11, the tests compile fine. Best regards, Hans

Hi Joaquín (and Ion),
I took a look at this a few months ago, and my view is positive today as
it was then. The code looks clean, the documentation good, and the tests
look thorough.
Summary: The library provides containers that offer great data locality
and which stores static type information. The net result is that
increased performance can be had, sometimes significantly. To achieve
this, objects are stored in per-type segments. Small buffer
optimizations may be advantages for vector<any> or
vector
- What is your evaluation of the design?
It's excellent.
- What is your evaluation of the implementation?
Very high quality, as usual.
- What is your evaluation of the documentation?
Also high quality.
- What is your evaluation of the potential usefulness of the library?
Very high.
- Did you try to use the library? With what compiler? Did you have any problems?
No/No/No.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
About half a day.
- Are you knowledgeable about the problem domain?
Somewhat.
And most importantly:
- Do you think the library should be accepted as a Boost library?
Yes.
*Specific comments*
A. why is subaddress() returning void* instead of T* ?
B. is the void* stuff like this
virtual range emplace( const_base_iterator p,void
(*emplf)(void*,void*),void* arg)
{
return range_from(s.emplace(iterator_from(p),emplf,arg));
}
needed ? That is, is there no way to use real ans specific types?
C. Does value_holder<T>::data() need to return void*
D. Implementation of equality: do we need to be set like semantics
(two-pass)? Why don't you just delegate to the segments after checking
the size? I don't think the documentation specifies what the
implementation does.
E. should the test check (perhaps via static_assert) the conditions
under which move-construction and move-assignment is noexcept? (sorry if
this is already done). Specifically, if std::unordered_map has noexcept
for these, then you can guarantee the same for base_collection etc.
F. I understand the layout to be roughly
std::unordered_map

El 10/05/2017 a las 22:33, Thorsten Ottosen via Boost escribió:
Technically, this is doable and, at first blush, simple, since all element copying and assignent is centralized in a single wrapper class: https://github.com/joaquintides/poly_collection/blob/master/include/boost/po... Another possibility is to use a copying traits class to let users customize this point, along the custom RTTI thing proposed by Ion. This looks to me even cleaner and more powerful. I have no strong objection against including ths feature in whatever form, but don't particularly like it either --in the end, elements in a base_collection must be copyable (strictly speaking, moveable) and this will force users to change their code by writing the necessary ctors, so requiring an extra layer of boilerplate (declaring friends etc.) is just more hassle.
Yes, this is trivial and looks useful. At the risk of bikeshedding, c.segment<warrior>() might be a better name. Unfortunately there's no std naming practice here to follow on, unless someone proves me wrong.
hanna::overload is already here to the rescue: https://lists.boost.org/Archives/boost/2017/03/233001.php
Of all your proposals this is the one I have most concerns about: element contiguity is such a key property that basically the whole library implicitly depends on it, code- and designwise. Policying this aspect away looks to me, in the absence of cogent use cases, an exercise in overdesign and a lot of work.
Thank you!
*Specific comments*
A. why is subaddress() returning void* instead of T* ?
Because only void* is needed: subaddress() is used in poly_collection/detail/segment.hpp as a client of the type-erased virtual interface defined in https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
This is connected to A, and in fact gives me the opportunity to comment on one of the more beautiful (IMHO) pieces of the lib :-) The implementation of a typical emplace(args...) function just forwards the args to the corresponding allocator construction call. The problem with Boost.PolyCollection is that this construction happens inside a class implementing a virtual interface (the one mentioned before), and this interface can't have a member function template able to accept(args...). We have a type erasure wall, if you wish. So the workaround is to define a type-erased callback (the emplf bit) that is invoked from the virtual class implementation and provided by the frontend of poly_collection, where the types of (args...) haven't been erased. Take a look at https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
C. Does value_holder<T>::data() need to return void*
It could return T*, but that's not needed, as data() is used for placement new.
We can't blindly delegate to the segments (if I'm getting your question right) because the order and associated types of the segments in x and y can differ: we need to first match them up, hence the two passes.
I can't guarantee noexcept because whether an exception is thrown or not depends on the nature of the actual types in the container, which is runtime info.
You're right, but the diagram is merely explanatory and I wouldn't want to complicate it to no avail. In fact, the segments are pointed to from what looks like a small vector of three elements (light blue), which is also not a correct depiction of a std::unordered_map internal structure. In the end, all these indirections don't affect performance in any significant way (workload is in the segments proper).
G. tutorial: consider using override where appropriate
Noted, will do.
There's an assertion: take a look at https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
I. why emplace_pos/hint but not insert_hint/pos ?
This is insert(it,x). As for the naming, we have emplace(args...) emplace_hint(it,args...) emplace_pos(local_it,args...) insert(x) insert(it,x) insert(local_it,x) which follows std conventions except for "emplace_pos". I had to made that name up because, in the standard, fixed position emplace only happens in sequence containers where the name is simply "emplace", but I can't use "emplace" as emplace(it,args...) would collide with emplace(args...) (which sequence containers don't have).
J. why ==,!= but not <,>,>=, <=
Followed the interface of std::unordered_set. As segment order is unspecified, less-than comparison seems to make no sense.
Loops over all elements, doing type restitution on those marked. Noted, will clarify.
L. It could be good to have a performance test of erase in the middle/front.
I can do that. Interpretation is going to be tricky, as the compared data structures differ wildly wrt erasure.
M. using the identifier "concept" may require change in the near future?
Good observation! But, is "concept" not going to be a context-dependent keyword à la "final"?
O. I wouldn't mind seeing separate graphs for 0-300 elements.
I can extend the graphs to cover that part, but I don't see the point of using this library for a handful of elements,don't you think?
P. clarify test are without use of reserve
Noted, will do.
There are normal size() and empty() member functions, they're described in the reference.
Actually, std::is_polymorphic_v is enforced on the reference but not anywhere in the code. I guess I preferred to be conservative here. As for working with type restitution alone, I think this scenario would be best served by a potential variant_collection, which was discussed at https://lists.boost.org/Archives/boost/2017/03/233001.php
Sorry if I am not getting your question, but is this not the line labeled "shuffled_ptr_vector" in the plot?
That's it. Great work :-).
What a thorough review! Thank you, Joaquín M López Muñoz

Den 11-05-2017 kl. 09:53 skrev Joaquin M López Muñoz via Boost:
Yeah, I guess I would need to see how client code looks like to form any opinion. Notice that classes in hierarchies often have a non-public copy-constructor for use in cloning. Identity management is just not like ints, and sometimes having classes that are neither (publicly) copyable or movable happens.
Yes, but it is likely that such types are used outside of the containers as well and therefore I think it is important to consider.
I like it.
Awesome!
I'm just thinking out loud here. So one option would be to require that the container is contiguous (flat_set fits this). I was under the impression that you rely heavily on having random access iterators on a segment, but not on actual contiguity of the elements in a segment. We would probably also need some way to get the segment container, say, auto& vector = c.segment_container<warrior>(); so why would anyone want to use a flat_set internally. Well, fast lookup. The alternative today is to copy the pointers to flat_set/vector and use that. Why would anyone want to use batch_deque (http://erenon.hu/double_ended/double_ended/examples.html#double_ended.exampl... Well, non-copyable, non-movable types. For example, when I need to load a Bayesian network in my work, it maps exactly over to a set of nodes that each must be loaded once and then kept in memory. Having the no-move guarantee ensures that pointers to the nodes can be shared freely without any worry that the object reference is not stable. Anyway, if its a lot of work, I wouldn't put it high on the wish list.
Well, the documentation states: "a==b evaluates to true iff for each non-empty segment of subojects of type U in a there is a segment of U in b with the same size and equal elements in the same order, and viceversa." ^^^^^^^^^^^^^^^^^^ So the question is we want the current O(n^2) == or if it should do as the docs say. Since we don't have set semantics anyway, I would prefer segments to match exactly.
I don't get it. We are moving a hash_map of std::type_index keys and std::unique_ptr<segment> values. Neither of those should throw. And we hopefully guarantee reference stability on moving a base_collection (that is, we don't move individual elements in the container)? I don't think the C++ standard requires a noexcept move for std::unordered, but suspect many implementations provide this guarantee (we are basically swapping a handle). So I'm saying that we can guarantee noexcept if the unordered map guarantees it. Perhaps using boost::unordered is better than using the standard container. This reminds me, it would be good with a small note of reference stability guarantees. AFAICT, it's swap/move/changing an unrelated segment.
Ok. Hm. So maybe I don't quite get we have c.insert(c.begin<juggernaut>(),juggernaut{10}); // compile time error c.insert(c.begin(typeid(warrior)),juggernaut{11}); // undefined behavior presumably because we may not know the type, so we say c.insert( c.begin(typeid( obj )), std::move(obj) ); ? Does it need to be an iterator? Perhaps c.insert( segment_position{0}, std::move(obj) ); could work, guaranteeing that no undefined behavior can happen. Otherwise, we should seriously consider throwing instead.
Right. One would have to copy the std::type_index keys to a local buffer, sort it, and use that as the order. But it is hard to imagine a use-case.
Not sure. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4377.pdf specifies two keywords.
Why not? The elements may be relatively few, but fat. This fits exactly my use case.
Sure, but in the section called Polymorphic collections they sometime appear as only segment specific, whereas in the declaration of the class, they show all members.
Yes, sort of, but I'm saying that very often one wants to copy pointers
from objects in base_collection into std::vector

El 11/05/2017 a las 18:18, Thorsten Ottosen escribió:
Sorry, I'm not getting you. The current implementation enforces exact segment match. Could you maybe elaborate/reword your question?
My bad, I misread your question. Of course move construction and move assignment do not throw or copy elements etc., but I didn't mark them as noexcept following the no-noexcept signature of std unordered associative containers. I don't know why the std has not made those noexcept.
This is implicitly guaranteed by [container.requirements.general]/9, as a polymorphic collection is a quasi-model of Container, as stated in the reference.
https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
The difference is that in the former, the type of the iterator is associated to warrior (I assume you meant "c.begin<warrior>()") at compile time, whereas in the former this info is dynamic and an assertion is the best we can get.
I don't think throwing is the right approach here, as we are penalizing users who do the right thing and don't mix values with iterators improperly.
Noted. Will extend the plots, then.
Sorry again but I don't get it yet. I don't quite understand which particular data structure you'd like me to compare base_collection against. Joaquín M López Muñoz

Den 15-05-2017 kl. 00:14 skrev Joaquin M López Muñoz via Boost:
My bad. Your code does what the docs say. You have this // TODO: can we do better than two passes? By that, do you mean that the first pass is the size() comparison?
E. should the test check (perhaps via static_assert) the conditions under which move-construction and move-assignment is noexcept? (sorry
I guess some vendors wanted freedom, and then gave users the near impossible task of tracking no-except/except impacts through their code bases. Now, you use =default for all the move operations, so I think they are required to be deduced to be noexcept exactly when they can be. Though a static assertion in the tests can track any fault in that logic. If one ever puts base_collection's into a container, you are in for a huge penalty when the container operation falls back on copying.
So take the processing test. You already compare against a shuffled
ptr_vector<T>. So far so good. Now take a base_collection<T> as is also
done. Then take the address of every element in the base collection and
put them into a vector

El 15/05/2017 a las 16:13, Thorsten Ottosen via Boost escribió:
OK, now I get it. Yes. the size() comparison does a first pass of both containers' internal std::unordered_maps, and I wondered whether we can come up with something smarter that avoids that.
Fallback copying cannot occur to the best of my knowledge: [container.requirements.general]/Notes: "Those entries marked “(Note A)” or “(Note B)” have linear complexity for array and have constant complexity for all other standard containers." As for noexcept in std unordered associative containers, I don't have a clue why move construction is not marked conditionally noexcept the way move assignement is... given this messy state I'd prefer to rely on =default before committing to any noexcept guarantees.
I did that with this testing shuffled_base_collection class template:
template<typename Base>
struct shuffled_base_collection
{
template<typename T>
void insert(const T& x)
{
bc.insert(x);
}
template<typename F>
void for_each(F f)
{
std::for_each(v.begin(),v.end(),[&](Base* x){f(*x);});
}
void prepare_for_for_each()
{
std::for_each(bc.begin(),bc.end(),[&](Base& x){v.push_back(&x);});
std::shuffle(v.begin(),v.end(),std::mt19937(1));
}
private:
boost::base_collection<Base> bc;
std::vector

Den 15-05-2017 kl. 19:01 skrev Joaquin M López Muñoz via Boost:
Hm. Yeah, I guess we have at least three options: a) current version b) omit check to "global" size() and let segment comparison deal with that c) store the size in poly_collection I think c) is not that attractive due to all the code to provide consistency. Whether a) is better than b) I guess will depend on the on the actual runtime data. I speculate that your current version is the best.
That is true, but as soon you hit code that calls std::move_if_noexcept, http://en.cppreference.com/w/cpp/utility/move_if_noexcept those move operations that are not marked/deduced noexcept will not be in effect.
Sounds reasonable. As I stated, it appears to me that you will take advantage of noexcept if it exists in the standard container.
Thanks for quick update. It wasn't clear to me if your time includes the time to insert and prepare_for_for_each. If so, it would be fair to call v.reserve( bc.size() ); in prepare_for_for_each. Anyway, it is a bit surprising. Perhaps modern allocators are good at allocating same size objects closely and without much overhead ... -Thorsten

El 15/05/2017 a las 19:46, Thorsten Ottosen via Boost escribió:
I think this variation of a) might perform better (code untested):
template
Timing does not include insert() or prepare_for_for_each().
Anyway, it is a bit surprising. Perhaps modern allocators are good at allocating same size objects closely and without much overhead ...
I think the similarity in performance between shuffled ptr_vector and shuffled base_collectin goes the other way around: once sequentiality is destroyed, it doesn't matter much whether elements lie relativley close to each other in main memory. Joaquín M López Muñoz

Den 15-05-2017 kl. 21:21 skrev Joaquin M López Muñoz via Boost:
El 15/05/2017 a las 19:46, Thorsten Ottosen via Boost escribió:
Yeah, could be.
Ok. I guess (and there is no hurry) it will also be interesting to see for 0-1000 elements. I know its nice to have a graph that grows as a function of n, but I think the best thing would be make each data point be based on the same number of iterations. So I would use an exponential growth for n, n = 8, 16, 32 ... max_n and then run each loop x times, x being max_n / n. kind regards Thorsten

El 16/05/2017 a las 11:11, Thorsten Ottosen via Boost escribió:
In my todo list.
Not sure why this is better than having homogeneous units all across the
plot
(namely nanoseconds/element). In any case the testing utilities sort of
do the
loop repetition the way you suggest, at least for small values of n:
measure_start=high_resolution_clock::now();
do{
res=f();
++runs;
t2=high_resolution_clock::now();
}while(t2-measure_start

El 16/05/2017 a las 16:20, Thorsten Ottosen via Boost escribió:
My typo, it should read as you say:
template
Or do you need sx and sy? But also, once all the segments have been compared in the loop, you know the x.size() == y.size() ...
Not so: consider the case where y has a segment (let's call it seg) which is not present in x; in this case the for loop would pass succesfully and seg wouldn't have been hit, and we'd thus erroneously return true. That's why we need the extra check on size. In other words, we can't determine equality by only going through x's segments. Joaquín M López Muñoz

On 15/05/2017 19:46, Thorsten Ottosen via Boost wrote:
It might be related to sentinel nodes. They are used in lists and ordered associative containers from Dikum STL from early versions. Sentinel nodes can't be transferred if source container invariants must be preserved. Reviewing Dinkum STL code it seems that std::unordered_xxx uses std::list internally, so sentinel nodes are allocated when move constructing an unordered_xxx and thus move constructor can't be noexcept. Ion

On Tue, May 2, 2017 at 4:46 PM, Ion Gaztañaga via Boost
The documentation looks beautiful and I was excited by the positive review comments so I thought I would have a look. Unfortunately, I ran into considerable difficulty trying to actually build the example and poke around. There are no instructions on how to build that I could find in the repository or the documentation. First I tried cloning the repository and building: $ cd ~/src $ git clone git@github.com:joaquintides/poly_collection $ cd poly_collection/example $ b2 Produced this output: C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:111: in load-parent from module project error: Could not find parent for project at '.' error: Did not find Jamfile.jam or Jamroot.jam in any parent directory. C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:464: in initialize from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:306: in load-jamfile from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:64: in load from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:145: in project.find from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src\build-system.jam:535: in load from module build-system C:\Users\vinnie\lib\boost_1_64_0\tools\build\src/kernel\modules.jam:295: in import from module modules C:\Users\vinnie\lib\boost_1_64_0\tools\build\src/kernel/bootstrap.jam:139: in boost-build from module C:\Users\vinnie\lib\boost_1_64_0\boost-build.jam:17: in module scope from module Now I remember some jibber jabber about putting projects in the live Boost directory (a practice which I am not fond of). In the interest of moving forward I tried it. $ cd ~/lib/boost_1_64_0 $ git clone git@github.com:joaquintides/poly_collection $ cd poly_collection/example $ b2 It got a little farther this time but still does not build: Performing configuration checks - 32-bit : yes (cached) - arm : no (cached) - mips1 : no (cached) - power : no (cached) - sparc : no (cached) - x86 : yes (cached) ...patience... ...found 1033 targets... ...updating 23 targets... compile-c-c++ ..\..\bin.v2\poly_collection\example\msvc-14.1\debug\threading-multi\algorithms.obj algorithms.cpp algorithms.cpp(12): fatal error C1083: Cannot open include file: 'boost/poly_collection/algorithm.hpp': No such file or directory Did I mention I am using Microsoft Visual Studio on Windows? Full disclosure, I have had this problem twice before when trying to review libraries. In the past I have simply given up. But I want to contribute something back since my own library will be reviewed soon. Regards

On 5/10/17 2:46 PM, Edward Diener via Boost wrote: algorithms.cpp
This is a common situation. The boost directory structure doesn't play nice with new libraries which are not already part of boost. When I put the safe numerics library into the incubator, I used path names for includes so that one would not have to do the above procedure for the new library. This permitted one to build and run the tests and examples, but of course created consternation of another sort. Robert Ramey

AMDG On 05/10/2017 03:46 PM, Edward Diener via Boost wrote:
Right. It needs to go under libs/ because that's where it will ultimately be put if it is accepted.
2) run 'b2 headers' from 'your_boost_tree_root' after you clone
This is supposed to happen automatically, and it does work for me with boost develop. In Christ, Steven Watanabe

On Wed, May 10, 2017 at 2:46 PM, Edward Diener via Boost
1) Clone beneath the 'your_boost_tree_root/libs' directory 2) run 'b2 headers' from 'your_boost_tree_root' after you clone
Okay, this worked and I was off to the races! - What is your evaluation of the design? It seems pretty straightforward. Given an explanation of the problem the design of the collection feels natural and right. Most of the container interface is standard boilerplate and so was immediately familiar. - What is your evaluation of the implementation? I have not looked at the implementation, only the public interfaces. - What is your evaluation of the documentation? I was able to find everything I looked for. The diagrams look great and really helped me understand the nature of the problem and how the library solves it. The author makes container performance claims which are then backed up with benchmarks across a variety of compilers and systems. My preference is for the style of documentation where exposition precedes each member declaration (e.g. http://www.boost.org/doc/libs/1_64_0/doc/html/boost_asio/reference/io_servic...) but the presented style is also workable. - What is your evaluation of the potential usefulness of the library? These containers look quite useful, and I can think of at least a couple of instances where I have rolled my own inferior solutions to accomplish the same result. - Did you try to use the library? With what compiler? Did you have any problems? I had some trouble getting going using the Microsoft Visual C++ toolchain on Windows, but this had to do more with how the repository needs to be cloned into a subtree of a boost library installation rather than the proposed library itself. I built the example using the compiler version 14.10.25017. Then I added my own code to one of the example sources to get a feel for how the library worked. I had no trouble at all writing correct code. My attempts to abuse the library APIs were met with resistance from the compiler in the form of compilation errors. - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? I spent about an hour on it. I exercised mostly the boost::base_collection container in my own program. I expect the containers to work similarly. I have not used boost::any so I can't comment on the any_collection. I focused on the "interesting" member functions, the ones which operate differently from their standard container counterparts. I assume the non-interesting functions act with predictable results. - Are you knowledgeable about the problem domain? Yes. I have implemented many containers with conforming allocator support of all varieties. I have used Boost.Intrusive heavily to create complex containers and I have dabbled with some Boost.MultiIndex. - Do you think the library should be accepted as a Boost library? This library is tightly focused on solving a specific, worthwhile problem and does so in a way that feels natural and "boosty." So YES. - Questions Iterators returned by begin<T>() cannot be compared to iterators returned by end(typeid(T)). Is this an oversight or a consequence of the design? It seems to me they should be comparable. - Other comments: Iterators returned by begin<T>() produce a long, ugly compile error when compared against iterators returned by begin<U>() or end<U>(). I think the library could do better by producing a static_assert with a helpful message. Its not a showstopper but a suggestion. I echo the comments from other reviewers, a range<T>() function to allow range-for iteration of a segment should be added at the earliest convenience. I wouldn't hold up acceptance for it. Thank you to the author for putting together a polished library!

El 11/05/2017 a las 4:46, Vinnie Falco via Boost escribió:
Thank you!
You need to do an explicit cast (either direction) first, so it follows comparison can't be done directly. I preferred to require expilcit conversion between the two types of local iterator as it seemed to me safer from the user's code point of view.
Will study the issue. I'm not sure I can improve the dagnostics but will take a look. Thank you for your review, Joaquín M López Muñoz

On 5/2/2017 7:46 PM, Ion Gaztañaga via Boost wrote:
This is my review of PolyCollection,
- What is your evaluation of the design?
I have no problem with the design as it succinctly covers the three areas to which poly collections apply. These areas should be enough for using any types with polycollections.
- What is your evaluation of the implementation?
I did not look at it.
- What is your evaluation of the documentation?
I have mentioned previously that I think the documentation should be more specific about the types being used for the boost::function_collection. While an example is good, as in the case of the tutorial, and while a reference is good, as in the case of the documentation for the insert member function, I found the former to be too singular and special an example while I found the latter to be highly technical and therefore hard to understand in general terms. A better addition is simply to document in an explanation of ideas and concepts exactly what types the boost::function_collection entails. I assume it is any type which can be passed to std::function, but I do not know if it can be a std::function object itself. Saying it is "Function wrapping in the spirit of std::function" does not explain it adequately to me. Also since std::function ( or boost::function if you will ) represents any callable it would be nice to understand why the programmer would want to use boost::function_collection versus a collection of std::function objects.
- What is your evaluation of the potential usefulness of the library?
I see the library as being useful as an optimization over present techniques in both speed and size terms. In effect it offers a choice for end-users over normal C++ collections.
- Did you try to use the library? With what compiler? Did you have any problems?
I tried it with gcc-7.1 in c++11 mode and it worked well.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
Mostly a slow reading and an attempt to understand why the library should be used and how to use it.
- Are you knowledgeable about the problem domain?
Somewhat knowledgeable, but not an expert in the type of optimizations which polycollection use.
And most importantly:
- Do you think the library should be accepted as a Boost library?
I think the library should be accepted based on the fact that it offers an alternative to normal collections which can prove valuable to end-users, and that it is a polished library with few surprises. With that said I think the library should document better the advantages and practical situations that make polycollection superior to normal C++ collections, else an end-user will be puzzled about for what situations it will be useful.

El 14/05/2017 a las 22:33, Edward Diener via Boost escribió:
Thank you for your contribution!
Points taken, as we have already discussed them in the past days. I'll try to improve docs so as to make these aspects clearer. (When you insert a std::function<Signature> object x, into a function_collection<Signature>, it is x that gets copied/stored rather than its underlying callable entity. This is so because std::function lacks functionalty to recover the wrapped callable entity without statically knowing its type, observe std::function::target does not type erase: http://en.cppreference.com/w/cpp/utility/functional/function/target . Incidentally, this is the main reason why function_collection<Signature>::value type is *not* a std::function, but an internall lookalike with the required extra functionality.)
Thank you, Joaquín M López Muñoz

Please post your comments and review to the boost mailing list (preferably), the Boost Library Incubator. or privately to the Review Manager (to me ;-). Here are some questions you might want to answer in your review:
- What is your evaluation of the design?
The overall design seems sound. The library identifies a clear need and seems to provide a carefully crafted solution.
- What is your evaluation of the implementation?
I did not look into the implementation in detail.
- What is your evaluation of the documentation?
The documentation seemed amazingly clear, even though I am not a game developer and do not relate daily to warriors and goblins. The common thread in the documentation and the clarity of progression of the tutorial was great. I did not review the reference documentation, though. In part, it seems unnecessary because the basic concepts seemed so well thought out and clearly described. I appreciate that there may be subtleties to bits of the API, but I will leave it to those more familiar with the details of implementation to improve.
- What is your evaluation of the potential usefulness of the library?
I feel this library would be very useful and can see using it in my own code. For example, right now I am working with a hierarchy of objects that must be stored in a container; there are lots of objects but relatively few classes in the hierarchy. This seems perfect for the base_collection container.
- Did you try to use the library? With what compiler? Did you have any problems?
I did use the library. In particular, I compiled all of the examples and built a small program with a class hierarchy of my own to test my understanding of the base_collection container. This was done with Apple LLVM version 8.0.0 (clang-800.0.42.1). The only problem was that the following test cases needed -std=c++14 (rather than c++11) because std::make_unique is not defined for c++11 on that compiler: algorithms, basic_function, and segmented_structure.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I have read through most of the documentation (basically all of the tutorial) and have been following much of the discussion during the review.
- Are you knowledgeable about the problem domain?
I am not particularly knowledgeable about writing optimized collections such as this, but I am knowledgeable from a user perspective on using them and on some of the optimizations that would be useful.
- Do you think the library should be accepted as a Boost library?
Yes, I feel this would be a very useful addition to Boost and would likely use it in my code. Cheers, Brook

El 17/05/2017 a las 17:12, Brook Milligan via Boost escribió:
Neither do I. I suspect real game developers will smile at the naïveté of the example, though.
Thanks for noticing this, will fix in the corresponding Jamvile.v2. By the way, algorithms.cpp need C++14 not because of make_unique (which is not used there) but for generic lambdas such as: https://github.com/joaquintides/poly_collection/blob/master/example/algorith... FWIW, algorithms is already marked as needing C++14: https://github.com/joaquintides/poly_collection/blob/master/example/Jamfile.... so I guess you didn't use b2 for building for went directly with the compiler command line, right?
Thank you! Joaquín M López Muñoz

Yes, I forgot the details at the time, but this reminds me that generic lambdas were exactly the problem.
Yes, that is correct. I personally, don’t use b2 for building anything except boost itself. It seems too far afield from “normal” build environments to be useful in my own work. I was not really pointing this out as a problem with the library, but rather that I thought there was some mention of it being a c++11 library and then I found I need to use c++14 at some points. I suppose that is entirely due to the example code, though, and not the library itself. Perhaps a mention of the need for c++14 in some examples? None of this is a big deal, though. The bottom line is it seems to be a great piece of work. Cheers, Brook

El 17/05/2017 a las 21:37, Brook Milligan escribió:
Correct, the library itself is strictly C++11. I decided to use C++14 at the examples when convenient so as not to divert readers' attention with workarounds etc; this is particularly the case with generic lambdas.
Perhaps a mention of the need for c++14 in some examples?
Yes, I can do that. Joaquín M López Muñoz

AMDG
I vote to ACCEPT PolyCollection into Boost.
Details:
General:
Missing .gitattributes
index.html:
No comments.
an_efficient_polymorphic_data_st.html:
"interspersedly"
This word sounds a bit awkward to me.
"the information is typically available at compile
time in the point of insertion in the vector"
I would say "at the point" instead of "in the point"
tutorial.html:
"Imagine we are developing a role game in C++"
Do you mean "role playing game"?
For choosing which type to insert, std::discrete_distribution
is slightly more appropriate than std::uniform_real_distribution.
"...by the user code --we will see more about this later"
This should be an em-dash.
"Insertion mechanisms are rather..."
-> "The insertion mechanisms are rather..."
"to explicitly forbid"
split infinitive
"...it is possible to only address the elements of a given
segment ..."
Address here is a bit confusing as "address" has a specific
meaning in programming. Access is probably a better term
to use.
reference.html:
"...in the same order, and viceversa."
s/viceversa/vice versa/
"cc.empty(index)...
Throws: If the indicated type is not registered. "
Does this really make sense? I think it would be
more useful to treat non-registered segments as
empty. Throwing here is somewhat inconsistent
with operator== which considers non-registered
segments to be equivalent to empty segments.
"cc.max_size()
REVIEW THIS DESIGN DECISION"
"the minimum return value of max_size for each of the segments of the
collection"
I think that the fundamental invariant of
max_size should be that size() <= max_size().
Both max_size and capacity are a bit strange.
My inclination would be to remove capacity
entirely (or rename it) and change max_size to
mean the maximum possible size of the whole container.
Class template base_collection
"subobject(x) = static_cast
is_acceptible.hpp, is_callable.hpp, is_constructible.hpp, is_final.hpp, is_nothrow_eq_comparable.hpp, iterator_impl.hpp, iterator_traits.hpp, packed_segment.hpp: No comments. poly_collection.hpp: If poly_collection is defined in namespace detail, that means that your internal functions can be found by ADL for user defined types that mention poly_collection. (I always avoid putting types that are part of the public interface in namespace detail to avoid this problem.) Test case in test_adl2.cpp line 1109: "// TODO: can we do better than two passes?" This comment seems obsolete. segment.hpp, segment_backend.hpp, split_segment.hpp, stride_iterator.hpp, type_restitution.hpp, value_holder.hpp No comments. In Christ, Steven Watanabe

El 18/05/2017 a las 2:59, Steven Watanabe via Boost escribió:
AMDG
I vote to ACCEPT PolyCollection into Boost.
Thank you Steven! Sorry it took me so long to answer your post, got a busy weekend and yours is not a review one can deal with in a hurry.
Noted. By the way, I couldn't find this as a requirement in http://www.boost.org/development/requirements.html
Any less awkward synonim? For other readers' convenience, the statement in which the word appears is: "This mechanism is rendered mostly useless when derived1, derived2 and derived3 elements are traversed interspersedly without a definite pattern."
You're right, noted.
Yep, will change that.
For choosing which type to insert, std::discrete_distribution is slightly more appropriate than std::uniform_real_distribution.
Your're right, will switch to std::discrete_distribution.
"...by the user code --we will see more about this later" This should be an em-dash.
Noted.
"Insertion mechanisms are rather..." -> "The insertion mechanisms are rather..."
Not being a native English speaker (nor Russian either, who typically have a hard time with definite articles :-) ), "Insertion mechanisms" looks perfectly OK to me, but I can change that anyway.
"to explicitly forbid" split infinitive
At the risk of starting a style war, it is my understanding that split infinitives are okay when used wth caution. Here, the alternative: "to forbid explicitly copying at" seems (to me) less clear as it might be parsed as referring to some sort of explicit copying... By the way, there are other instances of split infinitive in this section, for instance "to only address" (mentioned below).
Will change that.
Noted. It is "viceversa" in Spanish, hence the mistake.
Non-registered == empty would lead to stranger situations, IMHO: we can't hold the invariant that empty(index)==(begin(index)==end(index)) as begin/end(index) cannot return anything sensible for non-registered segments. Similarly, max_size, capacity etc. need to forward to the underlying segment to return something, unless we make up the return values for non-registered segments.
This is a design area I'm definitely not happy with. Please follow this thread for a related discussion: https://lists.boost.org/Archives/boost/2016/11/231784.php I'm leaning towards dropping (global) max_size/capacity, as you and Thorsten suggest, till we find sensible semantics for them.
Ummm.. Hadn't thought about virtual inheritance. At first blush, seems
like virtual
inheritance would be automatically supported if only we fix subaddress as
you suggest below, don't you think?
(the "static_cast
Noted. Please enlighten me: is it OK to qualify the offending call as detail::segment_split, or should I fully qualify as ::boost::poly_collection::detail::segment_split?
Right, will change it.
You're absolutely right, will fix it.
Noted again.
I don't see the need as any_iterator is a detail class not directly exposed to user code.
Right, in fact now that move assignment is provided (didn't notice that) we should check this rather than copy assignment.
That'd be a boon.
Noted, will change it.
That's the intention. If I can't get the wrapped object I'll store the any objects at least.
I'm not seeing this. First of all, there's a bug in
https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
s/any_cast
Absolutely, will change it. See also my comment above re
"static_cast
Got it. I guess I only need to s/return r(std::forward<Args>(args)...)/return static_cast<R>(r(std::forward<Args>(args)...)) right?
Do you know of some good quality logarithmic implementation that I can rip?
I see. What's best? a) moving poly_collection to boost::poly_collection::poly_collection_detail b) moving poly_collection to boost::poly_collection::detail::poly_collection_detail
line 1109: "// TODO: can we do better than two passes?" This comment seems obsolete.
There's an implicit second pass in https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
Thank you for your very valuable review! I'm bringing back home lots of useful suggestions. Best regards, Joaquín M López Muñoz

On 23/05/2017 05:21, Joaquin M López Muñoz wrote:
"interleaved", perhaps?
FWIW, as a native English (or dialect thereof) speaker, the original version sounds better to me, and not just because of the ambiguity of the alternative. As far as I am aware, objecting to split infinitives is out of fashion, and was never all that popular to begin with.

AMDG On 05/22/2017 11:21 AM, Joaquin M López Muñoz via Boost wrote:
The only reason I notice is that I can't open any of the files in notepad...
I think the sentence should be reordered. The problem, for me, is the use of -ly with a past participle, which affects most direct synonyms as well. "...useless for interspersed traversal of derived1..." would be one possibility.
Since you're talking about a specific set of insertion mechanisms, I think it should have an article.
It's not ambiguous. "Explicitly" modifies "copying." I.e., this isn't a correct rewrite of the sentence. It would be either "explicitly to forbid copying" (which is correct and unambiguous, but sounds pedantic) or "to forbid copying explicitly"
In this case "to address only" works fine. Although it changes "only" to modify "the elements" instead of "to address", this doesn't affect the overall meaning.
The iterators are not dereferenceable, so they don't actually need to point to anything. capacity is obviously 0. max_size... yeah, I can see that that would be a problem.
I don't think that there are sensible semantics for capacity. Expected semantics of capacity: a) size() <= capacity() b) if N <= (capacity() - size()) then N elements can be inserted without reallocation. c) capacity() - size() is proportional to the amount of wasted memory. d) reserve(N) has a post-condition that capacity() >= N In particular, I think there's no way to satisfy both (b) and (c) simultaneously. On a related note, reserve(size() + N) probably has unexpected behavior.
I think so. If there's anywhere else that you use a downcast (perhaps stride_iterator.hpp:83), it would also need dynamic_cast. If this causes any noticeable performance cost, I don't think it's worthwhile. Who uses virtual inheritance outside of iostreams?
Any namespace is sufficient to suppress ADL. Putting parentheses around the function name also works. (Personally, I always fully qualify everything in headers, but that gets very tedious.)
This compiles because relaxed implies typeid_.
That makes it too permissive although it's probably okay since you also construct a std::function.
I typically use something like this:
template<bool IsBaseCase>
struct make_int_seq_impl;
template<>
struct make_int_seq_impl<false>
{
template
I don't think it matters. Either one should work. In Christ, Steven Watanabe
participants (13)
-
Adam Wulkiewicz
-
Brook Milligan
-
Edward Diener
-
Gavin Lambert
-
Glen Fernandes
-
Hans Dembinski
-
Ion Gaztañaga
-
Joaquin M López Muñoz
-
Robert Ramey
-
Steven Watanabe
-
Thorsten Ottosen
-
Vicente J. Botet Escriba
-
Vinnie Falco