[review] Review of PolyCollection starts today (May 3rd)
Hi everyone, The formal review of Joaquín M. López Muñoz's PolyCollection library starts today. I'd like to encourage your participation as the proposed library is small and focused, and reviewers don't need to be domain experts to appreciate the potential usefulness of the library and propose improvements. PolyCollection implements fast containers of polymorphic objects. Typically, polymorphic objects cannot be stored directly in regular containers and need be accessed through an indirection pointer, which introduces performance problems related to CPU caching and branch prediction. Boost.PolyCollection implements a novel data structure that is able to contiguously store polymorphic objects without such indirection, thus providing a value-semantics user interface and better performance. Three polymorphic collections are provided: * boost::base_collection * boost::function_collection * boost::any_collection dealing respectively with classic base/derived or OOP polymorphism, function wrapping in the spirit of std::function and so-called duck typing as implemented by Boost.TypeErasure. The library can be found here: Incubator: http://blincubator.com/bi_library/polycollection/?gform_post_id=1643 Github: https://github.com/joaquintides/poly_collection and the documentation here: http://rawgit.com/joaquintides/poly_collection/website/doc/html/index.html Please post your comments and review to the boost mailing list (preferably), the Boost Library Incubator. or privately to the Review Manager (to me ;-). Here are some questions you might want to answer in your review: - What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With what compiler? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain? And most importantly: - Do you think the library should be accepted as a Boost library? For more information about Boost Formal Review Process, see: http://www.boost.org/community/reviews.html Waiting your reviews! Ion
I know a lot of boosters are willing to write a review of this great library, so please don't be shy! Best, Ion On 03/05/2017 1:46, Ion Gaztañaga via Boost wrote:
Hi everyone,
The formal review of Joaquín M. López Muñoz's PolyCollection library starts today.
I'd like to encourage your participation as the proposed library is small and focused, and reviewers don't need to be domain experts to appreciate the potential usefulness of the library and propose improvements.
PolyCollection implements fast containers of polymorphic objects. Typically, polymorphic objects cannot be stored directly in regular containers and need be accessed through an indirection pointer, which introduces performance problems related to CPU caching and branch prediction. Boost.PolyCollection implements a novel data structure that is able to contiguously store polymorphic objects without such indirection, thus providing a value-semantics user interface and better performance. Three polymorphic collections are provided:
* boost::base_collection * boost::function_collection * boost::any_collection
dealing respectively with classic base/derived or OOP polymorphism, function wrapping in the spirit of std::function and so-called duck typing as implemented by Boost.TypeErasure.
The library can be found here:
Incubator: http://blincubator.com/bi_library/polycollection/?gform_post_id=1643
Github: https://github.com/joaquintides/poly_collection
and the documentation here:
http://rawgit.com/joaquintides/poly_collection/website/doc/html/index.html
Please post your comments and review to the boost mailing list (preferably), the Boost Library Incubator. or privately to the Review Manager (to me ;-). Here are some questions you might want to answer in your review:
- What is your evaluation of the design?
- What is your evaluation of the implementation?
- What is your evaluation of the documentation?
- What is your evaluation of the potential usefulness of the library?
- Did you try to use the library? With what compiler? Did you have any problems?
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
- Are you knowledgeable about the problem domain?
And most importantly:
- Do you think the library should be accepted as a Boost library?
For more information about Boost Formal Review Process, see: http://www.boost.org/community/reviews.html
Waiting your reviews!
Ion
Le 05/05/2017 à 22:10, Ion Gaztañaga via Boost a écrit :
I know a lot of boosters are willing to write a review of this great library, so please don't be shy! I've started to rad the documentation. The library is so good (as usual from JMLM) that I've not yet added a single issue to my possible review. This kind of boring ;-)
Hope I will have/take the time to send a review soon. Best, Vicente
On 5/2/2017 7:46 PM, Ion Gaztañaga via Boost wrote:
Hi everyone,
The formal review of Joaquín M. López Muñoz's PolyCollection library starts today.
I'd like to encourage your participation as the proposed library is small and focused, and reviewers don't need to be domain experts to appreciate the potential usefulness of the library and propose improvements.
PolyCollection implements fast containers of polymorphic objects. Typically, polymorphic objects cannot be stored directly in regular containers and need be accessed through an indirection pointer, which introduces performance problems related to CPU caching and branch prediction. Boost.PolyCollection implements a novel data structure that is able to contiguously store polymorphic objects without such indirection, thus providing a value-semantics user interface and better performance. Three polymorphic collections are provided:
* boost::base_collection * boost::function_collection
I could not understand from the documentation what it is I am supposed to be inserting into a function_collection. The tutorial did not explain this to me and the reference's technical explanation on this eluded me. As usual the tutorial-reference form of documentation befuddles me whereas a simple explanation of the main topics of a library would probably make it easy for me to understand a library.
* boost::any_collection
dealing respectively with classic base/derived or OOP polymorphism, function wrapping in the spirit of std::function and so-called duck typing as implemented by Boost.TypeErasure.
The library can be found here:
Incubator: http://blincubator.com/bi_library/polycollection/?gform_post_id=1643
Github: https://github.com/joaquintides/poly_collection
and the documentation here:
http://rawgit.com/joaquintides/poly_collection/website/doc/html/index.html
Please post your comments and review to the boost mailing list (preferably), the Boost Library Incubator. or privately to the Review Manager (to me ;-). Here are some questions you might want to answer in your review:
- What is your evaluation of the design?
- What is your evaluation of the implementation?
- What is your evaluation of the documentation?
- What is your evaluation of the potential usefulness of the library?
- Did you try to use the library? With what compiler? Did you have any problems?
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
- Are you knowledgeable about the problem domain?
And most importantly:
- Do you think the library should be accepted as a Boost library?
For more information about Boost Formal Review Process, see: http://www.boost.org/community/reviews.html
Waiting your reviews!
El 06/05/2017 a las 5:53, Edward Diener via Boost escribió:
On 5/2/2017 7:46 PM, Ion Gaztañaga via Boost wrote:
Hi everyone,
The formal review of Joaquín M. López Muñoz's PolyCollection library starts today. [...]
* boost::function_collection
I could not understand from the documentation what it is I am supposed to be inserting into a function_collection. The tutorial did not explain this to me and the reference's technical explanation on this eluded me.
You can insert into a boost::function_collection<Signature> any entity that a std::function<Signature> object could be constructed from. For instance, boost::function_collection<double(int)> holds callable entities that can be invoked with a (convertible to) int argument and return a (convertible to) double result: double f(int n){return (double)n;} struct g { double operator()(int n)const{return (double)2*n;} }; boost::function_collection<double(int)> c; c.insert(&f); c.insert(g{}); double factor=3; c.insert([=](int n){return factor*n;}); for(const auto& x:c)std::cout<<x(1)<<" "; // prints 3 1 2 This is explained in http://tinyurl.com/kddv9nz . Didn't you find this section sufficiently explanatory? Joaquín M López Muñoz
On 5/6/2017 5:07 AM, Joaquin M López Muñoz via Boost wrote:
El 06/05/2017 a las 5:53, Edward Diener via Boost escribió:
On 5/2/2017 7:46 PM, Ion Gaztañaga via Boost wrote:
Hi everyone,
The formal review of Joaquín M. López Muñoz's PolyCollection library starts today. [...]
* boost::function_collection
I could not understand from the documentation what it is I am supposed to be inserting into a function_collection. The tutorial did not explain this to me and the reference's technical explanation on this eluded me.
You can insert into a boost::function_collection<Signature> any entity that a std::function<Signature> object could be constructed from. For instance, boost::function_collection<double(int)> holds callable entities that can be invoked with a (convertible to) int argument and return a (convertible to) double result:
double f(int n){return (double)n;}
struct g { double operator()(int n)const{return (double)2*n;} };
boost::function_collection<double(int)> c; c.insert(&f); c.insert(g{}); double factor=3; c.insert([=](int n){return factor*n;});
for(const auto& x:c)std::cout<<x(1)<<" "; // prints 3 1 2
This is explained in http://tinyurl.com/kddv9nz . Didn't you find this section sufficiently explanatory?
The explanation you cite with the link above is a tutorial, not an explanation. Furthermore, perhaps for the sake of syntactical brevity, you cite an example in which a lambda function returns another lambda function !!! This may be exceedingly clever but it can hardly be a mainstream explanation of your functionality which most programmers would easily understand as how a function collection "works". Your examples above are a much better and more mainstream example of how the function_collection works, and even the short explanation you have given me should have been in the documentation. It is also hard to understand what the advantage of a poly collection of these objects entail over a more normal C++ collection ( vector, array etc. ) of std::function<Signature> objects, since the latter object type already represents a generalized callable construct in C++. Some of your performance tests suggest there is little or no advantage, so a discussion of this would have been welcome in the documentation.
Joaquín M López Muñoz
El 06/05/2017 a las 16:49, Edward Diener via Boost escribió:
The explanation you cite with the link above is a tutorial, not an explanation. Furthermore, perhaps for the sake of syntactical brevity, you cite an example in which a lambda function returns another lambda function !!! This may be exceedingly clever but it can hardly be a mainstream explanation of your functionality which most programmers would easily understand as how a function collection "works". Your examples above are a much better and more mainstream example of how the function_collection works, and even the short explanation you have given me should have been in the documentation.
Will think about how to make the intro part more accessible along the lines you suggest.
It is also hard to understand what the advantage of a poly collection of these objects entail over a more normal C++ collection ( vector, array etc. ) of std::function<Signature> objects, since the latter object type already represents a generalized callable construct in C++. Some of your performance tests suggest there is little or no advantage, so a discussion of this would have been welcome in the documentation.
Performance is actually the *only* reason for using Boost.PolyCollection as opposed to say std:.vector<std::function<Signature>>. Perfomance tests are not very conclusive about insertion times (VS favors Boost.PolyCollection, GCC and Clang std:.vector<std::function<>>), but processing times, which are the crux of the matter, definitely show the point of the library. Admittedly this section should be added a comments section where the results are discussed --otherwise they might be hard to interpret. Joaquín M López Muñoz
Ion Gaztañaga Via Boost wrote:
The formal review of Joaquín M. López Muñoz's PolyCollection library starts today.
I'd like to encourage your participation as the proposed library is small and focused, and reviewers don't need to be domain experts to appreciate the potential usefulness of the library and propose improvements.
PolyCollection implements fast containers of polymorphic objects. Typically, polymorphic objects cannot be stored directly in regular containers and need be accessed through an indirection pointer, which introduces performance problems related to CPU caching and branch prediction. Boost.PolyCollection implements a novel data structure that is able to contiguously store polymorphic objects without such indirection, thus providing a value-semantics user interface and better performance. Three polymorphic collections are provided:
The library looks very well and I think it'd be a valuable contribution to Boost. But I'm wondering, would it make sense to expand the scope of the library? Besides storing polymorphic data of the same type based on run-time info to improve caching there are situations when data might be divided into hot and cold data based on compile-time info. E.g. a "vector" of tuples could internally store tuple of vectors and provide interface allowing to access tuples more or less transparently. Storing hot and cold data in different but contigeous parts of memory would also improve caching. Since the principle is similar, would it have sense to have both types of collections in one library? Not necessarily now, but if this was a good idea then the name of the library would have to be changed into something more general. I noticed that in the documentation a word "container" is used WRT poly collections. AFAIU this is not correct because poly_collection doesn't meet the requirements of C++ container as defined in the standard, i.e. allocator_traits<allocator_type>::construct should be called only for the element type (23.2.1.3) but poly_collection contains std::unordered_map and packed_segment contains std::vector which constructs objects of their own value types. Also Models call operator new in make function. If that's correct and I'm not missing anything then I suggest using the word "collection" consistently in the docs. Regards, Adam
El 08/05/2017 a las 15:48, Adam Wulkiewicz via Boost escribió:
Ion Gaztañaga Via Boost wrote:
The formal review of Joaquín M. López Muñoz's PolyCollection library starts today. [...] The library looks very well and I think it'd be a valuable contribution to Boost.
Thank you!
But I'm wondering, would it make sense to expand the scope of the library? Besides storing polymorphic data of the same type based on run-time info to improve caching there are situations when data might be divided into hot and cold data based on compile-time info. E.g. a "vector" of tuples could internally store tuple of vectors and provide interface allowing to access tuples more or less transparently. Storing hot and cold data in different but contigeous parts of memory would also improve caching. Since the principle is similar, would it have sense to have both types of collections in one library? Not necessarily now, but if this was a good idea then the name of the library would have to be changed into something more general.
IMHO this is an entirely different beast that happens to share the same underlying data structure as Boost.PolyCollection, but the apropriate interface would look different: no type registration, no type restitution, typed local_iterator's should have to go, etc. I think this would be best served by another, unrelated library.
I noticed that in the documentation a word "container" is used WRT poly collections. AFAIU this is not correct because poly_collection doesn't meet the requirements of C++ container as defined in the standard, i.e. allocator_traits<allocator_type>::construct should be called only for the element type (23.2.1.3) but poly_collection contains std::unordered_map and packed_segment contains std::vector which constructs objects of their own value types. Also Models call operator new in make function. If that's correct and I'm not missing anything then I suggest using the word "collection" consistently in the docs.
Throughout the docs the word "container" is used in a non-standardese sense, as you correctly point out. On the reference, however, the concept PolymorphicCollection is defined as a refinement of PolymorphicContainer (http://tinyurl.com/m9yh3hq ), which is explicitly defined as resembling, but not the same as a standard Container. The part about 23.2.1.3 is not listed among the differences, and I should include it as well (thanks!). BTW, the standard itself is rather lousy with respect to the Container concept. For instance, [forwardlist.overview]/1 says "A forward_list is a container that supports forward iterators..." only to state in the next parapragh that "A forward_list satisfies all of the requirements of a container (Table 96), except that the size() member function is not provided and operator== has linear complexity." So, in the end, forward_list is *not* a container :-) Joaquín M López Muñoz
On 9/05/2017 01:48, Adam Wulkiewicz wrote:
I noticed that in the documentation a word "container" is used WRT poly collections. AFAIU this is not correct because poly_collection doesn't meet the requirements of C++ container as defined in the standard, i.e. allocator_traits<allocator_type>::construct should be called only for the element type (23.2.1.3)
On that note, do you know of a more-comprehensible-than-standardese guide on how allocator_traits is supposed to be used for node-based custom container types? I've tried to figure it out a couple times but I'm sure I'm missing important points.
Gavin Lambert Via Boost wrote:
On 9/05/2017 01:48, Adam Wulkiewicz wrote:
I noticed that in the documentation a word "container" is used WRT poly collections. AFAIU this is not correct because poly_collection doesn't meet the requirements of C++ container as defined in the standard, i.e. allocator_traits<allocator_type>::construct should be called only for the element type (23.2.1.3)
On that note, do you know of a more-comprehensible-than-standardese guide on how allocator_traits is supposed to be used for node-based custom container types? I've tried to figure it out a couple times but I'm sure I'm missing important points.
No, sorry, I'm not aware of any literature regarding it, though it doesn't mean it doesn't exist. I don't consider myself an expert in this area. I'd suggest to check out several STL implementations and Boost.Container's code for hints. Adam
Gavin Lambert wrote:
On that note, do you know of a more-comprehensible-than-standardese guide on how allocator_traits is supposed to be used for node-based custom container types? I've tried to figure it out a couple times but I'm sure I'm missing important points.
What do you want to know? 1. The storage for the node should be allocated with the Allocator's 'allocate()', for an allocator instance 'a' whose value_type is the containers node type (i.e. 'a' can be achieved by rebinding an existing allocator instance in the normal way). 2. The node object itself would be constructed in the normal way - e.g. placement new to construct the node object. 3. The containers' value_type object within the node must be constructed by allocator_traits<U>::construct(u, p, args...) where u of type U is a rebound copy of the allocator such that the allocator's value_type is the container's value_type. 4. Similar to 3, for destruction of the container's value_type object should be destroyed by allocator_traits<U>::destroy(u, p) 5. Similar to 2, the node object is destroyed in the normal way - e.g. invoking the destructor of the node type. Glen
On 13/05/2017 12:50, Glen Fernandes wrote:
What do you want to know?
1. The storage for the node should be allocated with the Allocator's 'allocate()', for an allocator instance 'a' whose value_type is the containers node type (i.e. 'a' can be achieved by rebinding an existing allocator instance in the normal way).
2. The node object itself would be constructed in the normal way - e.g. placement new to construct the node object.
3. The containers' value_type object within the node must be constructed by allocator_traits<U>::construct(u, p, args...) where u of type U is a rebound copy of the allocator such that the allocator's value_type is the container's value_type.
4. Similar to 3, for destruction of the container's value_type object should be destroyed by allocator_traits<U>::destroy(u, p)
5. Similar to 2, the node object is destroyed in the normal way - e.g. invoking the destructor of the node type.
Yeah, that's what it sounded like it said. But those rules seem very peculiar to me. Why wouldn't you use allocator_traits<node>::construct to construct the node type as well instead of using placement new directly? That way you only ever need one allocator instance, that for the node type. Why is it sensible to call allocate with a different type than construct, if the reason to not call construct is to avoid leaking the node type externally? In particular, these rules seem to basically require that the node type be trivially-constructible, because you can't legally call the node constructor and then call the value constructor where the value is a field of the node; you'd have to construct the node, destroy the value, then re-construct the value, which seems well over the line into crazypants territory. It all seems very arbitrary to me.
On Sun, May 14, 2017 at 7:02 PM, Gavin Lambert wrote:
In particular, these rules seem to basically require that the node type be trivially-constructible, because you can't legally call the node constructor and then call the value constructor where the value is a field of the node; you'd have to construct the node, destroy the value, then re-construct the value, which seems well over the line into crazypants territory.
It all seems very arbitrary to me.
No. You could also design the node type to contain a std::aligned_storage_t<sizeof(value_type), alignof(value_type)> object (instead of containing a value_type object). That way after you construct the Node type with ::new(x) Node(args...); you an use std::allocator_traits<U>::construct(u, address, args...) with the address of that aligned storage. Glen
On 15/05/2017 11:12, Glen Fernandes wrote:
On Sun, May 14, 2017 at 7:02 PM, Gavin Lambert wrote:
In particular, these rules seem to basically require that the node type be trivially-constructible, because you can't legally call the node constructor and then call the value constructor where the value is a field of the node; you'd have to construct the node, destroy the value, then re-construct the value, which seems well over the line into crazypants territory.
It all seems very arbitrary to me.
No. You could also design the node type to contain a std::aligned_storage_t<sizeof(value_type), alignof(value_type)> object (instead of containing a value_type object).
That way after you construct the Node type with ::new(x) Node(args...); you an use std::allocator_traits<U>::construct(u, address, args...) with the address of that aligned storage.
Apologies for the delayed response, but I was distracted by other things. If the node actually contains aligned_storage rather than T, doesn't that then require reinterpret_casting everywhere (since everything will expect a T*, not an aligned_storage_t<>*)? Isn't that usually considered a bad thing? It also seems really error-prone, since allocator_traits::construct/destroy accept pointers of all types, not just their parameterised type, and then act based on the received type. So given: struct node { node* next; std::aligned_storage_t<sizeof(T), alignof(T)> storage; }; using alloc_traits = std::allocator_traits<T>; using node_traits = typename alloc_traits:: template rebind_traits<node>; alloc_traits::allocator_type m_alloc; node_traits::allocator_type m_node_alloc; // m_node_alloc(m_alloc) This code looks correct at a glance, but is horribly horribly wrong: node* n = node_traits::allocate(m_node_alloc, 1); ::new(n) node({ nullptr }); // not allowed to call construct? alloc_traits::construct(m_alloc, std::addressof(n->storage)); // ... alloc_traits::destroy(m_alloc, std::addressof(n->storage)); n->~node(); // not allowed to call destroy? node_traits::destroy(m_node_alloc, n, 1); (The double-construction of storage isn't the wrong part -- we're relying on it being a trivially-constructible type.) This compiles, and it default-constructs and then destroys a T inside node, right? No, since storage is the wrong type it ends up calling the storage-POD constructor and destructor, not T's. Despite alloc_traits supposedly being the traits for T. Why is the allocator even templated on this type when it doesn't actually use it? To make that code actually behave as expected, you need to reinterpret_cast<T&>(n->storage). And you need to remember to do that everywhere, and the compiler won't help you because it will happily compile with the wrong behaviour if you forget. (Only the allocator parameter is type-checked, not the address parameter.) Am I missing something? This seems a bit broken. Granted it's easy enough to make this safe once you're aware of it, but it seems like a way-too-easy trap to fall into. eg. this would be a "safe" node: struct node { explicit node(node* n = nullptr) : next(n) {} node* next; T& value() { return reinterpret_cast<T&>(storage); } private: std::aligned_storage_t<sizeof(T), alignof(T)> storage; }; But of course now it's not a POD. One other thing that the lax type-checking does allow is this: node* n = node_traits::allocate(m_node_alloc, 1); ::new(n) node(); // not allowed to call construct? node_traits::construct(m_node_alloc, std::addressof(n->value())); // ... node_traits::destroy(m_node_alloc, std::addressof(n->value())); n->~node(); // not allowed to call destroy? node_traits::destroy(m_node_alloc, n, 1); (Specifically not actually using m_alloc or alloc_traits any more.) Which seems like it should be illegal (despite being convenient, since it means storing only one allocator), although a StackOverflow thread claims that it's intended behaviour.
El 15/06/2017 a las 10:17, Gavin Lambert via Boost escribió:
[...]
Granted it's easy enough to make this safe once you're aware of it, but it seems like a way-too-easy trap to fall into. eg. this would be a "safe" node:
struct node { explicit node(node* n = nullptr) : next(n) {}
node* next; T& value() { return reinterpret_cast<T&>(storage); }
private: std::aligned_storage_t<sizeof(T), alignof(T)> storage; };
But of course now it's not a POD.
It's not a POD, but it has a trvial default ctor. My reading of 3.8 Object lifetime [basic.life] implies that you can omit construction and begin using a just-allocated object when its default ctor is trivial. Maybe some language lawyer can shed some light here. Joaquín M López Muñoz
On Thu, Jun 15, 2017 at 4:17 AM, Gavin Lambert via Boost wrote:
If the node actually contains aligned_storage rather than T, doesn't that then require reinterpret_casting everywhere (since everything will expect a T*, not an aligned_storage_t<>*)? Isn't that usually considered a bad thing?
Yes., and...
It also seems really error-prone,
Yes. :) i.e. It is easy to make the mistake you mentioned, among others. The model isn't nice (but it is what we have, until someone designs a less terrible one). Glen
Hi, I am not a Boost author, but I think I am able to give a review of PolyCollection, so here it goes.
On 03 May 2017, at 01:46, Ion Gaztañaga via Boost <boost@lists.boost.org> wrote:
Please post your comments and review to the boost mailing list (preferably), the Boost Library Incubator. or privately to the Review Manager (to me ;-). Here are some questions you might want to answer in your review:
- What is your evaluation of the design?
I only looked at the interface and the description of how it works, both is great. There is nothing I would do differently. Concerning the interface, I expected the collections to implement the std::set interface, which they mostly do. Of course, in addition they have the extra interface to do type-specific stuff. The std::set methods count, find, equal_range, lower_bound, upper_bound are missing, but the library has equivalent free functions, which I like better anyway.
- What is your evaluation of the implementation?
I didn't look at the code. The concept as described in the documentation made a lot of sense.
- What is your evaluation of the documentation?
I really liked the graphics which explain how objects are stored internally. The quality of the documentation is very high. Everything I looked at was very professional and made sense. Some specifics on the tutorial section: I understood boost::base_collection right away. This part is perfect from my POV. boost::function_collection was not so clear to me. It is nice to keep the narrative of game development going, but like someone said before, I think it would help to give a simpler example. I felt that the benefit of using boost::function_collection was not so clearly presented as for boost::base_collection. The function pointers are all the same size, so how exactly does the optimisation enter? I assume it is not about memory allocation here, but about sorting equal function signatures into homogeneous blocks to improve branch prediction and locality, but I am not sure. The description of boost::any_collection was clear again.
- What is your evaluation of the potential usefulness of the library?
The library optimises the use of collections of polymorphic objects. I can see many applications. The optimisation feels so obvious now after reading the documentation that I wonder why such a library was not added sooner. It is a good thing that Boost provides a standard implementation of this optimisation so that developers don't have to reinvent it in their domain.
- Did you try to use the library? With what compiler? Did you have any problems?
I tried to compile the tests on a Mac with Apple-clang 8.0.0. It worked after I manually specified "cxxflags=-std=c++11" in the b2 call. By the way, it would be great if b2 used the highest standard that is available by default.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I studied the documentation, mostly the tutorial. I also studied the benchmarks with interest, and I glanced at the reference. I tried to compile the tests. All in all, I guess I spend 3 hours.
- Are you knowledgeable about the problem domain?
Yes, as it is not very specific...
- Do you think the library should be accepted as a Boost library?
Yes. Best regards, Hans
El 10/05/2017 a las 17:56, Hans Dembinski via Boost escribió:
Hi,
I am not a Boost author, but I think I am able to give a review of PolyCollection, so here it goes.
Thanks for your review!
[...]
- What is your evaluation of the documentation? [...]
boost::function_collection was not so clear to me. It is nice to keep the narrative of game development going, but like someone said before, I think it would help to give a simpler example. I felt that the benefit of using boost::function_collection was not so clearly presented as for boost::base_collection. The function pointers are all the same size, so how exactly does the optimisation enter? I assume it is not about memory allocation here, but about sorting equal function signatures into homogeneous blocks to improve branch prediction and locality, but I am not sure.
I have to think over the boost::function_collection given that it does't look as clear as I expected... As you correctly guess my aim was to have a single narrative, but seems like that's not working so well. Regarding the benefits of boost::function_collection<Signature> with respect to std::vector<std::function<Signature>>, memory allocation improves as well: except where small object optimization applies, each std::function object stores a pointer to a heap-allocated copy of the wrapped callable entity passed by the user; boost::base_collection, on the contrary, stores same-type callable entities contiguously. (If you are really curious, a segment of a boost::function_collection is composed of *two* vectors, one for the callable entities proper and another storing std::function-like objects pointing to the former. Think of it as a std::vector<std::function<Signature>> constructed with such a smart memory allocator that (same-type) wrapped callable entities all happen to lie sequentially in memory.)
[...]
- Did you try to use the library? With what compiler? Did you have any problems? I tried to compile the tests on a Mac with Apple-clang 8.0.0. It worked after I manually specified "cxxflags=-std=c++11" in the b2 call. By the way, it would be great if b2 used the highest standard that is available by default.
It's weird because the Jamfile does have this flag already: https://github.com/joaquintides/poly_collection/blob/master/test/Jamfile.v2 I guess the snapshot provided from the Incubator might be oldish and not include that... I can't check right now because blincubator.com is not responding :-(
[...]
- Do you think the library should be accepted as a Boost library? Yes.
Thank you! Joaquín M López Muñoz
Hi Joaquin,
On 10 May 2017, at 19:21, Joaquin M López Muñoz via Boost <boost@lists.boost.org> wrote:
I have to think over the boost::function_collection given that it does't look as clear as I expected... As you correctly guess my aim was to have a single narrative, but seems like that's not working so well.
perhaps you can keep the narrative and just simplify the example a bit? I have nothing against the consistent narrative, it is a nice touch.
Regarding the benefits of boost::function_collection<Signature> with respect to […]
Ok, I think I got it now, thanks :). As far as I am concerned, just adding this explanation to the tutorial would be enough.
- Did you try to use the library? With what compiler? Did you have any problems? I tried to compile the tests on a Mac with Apple-clang 8.0.0. It worked after I manually specified "cxxflags=-std=c++11" in the b2 call. By the way, it would be great if b2 used the highest standard that is available by default.
It's weird because the Jamfile does have this flag already:
https://github.com/joaquintides/poly_collection/blob/master/test/Jamfile.v2
I guess the snapshot provided from the Incubator might be oldish and not include that... I can't check right now because blincubator.com is not responding :-(
It seems I got the right version, I checked out from github, not the incubator. My version of the Jamfile has these lines project : requirements <toolset>gcc:<cxxflags>-std=c++11 <toolset>clang:<cxxflags>-std=c++11 ; I am not a Boost.Build expert, but that looks ok to me. Nevertheless, when I run just ./b2 in the "tests" folder, it tries to compile but fails. When I run ./b2 cxxflags=-std=c++11, the tests compile fine. Best regards, Hans
El 11/05/2017 a las 9:46, Hans Dembinski escribió:
I tried to compile the tests on a Mac with Apple-clang 8.0.0. It worked after I manually specified "cxxflags=-std=c++11" in the b2 call. By the way, it would be great if b2 used the highest standard that is available by default.
It's weird because the Jamfile does have this flag already:
https://github.com/joaquintides/poly_collection/blob/master/test/Jamfile.v2
I guess the snapshot provided from the Incubator might be oldish and not include that... I can't check right now because blincubator.com is not responding :-( It seems I got the right version, I checked out from github, not the incubator. My version of the Jamfile has these lines
project : requirements <toolset>gcc:<cxxflags>-std=c++11 <toolset>clang:<cxxflags>-std=c++11 ;
I am not a Boost.Build expert, but that looks ok to me. Nevertheless, when I run just ./b2 in the "tests" folder, it tries to compile but fails. When I run ./b2 cxxflags=-std=c++11, the tests compile fine.
I'm no Boost.Build expert either. A shot in the dark: maybe Apple-clang has a different toolset than "regular" clang? Joaquín M López Muñoz
Hi Joaquín (and Ion), I took a look at this a few months ago, and my view is positive today as it was then. The code looks clean, the documentation good, and the tests look thorough. Summary: The library provides containers that offer great data locality and which stores static type information. The net result is that increased performance can be had, sometimes significantly. To achieve this, objects are stored in per-type segments. Small buffer optimizations may be advantages for vector<any> or vector<function<...>>, but this library goes beyond that. The price you pay is that iterators are forward only (so no sorting) and in-the-middle erase could be slower (in some cases). I would mostly be using base_collection in my work, but I can definitely see the need for the other two collections. To get the optimization offered by this library you would have to do something like vector<Derived1> vector<Derived2> ... vector<DerivedN> and then manually loop over each container each time you wanted to process each element. This quickly becomes tedious. With this library I can have a single container in my program and get approximately the same speed. So overall my impression is very positive. This is great work. I do have a few things that I think needs careful considerations. A. OO programming and copyability of classes is not something that everybody is going to like. The normal thing is to derive from boost::noncopyable to get rid of a bunch of undesired behaviors. Would it be possible for the library to pick up a private or protected copy-constructor when needed, possible via a friend declaration? I think many uses of this library will also store pointers to the objects in other collections, e.g. vector<const base*>, and as members so being able to have your hierarchy non-copyable while allowing copyability within the container is important IMO. B. I miss range versions of the various iteration patterns so I can use them directly with a range-based for. E.g. instead of for( auto i = c.begin<warrior>(); i != c.end<warrior>(); ++ i ) I should be able to say for( auto& : c.range<warrior>() ); C. I think we talked about an make_overload() function to devirtualize uses of base_collection. Do we have such a beast in boost already? If not, it might be a worth providing it with this library. D. Inside the segments you store std::vector. I could easily imagine the need to store something else, say a flat_set or a container suitable for non-movable, non-conpyable types (say a variant of deque). Therefore I would love to have a second defaulted argument set to std::vector. Now this is a template, template parameter which complicate things ... so in a sense a policy with a nested template alias could do the trick. I also have other minor comments, but those given below.
- What is your evaluation of the design?
It's excellent.
- What is your evaluation of the implementation?
Very high quality, as usual.
- What is your evaluation of the documentation?
Also high quality.
- What is your evaluation of the potential usefulness of the library?
Very high.
- Did you try to use the library? With what compiler? Did you have any problems?
No/No/No.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
About half a day.
- Are you knowledgeable about the problem domain?
Somewhat.
And most importantly:
- Do you think the library should be accepted as a Boost library?
Yes. *Specific comments* A. why is subaddress() returning void* instead of T* ? B. is the void* stuff like this virtual range emplace( const_base_iterator p,void (*emplf)(void*,void*),void* arg) { return range_from(s.emplace(iterator_from(p),emplf,arg)); } needed ? That is, is there no way to use real ans specific types? C. Does value_holder<T>::data() need to return void* D. Implementation of equality: do we need to be set like semantics (two-pass)? Why don't you just delegate to the segments after checking the size? I don't think the documentation specifies what the implementation does. E. should the test check (perhaps via static_assert) the conditions under which move-construction and move-assignment is noexcept? (sorry if this is already done). Specifically, if std::unordered_map has noexcept for these, then you can guarantee the same for base_collection etc. F. I understand the layout to be roughly std::unordered_map<type_index,std::unique_ptr<segment>> ... this does not quite match diagram on page 2 of documentation, that is, you are missing an indirection. G. tutorial: consider using override where appropriate H. local iterator and undefined behavior?: does the code have an assertion at least? Can we not get rid of undefined behavior of c.insert(c.begin(typeid(warrior)),juggernaut{11}); ? That would surely be nice. I. why emplace_pos/hint but not insert_hint/pos ? J. why ==,!= but not <,>,>=, <= K. clarify that boost::poly_collection::for_each <warrior,juggernaut,goblin,std::string,window>( //restituted types c.begin(),c.end(),[&](const auto& x) loops over all elements or only the specific. L. It could be good to have a performance test of erase in the middle/front. M. using the identifier "concept" may require change in the near future? O. I wouldn't mind seeing separate graphs for 0-300 elements. P. clarify test are without use of reserve Q. is there no normal container size()/empty() ? docs say e.g. cc.size(index) cc.size<U>() but then later when we see the full base_collection interface they are mentioned. R. dumb question: is there any thing in your code that requires std::is_polymorphic_v<Base> for base_collection ? If not, then perhaps one should be able to avoid that and work completely with type restitution ? In some sense, you could create a hierarchy where classes a memcopyable and without a single virtual function. S. The test are testing best possible scenario; a more realistic test would be to copy pointers to an std::vector<base*>, shuffle it and run update on that. That's it. Great work :-). kind regards Thorsten -- Best regards, Thorsten Jørgen Ottosen, Ph.D. Director of Research +45 23308797 Dezide (www.dezide.com)
El 10/05/2017 a las 22:33, Thorsten Ottosen via Boost escribió:
Hi Joaquín (and Ion),
[...]
A. OO programming and copyability of classes is not something that everybody is going to like. The normal thing is to derive from boost::noncopyable to get rid of a bunch of undesired behaviors. Would it be possible for the library to pick up a private or protected copy-constructor when needed, possible via a friend declaration? I think many uses of this library will also store pointers to the objects in other collections, e.g. vector<const base*>, and as members so being able to have your hierarchy non-copyable while allowing copyability within the container is important IMO.
Technically, this is doable and, at first blush, simple, since all element copying and assignent is centralized in a single wrapper class: https://github.com/joaquintides/poly_collection/blob/master/include/boost/po... Another possibility is to use a copying traits class to let users customize this point, along the custom RTTI thing proposed by Ion. This looks to me even cleaner and more powerful. I have no strong objection against including ths feature in whatever form, but don't particularly like it either --in the end, elements in a base_collection must be copyable (strictly speaking, moveable) and this will force users to change their code by writing the necessary ctors, so requiring an extra layer of boilerplate (declaring friends etc.) is just more hassle.
B. I miss range versions of the various iteration patterns so I can use them directly with a range-based for. E.g. instead of
for( auto i = c.begin<warrior>(); i != c.end<warrior>(); ++ i )
I should be able to say
for( auto& : c.range<warrior>() );
Yes, this is trivial and looks useful. At the risk of bikeshedding, c.segment<warrior>() might be a better name. Unfortunately there's no std naming practice here to follow on, unless someone proves me wrong.
C. I think we talked about an make_overload() function to devirtualize uses of base_collection. Do we have such a beast in boost already? If not, it might be a worth providing it with this library.
hanna::overload is already here to the rescue: https://lists.boost.org/Archives/boost/2017/03/233001.php
D. Inside the segments you store std::vector. I could easily imagine the need to store something else, say a flat_set or a container suitable for non-movable, non-conpyable types (say a variant of deque). Therefore I would love to have a second defaulted argument set to std::vector. Now this is a template, template parameter which complicate things ... so in a sense a policy with a nested template alias could do the trick.
Of all your proposals this is the one I have most concerns about: element contiguity is such a key property that basically the whole library implicitly depends on it, code- and designwise. Policying this aspect away looks to me, in the absence of cogent use cases, an exercise in overdesign and a lot of work.
[...]
- Do you think the library should be accepted as a Boost library?
Yes.
Thank you!
*Specific comments*
A. why is subaddress() returning void* instead of T* ?
Because only void* is needed: subaddress() is used in poly_collection/detail/segment.hpp as a client of the type-erased virtual interface defined in https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
B. is the void* stuff like this
virtual range emplace( const_base_iterator p,void (*emplf)(void*,void*),void* arg) { return range_from(s.emplace(iterator_from(p),emplf,arg)); }
needed ? That is, is there no way to use real ans specific types?
This is connected to A, and in fact gives me the opportunity to comment on one of the more beautiful (IMHO) pieces of the lib :-) The implementation of a typical emplace(args...) function just forwards the args to the corresponding allocator construction call. The problem with Boost.PolyCollection is that this construction happens inside a class implementing a virtual interface (the one mentioned before), and this interface can't have a member function template able to accept(args...). We have a type erasure wall, if you wish. So the workaround is to define a type-erased callback (the emplf bit) that is invoked from the virtual class implementation and provided by the frontend of poly_collection, where the types of (args...) haven't been erased. Take a look at https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
C. Does value_holder<T>::data() need to return void*
It could return T*, but that's not needed, as data() is used for placement new.
D. Implementation of equality: do we need to be set like semantics (two-pass)? Why don't you just delegate to the segments after checking the size? I don't think the documentation specifies what the implementation does.
We can't blindly delegate to the segments (if I'm getting your question right) because the order and associated types of the segments in x and y can differ: we need to first match them up, hence the two passes.
E. should the test check (perhaps via static_assert) the conditions under which move-construction and move-assignment is noexcept? (sorry if this is already done). Specifically, if std::unordered_map has noexcept for these, then you can guarantee the same for base_collection etc.
I can't guarantee noexcept because whether an exception is thrown or not depends on the nature of the actual types in the container, which is runtime info.
F. I understand the layout to be roughly std::unordered_map<type_index,std::unique_ptr<segment>> ... this does not quite match diagram on page 2 of documentation, that is, you are missing an indirection.
You're right, but the diagram is merely explanatory and I wouldn't want to complicate it to no avail. In fact, the segments are pointed to from what looks like a small vector of three elements (light blue), which is also not a correct depiction of a std::unordered_map internal structure. In the end, all these indirections don't affect performance in any significant way (workload is in the segments proper).
G. tutorial: consider using override where appropriate
Noted, will do.
H. local iterator and undefined behavior?: does the code have an assertion at least? Can we not get rid of undefined behavior of
c.insert(c.begin(typeid(warrior)),juggernaut{11});
? That would surely be nice.
There's an assertion: take a look at https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
I. why emplace_pos/hint but not insert_hint/pos ?
This is insert(it,x). As for the naming, we have emplace(args...) emplace_hint(it,args...) emplace_pos(local_it,args...) insert(x) insert(it,x) insert(local_it,x) which follows std conventions except for "emplace_pos". I had to made that name up because, in the standard, fixed position emplace only happens in sequence containers where the name is simply "emplace", but I can't use "emplace" as emplace(it,args...) would collide with emplace(args...) (which sequence containers don't have).
J. why ==,!= but not <,>,>=, <=
Followed the interface of std::unordered_set. As segment order is unspecified, less-than comparison seems to make no sense.
K. clarify that
boost::poly_collection::for_each <warrior,juggernaut,goblin,std::string,window>( //restituted types c.begin(),c.end(),[&](const auto& x)
loops over all elements or only the specific.
Loops over all elements, doing type restitution on those marked. Noted, will clarify.
L. It could be good to have a performance test of erase in the middle/front.
I can do that. Interpretation is going to be tricky, as the compared data structures differ wildly wrt erasure.
M. using the identifier "concept" may require change in the near future?
Good observation! But, is "concept" not going to be a context-dependent keyword à la "final"?
O. I wouldn't mind seeing separate graphs for 0-300 elements.
I can extend the graphs to cover that part, but I don't see the point of using this library for a handful of elements,don't you think?
P. clarify test are without use of reserve
Noted, will do.
Q. is there no normal container size()/empty() ? docs say e.g.
cc.size(index) cc.size<U>()
but then later when we see the full base_collection interface they are mentioned.
There are normal size() and empty() member functions, they're described in the reference.
R. dumb question: is there any thing in your code that requires std::is_polymorphic_v<Base> for base_collection ? If not, then perhaps one should be able to avoid that and work completely with type restitution ? In some sense, you could create a hierarchy where classes a memcopyable and without a single virtual function.
Actually, std::is_polymorphic_v is enforced on the reference but not anywhere in the code. I guess I preferred to be conservative here. As for working with type restitution alone, I think this scenario would be best served by a potential variant_collection, which was discussed at https://lists.boost.org/Archives/boost/2017/03/233001.php
S. The test are testing best possible scenario; a more realistic test would be to copy pointers to an std::vector<base*>, shuffle it and run update on that.
Sorry if I am not getting your question, but is this not the line labeled "shuffled_ptr_vector" in the plot?
That's it. Great work :-).
What a thorough review! Thank you, Joaquín M López Muñoz
Den 11-05-2017 kl. 09:53 skrev Joaquin M López Muñoz via Boost:
El 10/05/2017 a las 22:33, Thorsten Ottosen via Boost escribió:
Hi Joaquín (and Ion),
[...]
A. OO programming and copyability
Technically, this is doable and, at first blush, simple, since all element copying and assignent is centralized in a single wrapper class:
Another possibility is to use a copying traits class to let users customize this point, along the custom RTTI thing proposed by Ion. This looks to me even cleaner and more powerful.
Yeah, I guess I would need to see how client code looks like to form any opinion. Notice that classes in hierarchies often have a non-public copy-constructor for use in cloning. Identity management is just not like ints, and sometimes having classes that are neither (publicly) copyable or movable happens.
in the end, elements in a base_collection must be copyable (strictly speaking, moveable) and this will force users to change their code
Yes, but it is likely that such types are used outside of the containers as well and therefore I think it is important to consider.
Yes, this is trivial and looks useful. At the risk of bikeshedding, c.segment<warrior>() might be a better name.
I like it.
C. I think we talked about an make_overload() function to devirtualize uses of base_collection. Do we have such a beast in boost already? If not, it might be a worth providing it with this library.
hanna::overload is already here to the rescue:
Awesome!
D. Inside the segments you store std::vector. I could easily imagine the need to store something else, say a flat_set or a container suitable for non-movable, non-conpyable types (say a variant of deque). Therefore I would love to have a second defaulted argument set to std::vector. Now this is a template, template parameter which complicate things ... so in a sense a policy with a nested template alias could do the trick.
Of all your proposals this is the one I have most concerns about: element contiguity is such a key property that basically the whole library implicitly depends on it, code- and designwise. Policying this aspect away looks to me, in the absence of cogent use cases, an exercise in overdesign and a lot of work.
I'm just thinking out loud here. So one option would be to require that the container is contiguous (flat_set fits this). I was under the impression that you rely heavily on having random access iterators on a segment, but not on actual contiguity of the elements in a segment. We would probably also need some way to get the segment container, say, auto& vector = c.segment_container<warrior>(); so why would anyone want to use a flat_set internally. Well, fast lookup. The alternative today is to copy the pointers to flat_set/vector and use that. Why would anyone want to use batch_deque (http://erenon.hu/double_ended/double_ended/examples.html#double_ended.exampl... Well, non-copyable, non-movable types. For example, when I need to load a Bayesian network in my work, it maps exactly over to a set of nodes that each must be loaded once and then kept in memory. Having the no-move guarantee ensures that pointers to the nodes can be shared freely without any worry that the object reference is not stable. Anyway, if its a lot of work, I wouldn't put it high on the wish list.
D. Implementation of equality: do we need to be set like semantics (two-pass)? Why don't you just delegate to the segments after checking the size? I don't think the documentation specifies what the implementation does.
We can't blindly delegate to the segments (if I'm getting your question right) because the order and associated types of the segments in x and y can differ: we need to first match them up, hence the two passes.
Well, the documentation states: "a==b evaluates to true iff for each non-empty segment of subojects of type U in a there is a segment of U in b with the same size and equal elements in the same order, and viceversa." ^^^^^^^^^^^^^^^^^^ So the question is we want the current O(n^2) == or if it should do as the docs say. Since we don't have set semantics anyway, I would prefer segments to match exactly.
E. should the test check (perhaps via static_assert) the conditions under which move-construction and move-assignment is noexcept? (sorry if this is already done). Specifically, if std::unordered_map has noexcept for these, then you can guarantee the same for base_collection etc.
I can't guarantee noexcept because whether an exception is thrown or not depends on the nature of the actual types in the container, which is runtime info.
I don't get it. We are moving a hash_map of std::type_index keys and std::unique_ptr<segment> values. Neither of those should throw. And we hopefully guarantee reference stability on moving a base_collection (that is, we don't move individual elements in the container)? I don't think the C++ standard requires a noexcept move for std::unordered, but suspect many implementations provide this guarantee (we are basically swapping a handle). So I'm saying that we can guarantee noexcept if the unordered map guarantees it. Perhaps using boost::unordered is better than using the standard container. This reminds me, it would be good with a small note of reference stability guarantees. AFAICT, it's swap/move/changing an unrelated segment.
H. local iterator and undefined behavior?: does the code have an assertion at least? Can we not get rid of undefined behavior of
c.insert(c.begin(typeid(warrior)),juggernaut{11});
? That would surely be nice.
There's an assertion: take a look at
https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
Ok. Hm. So maybe I don't quite get we have c.insert(c.begin<juggernaut>(),juggernaut{10}); // compile time error c.insert(c.begin(typeid(warrior)),juggernaut{11}); // undefined behavior presumably because we may not know the type, so we say c.insert( c.begin(typeid( obj )), std::move(obj) ); ? Does it need to be an iterator? Perhaps c.insert( segment_position{0}, std::move(obj) ); could work, guaranteeing that no undefined behavior can happen. Otherwise, we should seriously consider throwing instead.
J. why ==,!= but not <,>,>=, <=
Followed the interface of std::unordered_set. As segment order is unspecified, less-than comparison seems to make no sense.
Right. One would have to copy the std::type_index keys to a local buffer, sort it, and use that as the order. But it is hard to imagine a use-case.
M. using the identifier "concept" may require change in the near future?
Good observation! But, is "concept" not going to be a context-dependent keyword à la "final"?
Not sure. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4377.pdf specifies two keywords.
O. I wouldn't mind seeing separate graphs for 0-300 elements.
I can extend the graphs to cover that part, but I don't see the point of using this library for a handful of elements,don't you think?
Why not? The elements may be relatively few, but fat. This fits exactly my use case.
Q. is there no normal container size()/empty() ? docs say e.g.
cc.size(index) cc.size<U>()
but then later when we see the full base_collection interface they are mentioned.
There are normal size() and empty() member functions, they're described in the reference.
Sure, but in the section called Polymorphic collections they sometime appear as only segment specific, whereas in the declaration of the class, they show all members.
S. The test are testing best possible scenario; a more realistic test would be to copy pointers to an std::vector<base*>, shuffle it and run update on that.
Sorry if I am not getting your question, but is this not the line labeled "shuffled_ptr_vector" in the plot?
Yes, sort of, but I'm saying that very often one wants to copy pointers from objects in base_collection into std::vector<base*> to rearrange the elements somehow (say, your game entities need to be sorted wrt to distance from the avatar). I still expects base_collection + std::vector<base*> + shuffle to perform somewhat better than ptr_vector + shuffle, but it would be interesting to see. kind regards Thorsten
El 11/05/2017 a las 18:18, Thorsten Ottosen escribió:
Den 11-05-2017 kl. 09:53 skrev Joaquin M López Muñoz via Boost:
El 10/05/2017 a las 22:33, Thorsten Ottosen via Boost escribió:
D. Implementation of equality: do we need to be set like semantics (two-pass)? Why don't you just delegate to the segments after checking the size? I don't think the documentation specifies what the implementation does.
We can't blindly delegate to the segments (if I'm getting your question right) because the order and associated types of the segments in x and y can differ: we need to first match them up, hence the two passes.
Well, the documentation states:
"a==b evaluates to true iff for each non-empty segment of subojects of type U in a there is a segment of U in b with the same size and equal elements in the same order, and viceversa." ^^^^^^^^^^^^^^^^^^
So the question is we want the current O(n^2) == or if it should do as the docs say. Since we don't have set semantics anyway, I would prefer segments to match exactly.
Sorry, I'm not getting you. The current implementation enforces exact segment match. Could you maybe elaborate/reword your question?
E. should the test check (perhaps via static_assert) the conditions under which move-construction and move-assignment is noexcept? (sorry if this is already done). Specifically, if std::unordered_map has noexcept for these, then you can guarantee the same for base_collection etc.
I can't guarantee noexcept because whether an exception is thrown or not depends on the nature of the actual types in the container, which is runtime info.
I don't get it. [...]
My bad, I misread your question. Of course move construction and move assignment do not throw or copy elements etc., but I didn't mark them as noexcept following the no-noexcept signature of std unordered associative containers. I don't know why the std has not made those noexcept.
This reminds me, it would be good with a small note of reference stability guarantees. AFAICT, it's swap/move/changing an unrelated segment.
This is implicitly guaranteed by [container.requirements.general]/9, as a polymorphic collection is a quasi-model of Container, as stated in the reference.
H. local iterator and undefined behavior?: does the code have an assertion at least? Can we not get rid of undefined behavior of
c.insert(c.begin(typeid(warrior)),juggernaut{11});
? That would surely be nice.
There's an assertion: take a look at
https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
Ok. Hm. So maybe I don't quite get we have
c.insert(c.begin<juggernaut>(),juggernaut{10}); // compile time error c.insert(c.begin(typeid(warrior)),juggernaut{11}); // undefined behavior
The difference is that in the former, the type of the iterator is associated to warrior (I assume you meant "c.begin<warrior>()") at compile time, whereas in the former this info is dynamic and an assertion is the best we can get.
presumably because we may not know the type, so we say
c.insert( c.begin(typeid( obj )), std::move(obj) );
? Does it need to be an iterator? Perhaps
c.insert( segment_position{0}, std::move(obj) );
could work, guaranteeing that no undefined behavior can happen. Otherwise, we should seriously consider throwing instead.
I don't think throwing is the right approach here, as we are penalizing users who do the right thing and don't mix values with iterators improperly.
O. I wouldn't mind seeing separate graphs for 0-300 elements.
I can extend the graphs to cover that part, but I don't see the point of using this library for a handful of elements,don't you think?
Why not? The elements may be relatively few, but fat. This fits exactly my use case.
Noted. Will extend the plots, then.
S. The test are testing best possible scenario; a more realistic test would be to copy pointers to an std::vector<base*>, shuffle it and run update on that.
Sorry if I am not getting your question, but is this not the line labeled "shuffled_ptr_vector" in the plot?
Yes, sort of, but I'm saying that very often one wants to copy pointers from objects in base_collection into std::vector<base*> to rearrange the elements somehow (say, your game entities need to be sorted wrt to distance from the avatar). I still expects base_collection + std::vector<base*> + shuffle to perform somewhat better than ptr_vector + shuffle, but it would be interesting to see.
Sorry again but I don't get it yet. I don't quite understand which particular data structure you'd like me to compare base_collection against. Joaquín M López Muñoz
Den 15-05-2017 kl. 00:14 skrev Joaquin M López Muñoz via Boost:
El 11/05/2017 a las 18:18, Thorsten Ottosen escribió:
Den 11-05-2017 kl. 09:53 skrev Joaquin M López Muñoz via Boost:
El 10/05/2017 a las 22:33, Thorsten Ottosen via Boost escribió:
D. Implementation of equality: do we need to be set like semantics
[snip]
Sorry, I'm not getting you. The current implementation enforces exact segment match. Could you maybe elaborate/reword your question?
My bad. Your code does what the docs say. You have this // TODO: can we do better than two passes? By that, do you mean that the first pass is the size() comparison?
E. should the test check (perhaps via static_assert) the conditions under which move-construction and move-assignment is noexcept? (sorry
I don't get it. [...]
My bad, I misread your question. Of course move construction and move assignment do not throw or copy elements etc., but I didn't mark them as noexcept following the no-noexcept signature of std unordered associative containers. I don't know why the std has not made those noexcept.
I guess some vendors wanted freedom, and then gave users the near impossible task of tracking no-except/except impacts through their code bases. Now, you use =default for all the move operations, so I think they are required to be deduced to be noexcept exactly when they can be. Though a static assertion in the tests can track any fault in that logic. If one ever puts base_collection's into a container, you are in for a huge penalty when the container operation falls back on copying.
S. The test are testing best possible scenario; a more realistic test would be to copy pointers to an std::vector<base*>, shuffle it and run update on that.
Sorry if I am not getting your question, but is this not the line labeled "shuffled_ptr_vector" in the plot?
Yes, sort of, but I'm saying that very often one wants to copy pointers from objects in base_collection into std::vector<base*> to rearrange the elements somehow (say, your game entities need to be sorted wrt to distance from the avatar). I still expects base_collection + std::vector<base*> + shuffle to perform somewhat better than ptr_vector + shuffle, but it would be interesting to see.
Sorry again but I don't get it yet. I don't quite understand which particular data structure you'd like me to compare base_collection against.
So take the processing test. You already compare against a shuffled ptr_vector<T>. So far so good. Now take a base_collection<T> as is also done. Then take the address of every element in the base collection and put them into a vector<T*>. Now shuffle this vector<T*>'s content like you did for ptr_vector. Then compare the processing time over a ptr_vector with the processing over the vector<T*>. I claim this is a slightly more realistic test because one often wants to impose an order of some or all of the elements in the base_collection<T> before processing/iterating them somehow. So you won't always access the items in the order induced by the base_collection<T>. That's why I said that the current test is a best-case scenario, and there is no need to change that test. But I think it would be useful to see how much better your library is when access is not segment-sequential even though storage is. kind regards Thorsten
El 15/05/2017 a las 16:13, Thorsten Ottosen via Boost escribió:
D. Implementation of equality: do we need to be set like semantics
My bad. Your code does what the docs say. You have this
// TODO: can we do better than two passes?
By that, do you mean that the first pass is the size() comparison?
OK, now I get it. Yes. the size() comparison does a first pass of both containers' internal std::unordered_maps, and I wondered whether we can come up with something smarter that avoids that.
E. should the test check (perhaps via static_assert) the conditions under which move-construction and move-assignment is noexcept? (sorry
I don't get it. [...]
My bad, I misread your question. Of course move construction and move assignment do not throw or copy elements etc., but I didn't mark them as noexcept following the no-noexcept signature of std unordered associative containers. I don't know why the std has not made those noexcept.
I guess some vendors wanted freedom, and then gave users the near impossible task of tracking no-except/except impacts through their code bases. Now, you use =default for all the move operations, so I think they are required to be deduced to be noexcept exactly when they can be. Though a static assertion in the tests can track any fault in that logic.
If one ever puts base_collection's into a container, you are in for a huge penalty when the container operation falls back on copying.
Fallback copying cannot occur to the best of my knowledge: [container.requirements.general]/Notes: "Those entries marked “(Note A)” or “(Note B)” have linear complexity for array and have constant complexity for all other standard containers." As for noexcept in std unordered associative containers, I don't have a clue why move construction is not marked conditionally noexcept the way move assignement is... given this messy state I'd prefer to rely on =default before committing to any noexcept guarantees.
S. The test are testing best possible scenario; a more realistic test would be to copy pointers to an std::vector<base*>, shuffle it and run update on that. [...] So take the processing test. You already compare against a shuffled ptr_vector<T>. So far so good. Now take a base_collection<T> as is also done. Then take the address of every element in the base
collection and put them into a vector<T*>. Now shuffle this vector<T*>'s content like you did for ptr_vector. Then compare the processing time over a ptr_vector with the processing over the vector<T*>.
I claim this is a slightly more realistic test because one often wants to impose an order of some or all of the elements in the base_collection<T> before processing/iterating them somehow. So you won't always access the items in the order induced by the base_collection<T>. That's why I said that the current test is a best-case scenario, and there is no need to change that test. But I think it would be useful to see how much better your library is when access is not segment-sequential even though storage is.
I did that with this testing shuffled_base_collection class template: template<typename Base> struct shuffled_base_collection { template<typename T> void insert(const T& x) { bc.insert(x); } template<typename F> void for_each(F f) { std::for_each(v.begin(),v.end(),[&](Base* x){f(*x);}); } void prepare_for_for_each() { std::for_each(bc.begin(),bc.end(),[&](Base& x){v.push_back(&x);}); std::shuffle(v.begin(),v.end(),std::mt19937(1)); } private: boost::base_collection<Base> bc; std::vector<Base*> v; }; (Complete performance program attached for your experimentation). Results for Visual C++ 2015 in 32-bit (x86) release mode, Windows 7 64-bit, Intel Core i5-2520M @2.5GHz, shuffled_base_collection is the fourth column: for_each: n;ptr_vector;sorted ptr_vector;shuffled ptr_vector;shuffled base_collection;base_collection;base_collection (poly::for_each);base_collection (restituted poly::for_each) 1000;68.6908;28.0432;86.3083;94.0958;37.7935;30.107;29.0816 2000;66.4786;27.1926;91.2408;96.0413;35.2004;27.8589;26.583 3100;65.5093;26.8698;89.2557;93.8447;34.4803;27.2731;26.1223 4310;64.4793;27.1499;89.499;98.0149;35.1578;28.2978;26.5429 5640;64.266;26.9044;91.033;98.202;34.2633;27.501;25.8809 7105;64.0913;26.8572;91.0014;97.3875;33.7854;27.4859;26.5453 8715;63.7974;27.3087;91.9985;98.399;34.2015;27.2596;25.599 10485;65.3485;26.5905;94.7418;97.5851;33.7521;27.6211;26.5611 12430;64.7723;27.3904;96.0209;99.1571;34.043;27.2985;26.1444 14575;64.2452;27.1067;96.5124;97.7042;34.5654;27.4023;25.8673 16930;65.1636;27.3421;98.1957;98.6361;34.0196;27.3581;25.9065 19520;64.7248;27.515;99.7822;99.5556;33.8953;27.5236;26.1173 22370;65.5374;27.8343;101.01;98.5953;33.8068;27.7628;27.0117 25505;65.0315;27.0511;103.225;101.638;34.2136;27.0208;26.2058 28955;64.1178;27.5933;103.772;105.797;33.9232;27.4988;26.231 32745;64.029;28.1792;103.421;106.625;33.8092;27.3831;25.794 36915;64.8421;28.1159;104.303;107.454;34.0542;27.405;25.7482 41505;65.1041;28.1572;105.835;108.924;34.07;27.1529;25.7295 46550;65.1797;27.8241;107.201;111.251;34.6232;27.6993;25.7345 52100;64.4235;28.0879;106.881;111.354;33.8314;27.0135;25.7227 58205;64.6465;27.9511;107.758;110.244;34.211;27.1424;25.8198 64920;65.1627;27.8584;110.97;111.413;34.7061;27.5941;26.0383 72305;65.5236;28.4969;111.552;112.634;34.0255;27.8747;25.7921 80430;66.1873;28.6039;115.026;114.442;34.3514;28.2707;25.6547 89365;66.3345;29.7042;119.888;114.368;33.976;27.5066;26.405 99195;67.5842;29.8768;132.691;118.178;33.8057;27.5137;26.3577 110005;67.502;33.2295;137.329;119.368;33.9743;28.261;25.8561 121900;67.8779;32.8886;144.397;119.939;34.0421;27.5768;26.1762 134980;73.2615;35.4748;215.428;150.062;35.1048;27.9387;26.3445 149370;69.5914;39.4097;166.049;132.759;35.3813;28.1609;26.5035 165195;70.0795;36.9757;176.128;125.824;35.4784;28.0991;27.1848 182605;70.678;39.4236;184.899;132.561;34.7531;27.9232;26.7449 201755;69.804;41.916;189.486;147.688;34.5916;28.4368;26.4968 222815;70.9049;43.3167;197.706;151.125;35.4652;28.6605;27.8671 245985;70.03;53.1854;201.855;164.847;35.7549;28.4426;27.1877 271470;69.6931;46.2775;209.332;179.249;36.2631;29.0661;28.4608 299505;70.9044;47.3569;212.374;194.289;36.024;29.2358;27.8824 330340;70.5898;47.0813;211.55;198.904;37.2955;29.1048;28.3034 364260;69.3568;48.0286;216.136;213.392;36.4719;29.6492;28.947 401570;70.0784;55.6783;217.553;216.374;36.6151;29.8494;28.6781 442610;70.2452;55.8915;222.709;222.249;36.4878;29.7408;28.192 487755;70.0358;48.7904;222.94;229.27;36.3237;30.3672;28.6129 537415;70.3103;50.0021;221.074;231.909;37.1915;30.0791;28.471 592040;70.1439;55.2571;223.077;234.858;37.6925;29.8299;28.4722 652125;70.8862;51.2554;230.277;239.505;37.3451;30.2505;28.1771 718220;70.4251;49.5872;233.219;239.863;36.5727;30.2672;28.6706 790920;69.7149;55.3772;230.926;242.922;37.2833;30.2138;28.9888 870895;69.4214;51.0372;235.172;251.45;36.9112;30.0106;29.2397 958865;70.4314;51.3184;235.628;252.243;37.3236;30.2469;28.4398 1055630;70.0297;52.1747;240.768;253.405;36.8073;30.0849;28.7678 1162075;71.0985;50.314;247.533;259.274;37.3846;30.3551;28.422 1279160;71.061;55.4783;243.327;257.581;37.0938;29.5389;29.1207 1407955;70.1601;50.6091;247.387;264.074;37.0902;29.8258;28.5657 1549630;70.4848;50.497;246.17;264.641;36.97;30.0207;28.4441 1705470;69.8107;50.983;247.042;261.625;36.5883;30.2686;28.5624 1876895;69.9785;51.5235;252.602;266.711;36.9832;29.9705;29.0799 2065465;69.4235;51.4422;256.153;265.141;37.3256;29.9859;29.0338 2272885;69.8385;51.4589;250.552;266.472;37.3823;30.3282;28.4389 2501050;69.9755;55.2984;255.818;272.43;36.9822;29.7997;28.9513 2752030;69.9469;54.2768;256.759;266.995;37.3053;30.2714;28.5217 3028110;70.3192;50.6052;261.753;272.133;37.5515;30.1139;28.9316 3331795;70.5005;49.9084;266.412;277.529;37.3315;30.3557;29.0938 3665850;70.0603;56.1319;271.833;276.466;37.199;29.6312;29.383 4033310;70.3551;51.7834;270.162;276.077;37.2656;29.772;28.5666 4437515;70.1824;50.7757;278.203;278.845;36.9247;29.4452;28.7805 4882140;70.5786;49.2731;281.148;283.854;37.1183;30.1079;28.9781 5371225;70.0242;49.7885;285.602;287.699;37.0402;30.4146;28.411 5909220;69.9687;51.3314;284.474;282.396;36.4418;29.6589;28.3406 6501010;70.3126;55.6867;297.482;284.977;36.9394;30.0999;28.9472 7151985;69.8237;50.2656;295.983;286.696;36.7772;29.8695;28.7694 7868050;69.6268;55.6824;303.074;291.302;37.4062;30.572;29.7347 8655725;69.5738;50.8151;310.437;300.738;37.2501;30.0555;28.7516 9522170;69.8031;56.4064;317.517;301.382;36.8001;30.8586;28.9918 10475255;69.6152;50.142;331.536;302.847;37.3644;30.9883;29.0113 So, shuffled ptr_vector and shuffled base_collection seem to perform equally bad; I can't see any significant pattern here, which leads me to conclude shuffling destroys any cache friendliness elements might have due to the fact they come from a boost::base_collection. Joaquín M López Muñoz
Den 15-05-2017 kl. 19:01 skrev Joaquin M López Muñoz via Boost:
El 15/05/2017 a las 16:13, Thorsten Ottosen via Boost escribió:
D. Implementation of equality: do we need to be set like semantics
My bad. Your code does what the docs say. You have this
// TODO: can we do better than two passes?
By that, do you mean that the first pass is the size() comparison?
OK, now I get it. Yes. the size() comparison does a first pass of both containers' internal std::unordered_maps, and I wondered whether we can come up with something smarter that avoids that.
Hm. Yeah, I guess we have at least three options: a) current version b) omit check to "global" size() and let segment comparison deal with that c) store the size in poly_collection I think c) is not that attractive due to all the code to provide consistency. Whether a) is better than b) I guess will depend on the on the actual runtime data. I speculate that your current version is the best.
If one ever puts base_collection's into a container, you are in for a huge penalty when the container operation falls back on copying.
Fallback copying cannot occur to the best of my knowledge:
[container.requirements.general]/Notes:
"Those entries marked “(Note A)” or “(Note B)” have linear complexity for array and have constant complexity for all other standard containers."
That is true, but as soon you hit code that calls std::move_if_noexcept, http://en.cppreference.com/w/cpp/utility/move_if_noexcept those move operations that are not marked/deduced noexcept will not be in effect.
As for noexcept in std unordered associative containers, I don't have a clue why move construction is not marked conditionally noexcept the way move assignement is... given this messy state I'd prefer to rely on =default before committing to any noexcept guarantees.
Sounds reasonable. As I stated, it appears to me that you will take advantage of noexcept if it exists in the standard container.
I did that with this testing shuffled_base_collection class template:
template<typename Base> struct shuffled_base_collection { template<typename T> void insert(const T& x) { bc.insert(x); }
template<typename F> void for_each(F f) { std::for_each(v.begin(),v.end(),[&](Base* x){f(*x);}); }
void prepare_for_for_each() { std::for_each(bc.begin(),bc.end(),[&](Base& x){v.push_back(&x);}); std::shuffle(v.begin(),v.end(),std::mt19937(1)); }
private: boost::base_collection<Base> bc; std::vector<Base*> v; };
(Complete performance program attached for your experimentation). Results for Visual C++ 2015 in 32-bit (x86) release mode, Windows 7 64-bit, Intel Core i5-2520M @2.5GHz, shuffled_base_collection is the fourth column:
So, shuffled ptr_vector and shuffled base_collection seem to perform equally bad; I can't see any significant pattern here, which leads me to conclude shuffling destroys any cache friendliness elements might have due to the fact they come from a boost::base_collection.
Thanks for quick update. It wasn't clear to me if your time includes the time to insert and prepare_for_for_each. If so, it would be fair to call v.reserve( bc.size() ); in prepare_for_for_each. Anyway, it is a bit surprising. Perhaps modern allocators are good at allocating same size objects closely and without much overhead ... -Thorsten
El 15/05/2017 a las 19:46, Thorsten Ottosen via Boost escribió:
Den 15-05-2017 kl. 19:01 skrev Joaquin M López Muñoz via Boost:
El 15/05/2017 a las 16:13, Thorsten Ottosen via Boost escribió:
> D. Implementation of equality: do we need to be set like semantics [...] OK, now I get it. Yes. the size() comparison does a first pass of both containers' internal std::unordered_maps, and I wondered whether we can come up with something smarter that avoids that.
Hm. Yeah, I guess we have at least three options:
a) current version
b) omit check to "global" size() and let segment comparison deal with that
c) store the size in poly_collection
I think c) is not that attractive due to all the code to provide consistency. Whether a) is better than b) I guess will depend on the on the actual runtime data. I speculate that your current version is the best.
I think this variation of a) might perform better (code untested): template<typename Model,typename Allocator> bool operator==( const poly_collection<Model,Allocator>& x, const poly_collection<Model,Allocator>& y) { typename poly_collection<Model,Allocator>::size_type s=0; const auto &mapx=x.map,&mapy=y.map; for(const auto& p:mapx){ s+=p.second.size(); auto it=mapy.find(p.first); if(it==mapy.end()?!p.second.empty():p.second!=it->second)return false; } if(s!=mapy.size())return false; return true; }
[...]
So, shuffled ptr_vector and shuffled base_collection seem to perform equally bad; I can't see any significant pattern here, which leads me to conclude shuffling destroys any cache friendliness elements might have due to the fact they come from a boost::base_collection.
Thanks for quick update. It wasn't clear to me if your time includes the time to insert and prepare_for_for_each. If so, it would be fair to call v.reserve( bc.size() ); in prepare_for_for_each.
Timing does not include insert() or prepare_for_for_each().
Anyway, it is a bit surprising. Perhaps modern allocators are good at allocating same size objects closely and without much overhead ...
I think the similarity in performance between shuffled ptr_vector and shuffled base_collectin goes the other way around: once sequentiality is destroyed, it doesn't matter much whether elements lie relativley close to each other in main memory. Joaquín M López Muñoz
Den 15-05-2017 kl. 21:21 skrev Joaquin M López Muñoz via Boost:
El 15/05/2017 a las 19:46, Thorsten Ottosen via Boost escribió:
I think this variation of a) might perform better (code untested):
template<typename Model,typename Allocator> bool operator==( const poly_collection<Model,Allocator>& x, const poly_collection<Model,Allocator>& y) { typename poly_collection<Model,Allocator>::size_type s=0; const auto &mapx=x.map,&mapy=y.map; for(const auto& p:mapx){ s+=p.second.size(); auto it=mapy.find(p.first); if(it==mapy.end()?!p.second.empty():p.second!=it->second)return false; } if(s!=mapy.size())return false; return true; }
Yeah, could be.
Anyway, it is a bit surprising. Perhaps modern allocators are good at allocating same size objects closely and without much overhead ...
I think the similarity in performance between shuffled ptr_vector and shuffled base_collectin goes the other way around: once sequentiality is destroyed, it doesn't matter much whether elements lie relativley close to each other in main memory.
Ok. I guess (and there is no hurry) it will also be interesting to see for 0-1000 elements. I know its nice to have a graph that grows as a function of n, but I think the best thing would be make each data point be based on the same number of iterations. So I would use an exponential growth for n, n = 8, 16, 32 ... max_n and then run each loop x times, x being max_n / n. kind regards Thorsten
El 16/05/2017 a las 11:11, Thorsten Ottosen via Boost escribió:
Den 15-05-2017 kl. 21:21 skrev Joaquin M López Muñoz via Boost:
I think the similarity in performance between shuffled ptr_vector and shuffled base_collectin goes the other way around: once sequentiality is destroyed, it doesn't matter much whether elements lie relativley close to each other in main memory.
Ok. I guess (and there is no hurry) it will also be interesting to see for 0-1000 elements.
In my todo list.
I know its nice to have a graph that grows as a function of n, but I think the best thing would be make each data point be based on the same number of iterations. So I would use an exponential growth for n, n = 8, 16, 32 ... max_n and then run each loop x times, x being max_n / n.
Not sure why this is better than having homogeneous units all across the plot (namely nanoseconds/element). In any case the testing utilities sort of do the loop repetition the way you suggest, at least for small values of n: measure_start=high_resolution_clock::now(); do{ res=f(); ++runs; t2=high_resolution_clock::now(); }while(t2-measure_start<min_time_per_trial); trials[i]=duration_cast<duration<double>>(t2-measure_start).count()/runs; This runs the loop as many times as needed to at least occupy a min_time_per_trial slot (200 ms) so as to avoid measuring individual executions that are too small for the high_resolution_clock granularity. Then, of course, resulting times are normalized back to ns/element. Joaquín M López Muñoz
Den 15-05-2017 kl. 21:21 skrev Joaquin M López Muñoz via Boost:
I think this variation of a) might perform better (code untested):
template<typename Model,typename Allocator> bool operator==( const poly_collection<Model,Allocator>& x, const poly_collection<Model,Allocator>& y) { typename poly_collection<Model,Allocator>::size_type s=0; const auto &mapx=x.map,&mapy=y.map; for(const auto& p:mapx){ s+=p.second.size(); auto it=mapy.find(p.first); if(it==mapy.end()?!p.second.empty():p.second!=it->second)return false; } if(s!=mapy.size())return false; return true; }
I not sure about this version. mapy.size() should be y.size() ?!? Or do you need sx and sy? But also, once all the segments have been compared in the loop, you know the x.size() == y.size() ... -Thorsten
El 16/05/2017 a las 16:20, Thorsten Ottosen via Boost escribió:
Den 15-05-2017 kl. 21:21 skrev Joaquin M López Muñoz via Boost:
I think this variation of a) might perform better (code untested):
template<typename Model,typename Allocator> bool operator==( const poly_collection<Model,Allocator>& x, const poly_collection<Model,Allocator>& y) { typename poly_collection<Model,Allocator>::size_type s=0; const auto &mapx=x.map,&mapy=y.map; for(const auto& p:mapx){ s+=p.second.size(); auto it=mapy.find(p.first); if(it==mapy.end()?!p.second.empty():p.second!=it->second)return false; } if(s!=mapy.size())return false; return true; }
I not sure about this version. mapy.size() should be y.size() ?!?
My typo, it should read as you say: template<typename Model,typename Allocator> bool operator==( const poly_collection<Model,Allocator>& x, const poly_collection<Model,Allocator>& y) { typename poly_collection<Model,Allocator>::size_type s=0; const auto &mapx=x.map,&mapy=y.map; for(const auto& p:mapx){ s+=p.second.size(); auto it=mapy.find(p.first); if(it==mapy.end()?!p.second.empty():p.second!=it->second)return false; } if(s!=y.size())return false; return true; }
Or do you need sx and sy? But also, once all the segments have been compared in the loop, you know the x.size() == y.size() ...
Not so: consider the case where y has a segment (let's call it seg) which is not present in x; in this case the for loop would pass succesfully and seg wouldn't have been hit, and we'd thus erroneously return true. That's why we need the extra check on size. In other words, we can't determine equality by only going through x's segments. Joaquín M López Muñoz
Den 16-05-2017 kl. 19:41 skrev Joaquin M López Muñoz via Boost:
El 16/05/2017 a las 16:20, Thorsten Ottosen via Boost escribió:
Or do you need sx and sy? But also, once all the segments have been compared in the loop, you know the x.size() == y.size() ...
Not so: consider the case where y has a segment (let's call it seg) which is not present in x; in this case the for loop would pass succesfully and seg wouldn't have been hit, and we'd thus erroneously return true. That's why we need the extra check on size. In other words, we can't determine equality by only going through x's segments.
right.
On 15/05/2017 19:46, Thorsten Ottosen via Boost wrote:
As for noexcept in std unordered associative containers, I don't have a clue why move construction is not marked conditionally noexcept the way move assignement is... given this messy state I'd prefer to rely on =default before committing to any noexcept guarantees.
Sounds reasonable. As I stated, it appears to me that you will take advantage of noexcept if it exists in the standard container.
It might be related to sentinel nodes. They are used in lists and ordered associative containers from Dikum STL from early versions. Sentinel nodes can't be transferred if source container invariants must be preserved. Reviewing Dinkum STL code it seems that std::unordered_xxx uses std::list internally, so sentinel nodes are allocated when move constructing an unordered_xxx and thus move constructor can't be noexcept. Ion
Den 15-05-2017 kl. 21:29 skrev Ion Gaztañaga via Boost:
Reviewing Dinkum STL code it seems that std::unordered_xxx uses std::list internally, so sentinel nodes are allocated when move constructing an unordered_xxx and thus move constructor can't be noexcept.
yikes -T
On Tue, May 2, 2017 at 4:46 PM, Ion Gaztañaga via Boost <boost@lists.boost.org> wrote:
The library can be found here: ... https://github.com/joaquintides/poly_collection
The documentation looks beautiful and I was excited by the positive review comments so I thought I would have a look. Unfortunately, I ran into considerable difficulty trying to actually build the example and poke around. There are no instructions on how to build that I could find in the repository or the documentation. First I tried cloning the repository and building: $ cd ~/src $ git clone git@github.com:joaquintides/poly_collection $ cd poly_collection/example $ b2 Produced this output: C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:111: in load-parent from module project error: Could not find parent for project at '.' error: Did not find Jamfile.jam or Jamroot.jam in any parent directory. C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:464: in initialize from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:306: in load-jamfile from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:64: in load from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:145: in project.find from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src\build-system.jam:535: in load from module build-system C:\Users\vinnie\lib\boost_1_64_0\tools\build\src/kernel\modules.jam:295: in import from module modules C:\Users\vinnie\lib\boost_1_64_0\tools\build\src/kernel/bootstrap.jam:139: in boost-build from module C:\Users\vinnie\lib\boost_1_64_0\boost-build.jam:17: in module scope from module Now I remember some jibber jabber about putting projects in the live Boost directory (a practice which I am not fond of). In the interest of moving forward I tried it. $ cd ~/lib/boost_1_64_0 $ git clone git@github.com:joaquintides/poly_collection $ cd poly_collection/example $ b2 It got a little farther this time but still does not build: Performing configuration checks - 32-bit : yes (cached) - arm : no (cached) - mips1 : no (cached) - power : no (cached) - sparc : no (cached) - x86 : yes (cached) ...patience... ...found 1033 targets... ...updating 23 targets... compile-c-c++ ..\..\bin.v2\poly_collection\example\msvc-14.1\debug\threading-multi\algorithms.obj algorithms.cpp algorithms.cpp(12): fatal error C1083: Cannot open include file: 'boost/poly_collection/algorithm.hpp': No such file or directory Did I mention I am using Microsoft Visual Studio on Windows? Full disclosure, I have had this problem twice before when trying to review libraries. In the past I have simply given up. But I want to contribute something back since my own library will be reviewed soon. Regards
On 5/10/2017 5:01 PM, Vinnie Falco via Boost wrote:
On Tue, May 2, 2017 at 4:46 PM, Ion Gaztañaga via Boost <boost@lists.boost.org> wrote:
The library can be found here: ... https://github.com/joaquintides/poly_collection
The documentation looks beautiful and I was excited by the positive review comments so I thought I would have a look. Unfortunately, I ran into considerable difficulty trying to actually build the example and poke around. There are no instructions on how to build that I could find in the repository or the documentation.
First I tried cloning the repository and building:
$ cd ~/src $ git clone git@github.com:joaquintides/poly_collection $ cd poly_collection/example $ b2
Produced this output: C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:111: in load-parent from module project error: Could not find parent for project at '.' error: Did not find Jamfile.jam or Jamroot.jam in any parent directory. C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:464: in initialize from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:306: in load-jamfile from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:64: in load from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src/build\project.jam:145: in project.find from module project C:/Users/vinnie/lib/boost_1_64_0/tools/build/src\build-system.jam:535: in load from module build-system C:\Users\vinnie\lib\boost_1_64_0\tools\build\src/kernel\modules.jam:295: in import from module modules C:\Users\vinnie\lib\boost_1_64_0\tools\build\src/kernel/bootstrap.jam:139: in boost-build from module C:\Users\vinnie\lib\boost_1_64_0\boost-build.jam:17: in module scope from module
Now I remember some jibber jabber about putting projects in the live Boost directory (a practice which I am not fond of). In the interest of moving forward I tried it.
$ cd ~/lib/boost_1_64_0 $ git clone git@github.com:joaquintides/poly_collection $ cd poly_collection/example $ b2
It got a little farther this time but still does not build:
Performing configuration checks
- 32-bit : yes (cached) - arm : no (cached) - mips1 : no (cached) - power : no (cached) - sparc : no (cached) - x86 : yes (cached) ...patience... ...found 1033 targets... ...updating 23 targets... compile-c-c++ ..\..\bin.v2\poly_collection\example\msvc-14.1\debug\threading-multi\algorithms.obj algorithms.cpp algorithms.cpp(12): fatal error C1083: Cannot open include file: 'boost/poly_collection/algorithm.hpp': No such file or directory
1) Clone beneath the 'your_boost_tree_root/libs' directory 2) run 'b2 headers' from 'your_boost_tree_root' after you clone
Did I mention I am using Microsoft Visual Studio on Windows?
Full disclosure, I have had this problem twice before when trying to review libraries. In the past I have simply given up. But I want to contribute something back since my own library will be reviewed soon.
Regards
On 5/10/17 2:46 PM, Edward Diener via Boost wrote: algorithms.cpp
algorithms.cpp(12): fatal error C1083: Cannot open include file: 'boost/poly_collection/algorithm.hpp': No such file or directory
1) Clone beneath the 'your_boost_tree_root/libs' directory 2) run 'b2 headers' from 'your_boost_tree_root' after you clone
Did I mention I am using Microsoft Visual Studio on Windows?
Full disclosure, I have had this problem twice before when trying to review libraries. In the past I have simply given up. But I want to contribute something back since my own library will be reviewed soon.
This is a common situation. The boost directory structure doesn't play nice with new libraries which are not already part of boost. When I put the safe numerics library into the incubator, I used path names for includes so that one would not have to do the above procedure for the new library. This permitted one to build and run the tests and examples, but of course created consternation of another sort. Robert Ramey
AMDG On 05/10/2017 03:46 PM, Edward Diener via Boost wrote:
On 5/10/2017 5:01 PM, Vinnie Falco via Boost wrote:
Now I remember some jibber jabber about putting projects in the live Boost directory (a practice which I am not fond of). In the interest of moving forward I tried it.
$ cd ~/lib/boost_1_64_0 $ git clone git@github.com:joaquintides/poly_collection $ cd poly_collection/example $ b2
<snip> algorithms.cpp algorithms.cpp(12): fatal error C1083: Cannot open include file: 'boost/poly_collection/algorithm.hpp': No such file or directory
1) Clone beneath the 'your_boost_tree_root/libs' directory
Right. It needs to go under libs/ because that's where it will ultimately be put if it is accepted.
2) run 'b2 headers' from 'your_boost_tree_root' after you clone
This is supposed to happen automatically, and it does work for me with boost develop. In Christ, Steven Watanabe
On Wed, May 10, 2017 at 2:46 PM, Edward Diener via Boost <boost@lists.boost.org> wrote:
1) Clone beneath the 'your_boost_tree_root/libs' directory 2) run 'b2 headers' from 'your_boost_tree_root' after you clone
Okay, this worked and I was off to the races! - What is your evaluation of the design? It seems pretty straightforward. Given an explanation of the problem the design of the collection feels natural and right. Most of the container interface is standard boilerplate and so was immediately familiar. - What is your evaluation of the implementation? I have not looked at the implementation, only the public interfaces. - What is your evaluation of the documentation? I was able to find everything I looked for. The diagrams look great and really helped me understand the nature of the problem and how the library solves it. The author makes container performance claims which are then backed up with benchmarks across a variety of compilers and systems. My preference is for the style of documentation where exposition precedes each member declaration (e.g. http://www.boost.org/doc/libs/1_64_0/doc/html/boost_asio/reference/io_servic...) but the presented style is also workable. - What is your evaluation of the potential usefulness of the library? These containers look quite useful, and I can think of at least a couple of instances where I have rolled my own inferior solutions to accomplish the same result. - Did you try to use the library? With what compiler? Did you have any problems? I had some trouble getting going using the Microsoft Visual C++ toolchain on Windows, but this had to do more with how the repository needs to be cloned into a subtree of a boost library installation rather than the proposed library itself. I built the example using the compiler version 14.10.25017. Then I added my own code to one of the example sources to get a feel for how the library worked. I had no trouble at all writing correct code. My attempts to abuse the library APIs were met with resistance from the compiler in the form of compilation errors. - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? I spent about an hour on it. I exercised mostly the boost::base_collection container in my own program. I expect the containers to work similarly. I have not used boost::any so I can't comment on the any_collection. I focused on the "interesting" member functions, the ones which operate differently from their standard container counterparts. I assume the non-interesting functions act with predictable results. - Are you knowledgeable about the problem domain? Yes. I have implemented many containers with conforming allocator support of all varieties. I have used Boost.Intrusive heavily to create complex containers and I have dabbled with some Boost.MultiIndex. - Do you think the library should be accepted as a Boost library? This library is tightly focused on solving a specific, worthwhile problem and does so in a way that feels natural and "boosty." So YES. - Questions Iterators returned by begin<T>() cannot be compared to iterators returned by end(typeid(T)). Is this an oversight or a consequence of the design? It seems to me they should be comparable. - Other comments: Iterators returned by begin<T>() produce a long, ugly compile error when compared against iterators returned by begin<U>() or end<U>(). I think the library could do better by producing a static_assert with a helpful message. Its not a showstopper but a suggestion. I echo the comments from other reviewers, a range<T>() function to allow range-for iteration of a segment should be added at the earliest convenience. I wouldn't hold up acceptance for it. Thank you to the author for putting together a polished library!
El 11/05/2017 a las 4:46, Vinnie Falco via Boost escribió:
[...] - Do you think the library should be accepted as a Boost library?
This library is tightly focused on solving a specific, worthwhile problem and does so in a way that feels natural and "boosty." So YES.
Thank you!
- Questions
Iterators returned by begin<T>() cannot be compared to iterators returned by end(typeid(T)). Is this an oversight or a consequence of the design? It seems to me they should be comparable.
You need to do an explicit cast (either direction) first, so it follows comparison can't be done directly. I preferred to require expilcit conversion between the two types of local iterator as it seemed to me safer from the user's code point of view.
- Other comments:
Iterators returned by begin<T>() produce a long, ugly compile error when compared against iterators returned by begin<U>() or end<U>(). I think the library could do better by producing a static_assert with a helpful message. Its not a showstopper but a suggestion.
Will study the issue. I'm not sure I can improve the dagnostics but will take a look. Thank you for your review, Joaquín M López Muñoz
On 5/2/2017 7:46 PM, Ion Gaztañaga via Boost wrote:
Hi everyone,
The formal review of Joaquín M. López Muñoz's PolyCollection library starts today.
I'd like to encourage your participation as the proposed library is small and focused, and reviewers don't need to be domain experts to appreciate the potential usefulness of the library and propose improvements.
PolyCollection implements fast containers of polymorphic objects. Typically, polymorphic objects cannot be stored directly in regular containers and need be accessed through an indirection pointer, which introduces performance problems related to CPU caching and branch prediction. Boost.PolyCollection implements a novel data structure that is able to contiguously store polymorphic objects without such indirection, thus providing a value-semantics user interface and better performance. Three polymorphic collections are provided:
* boost::base_collection * boost::function_collection * boost::any_collection
dealing respectively with classic base/derived or OOP polymorphism, function wrapping in the spirit of std::function and so-called duck typing as implemented by Boost.TypeErasure.
The library can be found here:
Incubator: http://blincubator.com/bi_library/polycollection/?gform_post_id=1643
Github: https://github.com/joaquintides/poly_collection
and the documentation here:
http://rawgit.com/joaquintides/poly_collection/website/doc/html/index.html
Please post your comments and review to the boost mailing list (preferably), the Boost Library Incubator. or privately to the Review Manager (to me ;-). Here are some questions you might want to answer in your review:
This is my review of PolyCollection,
- What is your evaluation of the design?
I have no problem with the design as it succinctly covers the three areas to which poly collections apply. These areas should be enough for using any types with polycollections.
- What is your evaluation of the implementation?
I did not look at it.
- What is your evaluation of the documentation?
I have mentioned previously that I think the documentation should be more specific about the types being used for the boost::function_collection. While an example is good, as in the case of the tutorial, and while a reference is good, as in the case of the documentation for the insert member function, I found the former to be too singular and special an example while I found the latter to be highly technical and therefore hard to understand in general terms. A better addition is simply to document in an explanation of ideas and concepts exactly what types the boost::function_collection entails. I assume it is any type which can be passed to std::function, but I do not know if it can be a std::function object itself. Saying it is "Function wrapping in the spirit of std::function" does not explain it adequately to me. Also since std::function ( or boost::function if you will ) represents any callable it would be nice to understand why the programmer would want to use boost::function_collection versus a collection of std::function objects.
- What is your evaluation of the potential usefulness of the library?
I see the library as being useful as an optimization over present techniques in both speed and size terms. In effect it offers a choice for end-users over normal C++ collections.
- Did you try to use the library? With what compiler? Did you have any problems?
I tried it with gcc-7.1 in c++11 mode and it worked well.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
Mostly a slow reading and an attempt to understand why the library should be used and how to use it.
- Are you knowledgeable about the problem domain?
Somewhat knowledgeable, but not an expert in the type of optimizations which polycollection use.
And most importantly:
- Do you think the library should be accepted as a Boost library?
I think the library should be accepted based on the fact that it offers an alternative to normal collections which can prove valuable to end-users, and that it is a polished library with few surprises. With that said I think the library should document better the advantages and practical situations that make polycollection superior to normal C++ collections, else an end-user will be puzzled about for what situations it will be useful.
For more information about Boost Formal Review Process, see: http://www.boost.org/community/reviews.html
Waiting your reviews!
Ion
El 14/05/2017 a las 22:33, Edward Diener via Boost escribió:
On 5/2/2017 7:46 PM, Ion Gaztañaga via Boost wrote:
Hi everyone,
The formal review of Joaquín M. López Muñoz's PolyCollection library starts today. [...]
This is my review of PolyCollection,
Thank you for your contribution!
[...]
- What is your evaluation of the documentation?
I have mentioned previously that I think the documentation should be more specific about the types being used for the boost::function_collection. While an example is good, as in the case of the tutorial, and while a reference is good, as in the case of the documentation for the insert member function, I found the former to be too singular and special an example while I found the latter to be highly technical and therefore hard to understand in general terms. A better addition is simply to document in an explanation of ideas and concepts exactly what types the boost::function_collection entails. I assume it is any type which can be passed to std::function, but I do not know if it can be a std::function object itself. Saying it is "Function wrapping in the spirit of std::function" does not explain it adequately to me. Also since std::function ( or boost::function if you will ) represents any callable it would be nice to understand why the programmer would want to use boost::function_collection versus a collection of std::function objects.
Points taken, as we have already discussed them in the past days. I'll try to improve docs so as to make these aspects clearer. (When you insert a std::function<Signature> object x, into a function_collection<Signature>, it is x that gets copied/stored rather than its underlying callable entity. This is so because std::function lacks functionalty to recover the wrapped callable entity without statically knowing its type, observe std::function::target does not type erase: http://en.cppreference.com/w/cpp/utility/functional/function/target . Incidentally, this is the main reason why function_collection<Signature>::value type is *not* a std::function, but an internall lookalike with the required extra functionality.)
[...]
And most importantly:
- Do you think the library should be accepted as a Boost library?
I think the library should be accepted based on the fact that it offers an alternative to normal collections which can prove valuable to end-users, and that it is a polished library with few surprises. With that said I think the library should document better the advantages and practical situations that make polycollection superior to normal C++ collections, else an end-user will be puzzled about for what situations it will be useful.
Thank you, Joaquín M López Muñoz
Please post your comments and review to the boost mailing list (preferably), the Boost Library Incubator. or privately to the Review Manager (to me ;-). Here are some questions you might want to answer in your review:
- What is your evaluation of the design?
The overall design seems sound. The library identifies a clear need and seems to provide a carefully crafted solution.
- What is your evaluation of the implementation?
I did not look into the implementation in detail.
- What is your evaluation of the documentation?
The documentation seemed amazingly clear, even though I am not a game developer and do not relate daily to warriors and goblins. The common thread in the documentation and the clarity of progression of the tutorial was great. I did not review the reference documentation, though. In part, it seems unnecessary because the basic concepts seemed so well thought out and clearly described. I appreciate that there may be subtleties to bits of the API, but I will leave it to those more familiar with the details of implementation to improve.
- What is your evaluation of the potential usefulness of the library?
I feel this library would be very useful and can see using it in my own code. For example, right now I am working with a hierarchy of objects that must be stored in a container; there are lots of objects but relatively few classes in the hierarchy. This seems perfect for the base_collection container.
- Did you try to use the library? With what compiler? Did you have any problems?
I did use the library. In particular, I compiled all of the examples and built a small program with a class hierarchy of my own to test my understanding of the base_collection container. This was done with Apple LLVM version 8.0.0 (clang-800.0.42.1). The only problem was that the following test cases needed -std=c++14 (rather than c++11) because std::make_unique is not defined for c++11 on that compiler: algorithms, basic_function, and segmented_structure.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I have read through most of the documentation (basically all of the tutorial) and have been following much of the discussion during the review.
- Are you knowledgeable about the problem domain?
I am not particularly knowledgeable about writing optimized collections such as this, but I am knowledgeable from a user perspective on using them and on some of the optimizations that would be useful.
- Do you think the library should be accepted as a Boost library?
Yes, I feel this would be a very useful addition to Boost and would likely use it in my code. Cheers, Brook
El 17/05/2017 a las 17:12, Brook Milligan via Boost escribió:
[...]
- What is your evaluation of the documentation? The documentation seemed amazingly clear, even though I am not a game developer and do not relate daily to warriors and goblins. [...]
Neither do I. I suspect real game developers will smile at the naïveté of the example, though.
[...] - Did you try to use the library? With what compiler? Did you have any problems? I did use the library. In particular, I compiled all of the examples and built a small program with a class hierarchy of my own to test my understanding of the base_collection container. This was done with Apple LLVM version 8.0.0 (clang-800.0.42.1). The only problem was that the following test cases needed -std=c++14 (rather than c++11) because std::make_unique is not defined for c++11 on that compiler: algorithms, basic_function, and segmented_structure.
Thanks for noticing this, will fix in the corresponding Jamvile.v2. By the way, algorithms.cpp need C++14 not because of make_unique (which is not used there) but for generic lambdas such as: https://github.com/joaquintides/poly_collection/blob/master/example/algorith... FWIW, algorithms is already marked as needing C++14: https://github.com/joaquintides/poly_collection/blob/master/example/Jamfile.... so I guess you didn't use b2 for building for went directly with the compiler command line, right?
[...]
- Do you think the library should be accepted as a Boost library? Yes, I feel this would be a very useful addition to Boost and would likely use it in my code.
Thank you! Joaquín M López Muñoz
On May 17, 2017, at 12:12 PM, Joaquin M López Muñoz via Boost <boost@lists.boost.org> wrote:
Thanks for noticing this, will fix in the corresponding Jamvile.v2. By the way, algorithms.cpp need C++14 not because of make_unique (which is not used there) but for generic lambdas such as:
https://github.com/joaquintides/poly_collection/blob/master/example/algorith...
Yes, I forgot the details at the time, but this reminds me that generic lambdas were exactly the problem.
FWIW, algorithms is already marked as needing C++14:
https://github.com/joaquintides/poly_collection/blob/master/example/Jamfile....
so I guess you didn't use b2 for building for went directly with the compiler command line, right?
Yes, that is correct. I personally, don’t use b2 for building anything except boost itself. It seems too far afield from “normal” build environments to be useful in my own work. I was not really pointing this out as a problem with the library, but rather that I thought there was some mention of it being a c++11 library and then I found I need to use c++14 at some points. I suppose that is entirely due to the example code, though, and not the library itself. Perhaps a mention of the need for c++14 in some examples? None of this is a big deal, though. The bottom line is it seems to be a great piece of work. Cheers, Brook
El 17/05/2017 a las 21:37, Brook Milligan escribió:
[...] I was not really pointing this out as a problem with the library, but rather that I thought there was some mention of it being a c++11 library and then I found I need to use c++14 at some points. I suppose that is entirely due to the example code, though, and not the library itself.
Correct, the library itself is strictly C++11. I decided to use C++14 at the examples when convenient so as not to divert readers' attention with workarounds etc; this is particularly the case with generic lambdas.
Perhaps a mention of the need for c++14 in some examples?
Yes, I can do that. Joaquín M López Muñoz
AMDG I vote to ACCEPT PolyCollection into Boost. Details: General: Missing .gitattributes index.html: No comments. an_efficient_polymorphic_data_st.html: "interspersedly" This word sounds a bit awkward to me. "the information is typically available at compile time in the point of insertion in the vector" I would say "at the point" instead of "in the point" tutorial.html: "Imagine we are developing a role game in C++" Do you mean "role playing game"? For choosing which type to insert, std::discrete_distribution is slightly more appropriate than std::uniform_real_distribution. "...by the user code --we will see more about this later" This should be an em-dash. "Insertion mechanisms are rather..." -> "The insertion mechanisms are rather..." "to explicitly forbid" split infinitive "...it is possible to only address the elements of a given segment ..." Address here is a bit confusing as "address" has a specific meaning in programming. Access is probably a better term to use. reference.html: "...in the same order, and viceversa." s/viceversa/vice versa/ "cc.empty(index)... Throws: If the indicated type is not registered. " Does this really make sense? I think it would be more useful to treat non-registered segments as empty. Throwing here is somewhat inconsistent with operator== which considers non-registered segments to be equivalent to empty segments. "cc.max_size() REVIEW THIS DESIGN DECISION" "the minimum return value of max_size for each of the segments of the collection" I think that the fundamental invariant of max_size should be that size() <= max_size(). Both max_size and capacity are a bit strange. My inclination would be to remove capacity entirely (or rename it) and change max_size to mean the maximum possible size of the whole container. Class template base_collection "subobject(x) = static_cast<Derived&>(x) with typeid(x)==typeid(Derived)" static_cast excludes virtual inhertance, which I presume is unsupported. algorithm.hpp: 61: segment_split: ADL (Contrived test case in test_adl1.cpp) any_collection.hpp: "See http://www.boost.org/libs/any_collection for library home page." s/any_collection/poly_collection/ operator!= should be defined in the same way as operator==. Overload resolution is subtly different for a non-template friend function defined inline vs. a regular free function template. This is especially problematic because any_collection_fwd.hpp declares a different operator== which is not implemented. any_collection_fwd.hpp: http://www.boost.org/libs/any_collection again. base_collection/function_collection have identical issues. exception.hpp: No comments. detail/any_iterator.hpp: line 74: explicit operator Concrete*()const noexcept This really should have an assert that the type is correct. detail/any_model.hpp: line 53: !type_erasure::is_subconcept<type_erasure::assignable<>,Concept2>::value Boost.TypeErasure was recently extended to support move assignment (assignable<_self, _self&&>). Should this also be handled here? For the record, I'm planning to fix the constructors and assignment operators of any so that is_constructible and is_assignable, etc. work, but it's quite annoying to implement. line 66, 94, 112, 123: type_erasure::is_subconcept<type_erasure::typeid_<>,Concept>::value This isn't quite correct. You need to check typeid_<remove_cv_t<remove_reference_t<T>>> Also it looks like this code is trying to allow an any that doesn't use typeid_ to be stored in a segment of anys (at least that's the impression that I get from the definition of subobject(x) for any_collection in reference.html). Trying to do this will error out in split_segment.hpp:303 in build_index when constructing the value_type. auto_iterator.hpp: No comments. base_model.hpp: line 45: static void* subaddress(T& x){return boost::addressof(x);} This assumes that the address of the base is the address of the whole object which is wrong and can fail in the case of multiple inheritance. You need to use dynamic_cast<void*>. (Failing test case included: test_multiple_inheritance.cpp) callable_wrapper.hpp: I'm somewhat curious about the rationale for using callable_wrapper instead of std::function w/ std::ref. Ah, I see: - data() is needed by function_model::subaddress - std::function can waste a lot of memory, depending on the implementation. line 83: return r(std::forward<Args>(args)...); This fails to handle R = void correctly. The return value of the function should be ignored. (Failing test case included: test_void_return.cpp) callable_wrapper_iterator.hpp: function_model.hpp: No comments. functional.hpp: No comments. integer_sequence.hpp: This can't possibly be the best implementation. make_index_sequence<N> and make_index_sequence<M> with N != M generate N+M totally independent instantiations of make_int_seq_impl.
From the source: "...but make_index_sequence is not hard to implement *(if efficiency is not a concern)*" [emphasis added]
is_acceptible.hpp, is_callable.hpp, is_constructible.hpp, is_final.hpp, is_nothrow_eq_comparable.hpp, iterator_impl.hpp, iterator_traits.hpp, packed_segment.hpp: No comments. poly_collection.hpp: If poly_collection is defined in namespace detail, that means that your internal functions can be found by ADL for user defined types that mention poly_collection. (I always avoid putting types that are part of the public interface in namespace detail to avoid this problem.) Test case in test_adl2.cpp line 1109: "// TODO: can we do better than two passes?" This comment seems obsolete. segment.hpp, segment_backend.hpp, split_segment.hpp, stride_iterator.hpp, type_restitution.hpp, value_holder.hpp No comments. In Christ, Steven Watanabe
El 18/05/2017 a las 2:59, Steven Watanabe via Boost escribió:
AMDG
I vote to ACCEPT PolyCollection into Boost.
Thank you Steven! Sorry it took me so long to answer your post, got a busy weekend and yours is not a review one can deal with in a hurry.
Details:
General:
Missing .gitattributes
Noted. By the way, I couldn't find this as a requirement in http://www.boost.org/development/requirements.html
[...]
an_efficient_polymorphic_data_st.html:
"interspersedly" This word sounds a bit awkward to me.
Any less awkward synonim? For other readers' convenience, the statement in which the word appears is: "This mechanism is rendered mostly useless when derived1, derived2 and derived3 elements are traversed interspersedly without a definite pattern."
"the information is typically available at compile time in the point of insertion in the vector" I would say "at the point" instead of "in the point"
You're right, noted.
tutorial.html:
"Imagine we are developing a role game in C++" Do you mean "role playing game"?
Yep, will change that.
For choosing which type to insert, std::discrete_distribution is slightly more appropriate than std::uniform_real_distribution.
Your're right, will switch to std::discrete_distribution.
"...by the user code --we will see more about this later" This should be an em-dash.
Noted.
"Insertion mechanisms are rather..." -> "The insertion mechanisms are rather..."
Not being a native English speaker (nor Russian either, who typically have a hard time with definite articles :-) ), "Insertion mechanisms" looks perfectly OK to me, but I can change that anyway.
"to explicitly forbid" split infinitive
At the risk of starting a style war, it is my understanding that split infinitives are okay when used wth caution. Here, the alternative: "to forbid explicitly copying at" seems (to me) less clear as it might be parsed as referring to some sort of explicit copying... By the way, there are other instances of split infinitive in this section, for instance "to only address" (mentioned below).
"...it is possible to only address the elements of a given segment ..." Address here is a bit confusing as "address" has a specific meaning in programming. Access is probably a better term to use.
Will change that.
reference.html:
"...in the same order, and viceversa." s/viceversa/vice versa/
Noted. It is "viceversa" in Spanish, hence the mistake.
"cc.empty(index)... Throws: If the indicated type is not registered. " Does this really make sense? I think it would be more useful to treat non-registered segments as empty. Throwing here is somewhat inconsistent with operator== which considers non-registered segments to be equivalent to empty segments.
Non-registered == empty would lead to stranger situations, IMHO: we can't hold the invariant that empty(index)==(begin(index)==end(index)) as begin/end(index) cannot return anything sensible for non-registered segments. Similarly, max_size, capacity etc. need to forward to the underlying segment to return something, unless we make up the return values for non-registered segments.
"cc.max_size() REVIEW THIS DESIGN DECISION" "the minimum return value of max_size for each of the segments of the collection" I think that the fundamental invariant of max_size should be that size() <= max_size(). Both max_size and capacity are a bit strange. My inclination would be to remove capacity entirely (or rename it) and change max_size to mean the maximum possible size of the whole container.
This is a design area I'm definitely not happy with. Please follow this thread for a related discussion: https://lists.boost.org/Archives/boost/2016/11/231784.php I'm leaning towards dropping (global) max_size/capacity, as you and Thorsten suggest, till we find sensible semantics for them.
Class template base_collection "subobject(x) = static_cast<Derived&>(x) with typeid(x)==typeid(Derived)" static_cast excludes virtual inhertance, which I presume is unsupported.
Ummm.. Hadn't thought about virtual inheritance. At first blush, seems like virtual inheritance would be automatically supported if only we fix subaddress as you suggest below, don't you think? (the "static_cast<Derived&>(x)" you refer to is only shown for explanatory purposes and does not appear anywhere in the actual code, as you probably guessed).
algorithm.hpp:
61: segment_split: ADL (Contrived test case in test_adl1.cpp)
Noted. Please enlighten me: is it OK to qualify the offending call as detail::segment_split, or should I fully qualify as ::boost::poly_collection::detail::segment_split?
any_collection.hpp:
"See http://www.boost.org/libs/any_collection for library home page." s/any_collection/poly_collection/
Right, will change it.
operator!= should be defined in the same way as operator==. Overload resolution is subtly different for a non-template friend function defined inline vs. a regular free function template. This is especially problematic because any_collection_fwd.hpp declares a different operator== which is not implemented.
You're absolutely right, will fix it.
any_collection_fwd.hpp:
http://www.boost.org/libs/any_collection again.
base_collection/function_collection have identical issues.
Noted again.
[...]
detail/any_iterator.hpp:
line 74: explicit operator Concrete*()const noexcept This really should have an assert that the type is correct.
I don't see the need as any_iterator is a detail class not directly exposed to user code.
detail/any_model.hpp:
line 53: !type_erasure::is_subconcept<type_erasure::assignable<>,Concept2>::value Boost.TypeErasure was recently extended to support move assignment (assignable<_self, _self&&>). Should this also be handled here?
Right, in fact now that move assignment is provided (didn't notice that) we should check this rather than copy assignment.
For the record, I'm planning to fix the constructors and assignment operators of any so that is_constructible and is_assignable, etc. work, but it's quite annoying to implement.
That'd be a boon.
line 66, 94, 112, 123: type_erasure::is_subconcept<type_erasure::typeid_<>,Concept>::value This isn't quite correct. You need to check typeid_<remove_cv_t<remove_reference_t<T>>>
Noted, will change it.
Also it looks like this code is trying to allow an any that doesn't use typeid_ to be stored in a segment of anys (at least that's the impression that I get from the definition of subobject(x) for any_collection in reference.html).
That's the intention. If I can't get the wrapped object I'll store the any objects at least.
Trying to do this will error out in split_segment.hpp:303 in build_index when constructing the value_type.
I'm not seeing this. First of all, there's a bug in https://github.com/joaquintides/poly_collection/blob/master/include/boost/po... s/any_cast<void*>/any_cast<const void*>. Once you do that, the following compiles fine: #include <boost/poly_collection/any_collection.hpp> int main() { using concept=boost::mpl::vector< boost::type_erasure::copy_constructible<>, boost::type_erasure::relaxed >; boost::any_collection<concept> c; auto a=boost::type_erasure::any<concept>{0}; c.insert(a); }
[...]
base_model.hpp:
line 45: static void* subaddress(T& x){return boost::addressof(x);} This assumes that the address of the base is the address of the whole object which is wrong and can fail in the case of multiple inheritance. You need to use dynamic_cast<void*>. (Failing test case included: test_multiple_inheritance.cpp)
Absolutely, will change it. See also my comment above re "static_cast<Derived&>(x)".
callable_wrapper.hpp:
[...]
line 83: return r(std::forward<Args>(args)...); This fails to handle R = void correctly. The return value of the function should be ignored. (Failing test case included: test_void_return.cpp)
Got it. I guess I only need to s/return r(std::forward<Args>(args)...)/return static_cast<R>(r(std::forward<Args>(args)...)) right?
[...]
integer_sequence.hpp:
This can't possibly be the best implementation. make_index_sequence<N> and make_index_sequence<M> with N != M generate N+M totally independent instantiations of make_int_seq_impl.
From the source: "...but make_index_sequence is not hard to implement *(if efficiency is not a concern)*" [emphasis added]
Do you know of some good quality logarithmic implementation that I can rip?
[...]
poly_collection.hpp:
If poly_collection is defined in namespace detail, that means that your internal functions can be found by ADL for user defined types that mention poly_collection. (I always avoid putting types that are part of the public interface in namespace detail to avoid this problem.) Test case in test_adl2.cpp
I see. What's best? a) moving poly_collection to boost::poly_collection::poly_collection_detail b) moving poly_collection to boost::poly_collection::detail::poly_collection_detail
line 1109: "// TODO: can we do better than two passes?" This comment seems obsolete.
There's an implicit second pass in https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
[...]
In Christ, Steven Watanabe
Thank you for your very valuable review! I'm bringing back home lots of useful suggestions. Best regards, Joaquín M López Muñoz
On 23/05/2017 05:21, Joaquin M López Muñoz wrote:
"interspersedly" This word sounds a bit awkward to me.
Any less awkward synonim? For other readers' convenience, the statement in which the word appears is:
"This mechanism is rendered mostly useless when derived1, derived2 and derived3 elements are traversed interspersedly without a definite pattern."
"interleaved", perhaps?
"to explicitly forbid" split infinitive
At the risk of starting a style war, it is my understanding that split infinitives are okay when used wth caution. Here, the alternative:
"to forbid explicitly copying at"
seems (to me) less clear as it might be parsed as referring to some sort of explicit copying...
FWIW, as a native English (or dialect thereof) speaker, the original version sounds better to me, and not just because of the ambiguity of the alternative. As far as I am aware, objecting to split infinitives is out of fashion, and was never all that popular to begin with.
AMDG On 05/22/2017 11:21 AM, Joaquin M López Muñoz via Boost wrote:
El 18/05/2017 a las 2:59, Steven Watanabe via Boost escribió:
I vote to ACCEPT PolyCollection into Boost.
Thank you Steven! Sorry it took me so long to answer your post, got a busy weekend and yours is not a review one can deal with in a hurry.
Details:
General:
Missing .gitattributes
Noted. By the way, I couldn't find this as a requirement in http://www.boost.org/development/requirements.html
The only reason I notice is that I can't open any of the files in notepad...
[...]
an_efficient_polymorphic_data_st.html:
"interspersedly" This word sounds a bit awkward to me.
Any less awkward synonim? For other readers' convenience, the statement in which the word appears is:
"This mechanism is rendered mostly useless when derived1, derived2 and derived3 elements are traversed interspersedly without a definite pattern."
I think the sentence should be reordered. The problem, for me, is the use of -ly with a past participle, which affects most direct synonyms as well. "...useless for interspersed traversal of derived1..." would be one possibility.
"Insertion mechanisms are rather..." -> "The insertion mechanisms are rather..."
Not being a native English speaker (nor Russian either, who typically have a hard time with definite articles :-) ), "Insertion mechanisms" looks perfectly OK to me, but I can change that anyway.
Since you're talking about a specific set of insertion mechanisms, I think it should have an article.
"to explicitly forbid" split infinitive
At the risk of starting a style war, it is my understanding that split infinitives are okay when used wth caution. Here, the alternative:
"to forbid explicitly copying at"
seems (to me) less clear as it might be parsed as referring to some sort of explicit copying...
It's not ambiguous. "Explicitly" modifies "copying." I.e., this isn't a correct rewrite of the sentence. It would be either "explicitly to forbid copying" (which is correct and unambiguous, but sounds pedantic) or "to forbid copying explicitly"
By the way, there are other instances of split infinitive in this section, for instance "to only address" (mentioned below).
In this case "to address only" works fine. Although it changes "only" to modify "the elements" instead of "to address", this doesn't affect the overall meaning.
"...it is possible to only address the elements of a given segment ..." Address here is a bit confusing as "address" has a specific meaning in programming. Access is probably a better term to use.
Will change that.
"cc.empty(index)... Throws: If the indicated type is not registered. " Does this really make sense? I think it would be more useful to treat non-registered segments as empty. Throwing here is somewhat inconsistent with operator== which considers non-registered segments to be equivalent to empty segments.
Non-registered == empty would lead to stranger situations, IMHO: we can't hold the invariant that empty(index)==(begin(index)==end(index)) as begin/end(index) cannot return anything sensible for non-registered segments. Similarly, max_size, capacity etc. need to forward to the underlying segment to return something, unless we make up the return values for non-registered segments.
The iterators are not dereferenceable, so they don't actually need to point to anything. capacity is obviously 0. max_size... yeah, I can see that that would be a problem.
"cc.max_size() REVIEW THIS DESIGN DECISION" "the minimum return value of max_size for each of the segments of the collection" I think that the fundamental invariant of max_size should be that size() <= max_size(). Both max_size and capacity are a bit strange. My inclination would be to remove capacity entirely (or rename it) and change max_size to mean the maximum possible size of the whole container.
This is a design area I'm definitely not happy with. Please follow this thread for a related discussion:
https://lists.boost.org/Archives/boost/2016/11/231784.php
I'm leaning towards dropping (global) max_size/capacity, as you and Thorsten suggest, till we find sensible semantics for them.
I don't think that there are sensible semantics for capacity. Expected semantics of capacity: a) size() <= capacity() b) if N <= (capacity() - size()) then N elements can be inserted without reallocation. c) capacity() - size() is proportional to the amount of wasted memory. d) reserve(N) has a post-condition that capacity() >= N In particular, I think there's no way to satisfy both (b) and (c) simultaneously. On a related note, reserve(size() + N) probably has unexpected behavior.
Class template base_collection "subobject(x) = static_cast<Derived&>(x) with typeid(x)==typeid(Derived)" static_cast excludes virtual inhertance, which I presume is unsupported.
Ummm.. Hadn't thought about virtual inheritance. At first blush, seems like virtual inheritance would be automatically supported if only we fix subaddress as you suggest below, don't you think?
I think so. If there's anywhere else that you use a downcast (perhaps stride_iterator.hpp:83), it would also need dynamic_cast. If this causes any noticeable performance cost, I don't think it's worthwhile. Who uses virtual inheritance outside of iostreams?
(the "static_cast<Derived&>(x)" you refer to is only shown for explanatory purposes and does not appear anywhere in the actual code, as you probably guessed).
algorithm.hpp:
61: segment_split: ADL (Contrived test case in test_adl1.cpp)
Noted. Please enlighten me: is it OK to qualify the offending call as detail::segment_split, or should I fully qualify as ::boost::poly_collection::detail::segment_split?
Any namespace is sufficient to suppress ADL. Putting parentheses around the function name also works. (Personally, I always fully qualify everything in headers, but that gets very tedious.)
Also it looks like this code is trying to allow an any that doesn't use typeid_ to be stored in a segment of anys (at least that's the impression that I get from the definition of subobject(x) for any_collection in reference.html).
That's the intention. If I can't get the wrapped object I'll store the any objects at least.
Trying to do this will error out in split_segment.hpp:303 in build_index when constructing the value_type.
I'm not seeing this. First of all, there's a bug in
https://github.com/joaquintides/poly_collection/blob/master/include/boost/po...
s/any_cast<void*>/any_cast<const void*>. Once you do that, the following compiles fine:
<snip> using concept=boost::mpl::vector< boost::type_erasure::copy_constructible<>, boost::type_erasure::relaxed >; <snip>
This compiles because relaxed implies typeid_.
callable_wrapper.hpp:
[...]
line 83: return r(std::forward<Args>(args)...); This fails to handle R = void correctly. The return value of the function should be ignored. (Failing test case included: test_void_return.cpp)
Got it. I guess I only need to
s/return r(std::forward<Args>(args)...)/return static_cast<R>(r(std::forward<Args>(args)...))
right?
That makes it too permissive although it's probably okay since you also construct a std::function.
[...]
integer_sequence.hpp:
This can't possibly be the best implementation. make_index_sequence<N> and make_index_sequence<M> with N != M generate N+M totally independent instantiations of make_int_seq_impl.
From the source: "...but make_index_sequence is not hard to implement *(if efficiency is not a concern)*" [emphasis added]
Do you know of some good quality logarithmic implementation that I can rip?
I typically use something like this: template<bool IsBaseCase> struct make_int_seq_impl; template<> struct make_int_seq_impl<false> { template<class T, T N> using apply = typename next_integer_sequence<make_int_seq_impl<(N-1 == 0)>::template apply<T, N-1>>::type; }; template<> struct make_int_seq_impl<true> { template<class T, T N> using apply = integer_sequence<T>; }; which is optimized for memoization rather than for the cost of a single instantiation. Actually, looking at the way you're using it, it probably doesn't matter at all.
[...]
poly_collection.hpp:
If poly_collection is defined in namespace detail, that means that your internal functions can be found by ADL for user defined types that mention poly_collection. (I always avoid putting types that are part of the public interface in namespace detail to avoid this problem.) Test case in test_adl2.cpp
I see. What's best?
a) moving poly_collection to boost::poly_collection::poly_collection_detail b) moving poly_collection to boost::poly_collection::detail::poly_collection_detail
I don't think it matters. Either one should work. In Christ, Steven Watanabe
participants (13)
-
Adam Wulkiewicz
-
Brook Milligan
-
Edward Diener
-
Gavin Lambert
-
Glen Fernandes
-
Hans Dembinski
-
Ion Gaztañaga
-
Joaquin M López Muñoz
-
Robert Ramey
-
Steven Watanabe
-
Thorsten Ottosen
-
Vicente J. Botet Escriba
-
Vinnie Falco