
Hi Joaquín, Some additional comments about the design: ///////////////////////// // Memory indirections ///////////////////////// If I understand code correctly, the internal structure of a poly collection is approximately: 1) poly_collection stores an unordered_map<type_index, segment_type> 2) segment_type stores a unique_ptr<segment_backend> and segment_backend_is an abstract class, so roughly equivalent to unique_ptr<packed_segment/split_segment> (both packed_segment/split_segment derive from segment_backend. 3) packed_segment/split_segment stores a vector of concrete types. So we need 3 memory hops to navigate between the "this" pointer of a PolyCollection to a pointer to a concrete type: poly_collection -> unordered node -> segment in unique_ptr -> vector value. I think the data structure could be optimized to save one indirection. segment_type could directly hold the class derived from segment_backend as packed_segment/split_segment will need the same size and alignment requirements regardless of the ConcreteType they hold (as long as they use std::vector). This could improve insertion times and other operations. This optimization could also make easier to fix the following section (allocator propagation). *Main question 1*: Do you find think PolyCollection should implement this optimization? ///////////////////////// // Allocator propagation ///////////////////////// As mentioned in Adam Wulkiewicz's review PolyCollection currently does not propagate the allocator from the constructor argument of poly_collection to the concrete type. From a practical point of view, the key requirement is that all intermediate memory until the concrete type must be allocated using the same root allocator passed when a poly_collection was constructed, and that allocator must be propagated also to the concrete type. I think this should also be the case of all memory used for auxiliary data like the index vector of split_sequence. Clause 23.2.1.3 was mentioned although I don't think this clause was designed with compound containers in mind, as IMHO all auxiliary data, including metadata that needs dynamic memory should receive the allocator and should be constructed using allocator_traits or a container that uses allocator_traits (yes, even the non-value parts of a node). Otherwise, the original intention of the new standard allocator model (use a single source of memory and propagate it transparently) wouldn't be feasible. I think that a unique memory source for all internal structures and the correct propagation of the allocator using allocator_traits is a very interesting feature for PolyCollection. Even Bjarne Stroustrup commented on this when Joaquín's article was published in isocpp ( https://isocpp.org/blog/2014/05/fast-polymorphic-collections-joaquin-m-lopez...) "One thing caught my eye: The elements could be allocated using stack allocators, rather than a general new. That would make the technique applicable to hard real-time systems where general dynamic memory is banned" *Main question 2*: Do you think PolyCollection should support stateful allocator using the scoped allocator model? Some specific notes that might show places where the scoped allocator model has problems.: - PolyCollection uses operator new to allocate the packed/split_segment implementation stored in unique_ptr and also when the segment must be copied. If unique_ptr is avoided and the concrete segment is in-place constructed inside "segment" then all memory will be allocated from the original allocator. - I think "segment" is missing an extended copy and move constructor taking an additional allocator. It also would need an internal allocator_type typedef so that uses_allocator_v is true and unordered_map calls these new constructors when needed. These new constructors should be propagated to split_segment/packed_segment so that these call the extended constructors of internal vectors. - The concrete type wrappers (e.g. value_holder) also would need to support allocators with extended copy/move constructors. - The "emplace" operation uses a placement new to wrapped in a lambda which is type-erased with a captureless lambda. It looks to me that instead of delegating the work into the value_holder, the type erasure arguments should be treated by the segment so that internal vectors propagate the allocator when needed. A test to make sure everything is correctly propagated: -> Create an instance of monotonic_buffer_resource passing a big enough static buffer. -> Create a base_collection< Base, std::polymorphic_allocator<Base> > passing the address of the monotonic_buffer_resource -> Insert types Derived1 and Derived2 in the container, both of which hold a vector<int, polymorphic_allocator<int> > (for Derived1 and Derived2 are uses_allocator_v is true and they propagate the memory resource to the vector). -> After filling base_collection, no heap allocation is made and all objects are constructed in the static buffer. Best, Ion