Hi Joaquín,
Some additional comments about the design:
/////////////////////////
// Memory indirections
/////////////////////////
If I understand code correctly, the internal structure of a poly
collection is approximately:
1) poly_collection stores an unordered_map
2) segment_type stores a unique_ptr and
segment_backend_is an abstract class, so roughly equivalent to
unique_ptr (both
packed_segment/split_segment derive from segment_backend.
3) packed_segment/split_segment stores a vector of concrete types.
So we need 3 memory hops to navigate between the "this" pointer of a
PolyCollection to a pointer to a concrete type: poly_collection ->
unordered node -> segment in unique_ptr -> vector value.
I think the data structure could be optimized to save one indirection.
segment_type could directly hold the class derived from segment_backend
as packed_segment/split_segment will need the same size and alignment
requirements regardless of the ConcreteType they hold (as long as they
use std::vector). This could improve insertion times and other
operations. This optimization could also make easier to fix the
following section (allocator propagation).
*Main question 1*: Do you find think PolyCollection should implement
this optimization?
/////////////////////////
// Allocator propagation
/////////////////////////
As mentioned in Adam Wulkiewicz's review PolyCollection currently does
not propagate the allocator from the constructor argument of
poly_collection to the concrete type. From a practical point of view,
the key requirement is that all intermediate memory until the concrete
type must be allocated using the same root allocator passed when a
poly_collection was constructed, and that allocator must be propagated
also to the concrete type.
I think this should also be the case of all memory used for auxiliary
data like the index vector of split_sequence. Clause 23.2.1.3 was
mentioned although I don't think this clause was designed with compound
containers in mind, as IMHO all auxiliary data, including metadata that
needs dynamic memory should receive the allocator and should be
constructed using allocator_traits or a container that uses
allocator_traits (yes, even the non-value parts of a node). Otherwise,
the original intention of the new standard allocator model (use a single
source of memory and propagate it transparently) wouldn't be feasible.
I think that a unique memory source for all internal structures and the
correct propagation of the allocator using allocator_traits is a very
interesting feature for PolyCollection. Even Bjarne Stroustrup commented
on this when Joaquín's article was published in isocpp (
https://isocpp.org/blog/2014/05/fast-polymorphic-collections-joaquin-m-lopez...)
"One thing caught my eye: The elements could be allocated using stack
allocators, rather than a general new. That would make the technique
applicable to hard real-time systems where general dynamic memory is banned"
*Main question 2*: Do you think PolyCollection should support stateful
allocator using the scoped allocator model?
Some specific notes that might show places where the scoped allocator
model has problems.:
- PolyCollection uses operator new to allocate the packed/split_segment
implementation stored in unique_ptr and also when the segment must be
copied. If unique_ptr is avoided and the concrete segment is in-place
constructed inside "segment" then all memory will be allocated from the
original allocator.
- I think "segment" is missing an extended copy and move constructor
taking an additional allocator. It also would need an internal
allocator_type typedef so that uses_allocator_v is true and
unordered_map calls these new constructors when needed. These new
constructors should be propagated to split_segment/packed_segment so
that these call the extended constructors of internal vectors.
- The concrete type wrappers (e.g. value_holder) also would need to
support allocators with extended copy/move constructors.
- The "emplace" operation uses a placement new to wrapped in a lambda
which is type-erased with a captureless lambda. It looks to me that
instead of delegating the work into the value_holder, the type erasure
arguments should be treated by the segment so that internal vectors
propagate the allocator when needed.
A test to make sure everything is correctly propagated:
-> Create an instance of monotonic_buffer_resource passing a big enough
static buffer.
-> Create a base_collection< Base, std::polymorphic_allocator<Base> >
passing the address of the monotonic_buffer_resource
-> Insert types Derived1 and Derived2 in the container, both of which
hold a vector (for Derived1 and
Derived2 are uses_allocator_v is true and they propagate the memory
resource to the vector).
-> After filling base_collection, no heap allocation is made and all
objects are constructed in the static buffer.
Best,
Ion