[boost] [poly_collection] Memory allocation and allocator propagation

14 May 2017

      Hi Joaquín,

Some additional comments about the design:

/////////////////////////
//  Memory indirections
/////////////////////////

If I understand code correctly, the internal structure of a poly 
collection is approximately:

1) poly_collection stores an unordered_map<type_index, segment_type>

2) segment_type stores a unique_ptr<segment_backend> and 
segment_backend_is an abstract class, so roughly equivalent to 
unique_ptr<packed_segment/split_segment> (both 
packed_segment/split_segment derive from segment_backend.

3) packed_segment/split_segment stores a vector of concrete types.

So we need 3 memory hops to navigate between the "this" pointer of a 
PolyCollection to a pointer to a concrete type: poly_collection -> 
unordered node -> segment in unique_ptr -> vector value.

I think the data structure could be optimized to save one indirection. 
segment_type could directly hold the class derived from segment_backend 
as packed_segment/split_segment will need the same size and alignment 
requirements regardless of the ConcreteType they hold (as long as they 
use std::vector). This could improve insertion times and other 
operations. This optimization could also make easier to fix the 
following section (allocator propagation).

*Main question 1*: Do you find think PolyCollection should implement 
this optimization?

/////////////////////////
//  Allocator propagation
/////////////////////////

As mentioned in Adam Wulkiewicz's review PolyCollection currently does 
not propagate the allocator from the constructor argument of 
poly_collection to the concrete type. From a practical point of view, 
the key requirement is that all intermediate memory until the concrete 
type must be allocated using the same root allocator passed when a 
poly_collection was constructed, and that allocator must be propagated 
also to the concrete type.

I think this should also be the case of all memory used for auxiliary 
data like the index vector of split_sequence. Clause 23.2.1.3 was 
mentioned although I don't think this clause was designed with compound 
containers in mind, as IMHO all auxiliary data, including metadata that 
needs dynamic memory should receive the allocator and should be 
constructed using allocator_traits or a container that uses 
allocator_traits (yes, even the non-value parts of a node). Otherwise, 
the original intention of the new standard allocator model (use a single 
source of memory and propagate it transparently) wouldn't be feasible.

I think that a unique memory source for all internal structures and the 
correct propagation of the allocator using allocator_traits is a very 
interesting feature for PolyCollection. Even Bjarne Stroustrup commented 
on this when Joaquín's article was published in isocpp (
https://isocpp.org/blog/2014/05/fast-polymorphic-collections-joaquin-m-lopez...)

"One thing caught my eye: The elements could be allocated using stack 
allocators, rather than a general new. That would make the technique 
applicable to hard real-time systems where general dynamic memory is banned"

*Main question 2*: Do you think PolyCollection should support stateful 
allocator using the scoped allocator model?

Some specific notes that might show places where the scoped allocator 
model has problems.:

- PolyCollection uses operator new to allocate the packed/split_segment 
implementation stored in unique_ptr and also when the segment must be 
copied. If unique_ptr is avoided and the concrete segment is in-place 
constructed inside "segment" then all memory will be allocated from the 
original allocator.

- I think "segment" is missing an extended copy and move constructor 
taking an additional allocator. It also would need an internal 
allocator_type typedef so that uses_allocator_v is true and 
unordered_map calls these new constructors when needed. These new 
constructors should be propagated to split_segment/packed_segment so 
that these call the extended constructors of internal vectors.

- The concrete type wrappers (e.g. value_holder) also would need to 
support allocators with extended copy/move constructors.

- The "emplace" operation uses a placement new to wrapped in a lambda 
which is type-erased with a captureless lambda. It looks to me that 
instead of delegating the work into the value_holder, the type erasure 
arguments should be treated by the segment so that internal vectors 
propagate the allocator when needed.

A test to make sure everything is correctly propagated:

-> Create an instance of monotonic_buffer_resource passing a big enough 
static buffer.

-> Create a base_collection< Base, std::polymorphic_allocator<Base> > 
passing the address of the monotonic_buffer_resource

-> Insert types Derived1 and Derived2 in the container, both of which 
hold a vector<int, polymorphic_allocator<int> > (for Derived1 and 
Derived2 are uses_allocator_v is true and they propagate the memory 
resource to the vector).

-> After filling base_collection, no heap allocation is made and all 
objects are constructed in the static buffer.

Best,

Ion

[boost] [poly_collection] Memory allocation and allocator propagation

Ion Gaztañaga