
Hi David, Thanks for helping to bring the problem of alignment to my (admittedly somewhat dim at times) attention. I have changed inline_storage.h: /// allocate storage. ignore hint. void *allocate(size_t num_bytes, void const * = 0) { // ensure we return a pointer on an aligned boundary size_t extra = num_bytes & 15; // todo: policy-based on the alignment is overkill? size_t required = num_bytes + extra; char *ptr = &buffer[cursor]; cursor += required; return ptr + extra; } Then I re-ran your test. I also added another test, which does exactly the same thing, but using std::vector, rather than monotonic::vector: Incrementing ord = 2.89 ps per iteration Incrementing ord = 2.86 ps per iteration STD: Incrementing ord = 3.1 ps per iteration STD: Incrementing ord = 3.1 ps per iteration My other tests dont change much: monotonic: 0.001 std: 0.016 monotonic: 0.01 std: 0.95 I am using VS2008, full optimisation. Note that your test case is ~8% faster when implemented using monotonic::vector over using std::vector. This is because the monotonic vector is using the stack for storage, where the loop controls and other fields are also. I will upload the change to boost/monotonic/inline_storage.h within hours, with this and more tests. Regards and Thanks, Christian. On Fri, Jun 12, 2009 at 4:58 PM, David Bergman < David.Bergman@bergmangupta.com> wrote:
On Jun 11, 2009, at 8:26 PM, Christian Schladetsch wrote:
Hi Luke,
Please compare performance of your container to std containers with a good
memory pooling allocator and include system settings with your benchmarks so that we know what you are measuring. Your benchmarks mean nothing to me because I don't know what allocator you are comparing to, but suspect you are comparing to the system allocator, which is clearly a straw man argument.
I am directly comparing monotonic::allocator to std::allocator, so there is no straw-man.
I agree that eventually I'll have to compare against as many other possible arrangements as possible, stopping when any of them fails to provide a performance benefit for the proposed system.
The benchmarks I posted, while hardly exhaustive, do indeed show a performance increase of 10-100x over the standard schemes. At least this was a proof-of-concept over and above my hand-waving.
I showed this for both stack- and heap-based storage, for the same algorithm implemented using the same containers, which no other changes or custom data-structures.
NOTE: I do get that preserving space - on stack or heap - is better than incremental allocations (or reallocations.), and your "monotonic" allocator could definitely be a good one if one is willing to bet on a fairly small maximum size.
But. if we skip this reserve+monotonic vs unreserved+standard comparison:
What we have then is for your allocator simply creates crashing code for a lot of structures or classes on certain platforms, whereas it yields worse performance (for my simple examples of misaligned structures, I got between 10% and 20% slower) for the lenient (in this aspect...) Intel processor family.
/David
I agree that more testing should be done, and I'll do so with a better
system analysis. I'll post more results and an updated library with full constructor forwarding and alignment. It may well be that a more extensive performance analysis shows that my proposed allocator is a wash.
Regards,
Luke
Cheers, Christian. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost