Michael, the latest revision of the code, when run, aborts when running the SSEopt_vec method, as detailed here: https://github.com/cppljevans/soa/blob/master/soa_compare.benchmark.cpp#L953 Would you have some idea what's causing that? On 10/28/2016 09:40 PM, Michael Marcin wrote:
On 10/28/2016 9:09 AM, Larry Evans wrote:
On 10/26/2016 10:00 AM, Michael Marcin wrote: [snip]
The sse emitter test used an aligned_allocator to guarantee 16 byte alignment for the std::vector data.
template< typename T > using sse_vector = vector
>; I assume that vector<T>'s data is allocated with new, and new, IIUC, guarantees maximum alignment; hence, boost::alignment::is_aligned(std::vector<T>::data(),16) should always be true. What am I missing?
Yes operator new isn't going to guarantee a 16 byte alignment on any platform I'm aware of.
From n4606 [snip] Detailed reference to standard Docs correcting my assumptions.
Thanks for the [snipped] detailed reference to standard Docs correcting my assumptions :) It must have taken a lot of detailed reading!
BTW, see the new push:
https://github.com/cppljevans/soa/commit/ea28ff814c8c1ea879fa5ac2453c12fc382...
I've *not tested* it in:
https://github.com/cppljevans/soa/blob/master/soa_compare.benchmark.cpp
but I think it should work.
Since that post, I've actually tested with the benchmark, and it ran. That run was done with the previous revision, the one with: https://github.com/cppljevans/soa/commit/02791e47080bf51ac3a467ed89e7cb91abf... Strangely, that one ran the SSEopt_vec method OK, in constrast to the current SSEopt_vec method.
IOW, instead of:
sse_vector<float> position_x; sse_vector<float> position_y; sse_vector<float> position_z; sse_vector<float> velocity_x; sse_vector<float> velocity_y; sse_vector<float> velocity_z; sse_vector<float> acceleration_x; sse_vector<float> acceleration_y; sse_vector<float> acceleration_z; vector<float2> size; vector<float4> color; sse_vector<float> energy; vector<char> alive;
I think you could use:
soa_block < type_align
,// position_x; type_align ,// position_y; type_align ,// position_z; type_align ,// velocity_x; type_align ,// velocity_y; type_align ,// velocity_z; type_align ,// acceleration_x; type_align ,// acceleration_y; type_align ,// acceleration_z; float2_t,// size; float4_t,// color; type_align ,// energy; char// alive; > particles; where type_align is found here:
https://github.com/cppljevans/soa/blob/master/vec_offsets.hpp#L14
That's quite similar to what I have in one of my potential implementations.
Would you post that somewhere? I'd be curious about how it differs. -regards, Larry