
On 10/31/2016 03:09 AM, Michael Marcin wrote:
On 10/30/2016 7:45 AM, Larry Evans wrote:
Would you post that somewhere? I'd be curious about how it differs.
My code isn't very complete but since everyone else is sharing I'll post what I've got.
FWIW I had to go back pretty far and then still make changes to get a version of your soa_compare.benchmark.cpp that compiled on windows VS2015.
particle_count=1,000,000 minimum duration=2.85542
comparative performance table:
method rel_duration ________ ______________ Aos 3.08266 SoA 1.51109 Flat 1.50358 StdArray 1.91317 Block 1.60447 SSE 1.35994 SSE_opt 1.18056 SSE_goon 1 Press any key to continue . . .
Thanks Michael. I found it interesting. However, I was still getting the 'double free' error message; hence, I tried val_grind. It showed a problem in the alive update loop. When the code was changed to: uint64_t *block_ptr = alive.data(); auto e_ptr = energy.data(); for ( size_t i = 0; i < n; ) { #define REVISED_CODE #ifdef REVISED_CODE auto e_i = e_ptr + i; #endif uint64_t block = 0; do { #ifndef REVISED_CODE //this code causes valgrind to show errors. auto e_i = e_ptr + i; #endif _mm_store_ps( e_i, _mm_sub_ps( _mm_load_ps( e_i ), t )); block |= uint64_t ( _mm_movemask_ps( _mm_cmple_ps( _mm_load_ps( e_i ), zero ))) << (i % bits_per_uint64_t) ; i += 4; } while ( i % bits_per_uint64_t != 0 ); *block_ptr++ = block; } valgrind reported no errors; however, when !defined(REVISED_CODE), valgrind reported: valgrind --tool=memcheck /tmp/build/clangxx3_8_pkg/clang/struct_of_arrays/work/soa_compare.benchmark.optim0.exe ==7937== Memcheck, a memory error detector ==7937== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==7937== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==7937== Command: /tmp/build/clangxx3_8_pkg/clang/struct_of_arrays/work/soa_compare.benchmark.optim0.exe ==7937== COMPILE_OPTIM=0 particle_count=1,000 frames=1,000 {run_test=SSEopt_vec ==7937== Invalid read of size 16 ==7937== at 0x403D6B: emitter_t<(method_enum)6>::update() (soa_compare.benchmark.cpp:962) ==7937== by 0x403094: run_result_t run_test<emitter_t<(method_enum)6>
(unsigned long, unsigned long) (soa_compare.benchmark
My soa_compare.benchmark.cpp:962 line is: _mm_store_ps( e_i, _mm_sub_ps( _mm_load_ps( e_i ), t )); -regards, Larry