Re: [boost] interest in structure of arrays container?

31 Oct 2016

      On 10/31/2016 9:14 AM, Larry Evans wrote:
...
However, I was still getting the 'double free' error message; hence,
I tried val_grind.  It showed a problem in the alive update loop.
When the code was changed to:
uint64_t *block_ptr = alive.data();
        auto e_ptr = energy.data();
        for ( size_t i = 0; i < n; ) {
          #define REVISED_CODE
          #ifdef REVISED_CODE
            auto e_i = e_ptr + i;
          #endif
            uint64_t block = 0;
            do {
              #ifndef REVISED_CODE
                //this code causes valgrind to show errors.
                auto e_i = e_ptr + i;
              #endif
                _mm_store_ps( e_i, _mm_sub_ps( _mm_load_ps( e_i ), t ));
                block |=
                  uint64_t
                  ( _mm_movemask_ps( _mm_cmple_ps( _mm_load_ps( e_i ),
zero )))
                  << (i % bits_per_uint64_t)
                  ;
                i += 4;
            } while ( i % bits_per_uint64_t != 0 );
            *block_ptr++ = block;
        }
valgrind reported no errors; however, when !defined(REVISED_CODE),
valgrind reported:
valgrind --tool=memcheck
/tmp/build/clangxx3_8_pkg/clang/struct_of_arrays/work/soa_compare.benchmark.optim0.exe
==7937== Memcheck, a memory error detector
==7937== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==7937== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==7937== Command:
/tmp/build/clangxx3_8_pkg/clang/struct_of_arrays/work/soa_compare.benchmark.optim0.exe
==7937==
COMPILE_OPTIM=0
particle_count=1,000
particle_count=1,000 is not a multiple of 64, the optimized energe/alive 
loop processes 64 particles at a time. I haven't bothered to analyze 
what the code will do in this case but memory corruption is likely

The code to handle a tail (if particle_count % 64 != 0) isn't difficult 
to add but it is explicitly left out. One of the things you'll often do 
in a system such as this is fit the data to optimize the algorithm. In 
the case of a particle system plus or minus 0 to 63 particles is 
generally unnoticeable.

You can address the problem however you like but the simplest solution 
would be to change your small particle count to 16 * 64 = 1024.

Re: [boost] interest in structure of arrays container?

Michael Marcin