On 10/16/2016 1:59 AM, degski wrote:
You state that the example is a toy example. To me the example shows that iterating over a vector of smaller objects (pods) is faster than iterating over a vector of larger objects, duh.
Iterating over a vector of smaller objects being faster than iterating over a vector of larger objects is the core of the SoA approach.
The real use case might be more interesting, maybe you can describe it.
A common real use case is a particle system in a game. A very simple particle could look like: A real particle will often have many more fields. struct particle_t { float3 position; float3 velocity; float3 acceleration; float2 size; float4 color; float energy; bool alive; }; Every Tick you need to update all of the particles. void update() { size_t n = position.size(); for ( size_t i = 0; i < n; ++i ) { auto & v = velocity[i]; v = v + (acceleration[i] * dt); v.y += gravity * dt; position[i] = position[i] + (v * dt); energy[i] -= dt; if ( energy[i] <= 0 ) { alive[i] = false; } } } Note that update() doesn't read or write several fields, these fields are only used for rendering. You can get a decent win purely through improved cache utilization by keeping this cold data out of the way. You can then think about going one step further and building a simd vectorized update loop, which is strictly impossible with an Array of Structures approach. There are two obvious ways to go about vectorizing this update. 1. You can store position/velocity/acceleration as float4 instead of float3 and operate on xyz for a single particle in one instruction. 2. You can store each component in its own array and operate on 4 particles at once. This is still a toy example but it's closer to something real. http://codepad.org/r9u82dOh AoS in 6.54421 seconds SoA in 5.91915 seconds SoA SSE in 3.58603 seconds Press any key to continue . . . Note: there is still room for improvement in the SSE example