
On 7/19/07, Malte Clasen <news@copro.org> wrote:
iain.denniston@blueyonder.co.uk wrote: I'm not sure whether this would be the right level for architecture specific optimizations. Having a layer that provides access to SIMD instructions might smooth out issues between SSE and AltiVec, but programming Cell or perhaps CUDA is afaik a bit more involved. I would start with larger operations such as FFT, SVD or "multiply a set of vectors with matrix A". Given a portable C++-only implementation, you could then provide architecture specific implementations of these building blocks.
For game engines, the one you want to do first is the Runge-Kutta. That'll handle the big weight in physical simulations. Then Collision Detection methods. Later in the year I'll have some performance numbers to indicate where a normal desktop CPU's spending it's cycles in a game engine, if anyone wants to prioritize their work. If you can get these parallelizable over 6 SPEs (e.g. what you find in a Linux-loaded COTS PS3), 3 PPC cores (Xbox 360) or 2 R5900s (PSP/PS2), you're in fantastic shape. -- H. Lally Singh Ph.D. Candidate, Computer Science Virginia Tech