
On 10/17/07, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
I'm using a library called DirectFB, which is a relatively thin wrapper around 2D hardware graphics acceleration. Operations like rectangular copies and fills are hugely faster than letting the CPU do the work.
Isn't this only possible on hardware with Frame Buffer Object support, or does DirectX not have the same limitations as OpenGL? In OpenGL, if the OpenGL context is not visible, the contents of the frame buffer are undefined, so one has to use Frame Buffer Objects or PBuffers to do what you are talking about. Definitely not a show stopper (we do similar things for calculating Line of Sight where I work), but something to keep in mind. This is traveling down the route of needing an accelerated graphics library in Boost...think of all that your request entails. Querying graphics hardware for capabilities, setting up extensions (if OpenGL), FBO construction, rendering to target, etc. Your request requires nearly a full graphics pipeline if you want to do it correctly and extensibly.
x86 hardware also has the MMX and SSE SIMD instruction set extensions which can give significant speedups, e.g. processing the 3 8-bit channels of an RGB pixel (or potentially 2 or 4 pixels) in one instruction.
This seems like a good extension to GIL.
So I have been wondering about how a graphics library that's not hardware-specific, like GIL or Anti-Grain or FreeType, can best be combined with hardware-specific features. For example, if I want to draw a rectangle with curved corners, then ideally I'd use the graphics hardware to draw the body of the rectangle and then do the details of the corners from the CPU, using MMX instructions to process whole pixels at a time. What sort of architecture would allow the maximum exploitation of available speedups, with the simplest possible interface?
At the top level the interface wouldn't change too much, but the amount of code inside the library would be pretty large (for reasons given above). The last thing to mention is that creating and destroying the rendering context will probably be one of the huge bottlenecks. For instance, if you had 100 rectangles that you wanted to do this to, you wouldn't want the underlying call to create_curved_corners to create an OpenGL context every time, then destroy it on function exit. I'm not sure how to get around that without exposing implementation details to the user. --Michael Fawcett