
Does #ifdef around #pragma omp indeed work? At least #define PARALLEL #pragma omp and using PARALLEL does not work.
I think the above may clash with either the definition or an implementation of #pragma; I can't recall ever successfully using a macro that expands to any directive. But that could be because I haven't tried it in the last twenty years :-) #if PARALLEL #pragma omp ... #endif For ( Sagans of rows of data ) do something_expensive() Should "work" but not in a way that is likely to lead to a good general-purpose library interface.
The compression algorithms (zip, ...) (part of the streams library?) would be a very good candidate. I once tested a parallel bzip2 algorithm and it scales really well.
Compression and crypto (and linear algebra, and image manipulation) are already available as threaded, micro-optimized, runtime-selected routines in the performance libraries from Intel and Amd; probably everybody else too. Wrapping them in boost interfaces with fallback implementations would be nice; it would encourage use of the best available implementations by programs with portability requirements. For my purposes, the ability to set up a "pipeline" like, say, serialization -> compression -> file i/o without having to code both sides of each relationship would be nice. It would also provide useful parallelism at a point in the application where the user is waiting for the program in many cases and, if generic enough, it could do so without requiring multithreading of the core algorithms of each of the filters. I wrote a hard-realtime system for Schlumberger back in the dark ages that had a multithreaded, typed data stream engine at its core. It was in C but designed with object-oriented concepts and would convert to a template library (+ compiled back end) very nicely. The coding discipline required to implement a source, sink, or filter was fairly rigid but the result was that the initialization script could paste together very elaborate, very high performance signal processing pipelines just by listing the names and parameters of the participants. We'd have to get permission from Schlumberger if we wanted to reuse any of my work products: anybody have a contact there? Note that, for programs working on very large datasets, any non-streaming interface can become a problem because it precludes efficient use of the storage hierarchy. If I must stream my data into an array before I pass it to the compression routine I've already lost: no matter how fast it is, it can't make up for the memory traffic I've already wasted compared with cache-to-cache producer/consumer transfers. Have a look at: http://developer.amd.com/TechnicalArticles/Articles/Pages/CrazyFastDataS haring.aspx A good producer / consumer pattern implementation framework may already exist in boost or elsewhere but if it does I haven't stumbled across it yet.