I havn't followed this whole thread, but I seem to recall that HDF5 supports MPI with Parallel HDF5. http://www.hdfgroup.org/HDF5/PHDF5/ Or does that not solve your requirements?
Alas, Parallel HDF5 != concurrent file access. As I understand it, parallel HDF5 = cooperating threads within a process writing in parallel, and I need one process to write & others to monitor/display the data.
Could you maybe use a raw memory-mapped file instead, and convert it to HDF5 off-line?
well, technically yes, but for robustness reasons I want to decouple the HDF5 logging from the shared memory logging. I'm very happy with the file format's storage efficiency and robustness + have not had to worry about file corruption (though oddly enough, the "official" HDF5 editor from the HDF5 maintainers has caused corruption in a few logs when I added some attributes after the fact), so would like to maintain independent paths: the HDF5 file as a (possibly) permanent record, and my shared memory structure, which could possibly become corrupt if I have one of those impossible-to-reproduce bugs -- but I don't care since I have the log file. I'm also dealing with a very wide range of storage situations; most are going to be consecutive packets of data that are written to the file + left there, but in some cases I may actually delete portions of previously-written data that has been deemed discardable, in order to make room for a long test run... more complicated than a vector that grows with time, or a circular buffer. I've defined structures within the HDF5 file which handle this fine; in the shared memory I was going to do essentially the same thing & have a boost::interprocess::list<> or map<> of moderately-sized data chunks (64K-256K) that I can keep/discard. But back to the topic at hand -- let me restate my problem: Suppose you have N processes where each process i=0,1,...N-1 is going to need a pool of related memory with a maximum usage of sz[i] bytes. This size sz[i] is not known beforehand but is guaranteed less than some maximum M; it has a mean expected value of m where m is much smaller than M. From a programmer's standpoint, the best way to handle this would be to reserve a single shared memory segment and ask Boost::interprocess to make the segment size equal to M. If I do this then my resource usage in the page file (or on disk if I use a memory-mapped file) is N*M which is much higher than I need. (I figured out the source of this: windows_shared_memory pre-commits space in the page file equal to the requested size) So what's a reasonable way to architect shared memory use to support this kind of demand? I guess maybe I could use a vector of shared memory segments, starting with something like 256KB and increasing this number as I need to add additional segments. It just seems like a pain to have to maintain separate memory segments and have to remember which items live where. Just for numbers, I may have an occasional log going on that needs to be in the 512MB range (though most of the time though it will be in the 50-500K range, occasionally several megabytes), and I can have 4-6 of these going on at once (though usually just one or two). On my own computer I have increased my max swap file size from 3GB to 7GB (so the hard limit is somewhat adjustable), though it didn't take effect until I restarted my PC. I'm going to be using my programs on several computers + it seems silly to have to go to this extent.