boost::interprocess shared memory performance

27 Dec 2008

      This is my first experience with using shared memory for anything more  
than trivial IPC. Thanks again to Ion Gaztañaga for getting the  
library working on FreeBSD. I'm developing initially on OS-X 10.5.6  
with boost_trunk, but the eventual target platform is FreeBSD--I'm  
just an old mac programmer used to cushy development tools.

I originally used managed_mapped_file, and got everything working, but  
I was disappointed by performance. Profiling with shark, I found that  
the application was spending a lot of time in msync, which according  
to man(2) is used to synchronize mapped memory with the filesystem. It  
makes sense to me that the interprocess containers I was creating  
could have a lot of file I/O overhead, and managed_mapped_file was  
probably not a good choice.

So I rewrote the code to use managed_shared_memory instead of  
managed_mapped_file, thinking that it would eliminate the file I/O and  
therefore be faster. However, I am surprised that it is not much  
faster, and when I profile with Shark on os-x I see that it is still  
spending a lot of time in msync, specifically whenever a  
managed_shared_memory object is destroyed.
(in boost::interprocess::mapped_region::flush(unsigned long, unsigned  
long),  which is called within basic_managed_shared_memory's destructor)

Does managed_shared_memory really need to call msync?

I see that I should optimize my code to cache the  
managed_shared_memory objects so that fewer create/deletes are  
necessary, but this is still going to happen fairly frequently and I  
wonder if this expensive msync call is necessary.

In the tradition of coder forums everywhere, someone will probably ask  
what I'm trying to accomplish and whether there may be a better way.  
Suggestions welcome.
I'm writing a little cgi driven database utility that queries data  
stored in a filesystem directory using a simple query language. I  
would like to keep indexes of the data to speed query resolution. The  
utility is old-school cgi, so all its resources (such as indexes) have  
to be instantiated into memory each time the cgi process is started. I  
could write indexes to files, but then I incur a de/serialization  
overhead that is expensive. My intention was to keep the indexes as  
ready-to-use interprocess::maps in shared memory, to be used by all  
invocations of the cgi. It works, but the performance of the shared  
memory is poor enough that I'm not getting much increase over just  
doing a brute force search through the datafiles.

All suggestions appreciated!

Andy

Andy Wiese

Ion Gaztañaga

Zeljko Vrba

Andy Wiese

Zeljko Vrba

Lothar Werzinger

tags

participants (4)