New subject: [interprocess] Fault recovery in managed mapped file

17 Nov 2010

      I have found that managed mapped file can get stuck in a spinlock if the 
file is not closed and fully flushed to disk.  For example, if the power 
is pulled from computer while a segment is open and before the first 
page in the segment has been committed.  In this case it is common for a 
journaled filesystem to preserve the fact that the file was created, but 
it has lost the contents of the file and now the file appears to be 
zero'd out.  I have observed this behaviour on Linux systems running 
ext4, for example.

This loop is where we get stuck, in 
managed_open_or_create_impl::priv_open_or_create

while(value == InitializingSegment || value == UninitializedSegment){
     detail::thread_yield();
     value = detail::atomic_read32(patomic_word);
}

At this point I have opened the file in open_only mode.  *patomic_word 
is 0 (UninitializedSegment) at this point.  It appears this code is 
waiting for some other process or thread to initialize the segment, but 
in fact there is no such process doing so.

Some possible solutions:

If, at this point we have opened a file and not created it, why wait for 
an UnitializedSegment to change state?  If the segment is Unitialized 
here then simply throw an error.  Make it the caller's responsibility to 
ensure the segment is created/initialized before it opened in a 
read-only mode.

Perhaps, if you want to allow multiple processes to do open and create 
simultaneously without any additional synchronization mechanism, you 
could accomplish that by adding a count of open mappings into the shared 
segment.  If the reference count is 1 at this point, don't attempt this 
spinlock because the state of the file is never going to change.  In 
this case throw if the *patomic_word is != InitializedSegment.

-- 
KEVIN ARUNSKI

[interprocess] Fault recovery in managed mapped file

Kevin Arunski

Ion Gaztañaga

Kevin Arunski

tags

participants (2)