
On Mar 19, 2010, at 11:31 AM, Brett Gmoser wrote:
I have a couple of questions about crash protection, which is very important in this case: 1. I am using only file_lock, because it is guaranteed to be released if the process crashes (if I understand correctly). However, file_lock has its limitations. If I use a named_recursive_mutex instead, is there any way to clear the mutex if the process terminates while holding the mutex? 2. If the mapped file gets corrupted, e.g. if a process gets killed while writing the file, I need to be able to detect that case and rebuild the file, but the file is certain death to any process that touches it, so validating it seems messy. Any suggestions how to detect a bad file?
Andy
Yeah, I think our problems are pretty similar. Unfortunately I haven't gotten much input on it - I posted it before, and didn't get and responses. I posted it again because I had seen that Ion was active here recently.
The only input I can offer you (I'm not much of an expert with the Interprocess library) is that I believe that named_recursive_mutex also has filesystem persistence. It creates files in /tmp/ boost_interprocess/XXXXX/name, where XXXXX seems to be a hash of the machine's boot time.
Referencing Ion's last reply. I think I will stick with file_lock for now, which I am able to wrap my own classes around to make an adequate recursive mutex. I have been using the flock style utilities on os-x and freebsd for some time now, and it seems to be dependable. I find myself wondering what are the conditions in which I would ever be comfortable using another lock type. I suppose if I had strict control and confidence in all processes, which is certainly not my current case. As for data corruption, I was also thinking along the lines of Ion's suggestion:
This depends on your application, and it's similar to when a thread corrupts a data structure used by another thread. You could serialize them and mark the start and end of a modification, so that a thread could check if a previous process/thread has finished the modification. This obviously, requires collaborative processes.
Accepting the constraint that I am trying to protect only against a file writer leaving a corrupt file by terminating abruptly, and given that file writers always have exclusive serialized access, based upon a file-based file_lock--the writer will write a token into the lock file upon start of write, and then clear that token upon completion. If either a reader or writer acquires the file_lock and finds that there is a token already written into the lock file, I assume that a writer failed to complete and the mapped file is suspect. Seems like a simple and obvious pattern, which I'm surprised I've never used before. I'm happy for suggestions if I've missed something. Brett it sounds like our applications are quite similar. Feel free to email me directly if you want to kibitz. Andy