[Interprocess] [file_mapping] is there any way to avoid writing a modified file to disc?

Hi, I'm using file_mapping/mapped_region with Boost 1.45. My idea was to load the memory mapped file in read/write mode and then decrypt the file content in-place (to avoid copying the entire file content). However, this will save a decrypted file :-( Is there any way to avoid the file being saved after in-memory manipulation? -Thorsten

On Thu, Dec 8, 2011 at 12:19 AM, Thorsten Ottosen
Hi,
I'm using file_mapping/mapped_region with Boost 1.45. My idea was to load the memory mapped file in read/write mode and then decrypt the file content in-place (to avoid copying the entire file content). However, this will save a decrypted file :-(
Is there any way to avoid the file being saved after in-memory manipulation?
In POSIX there's a way to say that a read/write region can be declared private -- changes to it would not be propagated to the underlying file. I don't see a way from the docs of providing the MAP_PRIVATE flag to the call to mmap. Not sure if there's a similar concept in Windows. Cheers -- Dean Michael Berris http://goo.gl/CKCJX

On Thu, Dec 8, 2011 at 12:29 AM, Dean Michael Berris
On Thu, Dec 8, 2011 at 12:19 AM, Thorsten Ottosen
wrote: Hi,
I'm using file_mapping/mapped_region with Boost 1.45. My idea was to load the memory mapped file in read/write mode and then decrypt the file content in-place (to avoid copying the entire file content). However, this will save a decrypted file :-(
Is there any way to avoid the file being saved after in-memory manipulation?
In POSIX there's a way to say that a read/write region can be declared private -- changes to it would not be propagated to the underlying file. I don't see a way from the docs of providing the MAP_PRIVATE flag to the call to mmap.
Not sure if there's a similar concept in Windows.
Actually, I just checked the docs: it seems you want the 'copy_on_write' mode which combines PROT_WRITE|PROT_READ and MAP_PRIVATE. This allows you to make changes in place to the memory region without having the changes reflected to the original file. HTH Cheers -- Dean Michael Berris http://goo.gl/CKCJX

Den 07-12-2011 14:44, Dean Michael Berris skrev:
Is there any way to avoid the file being saved after in-memory manipulation?
In POSIX there's a way to say that a read/write region can be declared private -- changes to it would not be propagated to the underlying file. I don't see a way from the docs of providing the MAP_PRIVATE flag to the call to mmap.
Not sure if there's a similar concept in Windows.
Actually, I just checked the docs: it seems you want the 'copy_on_write' mode which combines PROT_WRITE|PROT_READ and MAP_PRIVATE. This allows you to make changes in place to the memory region without having the changes reflected to the original file.
Thanks. I overlooked that functionality. Anyway, I thought I had to use managed_mapped_file with mapped_region just like for file_mapping, but this doesn't work (compile error due to missing function). Using managed_mapped_file alone segfaults. Maybe this is something that is fixed in a later version than 1.45? -Thorsten

El 07/12/2011 14:44, Dean Michael Berris escribió:
Actually, I just checked the docs: it seems you want the 'copy_on_write' mode which combines PROT_WRITE|PROT_READ and MAP_PRIVATE. This allows you to make changes in place to the memory region without having the changes reflected to the original file.
Yes, copy_on_write should work. When a page is modified the OS creates a new page for you but other process sees your changes (well, I guess they can end in the page file). Best, Ion

Den 07-12-2011 16:13, Ion Gaztañaga skrev:
El 07/12/2011 14:44, Dean Michael Berris escribió:
Actually, I just checked the docs: it seems you want the 'copy_on_write' mode which combines PROT_WRITE|PROT_READ and MAP_PRIVATE. This allows you to make changes in place to the memory region without having the changes reflected to the original file.
Yes, copy_on_write should work. When a page is modified the OS creates a new page for you but other process sees your changes (well, I guess they can end in the page file).
Well, at least with 1.45 I cannot get it to work.
I also speculate if its actually any faster, since I need to change
every single byte in the memory region (I'm decrypting).
My understanding of these OS features are weak, but my mental model of
it is that
//Create a file mapping
file_mapping m_file( file.c_str(),
boost::interprocess::read_only );
//Map the whole file with read-write permissions in this
process
mapped_region region( m_file, boost::interprocess::read_only );
//Get the address of the mapped region
const char* addr = static_cast

El 07/12/2011 16:21, Thorsten Ottosen escribió:
Den 07-12-2011 16:13, Ion Gaztañaga skrev:
El 07/12/2011 14:44, Dean Michael Berris escribió:
Actually, I just checked the docs: it seems you want the 'copy_on_write' mode which combines PROT_WRITE|PROT_READ and MAP_PRIVATE. This allows you to make changes in place to the memory region without having the changes reflected to the original file.
Yes, copy_on_write should work. When a page is modified the OS creates a new page for you but other process sees your changes (well, I guess they can end in the page file).
Well, at least with 1.45 I cannot get it to work.
Can you send me an example? (Are you using managed_mapped_file or mapped_region?) Maybe the bug it still there.
I also speculate if its actually any faster, since I need to change every single byte in the memory region (I'm decrypting).
My understanding of these OS features are weak, but my mental model of it is that
//Create a file mapping file_mapping m_file( file.c_str(), boost::interprocess::read_only );
//Map the whole file with read-write permissions in this process mapped_region region( m_file, boost::interprocess::read_only );
//Get the address of the mapped region const char* addr = static_cast
(region.get_address()); const size_t size = static_cast<int>(region.get_size()); makes the whole file avaiable as one big memory segment.
Yes, but in this case modifying the data should segfault (you are
mapping the memory as read-only, trying to write it will make your MMU
act). You can do this (at least this works in the latest version):
#include
It seems to me that the OS must do something to write changed data to disc (that is, there is an implicit flush() in one of the destructors). Hence I assumed that it would be possible to avoid this flush, leaving the file intact.
In theory, when mapping something copy on write, the OS will discard modified pages. The original file/shared memory shoud never be modified. Ion

I also speculate if its actually any faster, since I need to change every single byte in the memory region (I'm decrypting).
My understanding of these OS features are weak, but my mental model of it is that ...
I'm by no means an expert on this either, but my understanding is that using memory mapped files is faster especially when you need random access to the file and it allows you to have a logical view of the file greater than what you could normally fit into memory. Even sequential read access may be quicker as it should prevent any double buffering. In the normal case, I would expect that memory writes would be slower since it has to either write the data to the file or cache it for later writes. However, I think for what you're doing you may be better off reading the file into normal memory and manipulating it there. Even with copy on write, you're essentially forcing the OS (through a write protected memory exception) to copy the file into memory anyway. -- Bill

Den 07-12-2011 19:37, Bill Buklis skrev:
I also speculate if its actually any faster, since I need to change every single byte in the memory region (I'm decrypting).
writes. However, I think for what you're doing you may be better off reading the file into normal memory and manipulating it there. Even with copy on write, you're essentially forcing the OS (through a write protected memory exception) to copy the file into memory anyway.
Ok. Thanks! That's what I'm doing now. -Thorsten

Den 07-12-2011 17:43, Ion Gaztañaga skrev:
El 07/12/2011 16:21, Thorsten Ottosen escribió:
Den 07-12-2011 16:13, Ion Gaztañaga skrev:
El 07/12/2011 14:44, Dean Michael Berris escribió:
Actually, I just checked the docs: it seems you want the 'copy_on_write' mode which combines PROT_WRITE|PROT_READ and MAP_PRIVATE. This allows you to make changes in place to the memory region without having the changes reflected to the original file.
Yes, copy_on_write should work. When a page is modified the OS creates a new page for you but other process sees your changes (well, I guess they can end in the page file).
Well, at least with 1.45 I cannot get it to work.
Can you send me an example? (Are you using managed_mapped_file or mapped_region?) Maybe the bug it still there.
Well, the problem was that I used managed_mapped_file as I would use file_mapping. I did not, at first, realize that I would have to use managed_mapped_file alone. So when I used managed_mapped_file alone with copy_on_write, I got a segfault. I assumed that managed_mapped_file automatically mapped the whole file, like mapped_region can do.
I also speculate if its actually any faster, since I need to change every single byte in the memory region (I'm decrypting).
My understanding of these OS features are weak, but my mental model of it is that
//Create a file mapping file_mapping m_file( file.c_str(), boost::interprocess::read_only );
//Map the whole file with read-write permissions in this process mapped_region region( m_file, boost::interprocess::read_only );
//Get the address of the mapped region const char* addr = static_cast
(region.get_address()); const size_t size = static_cast<int>(region.get_size()); makes the whole file avaiable as one big memory segment.
Yes, but in this case modifying the data should segfault (you are mapping the memory as read-only, trying to write it will make your MMU act). You can do this (at least this works in the latest version):
Yes, so then I changed the mode to read_write, and everything worked, except my data was modified on disc.
#include
#include #include <cstddef> //std::size_t #include <cassert> #include <fstream> #include <cstring> //memset #include <cstdlib> //remove static char zeros [256];
int main() { using namespace boost::interprocess; //Create a file with zeros std::remove("myfile"); { std::ofstream f("myfile", std::ios_base::binary); f.write(&zeros[0], sizeof(zeros)); } { file_mapping fmapping("myfile", read_only); mapped_region mregion(fmapping, copy_on_write);
Ok, I did not realize that I could use copy_on_write with mapped_region. I think that is not clear from the docs. Anyway, since I need to modify every single byte, I fail to see how this is faster. I should make a speed comparison, but I don't have time right now.
It seems to me that the OS must do something to write changed data to disc (that is, there is an implicit flush() in one of the destructors). Hence I assumed that it would be possible to avoid this flush, leaving the file intact.
In theory, when mapping something copy on write, the OS will discard modified pages. The original file/shared memory shoud never be modified.
Ok. Do you think it's faster than copying the whole mapped_region into a vector and doing the modification there? -Thorsten
participants (4)
-
Bill Buklis
-
Dean Michael Berris
-
Ion Gaztañaga
-
Thorsten Ottosen