
Hello everyone, for quite some time I've been working on a portable mmap/virtual memory library and I've finally found enough time to bring it out of the 'my internal litle tool' state into something presentable: https://github.com/psiha/mmap (C++14 currently). I would now kindly ask fellow devs for opinions (on the good, the bad, the ugly and the future;) First let me answer the basic questions (i.e. on motivation and scope)... Q&A: (1) Why? Considering Boost already offers two related solutions * Interprocess: http://www.boost.org/doc/libs/release/doc/html/interprocess/sharedmemorybetw... * Iostreams: http://www.boost.org/doc/libs/release/libs/iostreams/doc/classes/mapped_file... why do we want a new one? Even if the two mentioned solutions were adequate, the problem domain would still merit a separate, dedicated library merely considering its complexity (which will become apparent in later points). (I suspect Ion would agree here considering he actually authored a related standardization proposal:) (2) Why a completely new library instead of a repackaging of existing functionality? I'm not satisfied with (a) the API semantics/design, (b) the API 'power'/library capabilities (c) and the implementation efficiency/overhead. a) Insistence on POSIX semantics: that shared memory objects have to be persistent (kernel lifetime) and resizable while not everyone wants or needs this (https://svn.boost.org/trac/boost/ticket/4827). This has implications both: - in the interface (requiring manual cleanup guards, additional platform specific shm types like windows_shared_memory) - and the overhead of the library (emulating shm with mapped files on major platforms that do not offer full POSIX compliance: Windows, OSX, iOS and Android) MMAP solves this with one class template 'named_memory' and policies: https://github.com/psiha/mmap/blob/master/include/boost/mmap/mappable_object... so the user can choose: named_memory<scoped, resizable>, named_memory<persistent, fixed> or any other combination and the library chooses the best implementation. On POSIX, scoped semantics are achieved with a SEM_UNDO-enabled SysV semaphore acting as a kernel/system-global reference counter (which is then also used for automatic cleanup of shm zombies left by killed/crashed processes). On Windows the user can choose the Win32 backend (file emulation) or the NativeNT (not yet finished/commited) backend (with native persistence and resizability). b) - lack of a meta layer (e.g. so one can ask is_mappable<FILE *> or is_mappable<boost::filesystem::path>, is_mappable<HANDLE> etc...) - lack of related utility functions, for example: -- map_read_only_file( path ) which will open the file for reading, query its size, create a read only file mapping and map/return a read only view of the file -- guarded_operation( view, operation, error_handler ) - execute operation wrapped in SEH(Windows)/scoped signal handler(POSIX) guards that will catch access violations (e.g. when mmaping network files) and gracefully execute error_handler (with the faulting address as the parameter) https://github.com/psiha/mmap/blob/master/include/boost/mmap/mapped_view/gua... - in MMAP the mapped_view object (which should probably be renamed to just 'view' considering the enclosing mmap namespace) is just a RAII wrapper around an iterator_range therby providing the standard begin/end(), front/back(), operator[], etc. interface (as opposed to a get_data(), get_size() like interface) - general (not specific to mapping) virtual memory functionality (to be discussed in a separate point, MMAP also completely lacks this currently) - MMAP offers a comprehensive 'flags' system (e.g. https://github.com/psiha/mmap/blob/master/include/boost/mmap/flags/flags.hpp) for everything from object-level and system-level access privileges, over object parent-child inheritance to system life-time and access-pattern optimisation hints. Quite some time was invested in this area to produce a normalized interface that works for all objects (files, mappings...) while producing (near)zero adjustment codegen. Flags are 'packed'/grouped in structs (e.g. struct access_privileges with object_access, child_access and systen_access members) with public members so that, after flags are created with a factory function (a portable API) the flags can be further tweaked for a specific platform (e.g. adding some FreeBSD specific mmap hint which isn't covered by the portable/documented MMAP API) c) It is my view (if not a "self evident truth";D) that libraries that merely wrap existing low level functionality (such as OS or CRT APIs) should allow you to write code that is safe, portable and looks reasonably nice while at the same time incurs (near)zero overhead (i.e. with a reasonably intelligent compiler, produces codegen that looks nearly the same as it would had one used the underlying API directly) - and the existing solutions fail that. In the two related tickets I went into more detail on this so I'll avoid spaming this post by repeating those objections and analysis. Rather I'll present a trivial example that demonstrates what wannabeboost::mmap currently produces with a decent compiler, https://gist.github.com/psiha/c0823fefc01fa3b39662: #define BOOST_MMAP_HEADER_ONLY #include <boost/mmap/mapped_view/mapped_view.hpp> #include <boost/mmap/mappable_objects/file/utility.hpp> int main( int /*argc*/, char * /*argv*/[] ) noexcept { auto maybe_foo_view( boost::mmap::map_read_only_file( "foo" )() ); if ( !maybe_foo_view ) return static_cast<int>( maybe_foo_view.error() ); if ( maybe_foo_view->empty() ) return -1; return (*maybe_foo_view)[ 0 ]; } with Xcode 7.2.1 Clang -O3 build for x64 produces https://gist.github.com/psiha/f92a7b8a93c5ce1736ae (notice how the error handling is also correctly detected as such and placed at the end of the function, after the main return...) Besides not using shared_ptr pimpls or saving paths in std::strings as a 'nice to have' (YAGNI!:), part of the way this codegen is achieved is through the use of (also wannabe) Boost.Err (https://github.com/psiha/err) which makes it possible to avoid EH (in such simple/'localised' examples). It is also the reason for the somewhat awkward (optional<>-like) syntax: - map_read_only_file() returns a mmap::fallible_result<mmap::mapped_view> (an alias for err::fallible_result<mmap::mapped_view, mmap::error>, an "rvalue-only" type) - which is converted/'saved'/'pinned' into a result_or_error (the maybe_foo_view variable) with the additional operator() call - the !maybe_foo_view checks whether the call succeeded or the returned object contains an error - if error return the error (errno) code - else check if the view/file is empty (optional<>-like syntax for accessing the contained object through the -> and * operators) - else return the value of the first character in the file. Boost.Err was recently discussed on this list so I'll skip most of that now, let me just say that it also supports classic EH coding style (it auto adapts, no need for reconfiguration, macros or anything like that), i.e. the above code can be rewritten as: int main( int /*argc*/, char * /*argv*/[] ) noexcept { using namespace boost::mmap; try { read_only_mapped_view const foo_view( map_read_only_file( "foo" ) ); if ( foo_view.empty() ) return -1; return foo_view[ 0 ]; } catch ( std::runtime_error const & ) { return error::get(); } } ...now let me reverse the Q&A direction ;) 3. Library scope: currently the library covers the topic of memory mapping (of filesystem objects and virtual memory), however I think that 'the final' library should cover all resonably portable aspects of virtual memory (and be called something like vm with mmap as a nested namespace), tackling thingies such as: - process working set - portable low memory event handling, madvise, memfd... - prefetching, locking/unlocking virtual memory to/from physical memory (e.g. for realtime sensitive data) - allocators capable of contiguous resizing (for implementing realloc or vector.resize() that does no copying or moving, simply maps new pages at the end of the current allocation) - https://fgiesen.wordpress.com/2012/07/21/the-magic-ring-buffer ... does anyone have an objection/different approach to this? 4. Windows offers/has/uses the concept of an intermediate "mapping" object (i.e. you don't create a mapped view of a file, rather you create a 'mapping' of a file and then a view of the mapping). This of course complicates things but also gives more power, e.g. you can have a r/w file open and create a mapping (or several mappings) of it that only covers a part of its size and has stricter (e.g read only at object/process level) or wider access privileges (e.g. on the system level, only the parent process/user can access the file but the mapping is accesible by everyone). Boost.MMAP retains this distinction in its API (mapping vs mapped_view classes)...comments/thoughts on this please? 5. The above (Windows specific) 'mapping' concept makes it possible to create "named file mappings" (so that the object gets a system global name, like a shared memory object) which then another process can open by its name, erasing the difference between file mappings and vm mappings for client processes (kind of like the interface vs hidden implementation distinction). MMAP uses this on Windows for file-backed/'emulated' shared memory (e.g. that which needs persistence and/or resizabilty) - it is created named so that client processes can open and access it as 'normal'/native Windows shared memory. It might actually be possible to make this at least partially portable to POSIX systems that use virtual filesystems for implementing shared memory (e.g. Linux which uses /dev/shm) where we could symlink the file to the shm filesystem directory...(unfortunately I have no way of testing this as I have no Linux machine setup, I develop only for Windows, OSX, iOS and Android)...does this make any sense/would it be worth the hassle (i.e. ease any real world problems)? 6. The security part of the 'flags system' models the POSIX API: for named (system level) created objects you specify the permissions for the 'user', 'group' and 'world'. 'Behind the scenes', on Windows, 'user' maps to the user that created the process, 'group' maps to all groups that the 'user' belongs to and 'world' maps to the Everyone group..? When not using the predefined privilege/permission levels [e.g. process_default, unrestricted, nix_default (644)] things are currently pretty verbose here: namespace flags = boost::mmap::flags; using ap = flags::access_privileges; auto const default_privileges ( ap::system::user ( ap::all ) | ap::system::group( ap::read ) | ap::system::world( ap::read ) ); I have a prototype implementation that shortens this with user defined literals, so that one can write something like "rwxr-xr--"_perm...All in all, this thing looks like it merits a separate library in its own right... 7. Windows has another quirk up its sleeve: global vs local session objects - if you want to create a shared memory object visible across terminal sessions (this includes a server process running as a service, thereby in session 0, and client process created by a logged on user, or/also using the NativeNT backend/API to create a native persistent shared memory object which requires that it be created in the global session '0') you have to prefix its name with "Global\" (and have admin privileges)... I'm still pondering how to model this in the API or whether it should be automatically handled at all, e.g. whether to deduce if the Global\ prefix should be automatically added (based on the process privileges, type and access privileges of shm object being created etc) or leave it up to the user to know "if I'm running on Windows and want to do this and this I have to use the Global\ prefix"... There's more but the openning post is already too long so I'l stop...for now ;) -- "What Huxley teaches is that in the age of advanced technology, spiritual devastation is more likely to come from an enemy with a smiling face than from one whose countenance exudes suspicion and hate." Neil Postman --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

Domagoj Saric <dsaritz <at> gmail.com> writes: > > Hello everyone, > > for quite some time I've been working on a portable mmap/virtual memory > library and I've finally found enough time to bring it out of the 'my > internal litle tool' state into something presentable: > https://github.com/psiha/mmap (C++14 currently). > I would now kindly ask fellow devs for opinions (on the good, the bad, > the ugly and the future;) > First let me answer the basic questions (i.e. on motivation and scope)... I'm interested in the general idea of controlling virtual memory explicitly and portably. >From a quick look at the code, it seems to deal with the following aspects: - map a file (not sure if anonymous is well supported) - set protection/access privileges - set mapping sharing properties I'd like to have access to the following extended functionality: - map at a fixed address - actually allocate/free the pages - map the same hardware page to multiple places in virtual memory without a file

Hi Domagoj and Mathias.
for quite some time I've been working on a portable mmap/virtual memory
I have been using Boost.Interprocess for handling large genetic data files in the proposed Boost.Genetics, it is intended for sharing memory images between processes, but works just as well as a portable mmap wrapper including as a block allocator. The only shame is that it is not 100% header-based and so needs binaries for different platforms. mmap is the only sensible way of working with large in-memory datasets as it can exceed the swap file size. Andy.

On 4.3.2016. 1:18, Andy Thomason wrote:
Hi Domagoj and Mathias.
for quite some time I've been working on a portable mmap/virtual memory
I have been using Boost.Interprocess for handling large genetic data files in the proposed Boost.Genetics, it is intended for sharing memory images between processes, but works just as well as a portable mmap wrapper including as a block allocator.
And the "portable mmap wrapper" and "block allocator" are IMO two separate things and belong in separate libraries, as mmap-ing is not essentially an IPC thing. This is one of the chalenges in making a VM/MMAP library - find out the clear API boundary where the library should stop and Interprocess should then build upon.
The only shame is that it is not 100% header-based and so needs binaries for different platforms.
But Interprocess is a header only library?
mmap is the only sensible way of working with large in-memory datasets as it can exceed the swap file size.
Not sure what you mean by 'it can exceed swap file size'. A mapping can be larger than the available RAM+swap only if you use overcommit and/or 'uncommited'/'unreserved' mappings (and thus risk AV/SIGSEV crashes). From your description I gather that you need/use something like a scratch disk/file(s) (which can be useful on systems with a dedicated but 'not big enough' swap partition). IMO this is not something client code should worry about, i.e. it should be abstracted by the VM library: you would specify ('reserve') how much scratch space you need and the library would see if it can use the paging file as a scratch file (what it esentially is) or look for a disk/partition with more space (and it would choose appropriate mapping and file creation flags/system hints optimal for scratch storage)... And this brings me to another point not covered in the opening post: 9. Resizable views (currently a todo item) Besides the already mentioned resizable ('ftruncatable') mapping objects, we can also have resizable views of those mappings. For example, one might wish to 'walk' a file (possibly in a random fashion) in chunks (e.g. files that contain 'ready to use data', such as uncompressed audio or video files). The problems is what API would be an 'overall best' for such use cases. mapped_view objects are currently thin RAII wrappers around an iterator_range and iterator_ranges have the handy advance_begin() and advance_end() member functions (which are hidden by the non resizable mapped_view class). These however move in 'element'-sized chunks (bytes for char views) which does not seem handy or efficient for mapped_views: you'd usually want to either resize the view or advance in page-size or mapped_view.size() chunks. Perhaps I can, for the future resizable_mapped_view class, retain the advance_begin/end() interface and add - advance() (which does a simultanaeous begin/end advance) - increment/decrement operators which hop in chunks equal the current size of the view - a seek() function that can do an absolute jump and resizing in one call. An additional question is should resizable_mapped_views hold references to their parent mapping objects and resize them as needed (or fail if one tries to grow them beyond the size of the mapped file/shared memory object)? -- "What Huxley teaches is that in the age of advanced technology, spiritual devastation is more likely to come from an enemy with a smiling face than from one whose countenance exudes suspicion and hate." Neil Postman

On 06/03/2016 22:19, Domagoj Saric wrote:
On 4.3.2016. 1:18, Andy Thomason wrote:
Hi Domagoj and Mathias.
for quite some time I've been working on a portable mmap/virtual memory
I have been using Boost.Interprocess for handling large genetic data files in the proposed Boost.Genetics, it is intended for sharing memory images between processes, but works just as well as a portable mmap wrapper including as a block allocator.
And the "portable mmap wrapper" and "block allocator" are IMO two separate things and belong in separate libraries, as mmap-ing is not essentially an IPC thing. This is one of the chalenges in making a VM/MMAP library - find out the clear API boundary where the library should stop and Interprocess should then build upon.
The only shame is that it is not 100% header-based and so needs binaries for different platforms.
But Interprocess is a header only library?
Yes. It depends on DateTime but no library function is needed so you can define BOOST_DATE_TIME_NO_LIB to avoid automatic linking. See: http://www.boost.org/doc/libs/1_60_0/doc/html/interprocess.html#interprocess... Best, Ion

On 3.3.2016. 18:11, Mathias Gaunard wrote:
I'm interested in the general idea of controlling virtual memory explicitly and portably.
That whas point 3 in the opening post.
From a quick look at the code, it seems to deal with the following aspects: - map a file (not sure if anonymous is well supported)
Thanks for taking a look, however I'm not sure what do you mean by anonymous. Non-anonymous (i.e. named) file mappings exist natively only on Windows (point 5 from the opening post). However if you meant anonymous memory mappings then yes I did not cover those in the opening post. So let's 'name it' point 8: 8. Anonymous shared memory - memory shared between a parent and its child processes. This one is still a todo item, mostly because I'm not yet sure how to model the discrepancy between the Windows and POSIX models. Both Windows and POSIX support inherited handles/descriptors for child processes (created with CreateProcess()/execve()), POSIX however also supports forking along with the corresponding MAP_ANONYMOUS mmap flag (i.e. anonymous virtual memory 'views'). So, one way is what Interprocess does now: use inherited handles on Windows and MAP_ANONYMOUS on POSIX - but this does not cover the case if someone wants handle/descriptor inheritance on POSIX systems too, and it also entails different ways of communicating the anonymous mapping 'handle'/location between the processes on Windows vs POSIX systems. Another way would be for anonymous_shared_memory to use the inherited handle/descriptor approach and to define a separate POSIX-only class anonymous_shared_memory_view used for the fork scenario. Unfortunately to me this only seems as a less ugly approach then the one above because, if the most often use case in crossplatform code is to use CreateProcess() on Windows and fork() on POSIX, it will still cause #ifdefs in user code...
I'd like to have access to the following extended functionality: - map at a fixed address
Still a todo item (again still figuring how to add the parameter to the API without exploding the number of overloads and/or forcing the user to specify it everywhere).
- actually allocate/free the pages
You mean like direct virtual memory management (i.e. the Win32 VirtualAlloc* API)?
- map the same hardware page to multiple places in virtual memory without a file
You mean something like the 'magic ring buffer' example from point 3 from the opening post? Arbitrary remapping of just any address (even if page aligned) is AFAIK not possible on any of the major systems. I.e. you can map the same mappable object (shared memory or a file) at multiple locations (subject to 'allocation granularity') but you cannot for example make a 'virtual copy' of a std::vector (i.e. of memory allocated with malloc as opposed to memory allocated with mmap)... -- "What Huxley teaches is that in the age of advanced technology, spiritual devastation is more likely to come from an enemy with a smiling face than from one whose countenance exudes suspicion and hate." Neil Postman
participants (4)
-
Andy Thomason
-
Domagoj Saric
-
Ion Gaztañaga
-
Mathias Gaunard