boost.interprocess dynamic segment size?

hgdhfdghsa＠Safe-mail.net

11 Apr 2008 11 Apr '08

7:30 p.m.

Hello, in my application I am forced to create a shared memory segment whose size can change at runtime. Is it possible to implement such a shared memory segment using the boost.interprocess library? I do not want to pre-allocate a big shared memory segment, as in most cases the segment is small (about 100bytes). Just in some very rare cases it gets very big (5MB or even more). Thank you

Show replies by date

Ion Gaztañaga

11 Apr 11 Apr

8:10 p.m.

hgdhfdghsa@Safe-mail.net wrote:

...

Hello,

in my application I am forced to create a shared memory segment whose size can change at runtime. Is it possible to implement such a shared memory segment using the boost.interprocess library? I do not want to pre-allocate a big shared memory segment, as in most cases the segment is small (about 100bytes). Just in some very rare cases it gets very big (5MB or even more).

Thank you

Sorry, but this is an issue that no one has solved yet ;-) The only thing you can do is to grow it offline (that is, without processes connected to it). See "Growing managed segments" in the documentation. Growing shared memory dynamically is well beyond the scope of a library. I would need the help of the OS to automatically increase the memory mappings of all processes atomically. Even with the OS, I don't think it's possible without the collaboration and processes. It's in the to-do list, but I don't see how could I implement it. Regards, Ion

Zeljko Vrba

12 Apr 12 Apr

6:18 a.m.

On Fri, Apr 11, 2008 at 10:10:28PM +0200, Ion Gaztañaga wrote:

...

it's possible without the collaboration and processes. It's in the to-do list, but I don't see how could I implement it.

POSIX shared memory segments can be grown: enlarge the underlying file with ftruncate(), then mmap() the new chunk with MAP_FIXED flag at the end of existing mapping; this may naturally fail in which case the application is out of luck. (It may help if the initial mapping's starting address is chosen smartly in a platform-specific way.) As for growing it atomically, two things are important to note: 1) there's always the first process that wants to enlarge it 2) the other processes first must get some knowledge about segment growth -- this does not happen magically, but is somehow transmitted by the process that has grown the segment 2) relies on the assumption that a correct program can't know about address X until it has malloc()'d it. With multiple processes this assumption is extended to that a process can't know about address X until _some_ process has malloc()'d it _and_ communicated it to others. (This communication may be limited to just writing some pointer to new memory in an already mapped SHM aread). OTOH, I can't think of a scenario where this assumption doesn't hold (except for HW programming / embedded systems where there's often some "higher force" which hands you out absolute addresses). So, for N processes, have in the initial mapping: - a barrier initialized with a count of N - a process-shared mutex When a process wants to grow the segment (this is written in the context of a shm malloc() - allocating memory and returning chunks is written with this in mind): - lock the mutex [this prevents other processes to concurrently try to grow the segment] - try to malloc() some memory _again_ [serialize memory allocation in case of shortage -- another process might have already enlarged the segment, so this allocation may succeed: when first thinking about this problem a long ago, I wished for a useful return status from pthread_mutex_lock() that would indicate whether the mutex was acquired with or without contention] - if this (repeated) malloc() succeeded, unlock the mutex and exit - otherwise: grow the segment and fix the current process's mappings - malloc() memory and save the pointer to return to the app [this allocation must succeed because the segment has just been grown, and we're executing in a critical section. the allocation algo must be smart enough NOT to satisfy concurrent mallocs(), not protected with this mutes, from the largest available chunk, if there are other free chunks] - signal other processes to fix THEIR memory mappings - unlock the mutex - wait on a barrier [when this returns, all of the processes will have updated their mappings] - return the new memory chunk to the user; now it's safe to "leak" out data about newly mapped memory to everybody else Signaling processes must be done asynchronously and might be done in at least two ways: - POSIX message queues with event notification (thread/signal) - storing (address, length) of the new mapping in the initial SHM chunk and sending a signal to all of the other processes Signal processing routine [asynchronously invoked; maybe even while SHM malloc() was executing]: - get the (address, length) sent by the "first" process, and fix own mapping - on failure -> bad luck; design some error handling - on success -> wait on the barrier - exit the signal handler As you already mentioned, this requires significant cooperation, but this is written from the perspective of a SHM malloc() routine, so it can be hidden from the program in a library..

Joel FALCOU

10:06 a.m.

New subject: Boost::MPI and/or Boost::Serialize on exotic platform

Dear members, I'm currently developping some code for automatically deploy code onto various parallel architecure and stumbled upon the following design problem. I need to be able to send an arbitrary complex data structure (or array of data srtuctre) through some parallel communications services. As my tools aims for simplicity of use, I use a programming model in which all transfered data are marshalled into a some serializable form. Using boost::MPI allowed me to use Boost::serialize to ake care of this problems, solving it for the classic MIMD cluster platform. I managed to adapt this for multi-core/multi-processor by using some concurrent queue to simulate message passing interface between threads. My next problem is to port this tools onto the Cell Processor. On one hand, i got all the DMA mumbo-jumbo cleared out into a tight template library that make them MPI call. Now my question is the following. Considering I can't use STL or any compiled boost library on the Cell SPEs due to memory constraints, which is the best solution to use or reuse boost:mpi or boost:serialize to be able to build a function that take a serializable class, marshall it into a raw byte array so I can pass it to my DMA transfer functions ? Is there any internal of either boost::mpi or serialization that can be accessed and used ? Thanks in advance. -- Joel FALCOU Research Engineer @ Institut d'Electronique Fondamentale Université PARIS SUD XI France

Ion Gaztañaga

8:48 p.m.

Zeljko Vrba wrote:

...

On Fri, Apr 11, 2008 at 10:10:28PM +0200, Ion Gaztañaga wrote:

...
it's possible without the collaboration and processes. It's in the to-do list, but I don't see how could I implement it.

POSIX shared memory segments can be grown: enlarge the underlying file with ftruncate(), then mmap() the new chunk with MAP_FIXED flag at the end of existing mapping; this may naturally fail in which case the application is out of luck. (It may help if the initial mapping's starting address is chosen smartly in a platform-specific way.)

hat should we do if

...

As for growing it atomically, two things are important to note:

1) there's always the first process that wants to enlarge it 2) the other processes first must get some knowledge about segment growth -- this does not happen magically, but is somehow transmitted by the process that has grown the segment

Ok. One process takes a lock tries to introduce new elements so increases the segment and adds more elements in a shared memory list. Then unlocks the mutex. New elements allocated in the list can be introduced in the front of the list. Other processes, lock the mutex and traverse the list and crash. How can we stop all processes to notify them that the mapping should be increased? I can guess that we could catch SIGSEV, see if new mappings have been added in a shared segment information and try to get new mappings map them again, and retry the access. But this is easier to say than to write.

...

2) relies on the assumption that a correct program can't know about address X until it has malloc()'d it. With multiple processes this assumption is extended to that a process can't know about address X until _some_ process has malloc()'d it _and_ communicated it to others.

With malloc all threads have atomically mapped that memory in their address space because they shared the address space. Once one threads successes doing it, all threads have succeed. Doing this with shared memory is a lot more difficult.

...

So, for N processes, have in the initial mapping: [...]

I think I can find some weaknesses to this process, but I'll try to think a bit more about it. Regards, Ion

Zeljko Vrba

13 Apr 13 Apr

5:54 a.m.

On Sat, Apr 12, 2008 at 10:48:01PM +0200, Ion Gaztañaga wrote:

...

Ok. One process takes a lock tries to introduce new elements so increases the segment and adds more elements in a shared memory list.

Where did the memory from the new element come from?

...

Then unlocks the mutex. New elements allocated in the list can be introduced in the front of the list. Other processes, lock the mutex and traverse the list and crash. How can we stop all processes to notify

The one process that allocates the memory for the new list element in its shm_malloc() will do (roughly, some steps omitted from the previous post): - take the dedicated mutex - grow and map the segment in its own private address space - allocate the chunk of memory again (this must succeed now!) - asynchronously interrupt other processes - unlock the mutex - wait on the barrier - return the pointer to the application The application will not get the pointer to the new memory chunk which will be inserted into the list before all other processes have waited on the barrier, i.e. have mapped the new SHM portion of memory. Whether the element is inserted at the beginning or the end of the list is irrelevant -- no insertion will take place until the mapping has been performed by everybody.

...

With malloc all threads have atomically mapped that memory in their address space because they shared the address space. Once one threads successes doing it, all threads have succeed. Doing this with shared memory is a lot more difficult.

The mapping does not have to be atomic; i.e. the provided user-space primitives are sufficient to make the mapping appear atomic as long as the assumption 2) holds.

Ion Gaztañaga

8:46 a.m.

Zeljko Vrba wrote:

...

The one process that allocates the memory for the new list element in its shm_malloc() will do (roughly, some steps omitted from the previous post):

- take the dedicated mutex - grow and map the segment in its own private address space - allocate the chunk of memory again (this must succeed now!) - asynchronously interrupt other processes

Ok. You need to register all processes attached to one particular segment the segment somewhere. This imposes some reliability problems because a process might crash when doing another task than using the segment. An asynchronous notification via signal does not carry enough context (sigval from sigqueue onlyl stores an int or void*) to notify which growable shared memory segment should other processes remap. This would require generating an unique id from the shared memory name. A process can have more than one growable shared memory segment. And if that does not discourage you from implementing this, there is no much you can do inside a signal handler. You can see a list here: http://www.opengroup.org/onlinepubs/000095399/functions/xsh_chap02_04.html#t... This means that you can't call mmap from a signal handler. You can't remap memory asynchronously according POSIX. It's possible that some OSs support that. If remapping is possible, a more correct and robust mechanism could be catching SIGSEGV from processes that have not updated their memory mappings and doing some remapping with some global growable segment list stored in a singleton (this has problems when using dlls). Less interprocess communication means more reliability. Regards, Ion

Zeljko Vrba

2:31 p.m.

On Sun, Apr 13, 2008 at 10:46:42AM +0200, Ion Gaztañaga wrote:

...

Ok. You need to register all processes attached to one particular segment the segment somewhere. This imposes some reliability problems because a process might crash when doing another task than using the segment.

Yes. I used a separate "bootstrap [shm] segment" to hold all global bookkeeping data. Actually, for this particular purpose, you don't even need it: you can use the SHM segment itself to hold a list of processes attached to it. (Unfortunately, there's no POSIX API to get the list of processes attached to a particular SHM -- most probably because such information is volatile and potentially already worthless at the time you get to use it.) Regarding reliability: a process can crash at any time for any cause; introducing error-free SHM grow code (however it's implemented) will not make the program crash more or less frequently or introduce some new failure mode. What _can_ happen though is that a process crashes and remains registered as having the segment attached. The same problem occurs also when handling SIGSEGV to make the remapping. Since a dead process may be replaced by a random process with the same PID, sending an asynchronous notification to that process may do unpredictable things -- most likely, terminate it [the default action for most signals]. So the reason for handling SIGSEGV and other fatal signals would NOT be to remap segments, BUT to deregister the process from the SHM manager before terminating it. This is again only a half-solution because the process may be terminated by other signals that it doesn't handle, and most definitely with SIGKILL which can't be caught. Potential solution would be to have all cooperating processes have a common parent controller -- thus, when the process dies, it will remain in the zombie state, and since parent will be coded to NOT call wait() and not to exit until all childs have exited (SIGCHLD), this will prevent the reuse of PIDs. This parent controller could than deregister the process from its SHM segments, and finally wait() for it after everything has been cleaned up. Where (at which level of complexity) to stop, depends on the needs - but the solution _can_ be made very reliable and portable. (An extremely simple solution that doesn't require controller process and won't kill random processes: just run the cooperating processes under a dedicated user ID.)

...

An asynchronous notification via signal does not carry enough context (sigval from sigqueue onlyl stores an int or void*) to notify which

That would be enough with a bootstrap segment that contains a list of all SHM segs managed by boost. The you just send the offset/pointer into this segment.

...

And if that does not discourage you from implementing this, there is no much you can do inside a signal handler. You can see a list here:

http://www.opengroup.org/onlinepubs/000095399/functions/xsh_chap02_04.html#t...

This means that you can't call mmap from a signal handler. You can't remap memory asynchronously according POSIX. It's possible that some OSs support that.

Good point. You still have two choices: 1. You ignored the possibility of sending a message through a POSIX msgq with SIGEV_THREAD notification (see mq_notify()). 2. Have a dedicated signal + dedicated thread in each process to catch it (see sigwait()). [All other threads shall block this signal.]

...

If remapping is possible, a more correct and robust mechanism could be

Correct according to which specification?

...

catching SIGSEGV from processes that have not updated their memory mappings and doing some remapping with some global growable segment list stored in a singleton (this has problems when using dlls). Less interprocess communication means more reliability.

Far from it: that very same URL says the following: "The behavior of a process is undefined after it returns normally from a signal-catching function for a [XSI] SIGBUS, SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(), [RTS] sigqueue(), or raise()." This applies to what you have just proposed. Furthermore, this venue shall lead you into a mess of platform-specific code: please see GNU libsigsegv. Anyway, a line has to be drawn somewhere: perfection is the worst enemy of good enough. Why should a library ensure its correct operation, when the client program breaks its preconditions? It's a tradeoff between being clean and having stronger preconditions (my approach), or relying on undefined behavior with weaker preconditions (trying to compensate for broken programs). A question: let's say that you have a situation like this: [SHM segment] [unmapped memory] ^ X A program generates SIGSEGV at address X. How are you going to design a *roubust* mechanism that can distinguish the following two cases: - true SIGSEGV (access through corrupt pointer) - SIGSEGV that should grow the segment? Note that there's a race condition: a program might make a true invalid access with corrupt pointer, but by the time that you've looked up the address, and found the nearest SHM segment, *another* process might have already grown the segment. Thus, instead of process being terminated, you will instead grow the faulting process's SHM mapping and let it go berzerk over valid data. Protecting the signal handler with a mutex/semaphore isn't enough: you'd need a way to *atomically* enter the signal handler and acquire a mutex/semaphore.

6342

Age (days ago)

6344

Last active (days ago)

List overview

Download

7 comments

4 participants

participants (4)

hgdhfdghsa＠Safe-mail.net
Ion Gaztañaga
Joel FALCOU
Zeljko Vrba