[shmem] error while closing the segment
Hi, I get an error when I try to close the shared memory segment. My
destructor calls simply mysegm.close(). I get the message below, but I can't
interpret it so that I can do anything with it. It happens only if before a
child process, that works in this segment too, is terminated erroneous and
subsequently a faultless run was done. But to close the segment should be
possible after errors too and at any time. When I restart my application
without deleting the segment by hand it hangs in the constructor and I can
wait forever, without any message.
Thanks a lot, Sascha
boost/shmem/detail/segment_manager.hpp:698: bool
boost::shmem::detail::segment_manager
Hi, The function has the following assert when trying to the destroy a named object by name: //Try to find the name in the index index_it it = index.find(key_type (name, std::char_traits<CharT>::length(name))); //If not found, return false if(it == index.end()){ //This name is not present in the index, wrong pointer or name! assert(0); return false; } This function can be called also when destroying an named/unique object by pointer, to erase the name entry from the map. That means that the code is trying to destroy a named object that does not exist (perhaps previously erased?) in the index that tracks named/unique objects. I just don't know why this function is called when you call close(). I would need more information. Could you just check if that destructor does anything apart from closing the segment (if any base or member has object destruction logic), and print the parameter "name" just before the assertion so that I can have more clues? Otherwise I'm afraid I can guess the cause... Also, could you tell me more about your platform/compiler? Regards, Ion
Ion Gaztañaga wrote:
I just don't know why this function is called when you call close(). I would need more information. Could you just check if that destructor does anything apart from closing the segment (if any base or member has object destruction logic), and print the parameter "name" just before the assertion so that I can have more clues?
Hi Ion,
yes, the destructor performs two destroy calls in front of the close
call (see below). But it should not give a damn about, destroyed is
destroyed. The call of close should only delete objects that still
exists. I've printed out the name of the named shared object which
triggers the error (it's "shmDefinition"). If I comment out the two
lines which destroy my named shared objects in the destructor, it works.
Hmm, I don't know why it not work when I destroy the objects by hand.
But definitely it should not be done twice.
Platform/Compiler: Linux version 2.6.17.1 (root@karo) (gcc version 4.0.4
20060507 (prerelease) (Debian 4.0.3-3))
Here is a short version of the class with con-/destructor:
13 class Myclass {
14 public:
15 Myclass();
16 ~Myclass();
...
27 private:
...
31 boost::shmem::named_shared_object segment;
32 typedef std::pair
Hi, I am using B.MI + shmem. You can do that by mapping the shared at a fixed address. So far, I am still in alfa version of my code, seems to work. I am waiting for a bug to be solved before proceeding any futher. Can post code if somebody wants...
yes, the destructor performs two destroy calls in front of the close call (see below). But it should not give a damn about, destroyed is destroyed. The call of close should only delete objects that still exists. I've printed out the name of the named shared object which triggers the error (it's "shmDefinition"). If I comment out the two lines which destroy my named shared objects in the destructor, it works. Hmm, I don't know why it not work when I destroy the objects by hand. But definitely it should not be done twice.
You are right. But this indicates that you somehow have tried to erase the same object (in this case ("shmDefinition")) twice (maybe another process?). It should assert when using a pointer (because otherwise that could corrupt all the segment) when deleting the object (a debug protection like when trying to "operator delete" the same ptr two times). However, when destroying by name, a bool is returned, saying if the object has been destroyed or not (well the only problem is to found the name in the name-object index). So it should return false in the second case, instead of asserting. The only problem is that both name and pointer delete operation uses the same code, and only one should assert. Thanks for finding this! Please, erase that assertion from your code. I will do the same in Sandbox-CVS, while I find some other solution. Regards, Ion
Hi, if somebody could give me a clue as to how to fix the bug about shmem giving a segmentation fault when dealing with large (>100MB) fixed named segments I would be most grateful. I have already tried to debug it but I am too unfamiliar with the code. Has anybody experienced anything similar? That's the only thing I need to solve to have everything working and move on... Thanks in advance
Hi,
if somebody could give me a clue as to how to fix the bug about shmem giving a segmentation fault when dealing with large (>100MB) fixed named segments I would be most grateful.
When are you getting the segmentation fault? When trying to create the segment? When connecting to it? 100 MB seems a big shared memory segment and all systems have a maximum shared memory segment size, Googling around, I've found the following commands: (http://www.unidata.ucar.edu/software/mcidas/2004/users_guide/workstation.htm...) Solaris: more /usr/sbin/sysdef | grep SHMMAX Linux: more /proc/sys/kernel/shmmax For example, googling around I've seen that Macos X has by default 4MB, HP-UX 64 MB, AIX 4.x 256MB. In linux I think it's 32MB. In windows, no idea. You should take in care that even if you are below the maximum, you must have free memory to create that segment in your system. I don't know if you are hitting your system maximum, (this should fail when creating or connecting to a shared memory segment) but most systems have a way to change this maximum. Regards, Ion
Hi Ion,
I have double checked that shm sizes and limits for my Fedora box are ok.
Thanks for the suggestion.
I've done some debugging and found the following:
1- You can create the segment ok
2- You can put data in it ok. I leave this process running in the
background.
3- I run a second process that reads the data from the shared:
3.1- The segment is opened ok (I noticed you don't check for errors the
return value of munmap)
3.2- The problem is at
segment.find<MyShmStringVector>("MyVector") from vectoGet (code below)
Returns a 0x0 pointer.
Why?
Line 630 of seg_manager.hpp sees 'it' as equal to 'index.end()'
Why?
When key_type for the index is created for some mysterious reason the values
'name' and name?s length are not stored correctly!!???
so index.find is comparing rubbish against the index. Therefore no
coincidence is found.
Notice you don?t have this problem with a non_fixed segment or if size is
66MB instead of 76MB. My shmem limit is 105MB.
I am completely lost!
code:
vectorGet.cpp
#include <vector>
#include
Ion, thank you for your reply. As you suggested mmap returns a different address than the one specified, behaving as a named_shared_object under the hood instead of a fixed_named_shared_object. I have used the MAP_FIXED flag for mmap with disastrous results (the man pages already discourage you to use this flag). The computer hanged, very strange behaviours, etc... It just forces a segment the size you want at the address you specify overwriting anything in it's way. Lots of fun as you can foresee... So as I see it: - You cannot store raw pointers in a shared memory even if you use fixed_named_shared_object. They all have to be offset_ptr always as it may not map where you want. - Documentation under 'Using STL containers and mapping the memory at a fixed address' should reflect this. A possible workaround, as you suggested, would be to somehow write down the address where the memory was mapped to so that other processes know about it. IMHO this is a named_shared_object in the end, so why use fixed_named_shared_object? I think that due to the inner behavior of mmap as long as you are aware that a fixed_named_shared_object is in fact an OCCASIONALY_fixed_named_shared_object everything sould be alright. Notice that the same binary run twice on the same box may map to a different address each time which may or may not be the one specified! And even if it returns false during creation of the segment you have already screwed your box overwritting protected memory areas... Do you agree or have I got the wrong end of the stick? (I use Linux Fedora Core 5)
Hi,
As you suggested mmap returns a different address than the one specified, behaving as a named_shared_object under the hood instead of a fixed_named_shared_object.
I have used the MAP_FIXED flag for mmap with disastrous results (the man pages already discourage you to use this flag). The computer hanged, very strange behaviours, etc... It just forces a segment the size you want at the address you specify overwriting anything in it's way. Lots of fun as you can foresee...
Very interesting. I thought that fixed mapping should always fail if there is no room for it. But I see that OpenGroup's mmap description (http://www.opengroup.org/onlinepubs/000095399/functions/mmap.html) states: "If a MAP_FIXED request is successful, the mapping established by mmap() replaces any previous mappings for the process' pages in the range [pa,pa+len)." "If an application requests a mapping that would overlay existing mappings in the process, it might be desirable that an implementation detect this and inform the application. However, the default, portable (not MAP_FIXED) operation does not overlay existing mappings. On the other hand, if the program specifies a fixed address mapping (which requires some implementation knowledge to determine a suitable address, if the function is supported at all), then the program is presumed to be successfully managing its own address space and should be trusted when it asks to map over existing data structures." "[ENOMEM] MAP_FIXED was specified, and the range [addr,addr+len) exceeds that allowed for the address space of a process; or, if MAP_FIXED was not specified and there is insufficient room in the address space to effect the mapping." So using MAP_FIXED is really dangerous unless you exactly know what you are managing your own address space. I think that Shmem should not offer that possibility, because it's very dangerous. If there is user request for this, I will add it as an option. To solve your issue, we could avoid using MAP_FIXED and pass the address as a suggestion. If the returned address is not the suggested address, I could unmap the segment and return an error. Since we avoid MAP_FIXED, you will have no risk to overwrite the address space of your shared libraries or heap. I will write a new version ASAP. Meanwhile, try to live with a smaller memory size. Adeu, Ion
# bblasi@jblasi.com / 2006-07-15 09:16:34 +0100:
Ion,
thank you for your reply.
As you suggested mmap returns a different address than the one specified, behaving as a named_shared_object under the hood instead of a fixed_named_shared_object.
I have used the MAP_FIXED flag for mmap with disastrous results (the man pages already discourage you to use this flag). The computer hanged, very strange behaviours, etc... It just forces a segment the size you want at the address you specify overwriting anything in it's way. Lots of fun as you can foresee...
What operating system was that? mmap fails if it cannot satisfy the request.
So as I see it:
- You cannot store raw pointers in a shared memory even if you use fixed_named_shared_object. They all have to be offset_ptr always as it may not map where you want. - Documentation under 'Using STL containers and mapping the memory at a fixed address' should reflect this.
A possible workaround, as you suggested, would be to somehow write down the address where the memory was mapped to so that other processes know about it. IMHO this is a named_shared_object in the end, so why use fixed_named_shared_object?
I think that due to the inner behavior of mmap as long as you are aware that a fixed_named_shared_object is in fact an OCCASIONALY_fixed_named_shared_object everything sould be alright. Notice that the same binary run twice on the same box may map to a different address each time which may or may not be the one specified! And even if it returns false during creation of the segment you have already screwed your box overwritting protected memory areas...
Do you agree or have I got the wrong end of the stick?
(I use Linux Fedora Core 5)
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991
Ion,
thank you for your reply.
As you suggested mmap returns a different address than the one specified, behaving as a named_shared_object under the hood instead of a fixed_named_shared_object.
I have used the MAP_FIXED flag for mmap with disastrous results (the man pages already discourage you to use this flag). The computer hanged, very strange behaviours, etc... It just forces a segment the size you want at
Roman, The OS is Fedora Core 5. Ion, Thanks for your time. I would still update the documentation so that nobody does as I've done trying to use a multiindex in a shared to find out it's not a good idea after all the hard work... Great lib though! Cheers. -----Mensaje original----- De: Roman Neuhauser [mailto:neuhauser@sigpipe.cz] Enviado el: sabado, 15 de julio de 2006 11:55 Para: Berenguer Blasi CC: boost-users@lists.boost.org Asunto: Re: [Boost-users] [shmem] Help with shmem and fixed segments # bblasi@jblasi.com / 2006-07-15 09:16:34 +0100: the
address you specify overwriting anything in it's way. Lots of fun as you can foresee...
So as I see it:
- You cannot store raw pointers in a shared memory even if you use fixed_named_shared_object. They all have to be offset_ptr always as it may not map where you want. - Documentation under 'Using STL containers and mapping the memory at a fixed address' should reflect this.
A possible workaround, as you suggested, would be to somehow write down
address where the memory was mapped to so that other processes know about it. IMHO this is a named_shared_object in the end, so why use fixed_named_shared_object?
I think that due to the inner behavior of mmap as long as you are aware
What operating system was that? mmap fails if it cannot satisfy the request. the that
a fixed_named_shared_object is in fact an OCCASIONALY_fixed_named_shared_object everything sould be alright. Notice that the same binary run twice on the same box may map to a different address each time which may or may not be the one specified! And even if it returns false during creation of the segment you have already screwed your box overwritting protected memory areas...
Do you agree or have I got the wrong end of the stick?
(I use Linux Fedora Core 5)
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991
Ion,
Thanks for your time.
I would still update the documentation so that nobody does as I've done trying to use a multiindex in a shared to find out it's not a good idea after all the hard work...
You need fixed mapping for that. Multiindex + Shmem would is a great combo for the future. Let's hope we can find time to make them compatible. Thanks for your bug reports, Ion
participants (4)
-
Berenguer Blasi
-
Ion Gaztañaga
-
Roman Neuhauser
-
Sascha Lumma