
I vote to accept the library into Boost. I have been following shmem development for long time and most of the issues I had were resolved. It covers important but so far neglected area. Potential for tools based on shmem is large. Ion has created extensive (1MB of sources) and powerful tool and I have full belief he will maintain and improve it futher. For practical reasons I do not recommend attempt to split the library into separate parts. What I do not like is described in details bellow. Most of my objections fit into two groups: 1. The documentation is /very/ dense with information and may easily scare a reader. Examples to play with and pictures/diagrams wold lessen the mental toll on reader. 2. (From someone who is not English native.) Currently used names are hard to understand and remember and lack intuitive meaning. Some words like "shared", "named" or "object" are overused I think naming conventions should be discussed as the most important issue of the library since shmem has real chance to establish de-facto standard for C++ IPC and process synchonization. /Pavel __________________________________________________________ 1. index.html: "portable synchronization primitives" ==>> "portable interprocess synchronization primitives" --------------- Need for glossary: the first page mentions several terms that are not really well defined: * dynamic allocation * segment * named allocation * base address (not eveyone may know) These terms should have link to glossary page when they are used for the first time, ala: <a ref="....">dynamic allocation</a> (<a ref="...#...">in glossary</a>) __________________________________________________________ 2. Naming: IMO the naming convention should be reconsidered. It is rather hard to intuitively feel difference between "named_shared_object" and "named_user_object" for example. I am not and English speaker so I don't dare to suggest alternatives but current names are quite confusing to me. Anyway, all naming conventions should be explained in glossary. I have uneasy feeling from these names: * named_shared_object (vague for me) * segment_manager (the word manager is overused) * segment (too much reminds me the x86 segments) * mixing of "object" and "class" together __________________________________________________________ 3. quick_guide.html: as it was suggested a smart pointer should be used in the first example. I am against adding new typedef into shmem, seeing shared_ptr<void, shmem::deleter> gives me immediate clue, while shmem_ptr_whatever is something one needs to look up. ----------------- The code snippets may have link to full source code of an example next to them. Especially for first few snippets this will encourage novices to play and get immediate feedback. -------------- Which reminds me: there should be a page listing all examples with short description what important and useful is in each example. It is fast and cheap way for learning and it is used e.g. in multi-index. ------------------ Code snippet about retrieving named object from shmem (the fourth): the need for std::pair when extracting the object should be explained in a comment. It is unexpected on the first sight. ------------------ The "offset_ptr" word should be linked to reference. Generally, the docs should be interlinked as much as possible. ------------------ An example with offset_ptr AND named object may be added to ensure the first time reader these tools are orthogonal. ------------------- The title: "Creating vectors in shared memory with different base addresses" should be "Shmem mapped to different base adresses? Standard STL containers cannot be used. Solution: shmem's own container set." Long but self-explaining. Medieval publishers titled their books in similar way and it sold well for centuries. ------------------- Just a pedagogical nit. The snippets show: int main() { ... if (!...) { return -1; } Depending on OS the value (-1) would produce strange effects. Use 1. ------------------- Nits for snippet #6: * the name ShmemAllocator should be shmem_allocator_t * alloc_inst should be "my_allocator" or so (the word "inst" means something in Prolog/Mercury) __________________________________________________________ 4. offset_ptr.html:the "offset_1_null_ptr" policy is sufficient for all purposes. Since I cannot think of any use case where an other pointer policy is required I recommend to remove them. This should decrease code complexity and most importantly mental load on the user. __________________________________________________________ 5. limitations.html: more hyperlinks. ------------- In sentence "References will only work if memory is ..." the word "only" should be bold and red. ------------ Perhaps this page should be divided into two parts: * shmem is always mapped to the same address * shmem can be mapped to different addresses and for each cases limitations listed. __________________________________________________________ 6. rationale.html: the text about containers repeats what was already said before. While repeating helps one to remember better it is for the third time on last three pages (quick guide/limitations/rationale) __________________________________________________________ 7. concepts.html: should be named glossary and expanded. Some texts here sound strange: "...memory algorithm is an object..." Picture/diagram would help here. The bolding is overused. I feel uneasy about the term "front-end" but have no suggestion. __________________________________________________________ 8. oswrappers.html: "Base shmem classes" may be better name. a the classes may have name "XYZ_base". The word "basic" is also used in two docs pages later and it is very confusing (to me). --------------- The functions "create_with_func/open_with_func/..." I still didn't got need for these (I am slow) but there should be a leaf page with an example and typical list of situations where this feature is necessary. The code snippet should have link to this "more" leaf page. ---------------- Perhaps the construction/initialisation dichotomy for mapped_file is not needed. The class looks cheap enough to use constructor/desctructor for them. --------------- Huge (> 4 GB) files and mapped_file class: I am not sure what size_t is on current 64 bit systems. To be absolutely, positively, safe I would use fileoff_t everywhere instead of size_t. ------------------- (Absolutely uninformed guess follows.) If process A crashes in the middle of shmem operation (I expect this to happen during development), will all the mutexes/shared memories/etc be destroyed automaticlalyy after process B is closed? Or will reboot be needed to cleanup the system? In second case, could some API be added to destroy any named remnants on explicitly? This may be also useful for self-restarting applications. -------------------- Use of term "process-shared" and "named". It is confusing (for me) and it took me a while to get it. I suggest to add short info on the top of the page describing distinction between these two approaches in tabular form. Should I pick names I would use "XYZ placed inside shmem" and "XYZ identified by name". --------------------- Generally, this page (oswrappers.html) is /extremely/ dense with information. It covers stuff that is taught during whole university semester. At least reader should be warned on the top of page that lot of time is needed to grok it. Links to examples and class diagrams would help here. __________________________________________________________ 9. named_shared_object.html: The title may be: "Named user data places in shmem" or something what forms full sentence. I have trouble parsing the current name. ----------------- Seeing the whole class synopsis is a bit frightening experience. (a blob on two pages). Perhaps a mini-synopsis with just template parameters and few most important functions (or groups of functions) could be shown and only then followed by the full class. Other possibility would be SGI STL like documentation with table and detailed comments bellow the table. ----------------- Title: "Common named shared object classes" ==> "Default specialisation of class doing named shared memory allocation" ;-) I do not like the group "object classes" and do not like that something with "object" in name is a class. I think this should be discussed. ------------------ Links to examples and class diagrams would help here. ------------------ Instead of having boost::shmem::anonymous_instance I would prefere to have function construct_unnamed(...) or construct_anonynous(...) Similarly construct_unique(....) ------------------ The discussion of "index types" feels as something so advanced that it should be moved futher down in the docs. Anyway, current documentation too low on details and actual purpose. Missing is a /table/ comparing each type and where it shines. __________________________________________________________ 9. stl_allocators.html: A very dense page. A table on the top comparing advantages of each allocator type would help. A link to header(s) implementing the allocators should be present so reader could jump there quickly. I guess every allocator type is used in at least one of shmem containers: there could be link to show this as an example. __________________________________________________________ 10. containers_explained.html: An overview of Boost libraries compatible with shmem couled be added (I suspect multi-index is compatible, while say smart pointer generally isn't). ------------- The last code snippet may be better if separated into smaller and focused parts like "all in shmem", "nothing in shmem", "whatever inbetween" and so on. __________________________________________________________ 11. customizing_boost_shmem.html: This should be separated into three parts: algorithms, custom allocators and indexes. ---------------- The previous pages in the docs should have hyperlinks like "see how to build your own XYZ (very advanced)" pointing here. -------------- Generally, interlinking should be used much more. __________________________________________________________ 12. beyond_shared_memory.html: This name suggests super-advanced functionality far beyond needs or abilities of anyone. I think what is presented here are very useful tools that could be used separately from any IPC and synchronisation. Complete examples would be very helpful here, as well as careful wording that doesn't suggest need to use whatever "shared" (in OS sense). ---------- The relevation of memory mapped files feels too late, at this moment. Too many readers won't get that far or will skip the "beyond". The note about file mapping in initial parts of the docs should have interlink pointing here. ---------- I remember Pablo Halpern's proposal for new allocator (on comp.std.c++). One of the features was ability to 'move' objects passed into an container under allocator of this container. At the moment of writing I have only vague recollection but I think it could be possible (via shmem containers): * create complex data structures in specified memory areas (without the need to modify too much of user code). * being able to "(de)serialize" complex data structures. (As I said, these are vague ideas from the past.) __________________________________________________________ 13. streams.html: there could be possibly overlap with Boost.Iostreams library. At this moment I am not able to comment on it more. __________________________________________________________ 14. shmem_smart_ptr.html: A teaser in form of miniexample should be on the top of the page. ------------- For examples I suggest to always use full qualification for boost::shmem::scoped_ptr (and the other one as well). Danger of making mistake is too large. __________________________________________________________ 15. shared_message_queue.html: The mention of named_user_object should have backlink and a comment that this means user-buffer-related tool. ------ There's no explicit info what happen if the buffer-is-too-small error happens: will it read part of message, will it stay in queue or will it be discared? This and info whether message can get fragmented should be added. ------ The double contruction/initialisation steps may be merged into one (no strong opinion, just a feeling). ------ Functions like: how-big-is-the-next-message() or peek() may be added. I guess that people who will use shmem will be the ones who happily use such tricks. ------ The "MyLess" in the example is not used consistently, should be removed for clarity. ------- The names like "mg1", "mg2" should be longer. ----- Information on atomicity of the queue should be told explicitly inside introduction paragraph. --------- I wonder whether and how it would be possible to build 'objects' passing queue above shared_message_queue (subject of necessary limitations). I know such tool can get easily misused but it would be first step to implement Erlang-like message passing framework. __________________________________________________________ 16. architecture.html: Diagrams picturing the levels of the library and interaction between the levels would be handy. ---------- Information how lifetime of OS resources is managed should be here (especially for Posix). __________________________________________________________ 17. performance.html: Examples would be helpful. Examples that show a slow way as well as examples tuned for performance. __________________________________________________________ 18. open_issues.html Namespaces: I do not like idea of more than one namespace as it would only complicate already feature rich tool. If it is possible to implement (transparently to the user) a "type" checking then add it. __________________________________________________________ 19. future_improvements.html Security attributes - IMHO adding them is almost trivial (I know, I know) and they should be added before the library is thrown to public. __________________________________________________________ 20. All shmem exception should have one common parent. Name sem_exception is too short. lock_exception should be deadlock_exception. -------------- Possibly shmem::bad_alloc may inherit from both common parent and std::bad_alloc. This would make sense if release of "normal" memory could have positive effect on availability of "shared" memory. Just an idea. __________________________________________________________ 21. Wishes for additional functonality: * ability to enumerate named from shmem. Useful for debugging/troubleshooting. * debugging support: for example the first byte of shmem should be fixed constants and assert()s should be everywhere to check that user code didn't overwrote this. Possibly the metadata may have CRC stored and /always/ checked against. It may sounds as paranoia but multiprocess troubleshooting is such a pain that any measure that helps to catch a bug is justified. * in "debug" mode the shared memory areas may be surrounded by guard areas (possibly too hard) and memory of an deleted object may be filled with 0xCC. * primitive "transaction like" functionality for shmem. Scenario: - shmem segment exists - I do copy (clone) of the shmem data - many operations with shmem are executed - something may fail but ability to revert into well-defined state (in application sense, not in low level C++ sense) is impossible at this moment - then the stored copy of shmem will be written back into shmem segment, restoring the initial, well defined state. __________________________________________________________ 22. Wishes for docs: * a page named "What I can do with shmem" with one liner description and link to example or detailed docs. This should cover small practical tasks like using process wide mutex, adding data into shared queue etc. * the current documentation should visually distinguish parts showing problems, mistakes and potentially wrong uses of shmem. Picture of a bomb is typically used for this purpose. * possibly a visual separation could be used for "basic" and "advanced" topics or for levels like "implementation details", "basic functionality" and "higher level tools". * the docs may have section on typical (or expected) user mistakes, for example: - "what if I retrieve named object but use wrong type" * the docs should say whether it is possible to somehow block shmem from working properly. E.g. both processes crash and leave a named OS resource hanging out, restarted processes will fail to connect the resource or it will be damaged beoynd repair. To discover such corner cases later in the project may very unpleasant. * Pictures of inheritance diagrams could be relatively easily generated by Doxygen. On many places such pictures would come handy. __________________________________________________________ EOF