Re: [boost] [Review] Shmem (lomg)

9 Feb 2006

      Hi Pavel,
...
I vote to accept the library into Boost.
Good start! ;-)
...
What I do not like is described in details bellow.
Most of my objections fit into two groups:
1. The documentation is /very/ dense
   with information and may easily
   scare a reader.
Examples to play with and pictures/diagrams
   wold lessen the mental toll on reader.
I agree. I will try to add some diagrams.
...
2. (From someone who is not English native.)
Currently used names are hard
   to understand and remember and lack
   intuitive meaning.
Some words like "shared", "named"
   or "object" are overused
They are not my favorite but I couldn't find better ones! I want also to
be a relationship between the front-ends, since all offer similar
functions but each ones operates in a different type of memory
(named_shared_object, named_mfile_object...). Now we have to find a name
to an object that creates shared memory/memory mapped file and allows
allocation of named objects there. I'm open to change it.
...
__________________________________________________________
1. index.html:
"portable synchronization primitives"
==>>
"portable interprocess synchronization primitives"
Ok
...
Need for glossary: the first page mentions
several terms that are not really well defined:
* dynamic allocation
* segment
* named allocation
* base address (not eveyone may know)
Well, maybe it's that I can see it through the eyes of a newbie. I will 
try to define them (as accurate as I can).
...
I have uneasy feeling from these names:
* named_shared_object (vague for me)
* segment_manager (the word manager is overused)
* segment (too much reminds me the x86 segments)
* mixing of "object" and "class" together
Sorry, but I couldn't find better ones. Open to change if someone 
proposes good alternatives.
...
__________________________________________________________
3. quick_guide.html: as it was suggested
   a smart pointer should be used in the first example.
I am against adding new typedef into shmem, seeing
shared_ptr<void, shmem::deleter>
gives me immediate clue, while
shmem_ptr_whatever
is something one needs to look up.
Ok, I wanted to keep example simple, but I don't think a shared_ptr will 
hurt.
...
-----------------
The code snippets may have link to full
source code of an example next to them.
Especially for first few snippets this
will encourage novices to play and get
immediate feedback.
Good idea. Since they taken from real code it's not difficult.
...
Code snippet about retrieving named object
from shmem (the fourth): the need for
std::pair when extracting the object
should be explained in a comment.
Ok.
...
-------------------
The title:
  "Creating vectors in shared memory
  with different base addresses"
should be
"Shmem mapped to different base adresses?
   Standard STL containers cannot be used.
   Solution: shmem's own container set."
Long but self-explaining.
I find your phrase too long, but I will think in a better alternative.
...
-------------------
Just a pedagogical nit. The snippets show:
int main() {
   ...
   if (!...) {
      return -1;
   }
Depending on OS the value (-1)
would produce strange effects. Use 1.
Strange effects? I didn't know that. Which type of effects? I can change 
all return errors to 1.
...
-------------------
Nits for snippet #6:
* the name ShmemAllocator should be shmem_allocator_t
* alloc_inst  should be "my_allocator" or so
  (the word "inst" means something in Prolog/Mercury)
No problem.
...
__________________________________________________________
4. offset_ptr.html:the "offset_1_null_ptr" policy
   is sufficient for all purposes.
Since I cannot think of any use case where
   an other pointer policy is required I recommend
   to remove them. This should decrease code
   complexity and most importantly mental load
   on the user.
I know you were since the beginning against this. But I will try once 
more to resist. ;-)
...
__________________________________________________________
5. limitations.html: more hyperlinks.
  "References will only work if memory is ..."
the word "only" should be bold and red.
Bold will be enough!
...
__________________________________________________________
6. rationale.html: the text about containers repeats
   what was already said before.
While repeating helps one to remember better
   it is for the third time on last three pages
   (quick guide/limitations/rationale)
Right. I will try to say it only once.
...
---------------
The functions "create_with_func/open_with_func/..."
I still didn't got need for these (I am slow)
but there should be a leaf page with an example
and typical list of situations where this feature
is necessary.
This functions allow atomic initialization of objects the shared memory 
when connecting/creating, so that two processes can create AND 
initialize shared memory without race conditions. The alternative would 
be to lock a external named mutex. It is used, for example, in the 
shared_message_queue, if two processes open_or_create the message queue, 
the creator, apart from creating the shared memory segment, will 
initialize queue members safely.

Others have requested similar functionality for named_shared_object, so 
they can create the segment and atomically create some named objects. 
It's a matter of taste I think.
...
---------------
Huge (> 4 GB) files and mapped_file class:
I am not sure what size_t is
on current 64 bit systems.
To be absolutely, positively, safe I would use
fileoff_t everywhere instead of size_t.
I use size_t when referring to memory since size_t is enough to point 
all memory range. When referring to file offsets, I use fileoff_t. I 
think this is safe (if there is any 64 expert here, please correct me if 
I'm wrong).
...
-------------------
(Absolutely uninformed guess follows.)
If process A crashes in the middle
of shmem operation (I expect this
to happen during development), will
all the mutexes/shared memories/etc
be destroyed automaticlalyy after
process B is closed?
Good question. The answer is... it depends. I can't guarantee cleanup 
when a process crashes, because I can't register rollback functions in 
C++. In Windows, the shared memory is automatically released by the OS. 
In Unix the file-like interface does not do this. And this is really 
annoying. I haven't found solution for this (yes I could handle all 
signals including SIGSEGV, but I would let UNIX signals unusable). If 
any POSIX expert can help I would appreciate.

This is, in may opinion, the weakest point of the library. The behavior 
of the the shared memory is correct when all goes well. But when a 
process crashes, I can't do anything.
...
In second case, could some API be added
to destroy any named remnants on explicitly?
I think this is a good idea, so that we can make more robust 
applications. I would need to register in a process-level map (a static 
object perhaps all objects when I create them. However, this 
singleton-like interface can create problems if we create static named 
objects (when we will have a good singleton in C++?)
...
--------------------
Use of term "process-shared" and "named".
It is confusing (for me) and it took
me a while to get it.
I suggest to add short info on the top
of the page describing distinction between
these two approaches in tabular form.
Ok.
...
---------------------
Generally, this page (oswrappers.html)
is /extremely/ dense with information.
It covers stuff that is taught
during whole university semester.
I know, but I don't pretend to explain these concepts. I already suppose 
the programmer knows something about them.
...
At least reader should be warned
on the top of page that lot of time
is needed to grok it.
Ok.
...
__________________________________________________________
9. named_shared_object.html:
Perhaps a mini-synopsis with just
template parameters and few most
important functions (or groups of functions)
could be shown and only then followed by
the full class.
Ok.
...
-----------------
Title:
"Common named shared object classes"
==>
"Default specialisation of class doing named shared memory allocation"
I will try to find an alternative.
...
I do not like the group "object classes" and
do not like that something with "object"
in name is a class.
I think this should be discussed.
I agree. But I run out ideas some time ago.
...
__________________________________________________________
9. stl_allocators.html:
A very dense page. A table on the top comparing
advantages of each allocator type would help.
Ok
...
A link to header(s) implementing the allocators
should be present so reader could jump there
quickly.
Ok
...
__________________________________________________________
10. containers_explained.html:
An overview of Boost libraries compatible
with shmem couled be added (I suspect multi-index
is compatible, while say smart pointer generally isn't).
Sorry, but I'm not aware of any.
...
The last code snippet may be better if
separated into smaller and focused parts
like "all in shmem", "nothing in shmem",
"whatever inbetween" and so on.
Ok
...
__________________________________________________________
12. beyond_shared_memory.html:
This name suggests super-advanced functionality
far beyond needs or abilities of anyone.
Any name suggestion?
...
The relevation of memory mapped files feels
too late, at this moment. Too many readers won't
get that far or will skip the "beyond".
You are right. Maybe I should say it just after named_shared_object 
explanation, since you have basically the same features.
...
__________________________________________________________
13. streams.html: there could be possibly overlap with
    Boost.Iostreams library.
I don't know Boost.Iostream well. Anyway, I think bufferstream and 
vectorstream are very general tools that could replace many scanf/printf 
functions of C++ code alergic to stringstream overhead.
...
__________________________________________________________
14. shmem_smart_ptr.html:
For examples I suggest to always use full qualification
for boost::shmem::scoped_ptr (and the other one as well).
Ok
...
__________________________________________________________
15. shared_message_queue.html:
The mention of named_user_object should have backlink
and a comment that this means user-buffer-related tool.
------
There's no explicit info what happen if
the buffer-is-too-small error happens:
will it read part of message, will it
stay in queue or will it be discared?
It will stay in the queue.
...
This and info whether message can get fragmented
should be added.
Message won't never be fragmented. I will write that.
...
The double contruction/initialisation steps
may be merged into one (no strong opinion,
just a feeling).
I will add it as exception throwing alternative.
...
------
Functions like: how-big-is-the-next-message()
or peek() may be added.
Ok. With "peek" you mean copying the message but without extracting from 
the queue?
...
The "MyLess" in the example is not used consistently,
should be removed for clarity.
The names like "mg1", "mg2" should be longer.
Ok.
...
-----
Information on atomicity of the queue should
be told explicitly inside introduction paragraph.
Ok
...
__________________________________________________________
16. architecture.html:
Diagrams picturing the levels of the library and
interaction between the levels would be handy.
Ok
...
----------
Information how lifetime of OS resources is managed
should be here (especially for Posix).
Ok
...
__________________________________________________________
19. future_improvements.html
Security attributes - IMHO adding them is
almost trivial (I know, I know) and they should
be added before the library is thrown to public.
But how will you unify POSIX/Windows security attributes? I hard issue, 
in my opinion.
...
__________________________________________________________
20. All shmem exception should have one common parent.
Name sem_exception is too short.
lock_exception should be deadlock_exception.
Lock exception is named after boost::thread name.
...
__________________________________________________________
21. Wishes for additional functonality:
* ability to enumerate named from shmem.
  Useful for debugging/troubleshooting.
I could fill (atomically) a vector with info about types. I will look at 
this.
...
* debugging support:
Ok, but this is quite a big task. I would like to implement them in 
future versions of Shmem.
...
* primitive "transaction like" functionality
  for shmem.
Scenario:
   - shmem segment exists
   - I do copy (clone) of the shmem data
   - many operations with shmem are executed
   - something may fail but ability to revert into
     well-defined state (in application sense,
     not in low level C++ sense) is impossible
     at this moment
   - then the stored copy of shmem will be written
     back into shmem segment, restoring the initial,
     well defined state.
Uf. I think this is beyond my knowledge!
...
* the current documentation should visually
  distinguish parts showing problems, mistakes
  and potentially wrong uses of shmem.
Picture of a bomb is typically used
  for this purpose.
Ok.

Thanks for the review! I've left some comments out of the reply, but I 
think you will happy if I address "only" those from above. It's just 
that replying your reviews is a very hard work! ;-)

Regards,

Ion

Re: [boost] [Review] Shmem (lomg)

Ion Gaztañaga