[boost] Re: Boost Shmem-Shared memory library first archive available

23 Nov 2004

      Hello Ion,

"ION_G_M" wrote:
...
Thank you Pavel for your comments
My interest was raised because once upon a time
I implemented portable shared memory message queue
used for high-speed control system.

While I was quite proud of it, should I have
general purpose, working library I wouldn't waste
as much of time as I did back then.

My guess is that people would like very feature rich
(and still easy to use) shared memory module.

Maybe the best way to think about it is how it could be used.
I see few common scenarios:

1. As message queue between producer(s) and consumer(s),
   carying typed or untyped messages (with possible filtering
   of these mesages).

2. Appllication A starts, reads from shared memory,
    does something, updates it and quits.
   Some time later application B starts and does something
   else with the shared memory.

   The shared memory may actually rest in file
   in between.

3. Application 'A' is the main application and it exposes
    its internals for customization.
    'A' would  work unchanged without shared memory.
    Then there are many small applications who can touch
    this or that part of 'A' internals.

4. Quick and dirty and suboptimal form of disk persistence.
    (Post++ on http://www.garret.ru/~knizhnik/
     is vaguely similar - it uses caching).

__________________________________________
...
...
4. Example in 3.2: the "alignement" parameter in
  segment.create() isn't found in code.
I can't find the error you mention, I've just downloaded the zip file
and 3.2 does not have any example. In 4.2 segment.create is missing
a ",".
Oops, example in 4.1, the line with
   8                 /*alignment*/);

__________________________________________
...
...
5. Example in 4.2:
  segment.named_new<MyType>
     ("MyType instance",  /*name of the object*/
     10                  /*number of elements*/,
     false  /*if previously present, return error*/,
     0      /*ctor first argument*/,
     0      /*ctor second argument*/);
a.It returns false. Exceptions are used only to indicate memory errors
throwing bad_alloc. You are right there is info missing here. I will
add more documentation in examples.
If I understand it correctly the function acts "like constructor".
Then there are two ways to report error:
 - bool return value if there's something with shmem
- exception from actual object constructor

Possible handling of this can get messy.

[two functions instead of bool parameter]
...
b.If you find this approach more useful, I have no problem. I really
don't like the boolean parameter, but I wanted to have a "find or
create" functionality. If you like a find_or_named_new<>() additional
function to indicate that approach I find it more clear than with a
boolean parameters.
My complain is that in:

 segment.named_new<MyType> ("MyType instance",  10, false, 0 ,0);

the false doesn't give much of clue what is it all about.

[syntax with separated arguments]
...
c.The syntax you propose is better, no doubt. I don't have experience
with it so to implement this I suppose named_new<> should return an
proxy object with overloaded operator()() functions. Is that right? If
you want to help me I'm open.
Yes. Maybe the technique from object factory (library in Files section)
written by Robert Geiman could be used here.
...
If boosters prefer throwing exceptions
instead or returning false no problem here.
Me yes (explanation above).

It may be possible to create overload
   bool b = segment.named_new<MyType, std::nothow>(.....)
when one doesn't like/use exceptions.
...
A problem I see is that my
interface allows creating an array like new[]. Do you consider this
necessary? You prefer a different function? Maybe the proxy object
should have an operator[] that can be used to specify array allocation?
I would prefere separate function
   named_new_array<type>(array_count)(....)

It would save one parameter where it is not needed.

__________________________________________
...
...
std::pair< MyType *, std::size_t> res =
     segment.find_named_new< MyType > ("MyType instance");
Why do I need the "size"? Doesn't a type have
always the same size regardless?
Size contains the number of elements in case you allocate an array. with
segment.named_new<MyType>
     ("MyType instance",  /*name of the object*/
     10                  /*number of elements*/, ...
you allocate an array of 10 elements. so it will return 10.
Maybe separate functions could be used:

a)

type* = segment.find_named_new<MyType>(....)

which would thor if you ask for array

b)

type* = segment.find_named_new<MyType, std::nothrow>(....)

which would return 0 if nothing is found OR data are array.

c)

type* = segment.find_named_array_new<MyType>(....)
type* = segment.find_named_new<MyType, std::nothrow>(....)

with similar behavior

d)

bool = segment.is_array<MyType>(...)

Btw maybe the names could be
    find_named_object<...>(...)
etc.

__________________________________________
...
...
6. Could you use namespace shmem_detail or so
  instead of "detal" to avoid possible clashes?
No problem. I've seen detail namespace in some projects so I thought it
was not a problem. detail namespace is inside boost::shmem namespace so
it wouldn't be necessary. You find it necessary even if detail
namespace is really boost::shmem::detail?
My mistake. I had misread it as boost::detail.

__________________________________________
...
...
8. offset_ptr.hpp: full_offset_ptr class
b) The flag could be eliminated completely.
The offset indicates the distance between the pointee and the this
pointer of offset_ptr, so m_offset == 0 indicates a pointer pointing to
itself and this is quite common in STL containers when empty, since
next pointer in the end node points to end node, resulting in a
m_offset = 0. Obviously this is different from a null pointer. If I
change the meaning of m_offset to offset from the beginning of the
segment I need the base address stored somewhere (and the base address
is different in each process), so that I can convert from offset_ptr<A>
to A* using get() or the constructor.
Hmm, maybe something as (ptrdiff_t)-1 or if there's some symbolic
name for such value.

If one has a lot of pointers it could make difference (and it could be
also passed in one register as function parameter).

__________________________________________
...
...
9. Maybe the protection of mutext from shared ptr
  lib could be worked around
I don't understand your point.
My misunderstanding.

Maybe process-wide mutexes etc should be pushed
into Boost.Thread rather than here.

__________________________________________
...
...
12. The simple algorithm to find fitting memory
   block may not be adequate for high-performance
   apps (who are most likely to use shared
   memory).
You are right. The default algorithm is space-friendly, which I thought
it was more important than performance for fixed size segments. You can
write your own algorithm and use it since shmem_alloc is a typedef of
basic_shmem_alloc<default_algo>. If you prefer another algo like
segregated lists, I can try to write it, so that the user can choose
the allocation algorithm. I've written the pooled allocator due to
default algorithm slowness.
Maybe if the library has interface to plug in different
algorithm (no clue now how it would look like).

__________________________________________
...
...
13. Can be be possible to identify named objects
   with something else than C string? Say
   wchar_t* or numeral or other templated type?
Do you think that the key type should be templatized?
Well, this is Boost ;-)
...
I think that an integer key can speed up a lot searches but I have to
think about which classes should be templatized. When storing other
type of strings (for example std::string, I would need to build an
allocator for strings in shared memory but also a std::string since
it's probable that current STL won't work with Shemem STL allocator).
The key meaning would be different since right now, I copy the string
to shared memory but with configurable key type things are more
complicated.
I think it is not critical though nice to have.
Maybe there could be limitation on types allowed as keys.
const char* and integers should cover 90%.

__________________________________________
...
...
a) avoiding shmem specific containers/mutexes/etc
  as much as possible.
I think you can't avoid mutexes if you want to guarantee atomic memory
allocation, since I have no skills to write a lock-free memory
allocator.
This reminds me, newest STLport beta has lock-free allocator
inside (I know just this, no more details).
...
Regarding to containers, it was no my intention to write
them, but I needed some of them to store name-buffer mappings and a
node container to test the pooled allocator in several systems. For
now, I have only succeed using shmem STL allocators in a modified
Dinkum STL. STLport and libstc++ use suppose allocator::pointer to be a
raw pointer, so I can't use STL containers. I've chosen internal
containers to be public because I find them very useful, but if this is
not accepted they can be used only for internal uses and removed from
documentation.
Maybe the library could list those offending STLs and instruction how
to fix them manually to be useable with shmem. Such a fix should
not change behaviour of applications, right?

__________________________________________
...
...
b) ability to "equalize" shared memory features
I would need some help in this because my operating system knowledge is
very limited. Mimic-ing UNIX way in windows can be very difficult, I
think, unless you use a mapped file. I would need some serious help
here.
The solution I used was to have extra process keeping shared
memory alive on Windows. On Unix a process can periodicall
check shmem usage and destroy it if needed.

I guess the library could just have interface to plug in a helper
process, something like:
    errno = segment::use_helper_process(exe_filename);
Shmem could detect whether such process already runs
and exec() it if it doesn't. The process could shut itself down
when it finds it is no longer needed.

But I think such feature isn't very critical and could
be done outside library.

__________________________________________
...
...
d) support for "transactions": I would like to
My knowledge in transaction world is null so I can't help you with
that. I suppose that a shared memory condition variable should be very
interesting to notify events to other processes, but I'm afraid this is
a work for more skilled programmers than me (people from
boost::threads, perhaps?). As far as I know in windows is difficult to
implement a shared memory condition variable, pthreads-win32 does no
support it and I don't know how cygwin solves this.
What I mean is roughly this (just idea):

- transactions with isolation level 1:
    All changes to shmem are cached in normal memory.
    When user questions shmem he will get the cached
    not yet comitted data data (I think it should work
    because of offset_ptr<>).

    Effect of transactions from other apps would be visible.
    When transaction gets comitted all data are written at
    one into shmem. It should be possible to revert it back when
    somethink goes wrong (e.g. by saving old copy from shmem).

  - transactions with isolation level 2:
     The data from shmem are copied into buffer when transaction
     starts and are used during transaction duration.
     Otherwise it is the same as above.

Having transactions would add *very* useful feature
to shmem (for scenarios [2] and [3]),
feature quite hard to implement right.

It would be still possible for user to design his
own specific transaction mechanism.

Tables library (something as in-memory relational
database, in Files section), written by Arkadyi Vertelyeb
has transaction capability. Maybe its interface could
be reused.

__________________________________________
When I am at this, one can think about more features of shmem
  (if these things are technically feasible):

- C language binding (only the most primitive functionality
   as get_data_block/put_data_block). This would allow
   apps written in other languages interoperate
   with shmem (up to some point). Not everyone has C++
   compiler or is willing to use it or dare to use Boost.

- The shmem segment could have following functionality

   - fixed location/any location in memory

   - fixed size/expandable/expandable and shrunkable

  - keep when not used/not keep when not used
    (as suggested above)

   - function to compact data inside

   - function to report how much memory is available/max
     free block size

   - something as debug mode switched either at runtime
     or compile time - with sentinel guards, erasing freed memory etc.

     I guess commonly used memory checkers
     may not work with shared memory.

  -  create_shmem_from_file(const char* file, unsigned begin_offset,
             unsigned size)

  - get_shmem_OS_handle()

   - events to report shmem is getting exhausted:
     segment::report_exhausted(boost::function handler,
          unsigned high_threshold);
    Possibly similar mechanism as set_new_handler of new.

- debug mode where offset_ptr<> contains pointer to shmem
  segment and verifies it doesn't point outside.

- some data inside shmem may carry with itself complete
   information how to construct itself
   (DLL name + identification string for object factory there).

  Shmem could read this info, load DLL and return
  result to application. (lsomethinmg as primitive
  component system).

  It would (1) keep most of class functionality
  in just one place and (2) if data structures change
  it could keep applications running w/o upgrade.

/Pavel

[boost] Re: Boost Shmem-Shared memory library first archive available

Pavel Vozenilek