Re: [boost] lightweight_once (Was: Boost threadsafe singleton? - Inclusion in Boost)

27 Aug 2007

      On 8/26/07, Tobias Schwinger <tschwinger@isonews2.com> wrote:
...
Andrey Semashev wrote:
...
Hello Tobias,
Sunday, August 26, 2007, 1:37:17 PM, you wrote:
...
...
Hmm, I'm not sure of the purpose of this project. Is it supposed to
pass several tools under its umbrella to boost via fast-track review?
...
Sort of. It's just an idea, so far.
...
Its purpose is to avoid lots of fast-track reviews (and reviewing
overhead) for utility components by grouping them into a "pseudo
library", thus encouraging developers to brush up / factor out useful
stuff.
In this particular case the tool will be reviewed during the Boost.FSM
review (if it will, since it's not for public use anyway), so
including it in X-Files won't reduce the amount of reviews. On the
other hand, if Boost.FSM is rejected but there is interest to this
tool, I would gladly extract it to the X-Files project.
If your library gets accepted, LWCO is accepted as an implementation
detail. AFAIK you need at least a fast-track review to make it a public
thing (not sure that's what you want, though).
I don't like the idea of important and *exposed* items being 'accepted
as an implementation detail'.

An init_once isn't an easy thing to write.  When people review
Boost.FSM are they going to take a close look at the 'once'
implementation.  Probably not.  They are going to focus on the central
items related to the stated purpose of the library.
...
However, I'd at least very much welcome a test suite for LWCO.
Unless carefully reviewed, all threading code has bugs.  Even if
tested.  It is the nature of threaded code.  It is *extremely* hard to
test in such a way that all possible cases are tried.

I had a bug where we missed a read barrier in a lock-free allocator.
With 10-20 testers hammering on the product, the bug appeared about
once a month, if you left it running large projects overnight.

(And, after narrowing it down to which pointer was wrong it still took
hours of staring at about 4 lines of code to figure out what was going
on.)
...
...
...
...
Yes, but consider that this code will be executed only once. The rest
of the execution time this mutex is useless.
...
Consider the deadlock if 'once' is used recursively to initialize
different resources...
The mutex is recursive.
Sorry, missed it.
...
...
Further, it's quite unintuitive that a trivial initialization might get
slowed down by one in another thread that takes a lot of time.
...
You can call 'pthread_mutex_destroy' once you're done with the mutex to
free up eventually acquired system resources.
I'll think of it. The first thing that comes to mind is that I'd have
to count threads that are hanging locked in the mutex since destroying
it right away would leave those threads in undefined behavior.
It might be possible to use the flag for the counter...
...
...
...
The fundamental problem arises here - I need to safely create a
synchronization object. Non-POSIX APIs don't provide things like
PTHREAD_MUTEX_INITIALIZER or I didn't find them in the docs.
...
I see. Would it be an option to use 'yield' instead of 'sleep'?
The "yield" function is not guaranteed to switch execution context, it
may return immediately. If a lower-priority thread entered once
functor, you may spin for a relatively long time in a "yield" loop
instead of just letting the lower-priority thread finish its job.
I figured something like that. Does 'Sleep' guarantee preemption - or
does it depend on the argument and the resolution of the system timer?
Under Windows, Sleep(0) only relinguishes to threads of >= priority.
Sleep(1) will relinquish to any thread.
...
...
...
...
...
For some platforms (such as x86) memory access is atomic, so atomic
operations are just a waste of time for simple read/write operations as
the 'is_init' and 'set_called' stuff.
The point is not only in atomic reads and writes, but in performing
memory barriers too. Otherwise the result of executing the once
functor could not have been seen by other CPUs.
...
Then the memory barriers will suffice for x86, correct? As this code is
executed on every call, any superfluous bus-locking should be avoided.
Actually, I got the impression that barriers themselves do a major
deal of performance impact. Besides, not all compilers support barrier
intrinsics.
...
Alternatively, doing an "uncertain read" to check whether we might need
initialization before setting up the read barrier might be close enough
to optimal.
Bad wording on my side. Substitute "Alternatively" with "Additionally".
...
Well, that's a tricky point. I'm not an expert in threading issues,
but it's not obvious to me whether a memory barrier should act
regardless of its scope. For example:
void foo(int& x, int& y)
{
  if (x == 0)
  {
    read_memory_barrier();
    y = 10;
    x = 1;
    write_memory_barrier();
  }
// use y
}
Now, is it guaranteed that those barriers are in effect regardless of
x value? I think not. Either the compiler may reorder statements in
such way that y is used before the "if" statement, or the same thing
may be done by CPU since the barrier instructions may not be executed.
That's not quite what I meant:
// 'initialized' started being false
if (initialized)
    {
      // 'initialized' is true for sure
'initialized is true for sure, but, without a read barrier, we can't
be sure that the objects initialized are seen that way by the current
processor.

ie
           // initialized, so use object:
           object.foo();  // crash, because object not seen as initted
...
}
    else
    {
      // we can't know 'initialized' is still false, so let's
      // synchronize and check again
read_memory_barrier();
if (initialized)
      {
         // 'initialized' is true
      }
      else
      {
         // 'initialized' is false
      }
    }
Now we only have to cross the bsrrier during (and immediately after)
initialization.
Tony

Re: [boost] lightweight_once (Was: Boost threadsafe singleton? - Inclusion in Boost)

Gottlob Frege