Re: [boost] [filesystem] path thread safety fix impact on POSIX systems

30 Dec 2011

      On 30.12.2011 14:48, Beman Dawes wrote:
...
Class path locale initialization has suffered from a data race for
several releases.
See https://svn.boost.org/trac/boost/ticket/6320 for an example of
code that suffers as a result.
The problem was introduced when locale initialization was changed from
namespace scope initialization to function scope initialization. For
Windows and Mac OS X, the fix is simply to change back to namespace
scope initialization.
For non-BSD based POSIX systems such as Linux, the problem is more
complex. These system need std::locale(""), "the locale-specific
native environment". Considerations:
* std::locale("") will throw if environmental variables are configured
incorrectly. For example, setting LANG=foo on my Ubuntu system causes
std::locale("") to throw.
* std::locale("") is only needed if conversions between wide and
narrow character paths occur in the program, so it would be
unfortunate to have programs throw that don't actually do any such
conversion.
* With GCC, std::locale("") at namespace scope will throw before
main() has started! That prevents catching the exception in the user
code, and was what led to moving the initialization to a function
scope static. Initialization as a function scope static also meant
that the exception only occurred if user code actually performed wide
- narrow conversions.
I can see two possible fixes:
(A) Use function scope locale initialization, using
boost/detail/lightweight_mutex.hpp to prevent data races.
(B) Use namespace scope locale initialization, defaulting the codecvt
facet to UTF-8 if std::locale("") throws.
The advantage of (B) is that path always initializes without throwing,
and that's what users seem to expect. The initialization is correct
for all those whose environments are configured correctly, and for
those uses who want UTF-8 even if their environments are
misconfigured. The POSIX users who prefer an exception on a
misconfigured environment can always add a std::locale("") at the
start of main().
The problem with solution (B) is IMHO not that it lies, but that it 
/covers up/ a problem. The problem -- misconfiguration -- is still there 
but the user is made unaware of it. That's ungood.

So I would favor (A). The problem with that is then efficiency, or 
perceived inefficiency. But so what.

I say, go for correctness, and don't fret about the nano-efficiency. It 
could be different if the question was about some new clean thing, then 
it would warrant some redesign (mutable globals in the age of 
multi-processing isn't that bright an idea, really). But for just 
supporting the old unclean stuff  --  don't fret about the nano-efficiency.

Cheers,

- Alf

Re: [boost] [filesystem] path thread safety fix impact on POSIX systems

Alf P. Steinbach