[filesystem] path thread safety fix impact on POSIX systems

Class path locale initialization has suffered from a data race for several releases. See https://svn.boost.org/trac/boost/ticket/6320 for an example of code that suffers as a result. The problem was introduced when locale initialization was changed from namespace scope initialization to function scope initialization. For Windows and Mac OS X, the fix is simply to change back to namespace scope initialization. For non-BSD based POSIX systems such as Linux, the problem is more complex. These system need std::locale(""), "the locale-specific native environment". Considerations: * std::locale("") will throw if environmental variables are configured incorrectly. For example, setting LANG=foo on my Ubuntu system causes std::locale("") to throw. * std::locale("") is only needed if conversions between wide and narrow character paths occur in the program, so it would be unfortunate to have programs throw that don't actually do any such conversion. * With GCC, std::locale("") at namespace scope will throw before main() has started! That prevents catching the exception in the user code, and was what led to moving the initialization to a function scope static. Initialization as a function scope static also meant that the exception only occurred if user code actually performed wide - narrow conversions. I can see two possible fixes: (A) Use function scope locale initialization, using boost/detail/lightweight_mutex.hpp to prevent data races. (B) Use namespace scope locale initialization, defaulting the codecvt facet to UTF-8 if std::locale("") throws. The advantage of (B) is that path always initializes without throwing, and that's what users seem to expect. The initialization is correct for all those whose environments are configured correctly, and for those uses who want UTF-8 even if their environments are misconfigured. The POSIX users who prefer an exception on a misconfigured environment can always add a std::locale("") at the start of main(). Comments or opinions? --Beman

On 30.12.2011 14:48, Beman Dawes wrote:
The problem with solution (B) is IMHO not that it lies, but that it /covers up/ a problem. The problem -- misconfiguration -- is still there but the user is made unaware of it. That's ungood. So I would favor (A). The problem with that is then efficiency, or perceived inefficiency. But so what. I say, go for correctness, and don't fret about the nano-efficiency. It could be different if the question was about some new clean thing, then it would warrant some redesign (mutable globals in the age of multi-processing isn't that bright an idea, really). But for just supporting the old unclean stuff -- don't fret about the nano-efficiency. Cheers, - Alf

On 12/30/2011 02:48 PM, Beman Dawes wrote:
From what you're saying, function-scope static initialization is only required on Linux. Then you say that the problem is that function-scope static initialization is not thread-safe. But function-scope static iniitalization is thread-safe with conforming C++11 compilers, and has been thread-safe with GCC and other compilers that follow the Itanium ABI for a fairly long time. So the problem seems to be non-existent.

On Fri, Dec 30, 2011 at 12:44 PM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
That seems the easiest and most robust solution, particularly if class path continues to throw on locale("") errors.
What about other compilers on Linux, like Intel and Clang? Any clue? --Beman

On Sat, Dec 31, 2011 at 4:00 AM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
That really shouldn't be necessary. Also, I'm trying to hold down dependencies on other libraries, or anything that is C++11 only. --Beman

On Saturday, December 31, 2011 18:20:54 Beman Dawes wrote:
Call once mechanism is easy to implement on top of pthread_call_once and atomic_count (or only atomic_count, if we rely on the fact that namespace- scope dynamic initializations are thread-safe). That won't introduce any dependencies except for pthreads, which you depend on already I assume.
participants (4)
-
Alf P. Steinbach
-
Andrey Semashev
-
Beman Dawes
-
Mathias Gaunard