
9 Sep
2006
9 Sep
'06
8:54 p.m.
On 9/3/06, Jeff Garland <jeff@crystalclearsoftware.com> wrote: > <snip previous discussion> > > This scheme addresses the performance problems noted in > > http://www.sgi.com/tech/stl/string_discussion.html for reference > > counted strings with unshareable state (like the g++ implementation), > > because now shareable/unshareable state is assigned by the user at > > compile time, and can be explicitly changed. In this way the user will > > know what to expect from the performance point of view, and the > > semantics will be (to my eyes) more clear. > > This seems like a reasonable approach for those that want to go the immutable > route....care to develop it into a full blown proposal? I developed my ideas into three concrete classes. I uploaded it to the boost vault as imm_string_and_builder.zip (under Strings - Text Processing). The tiny url to the zip file is http://tinyurl.com/hznt4 . I added a test file to compare the copy/access cost of my string implementations with std::string and const char *. >From my tests, splitting the string abstraction into immutable string and string builder allows a more efficient implementation. My tests results are: 1) compiling without threads configured (BOOST_HAS_THREADS undefined), copying an immutable string has a small overhead over plain const char * 2) compiling with BOOST_USE_ASM_ATOMIC_H, imm_string and string_builder always outperform std::string (I'm using the implementation delivered with g++ . 4.0.1). Overhead is large w.r.t. const char * (due to the high cost of lock operations on Pentium IVs), but passing a const & to imm_string is as fast as passing a const char * (not true for std::string). 3) using move semantics over string builders achieve the same performance of copying immutable strings (see below for the actual numbers) I'm interested in seeing the test results on other platforms/compilers. It is interesting to note that with this split of the abstraction in different classes, the thread safety properties becomes compatible with posix rules (that was my original goal): * accessing an object as read only from multiple threads is safe * even writing to different string builders areas (without changing their length) is MT-safe, as writing to a preallocated char [] I think that, if the upcoming standard will deal with threads, some thread-safe issues in the standard library will need to be addressed. I think that for string, these abstractions will be an useful starting point. Corrado Test results for BOOST_USE_ASM_ATOMIC_H, on a PentiumIV 2.8GHz with HT Baseline (const char *, without atomic count) Took 0.09 s on PKc Baseline (const char *, with atomic count) Took 0.94 s on PKc Pass by value Took 0.96 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE Took 2.45 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE Took 3.48 s on Ss // this is std::string Pass by reference Took 0.09 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE Took 0.08 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE Took 0.13 s on Ss // this is std::string Modified, Pass by value Took 1 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE Took 2.45 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE Took 3.78 s on Ss // this is std::string Modified, Pass by reference Took 0.09 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE Took 0.09 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE Took 0.13 s on Ss // this is std::string Leaked, Pass by value // Leaked state is peculiar of g++ strings implementations Took 1 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE Took 2.45 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE Took 2.56 s on Ss // this is std::string Leaked, Pass by reference // Leaked state is peculiar of g++ strings implementations Took 0.09 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE Took 0.09 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE Took 0.13 s on Ss // this is std::string Temp String, with copy Took 2.49 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE Took 2.49 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE Took 4.39 s on Ss // this is std::string Temp String, Move semantics Took 1.25 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE Took 1.03 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE > Jeff > _______________________________________________ > Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost > -- __________________________________________________________________________ dott. Corrado Zoccolo mailto:czoccolo (at) gmail.com PhD - Department of Computer Science - University of Pisa, Italy --------------------------------------------------------------------------