[review] fixed strings library

Here is my quick review in hope to throw in some fresh thoughts... - What is your evaluation of the potential usefulness of the library? Strings with internal buffer are quite useful and it's plainly wrong to say they aren't IMO. The "eliminate buffer overrun"-motivation is not generic enough for my taste, though. Putting fixed_stringS into STL containers (which in turn could use a custom allocator, e.g. Boost.Pool) can significantly reduce the allocation costs (and would be my no.1 use case for such a library). Thinking the above use case a bit further I would also want a cached hash value (invalidated when the string gets modified). AFAICS I would currently have to overload the whole non-const interface to implement this behaviour... Brings me to: - What is your evaluation of the design? The scope of the library should be widened to provide a policy-based general purpose strings implementation. There are a lot of enhancements that can be made in terms of strings (and memory management) -- where to but the buffer, whether the buffer has a fixed size and what to do if a string gets larger than some limit are only three aspects of it. Lots of interesting specializations come to my mind, such as a variable sized string type built upon a static buffer plus an optional dynamic buffer, intrusive hashing (as mentioned above), copy-on-write, immutable string handles, an SGI-ropes-style backend, etc (not that all these features must be implemented by the library, but the extension points for them should exist). - What is your evaluation of the documentation? Jumping from the introduction straight to *_impl stuff (BTW. I don't like these names to be part of a public interface) seems like a pretty rough transition... Generally the documentation doesn't read as smooth as the documentation of other Boost libraries (based on subjective feeling), but after other reviewers' comments have been addressed I believe it will so I won't go into further detail here. - Did you try to use the library? With what compiler? Did you have any problems? - What is your evaluation of the implementation? Didn't use the library, yet and didn't do enough source reading to judge the implementation. - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? A glance plus reading some of the discussion around the review. - Are you knowledgeable about the problem domain? Security? Not really. - Should the library be accepted as a Boost library? Generally I welcome components that increase the control over memory access/management. I think the functionality provided by the library is Boost-worthy but as a policy of a more general solution. I'm confident the author can provide it, so I vote for another review when it's done ( *duck* ;-) ). Regards, Tobias

Tobias Schwinger wrote:
Putting fixed_stringS into STL containers (which in turn could use a custom allocator, e.g. Boost.Pool) can significantly reduce the allocation costs (and would be my no.1 use case for such a library).
have you tried to benchmark fixed_string compared to std::string with the short string optimization? is fixed_string really faster? -Thorsten

Thorsten Ottosen wrote:
Tobias Schwinger wrote:
Putting fixed_stringS into STL containers (which in turn could use a
custom allocator, e.g. Boost.Pool) can significantly reduce the allocation costs (and would be my no.1 use case for such a library).
have you tried to benchmark fixed_string compared to std::string with the short string optimization?
is fixed_string really faster?
I haven't done any benchmarking, but plan on doing so for the next review cycle. - Reece

"Thorsten Ottosen" <tottosen@dezide.com> wrote in message news:drraqn$pok$1@sea.gmane.org...
have you tried to benchmark fixed_string compared to std::string with the short string optimization?
is fixed_string really faster?
-Thorsten
Thorsten, the biggest problem I have found with libraries that offer this optimization, and I am glad that VC++ 7.1's is one of them, is that the definition of "short" is non-configurable, except by changing a very important header file. Even if a change to that header file is made, there is no guarantee that the compiler actually reads the header file and does not have its contents hardcoded when an include directive for that header file is seen in source code, since the header file is a "standard" (i.e. included with the compiler) header file. For example, in VC++, IIRC, it is set to somewhere in the neighborhood of 15. I would much rather that they upped this value to 31 or 63, or somewhere thereabouts. Michael Goldshteyn

Michael Goldshteyn wrote:
"Thorsten Ottosen" <tottosen@dezide.com> wrote in message news:drraqn$pok$1@sea.gmane.org...
have you tried to benchmark fixed_string compared to std::string with the short string optimization?
is fixed_string really faster?
-Thorsten
Thorsten, the biggest problem I have found with libraries that offer this optimization, and I am glad that VC++ 7.1's is one of them, is that the definition of "short" is non-configurable, except by changing a very important header file. Even if a change to that header file is made, there is no guarantee that the compiler actually reads the header file and does not have its contents hardcoded when an include directive for that header file is seen in source code, since the header file is a "standard" (i.e. included with the compiler) header file.
For example, in VC++, IIRC, it is set to somewhere in the neighborhood of 15. I would much rather that they upped this value to 31 or 63, or somewhere thereabouts.
maybe std::char_traits could container such a constant? -Thorsten

"Thorsten Ottosen" <tottosen@dezide.com> wrote in message news:drsf4g$hob$1@sea.gmane.org...
Michael Goldshteyn wrote:
Thorsten, the biggest problem I have found with libraries that offer this optimization, and I am glad that VC++ 7.1's is one of them, is that the definition of "short" is non-configurable, except by changing a very important header file. Even if a change to that header file is made, there is no guarantee that the compiler actually reads the header file and does not have its contents hardcoded when an include directive for that header file is seen in source code, since the header file is a "standard" (i.e. included with the compiler) header file.
For example, in VC++, IIRC, it is set to somewhere in the neighborhood of 15. I would much rather that they upped this value to 31 or 63, or somewhere thereabouts.
maybe std::char_traits could container such a constant?
-Thorsten
It's hard to say, but whatever the place chosen it should be compatible with similar extensions to other containers: - vector (i.e. how does it scale as the reserve is exceeded?) - deque (i.e. how can the page size be set by the programmer?) - hash_* (i.e. how can an initial hash table size be reserved? how does it grow when the reserve is exceeded?) So, in this light, perhaps char_traits isn't the right place, but I am not sure what is, short of adding another template parameter to specify a growth/initial_size policy. Michael Goldshteyn

Thorsten Ottosen wrote:
Tobias Schwinger wrote:
Putting fixed_stringS into STL containers (which in turn could use a custom allocator, e.g. Boost.Pool) can significantly reduce the allocation costs (and would be my no.1 use case for such a library).
have you tried to benchmark fixed_string compared to std::string with the short string optimization?
It isn't necessary, given that not all implementations of the standard library provide this optimization IMO. Even if they would: it is still easy to come up with a case where the assumptions the optimization is based on won't hold (as enough copying and a fast special-purpose allocator can make a string with SSO perform worse than without) ==> the more control, the better (applies not only to strings but to memory management in general). Regards, Tobias
participants (4)
-
Michael Goldshteyn
-
Reece Dunn
-
Thorsten Ottosen
-
Tobias Schwinger