
Hello all, I've been working on a small library - an immutable C++ string. Now it is in usable state and has been put in use in a couple of real projects. Although it needs further polishing, I would like to know is there any interest in such a library? Here is the link: http://conststring.sourceforge.net/ -- Maxim Yegorushkin

"Maxim Yegorushkin" <e-maxim@yandex.ru> wrote in message news:opsejcddh2ti5cme@wkcg6rirwp...
Hello all,
I've been working on a small library - an immutable C++ string. Now it is in usable state and has been put in use in a couple of real projects. Although it needs further polishing, I would like to know is there any interest in such a library?
I am interested, but don't have time to examine it in detail now. I have a few comments and questions: 1. You might mention Java together with Python as a language with immutable built-in strings. 2. I don't understand the following statement: Unlike std::basic_string<>, empty string object references empty c-style string "". 3. What are the semantics of std::string str("abc"); h(boost::cref(str)); Best Regards, Jonathan

Jonathan Turkanis wrote:
[...] 2. I don't understand the following statement:
Unlike std::basic_string<>, empty string object references empty c-style string "".
I think this means that if the string contains a charT*, it will never be null (but rather point to a special empty string object). I don't see why the user would need to know that, unless it implies something special about begin() and end().
3. What are the semantics of
std::string str("abc"); h(boost::cref(str));
Without looking at the code, I would expect that the const_string<> would share a representation with str, thus implying that it had better not outlive str. Dave

"David B. Held" <dheld@codelogicconsulting.com> wrote in message news:ciidcc$4tl$1@sea.gmane.org...
Jonathan Turkanis wrote:
[...] 2. I don't understand the following statement:
Unlike std::basic_string<>, empty string object references empty c-style string "".
I think this means that if the string contains a charT*, it will never be null (but rather point to a special empty string object).
This is what I would have expected. The phrase 'unlike std::basic_string<>' confused me.
I don't see why the user would need to know that, unless it implies something special about begin() and end().
3. What are the semantics of
std::string str("abc"); h(boost::cref(str));
Without looking at the code, I would expect that the const_string<> would share a representation with str, thus implying that it had better not outlive str.
And that str must not be modified during its lifetime, I guess. Jonathan

Jonathan Turkanis <technews@kangaroologic.com> wrote:
Jonathan Turkanis wrote:
[...] 2. I don't understand the following statement:
Unlike std::basic_string<>, empty string object references empty c-style string "".
I think this means that if the string contains a charT*, it will never be null (but rather point to a special empty string object).
That is true. I can't clearly state which benefits it provides, but it seemed to me that it would be more natural for an empty string to be the same as "". []
3. What are the semantics of
std::string str("abc"); h(boost::cref(str));
Without looking at the code, I would expect that the const_string<> would share a representation with str, thus implying that it had better not outlive str.
That is also true. By default the string allocates and copies its argument, but it can be forced not to do so by wrapping the argument with boost:cref() call.
And that str must not be modified during its lifetime, I guess.
This is what the string is all about - using its interface it is impossible to inadvertently modify the string. -- Maxim Yegorushkin

"Maxim Yegorushkin" <e-maxim@yandex.ru> wrote in message news:opsejsvsjeti5cme@wkcg6rirwp...
Jonathan Turkanis <technews@kangaroologic.com> wrote:
And that str must not be modified during its lifetime, I guess.
This is what the string is all about - using its interface it is impossible to inadvertently modify the string.
I get that part. I was just saying if you construct a const_string which shares the implementation of a std::string, it's the programmer's responsibility not to modify the std::string. Right? Jonathan

Jonathan Turkanis <technews@kangaroologic.com> wrote:
And that str must not be modified during its lifetime, I guess.
This is what the string is all about - using its interface it is impossible to inadvertently modify the string.
I get that part. I was just saying if you construct a const_string which shares the implementation of a std::string, it's the programmer's responsibility not to modify the std::string. Right?
Yes. -- Maxim Yegorushkin

Maxim Yegorushkin wrote:
Hello all,
I've been working on a small library - an immutable C++ string. Now it is in usable state and has been put in use in a couple of real projects. Although it needs further polishing, I would like to know is there any interest in such a library?
I don't have any time to study your proposal, but I'm interested in such a library. I wish such a thing was in TR1, truth be told. -cd

Maxim Yegorushkin wrote:
I've been working on a small library - an immutable C++ string. Now it is in usable state and has been put in use in a couple of real projects. Although it needs further polishing, I would like to know is there any interest in such a library?
Here is the link: http://conststring.sourceforge.net/
First, I'll say that yes, I am very interested in a library such as this. Second, let me say that "const_string<>" and "immutable string" don't seem like the right terms to me. The fact that you can call operator?=() and append() on it tell me that there is little that is "const" or "immutable" about it. It seems to me that "cow_string<>" would be more appropriate, but perhaps others have different opinions. Without taking a super-deep look at the code, I will say that it looks fairly clean and appears to appropriately use modern techniques. A small detail is that I would tend to call const_string_storage a policy, since that is basically how it is being used. You could call it a StoragePolicy for the string. Also, types ending in _t tend to be reserved for typedefs of fundamental types. By convention, we usually name template parameters with Capitalized names. charT, Allocator, StoragePolicy, etc. I suppose its not completely a COW string, since you don't support the full set of mutating operations. However, some more documentation would be nice. In particular, I would like to see guarantees about size and performance as compared to basic_string<>, etc. Dave

David B. Held <dheld@codelogicconsulting.com> wrote:
I've been working on a small library - an immutable C++ string. Now it is in usable state and has been put in use in a couple of real projects. Although it needs further polishing, I would like to know is there any interest in such a library? Here is the link: http://conststring.sourceforge.net/
First, I'll say that yes, I am very interested in a library such as this. Second, let me say that "const_string<>" and "immutable string" don't seem like the right terms to me. The fact that you can call operator?=() and append() on it tell me that there is little that is "const" or "immutable" about it. It seems to me that "cow_string<>" would be more appropriate, but perhaps others have different opinions.
Well, I just did not have a better idea and called it so. If there is a more suitable name I will happily rename it.
Without taking a super-deep look at the code, I will say that it looks fairly clean and appears to appropriately use modern techniques. A small detail is that I would tend to call const_string_storage a policy, since that is basically how it is being used.
I agree that the const_string_storage is indeed a policy. I did not call it policy because I felt it was very unlikely a user would ever want to implement it herself, since the class essentially provides the core interface and functionality that makes the string constant. I would rather go for named template parameters technique to hide const_string_storage from the user, and it was not done so because I am still not sure about this.
You could call it a StoragePolicy for the string. Also, types ending in _t tend to be reserved for typedefs of fundamental types. By convention, we usually name template parameters with Capitalized names. charT, Allocator, StoragePolicy, etc.
I partially agree. Using something to distinguish type names from variable names is often beneficial and I think a suffix is much better than a prefix. I knew about boost conventions and yet I've been seeing libraries in boost that do not follow it. Anyway, this is not an issue for me and can be easily changed.
I suppose its not completely a COW string, since you don't support the full set of mutating operations.
The intent is exactly not to provide mutating operations that change individual characters of the string. The string object itself can be actually changed (just like you can't change the string pointed by char const* p, but you can change p to point to another string).
However, some more documentation would be nice. In particular, I would like to see guarantees about size and performance as compared to basic_string<>, etc.
Yes, that what I'm thinking about right now. I have very little free time these days to carefully write docs. The text at the link was written from scratch during a free hour I had today to get some feedback from you guys. Thank you Dave and all others for the interest and the precious time spent into looking at the stuff. After reading your posts I see that I should elaborate the docs in the first place and then polish it... -- Maxim Yegorushkin

In article <ciieha$70o$1@sea.gmane.org>, "David B. Held" <dheld@codelogicconsulting.com> wrote:
Also, types ending in _t tend to be reserved for typedefs of fundamental types
All identifiers ending with _t are reserved by POSIX. Unless you are a part of a language or library standard that is part of POSIX (which boost isn't), you should stay away from all such identifiers. meeroh

Miro Jurisic <macdev@meeroh.org> wrote:
In article <ciieha$70o$1@sea.gmane.org>, "David B. Held" <dheld@codelogicconsulting.com> wrote:
Also, types ending in _t tend to be reserved for typedefs of fundamental types
All identifiers ending with _t are reserved by POSIX. Unless you are a part of a language or library standard that is part of POSIX (which boost isn't), you should stay away from all such identifiers.
Ok, I'll stick to boost conventions. -- Maxim Yegorushkin

On Sun, 19 Sep 2004 01:33:26 -0400, Miro Jurisic <macdev@meeroh.org> wrote:
In article <ciieha$70o$1@sea.gmane.org>, "David B. Held" <dheld@codelogicconsulting.com> wrote:
Also, types ending in _t tend to be reserved for typedefs of fundamental types
All identifiers ending with _t are reserved by POSIX. Unless you are a part of a language or library standard that is part of POSIX (which boost isn't), you should stay away from all such identifiers.
</lurk> (?) I thought that's what namespaces were for. Is the _t reservation a C equivalent of a reserved namespace? If POSIX defines _t macros I guess namespaces won't help, but if not why do we need to reserve _t names for POSIX use? <lurk> Max Wilson -- Ubi solitudinem faciunt, pacem appellant. They make a desert and call it peace. -Tacitus

Maxim Yegorushkin wrote:
I've been working on a small library - an immutable C++ string. Now it is in usable state and has been put in use in a couple of real projects. Although it needs further polishing, I would like to know is there any interest in such a library?
Here is the link: http://conststring.sourceforge.net/
Just wanted to mention here, that I've used const_string as a plug in replacement for std::string for my Wave library and succeeded without any further problems. Moreover, the const_string class gave a perfomance boost of about 20% if compared to the flex_string and about 100% if compared to the std::string (I can provide the concrete numbers, if somebody is interested). But these numbers certainly very specific to Wave and its applications. Regards Hartmut

At 04:30 AM 9/19/2004, Hartmut Kaiser wrote:
Maxim Yegorushkin wrote:
I've been working on a small library - an immutable C++ string. Now it is in usable state and has been put in use in a couple of real projects. Although it needs further polishing, I would like to know is there any interest in such a library?
Here is the link: http://conststring.sourceforge.net/
Just wanted to mention here, that I've used const_string as a plug in replacement for std::string for my Wave library and succeeded without any further problems. Moreover, the const_string class gave a perfomance
boost
of about 20% if compared to the flex_string and about 100% if compared to the std::string (I can provide the concrete numbers, if somebody is interested). But these numbers certainly very specific to Wave and its applications.
100% gets my attention, that's for sure! Could you give a small example of a typical usage or two? I'd like to form an opinion as to how common the usage is likely to be. --Beman

Beman Dawes wrote:
Just wanted to mention here, that I've used const_string as a plug in replacement for std::string for my Wave library and succeeded without any further problems. Moreover, the const_string class gave a perfomance boost of about 20% if compared to the flex_string and about 100% if compared to the std::string (I can provide the concrete numbers, if somebody is interested). But these numbers certainly very specific to Wave and its applications.
100% gets my attention, that's for sure!
Could you give a small example of a typical usage or two? I'd like to form an opinion as to how common the usage is likely to be.
I've simply used the const_string class instead of the std::string/flex_string classes. The const_string class exposes a std::string compatible interface (sans the modifying member functions, certainly), so this was possible without problems. Regards Hartmut

"Hartmut Kaiser" <hartmutkaiser@t-online.de> wrote in message news:1C96Yi-0TQmY40@afwd01.sul.t-online.com...
Beman Dawes wrote:
Just wanted to mention here, that I've used const_string as a plug in replacement for std::string for my Wave library and succeeded without any further problems. Moreover, the const_string class gave a perfomance boost of about 20% if compared to the flex_string and about 100% if compared to the std::string (I can provide the concrete numbers, if somebody is interested). But these numbers certainly very specific to Wave and its applications.
100% gets my attention, that's for sure!
Could you give a small example of a typical usage or two? I'd like to form an opinion as to how common the usage is likely to be.
I've simply used the const_string class instead of the std::string/flex_string classes. The const_string class exposes a std::string compatible interface (sans the modifying member functions, certainly), so this was possible without problems.
I wouldn't want to rush things, but I'd like to point out that if you use const_string in Wave, it might become a candidate for fast track review. Jonathan

Hartmut Kaiser wrote:
[...] Just wanted to mention here, that I've used const_string as a plug in replacement for std::string for my Wave library and succeeded without any further problems. Moreover, the const_string class gave a perfomance boost of about 20% if compared to the flex_string and about 100% if compared to the std::string (I can provide the concrete numbers, if somebody is interested). But these numbers certainly very specific to Wave and its applications.
Excellent info, Hartmut. Anyway, it doesn't bother me that they are Wave-specific numbers, because that's the point, isn't it? You wouldn't use const_string<> as a wholesale replacement for std::string<>, but if you choose it in an appropriate application, you *can* realize up to 100% performance improvement. The fact that this is demonstrated in Wave, and not an artificial benchmark makes it a very valuable test point, in my opinion. Dave

Maxim Yegorushkin wrote: [...]
Here is the link: http://conststring.sourceforge.net/
"The mutability of std::basic_string<> and its interface do not allow implementers to make it a lightweight value object with cheap copy operation through string representation sharing and copy-on-write technique while maintaining thread safety." This claim borders on blatant FUD. Claims to the extent that std:: string is sorta "less thread-safe then char[]" don't hold water because there isn't non-const overload of operator[] for char[]. Or am I just missing something? regards, alexander.

From: Alexander Terekhov <terekhov@web.de>
Maxim Yegorushkin wrote: [...]
Here is the link: http://conststring.sourceforge.net/
"The mutability of std::basic_string<> and its interface do not allow implementers to make it a lightweight value object with cheap copy operation through string representation sharing and copy-on-write technique while maintaining thread safety."
This claim borders on blatant FUD. Claims to the extent that std:: string is sorta "less thread-safe then char[]" don't hold water because there isn't non-const overload of operator[] for char[].
I think it's just saying that COW can make strings more efficient in single-threaded applications, but it's a pessimization for multithreaded applications. Thus, an implementation can provide different versions of std::basic_string for ST versus MT builds, it can use COW (with locking for MT), or it can eschew COW altogether. The first choice is harder for the implementer, but results in (potentially) the best performance in both cases. The second is suboptimal in MT applications, and the third is suboptimal in ST applications. Using an immutable string ensures the best performance regardless of platform and threading model, even if that is the same as the performance of std::basic_string on select platform/threading model configurations. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Alexander Terekhov <terekhov@web.de> wrote:
Maxim Yegorushkin wrote: [...]
Here is the link: http://conststring.sourceforge.net/
"The mutability of std::basic_string<> and its interface do not allow implementers to make it a lightweight value object with cheap copy operation through string representation sharing and copy-on-write technique while maintaining thread safety."
This claim borders on blatant FUD. Claims to the extent that std:: string is sorta "less thread-safe then char[]" don't hold water because there isn't non-const overload of operator[] for char[].
Or am I just missing something?
May be the quoted statement is not clear, but it states that a COW std::string can not be made thread safe due to its interface. A good discussion of the subject you can find here: http://groups.google.com/groups?ie=UTF-8&threadm=31c49f0d.0409070901.4e7a0aa6%40posting.google.com -- Maxim Yegorushkin

Maxim Yegorushkin wrote: [...]
http://groups.google.com/groups?ie=UTF-8&threadm=31c49f0d.0409070901.4e7a0aa6%40posting.google.com
Well, http://google.com/groups?selm=2bbfa355.0409090416.7ecbc59d%40posting.google.... I agree. regards, alexander.
participants (10)
-
Alexander Terekhov
-
Beman Dawes
-
Carl Daniel
-
David B. Held
-
Hartmut Kaiser
-
Jonathan Turkanis
-
Maxim Yegorushkin
-
Maximilian Wilson
-
Miro Jurisic
-
Rob Stewart