Re: [boost] static sized strings

Rob Stewart wrote:
From: "Reece Dunn" <msclrhd@hotmail.com>
Martin wrote:
2. Substrings ------------- One of the applications I can see for fixed_string is when you want to avoid dynamic allocation but still want to use the std::string interface and algorithms. In that perspective it seems strange that substr returns a std::string instead of a fixed_string.
Agreed.
I agree too. It is just that the current implementation cannot create an instance of boost::char_string (because of the ABC - necessary to allow operations on variable-capacity strings). Also, you cannot get at the fixed_string type/length at compile time so it is not possible to derive the *return type* at compile time.
This is related to the above problem. You cannot create an instance of boost::char_string, etc. and as such cannot implement substr in the usual way (this is the proplem I was having with operator+). The solution would be to defione an alternate substr function in fixed_string, but this would not solve the problem for boost::char_string, etc.
The alternative would be to use a ranged string (pointer to the fixed_string buffer and length), but the problem with this is null-termination.
I do not see an easy way around this without removing the ABC design, since it is not possible to do fixed_string< length() > or something similar.
Couldn't substr() be a pure virtual function in char_string that deferred to the derived class to create a copy of itself? The derived class can just create a duplicate of itself, including the current capacity. That capacity is guarranteed to be sufficient to hold any substring requested.
The return type cannot be boost::fixed_string_base because of the whole ABC issue. Therefore, the return type needs to be a reference or a pointer. It is possible to do this, and have: class fixed_string { inline fixed_string_base & substr_( size_type pos, size_type n ) { return( this_type( *this, pos, n )); } }; This is problematic because of returning the address of a temporary - and leads to unpredictable output. I am looking for a better solution to this. I am also thinking about how to remove the ABC from the implementation using a technique pointed out by someone on this thread (can't remember who): class fs_base { size_t len; szie_t capacity; // needed for buffer-safe operations CharT str[ 1 ]; // string manipulation functions fs_base( size_t c ): capacity( c ){} }; template< size_t n > class fixed_string: public fs_base { CharT data[ n ]; fixed_string(): fs_base( n ){} }; I'll need to look into this, to see what effect this will have, to see if it is a possible replacement for the ABC. What do other people think? Regards, Reece _________________________________________________________________ Use MSN Messenger to send music and pics to your friends http://www.msn.co.uk/messenger

From: "Reece Dunn" <msclrhd@hotmail.com>
Rob Stewart wrote:
Couldn't substr() be a pure virtual function in char_string that deferred to the derived class to create a copy of itself? The derived class can just create a duplicate of itself, including the current capacity. That capacity is guarranteed to be sufficient to hold any substring requested.
The return type cannot be boost::fixed_string_base because of the whole ABC issue. Therefore, the return type needs to be a reference or a pointer. It is possible to do this, and have:
class fixed_string { inline fixed_string_base & substr_( size_type pos, size_type n ) { return( this_type( *this, pos, n )); } };
I didn't mean for it to be returned by value (because it would slice) and I certainly didn't mean to for it to return a pointer or reference to a temporary.
This is problematic because of returning the address of a temporary - and leads to unpredictable output.
It's more than problematic. It results in undefined behavior.
I am looking for a better solution to this. I am also thinking about how to remove the ABC from the implementation using a technique pointed out by someone on this thread (can't remember who):
class fs_base { size_t len; szie_t capacity; // needed for buffer-safe operations CharT str[ 1 ]; // string manipulation functions
fs_base( size_t c ): capacity( c ){} };
template< size_t n > class fixed_string: public fs_base { CharT data[ n ]; fixed_string(): fs_base( n ){} };
I'll need to look into this, to see what effect this will have, to see if it is a possible replacement for the ABC. What do other people think?
This has the definite benefit of putting all of the meat in the base class, leaving to the derived class only the initialization of the string, thus avoiding most virtual functions. In order for fs_base::substr() to do its job, however, it would still need the derived class to create an instance from the designated substring; the base class doesn't know, at compile time, the required capacity. That implies a pure virtual function, sort of a clone(), that the derived class must implement. The only inherent danger in this approach is that the author of the derived type must be trusted to allocate the necessary number of bytes. It is convenient that the base class allocates one byte as that can be the reserved byte for the null terminator; the derived class only has to allocate N bytes.* * Change "bytes" to "characters" to allow for wchar_t. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Rob Stewart wrote:
From: "Reece Dunn" <msclrhd@hotmail.com>
I am looking for a better solution to this. I am also thinking about how to remove the ABC from the implementation using a technique pointed out by someone on this thread (can't remember who):
class fs_base { size_t len; szie_t capacity; // needed for buffer-safe operations CharT str[ 1 ]; // string manipulation functions
fs_base( size_t c ): capacity( c ){} };
template< size_t n > class fixed_string: public fs_base { CharT data[ n ]; fixed_string(): fs_base( n ){} };
I'll need to look into this, to see what effect this will have, to see if it is a possible replacement for the ABC. What do other people think?
Are you assuming that "data[n]" physically follows "str[1]" in memory, and that "data[n]" can be addressed by subscripting "str[1]" out of range? That will work with most implementations, but it's not guaranteed to work. There are often intervening alignment bytes. Some debug implementations may place sentinel areas between objects to detect buffer overruns. John Nagle Animats

From: John Nagle <nagle@animats.com>
From: "Reece Dunn" <msclrhd@hotmail.com>
I am looking for a better solution to this. I am also thinking about how to remove the ABC from the implementation using a technique pointed out by someone on this thread (can't remember who):
class fs_base { size_t len; szie_t capacity; // needed for buffer-safe operations CharT str[ 1 ]; // string manipulation functions
fs_base( size_t c ): capacity( c ){} };
template< size_t n > class fixed_string: public fs_base { CharT data[ n ]; fixed_string(): fs_base( n ){} };
Are you assuming that "data[n]" physically follows "str[1]" in memory, and that "data[n]" can be addressed by subscripting "str[1]" out of range? That will work with most implementations, but it's not guaranteed to work. There are often intervening alignment bytes. Some debug implementations may place sentinel areas between objects to detect buffer overruns.
Oh, right. It was C99 that blessed the struct hack. Oh well. Maybe we could do it anyway on platforms where it's been proven to work? -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;
participants (3)
-
John Nagle
-
Reece Dunn
-
Rob Stewart