New subject: static sized strings

13 May 2004

      Rob Stewart wrote:
...
From: John Nagle > Reece Dunn wrote:
...
...
Rob Stewart wrote:
...
From: John Nagle
...
You're comparing apples to oranges.  Either you mean for the
buffer to be exactly four bytes or you don't.  If you do, then
there's no room for a null terminator.  You're not dealing with a
"C string" (a null terminated string), you're dealing with a four
byte buffer.  If you allocated four bytes for a "C string" and
write "ABCDE" to it, or even strcat() "ABCD" to it, you overrun
the buffer.  Why would we want to provide consistent semantics
for that behavior?
The idea is to prevent overrun when this type of situation occurs. Thus, it 
will be a 3 character buffer with the extra character being a 
null-terminator.
...
...
...
...
IOW, make it a runtime error to fix the capacity too small to
permit null termination when calling c_str().  That still leaves
room for things like the 4-character file signature to which you
referred, and yet prevents buffer overrun, but doesn't require
foregoing flexibility.
If you want a 4-char file signature, you can use
"boost::array<char,4>", which does that job.  Is there any
real need for that functionality in char_string?
Isn't boost::array specific to generic arrays, whereas using a variant of 
char_string, you can use string functions that will be optimized for string 
operations. This is the main reason for using a special string class.
...
...
char_string might have some convenience template functions to
interconvert "boost::array" and "boost::char_string".
...
That is a good idea. It will mean keeping track of the string length,
Yes, that seems to be necessary.
The question is, which is better: char_string<4> or
boost::array<char,4>?  I suggest that the latter is better.  In C
code, and similar C++ code, arrays of char are used as buffers of
fixed length and as memory for strings.  Code will be clearer if
one uses boost::array<char,N> for the former and char_string<N>
for the latter.  Once you make that distinction of purpose, null
termination can be integral to char_string without complication
or unwanted overhead.
That makes sense.
...
...
What about the base class issue?  There's a need to be able to
write something like "char_string_base& s" when you want to
pass around fixed-capacity strings of more than one capacity.
If that is a necessary feature, then the length can be stored in
the base class.  However, such a base class means that the dtor
must be virtual.  Is vtable overhead acceptable in such a class?
Perhaps two types are needed.
My design does not use a virtual base class, so that isn't an issue. John 
Nagle's version does, so that is where the problem arises. I have several 
issues with the use of a virtual base class:

[1] If you want to operate on a variable length character string 
specifically, why not templatize the function:
   template< int n >
   void myfn( boost::char_array< n > & s ){ ... }

[2] How do you deal with wide-character strings? My update generalizes to 
support char and wchar_t based buffers, but with a virtual base class, you 
are limited to char buffers.

[3] One of the reasons for having a virtual class is to supply custom string 
operations, e.g. using Windows-specific string functions instead of the 
standard library ones. This can also be solved with a policy template like 
that found in basic_string. My current version uses this approach, improving 
interoperability with basic_strings.

Regards,
Reece

_________________________________________________________________
Use MSN Messenger to send music and pics to your friends 
http://www.msn.co.uk/messenger

Re: [boost] Re: static sized strings

Reece Dunn

Rob Stewart

John Nagle

tags

participants (3)