Re: [boost] Re: static sized strings

Rob Stewart wrote:
John Nagle wrote:
The big question is whether the size specified in the declaration (as in "char_string<80>") includes the trailing null.
I have modified the sandbox implementation to make it easy to switch between the two models; it is currently using the 2nd model (adding space for
null). One possibility is putting the choice of models as a template parameter (e.g. bool need_null -- see comments in the sandbox code for an implementation), that way the decision is up to the programmer and not
From: "Reece Dunn" <msclrhd@hotmail.com> the the
library implementor.
This is overkill.
I think you misunderstand what I am trying to say. See my comments below.
The purpose of fixed_string is to replace arrays of characters, otherwise, std::string would be the right replacement, right?
correct.
Given that assumption, look to the syntax you're replacing:
char s[] = "1234567890"; assert(10 == strlen(s)); size_t const LENGTH = 3; char t[LENGTH + 1]; strncpy(t, "ABCEFGHIJKLMNOP", LENGTH); assert(LENGTH == strlen(t));
Now do the same with the new class:
The equivalent would be: fixed_string< 15 > s( "1234567890" ); // [1] assert( 10 == strlen( s )); // C-style assert( 10 == s.size()); // C++-style const size_t LENGTH = 3; fixed_string< LENGTH > t; // [2] t = "ABCEFGHIJKLMNOP"; assert( LENGTH == strlen( t )); // C-style assert( LENGTH == t.length()); // alternate C++-style [1] You need to specify the size of the buffer. You cannot declare an object of type fixed_string_base (as it is abstract), you can only have references and pointers to it so you can operate on variable-capacity strings. [2] This is the point that we are discussing. Should this line be: fixed_string< LENGTH > t; as will work with the current sandbox implementation (my suggestion #2), or should it be: fixed_string< LENGTH + 1 > t; as in my suggestion #1. The code calculates the correct storage and capacity values at the top of the fixed_string class, because buffer overrun checks rely on these values: template< size_t n, typename CharT = char, class CharStringPolicy = std::char_traits< CharT > > class fixed_string: public fixed_string_base< CharT, CharStringPolicy > { BOOST_STATIC_CONSTANT( size_t, storage_c = n + 1 ); // = needs_null ? n + 1 : n BOOST_STATIC_CONSTANT( size_t, capacity_c = n ); // = needs_null ? n : n - 1 // ... }; Different programmers favour the different semantics, so I ask: why not parameterise it, providing a default behaviour.
(I know there is no make_fixed_string() -- yet -- but such a facility would be appropriate.)
Er... the fixed_string constructors! When you declare the object, you should know the size of the buffer you are using, e.g.: fixed_string< 100 > data; but when operating on them, you want to use any capacity buffers, so you either need a template (not good for existing code or code that needs to be in a static/dynamic library), so you use fixed_string_base, or [w]char_string if you want to save typing :). E.g. inline size_t strlen( const char_string & s ) { return( s.length()); }
I think this clearer reveals that fixed_string's size parameter should specify the number of characters. Remember, one can use boost::array to manage a fixed size, non-string buffer. If buffer overrun protection is insufficient in boost::array, that should be fixed (or a new class should be added to Boost). Thus, fixed_string can ignore that usage.
This is not what we are discussing. See above. The need_null name was to indicate that an extra charcter be reserved for the null, not whether to add a null terminator or not. Sorry for the confusion. I have started work on documentation, but it is currently in text form. I will post the documentation in BoostBook format within the next day or so. Regards, Reece _________________________________________________________________ Stay in touch with absent friends - get MSN Messenger http://www.msn.co.uk/messenger

From: "Reece Dunn" <msclrhd@hotmail.com>
Rob Stewart wrote:
John Nagle wrote:
The big question is whether the size specified in the declaration (as in "char_string<80>") includes the trailing null.
I have modified the sandbox implementation to make it easy to switch between the two models; it is currently using the 2nd model (adding space for
null). One possibility is putting the choice of models as a template parameter (e.g. bool need_null -- see comments in the sandbox code for an implementation), that way the decision is up to the programmer and not
From: "Reece Dunn" <msclrhd@hotmail.com> the the
library implementor.
This is overkill.
I think you misunderstand what I am trying to say. See my comments below.
Actually, I think you misunderstood my points.
Given that assumption, look to the syntax you're replacing:
char s[] = "1234567890"; assert(10 == strlen(s)); size_t const LENGTH = 3; char t[LENGTH + 1]; strncpy(t, "ABCEFGHIJKLMNOP", LENGTH); assert(LENGTH == strlen(t));
Now do the same with the new class:
The equivalent would be:
fixed_string< 15 > s( "1234567890" ); // [1] assert( 10 == strlen( s )); // C-style assert( 10 == s.size()); // C++-style
const size_t LENGTH = 3; fixed_string< LENGTH > t; // [2] t = "ABCEFGHIJKLMNOP"; assert( LENGTH == strlen( t )); // C-style assert( LENGTH == t.length()); // alternate C++-style
[1] You need to specify the size of the buffer. You cannot declare an object of type fixed_string_base (as it is abstract), you can only have references and pointers to it so you can operate on variable-capacity strings.
Ah, but you missed that I was declaring a fixed_string_base, where string capacity is not known. I was relying on make_fixed_string() to deduce the array length and create an appropriate fixed_string for it.
[2] This is the point that we are discussing. Should this line be:
fixed_string< LENGTH > t;
as will work with the current sandbox implementation (my suggestion #2), or should it be:
fixed_string< LENGTH + 1 > t;
And that's exactly my point. It should be the former as shown by my equivalencies.
Different programmers favour the different semantics, so I ask: why not parameterise it, providing a default behaviour.
I'm not so sure they do and I'm not sure you need to allow for it. If you do, I suspect you'll encourage off-by-one mistakes. Granted, those mistakes won't result in overruns, but they will be annoying. Furthermore that will result in different fixed_string types; will you provide for conversions among them? (That is, for example, from a fixed_string that adds one to the capacity to one that doesn't?)
(I know there is no make_fixed_string() -- yet -- but such a facility would be appropriate.)
Er... the fixed_string constructors! When you declare the object, you should know the size of the buffer you are using, e.g.:
fixed_string< 100 > data;
The problem is that you don't know the length of the string. That is, with this code: char s[] = "1234"; s is a fixed size array of char with five elements. The programmer didn't have to write: char s[5] = "1234"; Instead, the compiler figured out the length. That's what I'm trying to provide an equivalence for: fixed_string_base & s = make_fixed_string("1234"); With that approach, the client doesn't need to precompute the length of the literal and can rely on the compiler (and template magic) to deduce it and use it to create a fixed_string<5>. I've shown returning by value (with the temporary bound to the reference s), but it could return a smart pointer just as well.
I think this clearer reveals that fixed_string's size parameter ^^^^^^^ That should have been "clearly."
should specify the number of characters. Remember, one can use boost::array to manage a fixed size, non-string buffer. If buffer overrun protection is insufficient in boost::array, that should be fixed (or a new class should be added to Boost). Thus, fixed_string can ignore that usage.
This is not what we are discussing. See above. The need_null name was to indicate that an extra charcter be reserved for the null, not whether to add a null terminator or not. Sorry for the confusion.
Right, and that's what I was discussing. I showed the typical C code for creating an array of char for holding a string. In it, the null is accounted for using "+ 1" rather than being part of the length "variable." Thus, the length parameter to fixed_string should also not include the element needed for the null terminator. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Reece Dunn wrote:
Different programmers favour the different semantics, so I ask: why not parameterise it, providing a default behaviour.
As with "do we null-terminate", I think we have to pick a behavior and stick to it. I could live with either set of semantics, but adding a parameter makes the issue more confusing. (And you have to supply conversions.) Most strings are sized a bit too big, anyway. On a related subject, we should have unconditional null termination. "fixed_string" items are always null-terminated. "snprintf", "strncat", etc. have hazardous semantics: if you overflow the string, it is not null terminated. (This is a bug in my current version, incidentally.) We should guarantee null termination in all cases. The whole point of this class is improved safety, after all. John Nagle Team Overbot

From: John Nagle <nagle@overbot.com>
Reece Dunn wrote:
Different programmers favour the different semantics, so I ask: why not parameterise it, providing a default behaviour.
As with "do we null-terminate", I think we have to pick a behavior and stick to it. I could live with either set of semantics, but adding a parameter makes the issue more confusing.
I really don't think the two approaches are equivalent when thinking in terms of the C-string code we're trying to replace.
(And you have to supply conversions.) Most strings are sized a bit too big, anyway.
Reece mentioned that conversions, per se, weren't needed because a ctor and assignment operator can take references to the base class, thus ignoring the capacity differences. Oh, it just occurred to me that when the two have the same size, there should be a cctor and copy assignment operator to avoid the virtual function call overhead. You might even be able to do some metaprogramming to determine whether the capacity of the source string is less than or equal to the capacity of the destination string and avoid the virtual function call overhead for an entire class of copies.
On a related subject, we should have unconditional null termination. "fixed_string" items are always null-terminated. "snprintf", "strncat", etc. have hazardous semantics: if you overflow the string, it is not null terminated. (This is a bug in my current version, incidentally.) We should guarantee null termination in all cases. The whole point of this class is improved safety, after all.
Absolutely. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;
participants (3)
-
John Nagle
-
Reece Dunn
-
Rob Stewart