New subject: static sized strings

13 May 2004

      Rob Stewart wrote:
...
From: "Reece Dunn" <msclrhd@hotmail.com>
...
Rob Stewart wrote:
...
From: John Nagle > Reece Dunn wrote:
The idea is to prevent overrun when this type of situation occurs. Thus, 
it
will be a 3 character buffer with the extra character being a
null-terminator.
I think we're miscommunicating, and it's probably me
misinterpreting something.
The four byte matter was raised because someone wanted to be able
to peek into a buffer of data read from a file.  The 5th byte
That was me :). I was suggesting that one possible application for this type 
of class was in managing header files of binary data, e.g. tar files, doing 
things like:

   struct tar_header
   {
      ...
      boost::char_string< 6 > magic;
   } hdr;

   if( hdr.magic == "ustar\x20\x20\0" ) // process GNU TAR file
...
wasn't a null terminator and it shouldn't be changed, so I took
that to suggest effectively overlaying a char_string on that
file's contents in the buffer.
I have revised my initial idea of supporting both null-terminated and 
non-terminated string buffers in response that (a) boost::array< char, N > 
is a good candidate for hte latter, and (b) adding a boolean template 
parameter to select null termination and the associated code made the logic 
too complex.
...
...
...
...
...
...
IOW, make it a runtime error to fix the capacity too small to
permit null termination when calling c_str().  That still leaves
room for things like the 4-character file signature to which you
referred, and yet prevents buffer overrun, but doesn't require
foregoing flexibility.
If you want a 4-char file signature, you can use
"boost::array<char,4>", which does that job.  Is there any
real need for that functionality in char_string?
Isn't boost::array specific to generic arrays, whereas using a variant 
of
char_string, you can use string functions that will be optimized for 
string
operations. This is the main reason for using a special string class.
You're right that using boost::array doesn't offer any string
facilities.  Perhaps what we need is a string library that uses
all namespace scope functions and type generators to generalize
the notion of a string (this is not unlike Thorsten's
CollectionTraits library).  Then, a boost::array<char,N>, a
std::string, even a C string can be treated generically as a
string.  With such a facility, boost::array would work just fine
for the file signature example and char_string would be relieved
from needing to handle the "no terminator" case.
Doesn't the string algorithms library do just that?
...
...
...
...
What about the base class issue?  There's a need to be able to
write something like "char_string_base& s" when you want to
pass around fixed-capacity strings of more than one capacity.
If that is a necessary feature, then the length can be stored in
the base class.  However, such a base class means that the dtor
must be virtual.  Is vtable overhead acceptable in such a class?
Perhaps two types are needed.
My design does not use a virtual base class, so that isn't an issue. 
John
Nagle's version does, so that is where the problem arises. I have 
several
issues with the use of a virtual base class:
A "virtual base class" or a base class with a virtual function?
Base class with a set of virtual functions (my mistake).
...
...
[1] If you want to operate on a variable length character string
specifically, why not templatize the function:
   template< int n >
   void myfn( boost::char_array< n > & s ){ ... }
The issue had to do with being able to create collections of
variable sized string objects.
Hmmm. That could be tricky :(.
...
...
[2] How do you deal with wide-character strings? My update generalizes 
to
support char and wchar_t based buffers, but with a virtual base class, 
you
are limited to char buffers.
That's true only if the character type is encoded in the base
class.  Why would it be?
You need to know the character type you are operating on in the base class. 
This is the basic idea that John Nagle's approach takes:

   class char_string_base
   {
      public:
         virtual std::size_t length() const;
         virtual const char * c_str() const;
         virtual void copy( const char * );
   };

But I agree with John's comments that it may be necessary to have a 
wchar_string_base to support wide characters.
...
...
[3] One of the reasons for having a virtual class is to supply custom 
string
operations, e.g. using Windows-specific string functions instead of the
standard library ones. This can also be solved with a policy template 
like
that found in basic_string. My current version uses this approach, 
improving
interoperability with basic_strings.
You can certainly design an ABC with many pure virtual functions
that the derived types implement, but that was never the intent
of the base class idea.
Your policy approach will permit a lot of custimization, but
perhaps the better approach is to externalize all operations.
Can you expand on what you mean by externalize.

Regards,
Reece

_________________________________________________________________
Sign-up for a FREE BT Broadband connection today! 
http://www.msn.co.uk/specials/btbroadband

Re: [boost] Re: static sized strings

Reece Dunn

Rob Stewart

John Nagle

tags

participants (3)