Re: [boost] Re: static sized strings

Rob Stewart wrote:
Sure, but you've both jumped off on a tangent unrelated to the purpose of the original query. See my reply to Dave's message.
I agree (thanks for the info, though :)). What I am interested in is a comparison between my implementation at: boost-sandbox/boost/fixed_string and John Nagle's implementation at: http://www.animats.com/source if possible. Regards, Reece _________________________________________________________________ It's fast, it's easy and it's free. Get MSN Messenger today! http://www.msn.co.uk/messenger

Me too. Thanks. The counted string vs. null terminated string issue turns out to be complicated. If you use only null terminated strings, appending a char to a string becomes very expensive, and "end()" is expensive to evaluate. If you use only counted strings, the interaction with subscripting is non-obvious. If you store a null, what happens? Are you allowed to store beyond the current length in use? Mixing C string and <string> semantics is messy. I ended up with a lazy evaluation scheme. John Nagle Team Overbot Reece Dunn wrote:
Rob Stewart wrote:
Sure, but you've both jumped off on a tangent unrelated to the purpose of the original query. See my reply to Dave's message.
I agree (thanks for the info, though :)). What I am interested in is a comparison between my implementation at:
boost-sandbox/boost/fixed_string
and John Nagle's implementation at:
if possible.
Regards, Reece
_________________________________________________________________ It's fast, it's easy and it's free. Get MSN Messenger today! http://www.msn.co.uk/messenger
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

I have looked at both implementation and got some comments: 1. Both implementations use a size independent base class. I can see the need for such a class but don't understand how to use it in the way proposed. The base class only implements a few methods so in my view it is more or less useless. E.g. a function taking a "base" argument can't iterate or look at the string content without converting it to a c-string (so it could just as well use a c-string argument). Why not implement the full interface in the base class? 2. I couldn't find any background discussion on what non-const operations should be allowed on the string. Only the std::string interface is mentioned but since the fixed_string is guaranteed to be continuos it makes sense to also allow non-const access to the string. In both versions c_str() is const while data() is const only in the sandbox version. I think a const c_str() and non-const data() (as an overload to keep the std::string interface) is the way to go. 3. As mentioned in other posts, length is messy and I don't think there are any way to safely store the length inside the class. There will always be a risk that the non-const operations add/removes a '\0' in the wrong place. One solution may be to add something similar to the CString's "ReleaseBuffer()" (with a better name) that allows you to force an update of the length after an unsafe operation. And last just an idea I got from the flex_string implementation: Why not allow use of the "struct hack" as a performance option. e.g. template <typename CharT> class fixed_string_base { const size_t capacity_; mutable size_t length_; CharT str_[1]; protected: fixed_string_base(size_t n) : capacity_(n), length_(0) {} // std::string interface + special fixed_string operations }; typedef fixed_string_base<char> char_string; typedef fixed_string_base<wchar_t> wchar_string; template <size_t n, typename CharT> class fixed_string : public fixed_string_base<CharT> { CharT str_[n]; public: fixed_string() : fixed_string_base<CharT>(n) {} };

Martin wrote:
I have looked at both implementation and got some comments:
1. Both implementations use a size independent base class. I can see the need for such a class but don't understand how to use it in the way proposed. The base class only implements a few methods so in my view it is more or less useless. E.g. a function taking a "base" argument can't iterate or look at the string content without converting it to a c-string (so it could just as well use a c-string argument). Why not implement the full interface in the base class?
Good point. That needs to be done.
2. I couldn't find any background discussion on what non-const operations should be allowed on the string. Only the std::string interface is mentioned but since the fixed_string is guaranteed to be continuos it makes sense to also allow non-const access to the string. In both versions c_str() is const while data() is const only in the sandbox version. I think a const c_str() and non-const data() (as an overload to keep the std::string interface) is the way to go.
I wasn't too happy about "data", but it seems to be needed, given fixed_string's intended use as a retrofit to legacy code. People will want to code things like char_string<80> line; int cnt = read(fd,line.data(),line.capacity());
3. As mentioned in other posts, length is messy and I don't think there are any way to safely store the length inside the class. There will always be a risk that the non-const operations add/removes a '\0' in the wrong place. One solution may be to add something similar to the CString's "ReleaseBuffer()" (with a better name) that allows you to force an update of the length after an unsafe operation.
I have such a function, but it's private, and used internally within the class. Do you think the lazy evaluation approach I used is good? I have mixed feelings. John Nagle

From: "Reece Dunn" <msclrhd@hotmail.com>
What I am interested in is a comparison between
Here it is. I have some comments regarding both and some that are unique to one or the other. I'll start with the common ones. * I think that data() would be better named "buffer." The typical usage of it would be to expose the entire buffer for external manipulation. c_str() is the means by which to expose the string data. * A better way to handle exposing the buffer would be to have buffer() return a proxy class that implicitly converts to char *, thus exposing the string's buffer. The benefit, though would be for the dtor to recompute the length (and inform the string of it) or do invariant checks. Unfortunately, you can't let the dtor throw exceptions, so this doesn't work as nicely as I'd like. * Should fixed_string support zero-length strings?
my implementation at:
boost-sandbox/boost/fixed_string
* const_iter_offset() should be named "iter_offset" like its non-const counterpart. It is a const mf, so overload resolution will find it. * I'm not sure that there's enough value in const_iter_offset()/iter_offset() versus begin()/end() (const and non-const) in the derived class. You eliminate four trivial functions, but you may eliminate opportunities for optimizations. (I'm thinking there may be derived types that have a better way of computing end and that returning the result of adding zero to a pointer is slower than simply returning the pointer.) * fixed_string doesn't have a const overload of at(). Calling fixed_string's at() "at" seems wrong. You should take the tack of requiring a derivate of detail::basic_string_impl to provide uniquely named, implementation functions that basic_string_impl uses to provide the public interface. You could even suggest that such functions be made private with the basic_string_impl specialization made a friend. Your current approach will generate many "hiding" warnings, too. * fixed_string's at() shouldn't test the length and throw an exception since detail::basic_string_impl::at() does it and detail::basic_string_impl::operator[] calls it. * The same out of range testing code is repeated in many functions. I suggest factoring it out into a validate() mf called from the rest. * fixed_string::swap() is missing. * detail::basic_string_impl::get_allocator() expects string_type::get_allocator() but that isn't listed in the requirements levied on the derived type and isn't defined in fixed_string. * I wouldn't expect fixed_string's StringPolicy to govern how to determine the length of a string passed to a mf as a CharT *. * If I kept looking, I'm sure I'd find many more things, but you are more interested in philosophical differences, so I'll stop here. Look at the end of the message for my summary.
and John Nagle's implementation at:
* fixed_string_base must implement the entire std::basic_string I/F. * A core I/F in fixed_string_base must be done via pure virtual functions, relying on the derived classes to implement those functions. Then there should be a large number of mfs that can be implemented in terms of that core set. * The current implementation puts all of the onus on derived classes. Summary: The two classes take quite different approaches to solve the problem. Reece's approach is like iterator_adaptor and its ilk, relying on a parameterized base class to manage the derived class to get the required behavior. John's approach is to use an ABC and let the derived class implement required functionality. One problem with Reece's approach is how to expose a core I/F in the derived class for use by the base class without that core I/F being part of the public interface of the derived type. Since those functions are typically unsafe for normal use, exposing them is risky. By contrast, John's approach relies on pure virtual functions to require the derived class to offer the required I/F. Since those pvf's can be declared private in both fixed_string_base and the derived class, they needn't ever be part of the public I/F. I think John's approach is better, only because it does less type indirection. It relies on the more ordinary pvf approach to the core functionality. Otherwise, the two libraries provide the same functionality (or can when complete). No doubt there's something else you'd rather I focused on. If so, direct me to it and I'll let you know what I think. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;
participants (5)
-
John Nagle
-
John Nagle
-
Martin
-
Reece Dunn
-
Rob Stewart