[Review] Formal review of Fixed Strings library starts today

Hi all, The review of the proposed Boost Fixed Strings library written by Reece Dunn starts today (January 19th, 2006) and ends January 28th, 2006. You can download the library here: http://tinyurl.com/czrh9 --------------------------------------------------- About the library: The fixed_strings library is a buffer-overrun safe version of using fixed-size character array data buffers. It provides integration with the C API allowing strcpy, et. al. to be used safely for a fixed string and provides a basic_string interface allowing easy migration to C++ strings. There is also a formatter function object allowing (w)sprintf to be used in a single line! Example: boost::fixed_string< 5 > hi; strcpy( hi, "Hello World" ); hi.push_back( '!' ); std::cout << hi << std::endl; std::cout << boost::formatter( "Meine %s Welt!", "kleine" ) << std::endl; output: Hello Meine kleine Welt! --------------------------------------------------- Please always state in your review, whether you think the library should be accepted as a Boost library! Additionally please consider giving feedback on the following general topics: - What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With what compiler? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain? Regards Hartmut Review Manager

Sorry to start with a negative review: What is your evaluation of the design? -------------------------------------- In the introduction it says "The boost::fixed_string class is aimed at solving the problem outlined above and is designed to be a direct replacement for character buffers, operating as a fixed-capacity string." ("problem above" is buffer overrun and null termination of c-strings) With this in mind I think the fixed_string design is very complicated. Where in the aim does the need for a fixed_string_impl class and a configurable format_policy arise? The more obvious "overrun_policy" isn't in the design. I also don't like the fixed_string_base class. Deriving fixed_string from fixed_string_base means: 1. all fixed_strings have an overhead of 1 pointer and 2 size_t. 2. all operations on fixed_string use one extra pointer indirection. 3. fixed_string can't be used in shared_memory or binary serialization. I think it would be better with a separate fixed_string_ext friend class that works on an external buffer. template <typename CharT> struct basic_fixed_string_ext { template <size_t N> basic_fixed_string_ext(const fixed_string<CharT, N>& f) : buffer_(f.buffer()), capacity(N), size_(f.len) { }; public: CharT* const buffer_; const size_t capacity_; size_t& size_; }; I also miss a few things: - operations/construction with basic_string or boost::range - Easier way to handle length. Currently it is easy to forget to update the length after using buffer(). Most of my code looks like fstr.setlength(apifunc(fstr.buffer())); or apifunc(fstr.buffer()); fstr.setlength() The solution is probably to either: * Don't store the length * let buffer() invalidate the length and update it when needed. * let buffer() return a proxy that sets the length when destroyed. - fixed_stringstream. What is your evaluation of the documentation? ----------------------------- The documentation looks good but I think it is done the wrong way. The main interface class "fixed_string" is deep inside the design page and only got 2 lines of documentation. "This is the main class that provides the fixed-capacity string implementation, deriving itself from boost::fixed_string_base. It also provides the constructors specified in the std::basic_string interface." You need to look in the documentation for fixed_string_base to find out that fixed_string got the full basic_string interface plus a few more functions. What is your evaluation of the potential usefulness of the library? ----------------------------- Could be useful in applications where you want to avoid dynamic memory allocations but I don't find it useful for char array replacement. It is just to much trouble with setlength and conversion to/from basic_string. Did you try to use the library? With what compiler? Did you have any problems? ----------------------------- I am currently using an earlier version (December 2004). The review version doesn't work for me on VC70. boost\fixed_string\fixed_string.hpp(400) : error C2065: 'recalc_' : undeclared identifier Please always state in your review, whether you think the library should be accepted as a Boost library! ----------------------------- I vote No.

Martin Adrian wrote:
Sorry to start with a negative review:
It's ok. It'll lead to a better library :).
What is your evaluation of the design? --------------------------------------
In the introduction it says "The boost::fixed_string class is aimed at solving the problem outlined above and is designed to be a direct replacement for character buffers, operating as a fixed-capacity string." ("problem above" is buffer overrun and null termination of c-strings)
With this in mind I think the fixed_string design is very complicated. Where in the aim does the need for a fixed_string_impl class and a configurable format_policy arise? The more obvious "overrun_policy" isn't in the design.
The use of the fixed_string_impl class is to save an implementor of a basic_string-style class implementing all of the basic_string functions. Really, this should be more exposed and part of a set of basic_string helper classes, like the iterator implementation helpers are. The format policy is to help provide support for printf-style functions. As for overrun_policy, the approach I have taken is to clip the string when this occurs. However, it may be useful to have a policy that throws when the buffer limit is exceeded.
I also don't like the fixed_string_base class. Deriving fixed_string from fixed_string_base means:
1. all fixed_strings have an overhead of 1 pointer and 2 size_t. 2. all operations on fixed_string use one extra pointer indirection. 3. fixed_string can't be used in shared_memory or binary serialization.
This was to get around the problem of needing to do something like: template< int n > int strlen( const boost::fixed_string< char, n> & str ) { return str.length(); } so you could write that as: int strlen( const char_string & str ) { return str.length(); }
I think it would be better with a separate fixed_string_ext friend class that works on an external buffer.
template <typename CharT> struct basic_fixed_string_ext { template <size_t N> basic_fixed_string_ext(const fixed_string<CharT, N>& f) : buffer_(f.buffer()), capacity(N), size_(f.len) { }; public: CharT* const buffer_; const size_t capacity_; size_t& size_; };
Yes. That would be better, as you could then use whichever variant you wanted. NOTE: In order to use the basic_string functions on both of these (one of the requirements of the library), they will both need to be derived from basic_string_impl (or something similar).
I also miss a few things:
- operations/construction with basic_string or boost::range
They would be useful. But then what about others. How about interacting with the const_string class or a copy-on-write string, a std::vector< CharT >, ...? I suppose you could do something like: template< typename StringT > void assign( const StringT & str ) { assign( boost::begin( str ), boost::end( str )); }
- Easier way to handle length. Currently it is easy to forget to update the length after using buffer(). Most of my code looks like fstr.setlength(apifunc(fstr.buffer())); or apifunc(fstr.buffer()); fstr.setlength()
The solution is probably to either: * Don't store the length
And use strlen() to calculate the length? Then you would miss things like this: "Text Document\0*.txt\0\0\0".
* let buffer() invalidate the length and update it when needed. * let buffer() return a proxy that sets the length when destroyed.
Both of these solutions seem promising.
- fixed_stringstream.
This would be a fixed_string_buffer, possibly making use of Johnathan's I/O stream library.
What is your evaluation of the documentation? ----------------------------- The documentation looks good but I think it is done the wrong way. The main interface class "fixed_string" is deep inside the design page and only got 2 lines of documentation.
Ok. I'll see if I can come up with a better organisation.
"This is the main class that provides the fixed-capacity string implementation, deriving itself from boost::fixed_string_base. It also provides the constructors specified in the std::basic_string interface."
You need to look in the documentation for fixed_string_base to find out that fixed_string got the full basic_string interface plus a few more functions.
I'll make this more explicit in the documentation.
What is your evaluation of the potential usefulness of the library? ----------------------------- Could be useful in applications where you want to avoid dynamic memory allocations but I don't find it useful for char array replacement. It is just to much trouble with setlength and conversion to/from basic_string.
Did you try to use the library? With what compiler? Did you have any problems? ----------------------------- I am currently using an earlier version (December 2004). The review version doesn't work for me on VC70.
boost\fixed_string\fixed_string.hpp(400) : error C2065: 'recalc_' : undeclared identifier
This is being called when you call setlength(). I most likely removed this during the modifications since your version. I will readd it and update the tests to cover the buffer(), setlength() and other misc. functions. Thanks for your feedback, - Reece

I vote NOT to accept it for the following reasons:
- What is your evaluation of the documentation?
The documentation says more about implementation details rather then public interface. There is only one informative sentence on fixed_string reference page: "This class provides the implementation of an n-ary fixed capacity string. It is an abstraction of a CharT[ n ] buffer that exposes a std::basic_string-style interface." All constructors and assignment operators don't have a documentation, they're only listed. All other functions are not even listed! I realise that these functions are implemented in fixed_string_base but this is not documented (I'm pretty sure it's a typo, public keyword does never come alone)
- What is your evaluation of the design?
fixed_string is derived from fixed_string_base. IIUC, the rationale is to get rid of compile-time N in fixed_string<N>. What is inappropriate, is that fixed_string copies arrays while fixed_string_base copies pointers. For example, fixed_string_base foo() { return fixed_string<10>("Return a pointer to local array"); } Truncating the string silently is also not a good way. This often leads to errors at later stage of execution thus making it diffuct to track down an origin. For example, a developer appends ".ext" in one place to make sure that all files have extention, then in another place he rseaches for '.' expecting that it never returns npos. I didn't try to compile anything because I simply can't compile anything: $ cat test.cpp #include <boost/fixed_string/fixed_string.hpp> int main() { } $ pwd /home/nasoale/src/boost/fixed_string $ ls boost libs tags test.cpp $ gcc -v Reading specs from /usr/lib/gcc-lib/i486-suse-linux/3.2.2/specs Configured with: ../configure --enable-threads=posix --prefix=/usr --with-local-prefix=/usr/local --infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib --enable-languages=c,c++,f77,objc,java,ada --enable-libgcj --with-gxx-include-dir=/usr/include/g++ --with-slibdir=/lib --with-system-zlib --enable-shared --enable-__cxa_atexit i486-suse-linux Thread model: posix gcc version 3.2.2 $ g++ -I. -I ../boost_1_33_0/ test.cpp In file included from ./boost/fixed_string/fixed_string.hpp:12, from test.cpp:1: ./boost/fixed_string/detail/basic_string_impl.hpp: In member function `typename Base::const_iterator boost::detail::basic_string_impl<Base, ErrorPolicy>::begin() const': ./boost/fixed_string/detail/basic_string_impl.hpp:149: error: there are no arguments to `begin_' that depend on a template parameter, so a declaration of `begin_' must be available ./boost/fixed_string/detail/basic_string_impl.hpp:149: error: (if you use `-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated) ... skiped a lot of error lines
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
Quick reading. Sorry for all bad news for you, Reece. Es ist nicht so schlimm :))) I hope my negative review won't stop others from posting constructive comments. -- Alexander Nasonov

Alexander Nasonov wrote:
- What is your evaluation of the documentation?
The documentation says more about implementation details rather then public interface.
Ok. I will try and address this when I rewrite the documentation.
All constructors and assignment operators don't have a documentation, they're only listed. All other functions are not even listed!
I think this is because I am using the doxygen named group feature. /** @name Group */ //@{ //@} and that is not pulling in all the functions :(.
I realise that these functions are implemented in fixed_string_base but this is not documented (I'm pretty sure it's a typo, public keyword does never come alone)
I think this one is because the fixed_string_base (and fixed_string_impl) classes are in the details namespace and these are removed when the doxygen documentation is being pulled into the Boost docs.
- What is your evaluation of the design?
fixed_string is derived from fixed_string_base. IIUC, the rationale is to get rid of compile-time N in fixed_string<N>. What is inappropriate, is that fixed_string copies arrays while fixed_string_base copies pointers. For example,
fixed_string_base foo() { return fixed_string<10>("Return a pointer to local array"); }
I think Martin Adrian's solution would work well here. You would then have basic_string_impl (or something similar) to implement all the repetitive basic_string functions, fixed_string< CharT, n > for operating on fixed-length buffers and fixed_string_any< CharT > for operating on any length character buffers.
Truncating the string silently is also not a good way. This often leads to errors at later stage of execution thus making it diffuct to track down an origin. For example, a developer appends ".ext" in one place to make sure that all files have extention, then in another place he rseaches for '.' expecting that it never returns npos.
Having an overrun_policy, as Martin suggests, would allow you to use a throwing policy to avoid this.
I didn't try to compile anything because I simply can't compile anything:
gcc version 3.2.2
$ g++ -I. -I ../boost_1_33_0/ test.cpp In file included from ./boost/fixed_string/fixed_string.hpp:12, from test.cpp:1: ./boost/fixed_string/detail/basic_string_impl.hpp: In member function `typename Base::const_iterator boost::detail::basic_string_impl<Base, ErrorPolicy>::begin() const': ./boost/fixed_string/detail/basic_string_impl.hpp:149: error: there are no arguments to `begin_' that depend on a template parameter, so a declaration of `begin_' must be available ./boost/fixed_string/detail/basic_string_impl.hpp:149: error: (if you use `-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated)
In order to get this to work, I'll need to use the derived().begin_ technique used by the iterator implementations.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
Quick reading.
Sorry for all bad news for you, Reece. Es ist nicht so schlimm :)))
Was toten Unser nicht, macht Unser stärker. (Sorry if I have mangled the grammar, as I can't speak German fluently.) Thanks for the feedback, - Reece

"Hartmut Kaiser" <hartmut.kaiser@gmail.com> writes:
You can download the library here: http://tinyurl.com/czrh9
TinyURL is down; please post the full link. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
You can download the library here: http://tinyurl.com/czrh9
TinyURL is down; please post the full link.
http://boost-consulting.com/vault/index.php?action=downloadfile&filename=fix ed_string.zip&directory=Strings%20-%20Text%20Processing& Regards Hartmut
participants (5)
-
Alexander Nasonov
-
David Abrahams
-
Hartmut Kaiser
-
Martin Adrian
-
Reece Dunn