
Jeff Garland wrote:
David Abrahams wrote:
Jeff Garland <jeff@crystalclearsoftware.com> writes:
I've been working on a little project where I've had to doing lots of string processing, so I decided to put together a string type that wraps up boost.regex and boost.string_algo into a string type. I also remember a discussion in the LWG about whether the various string algorithms should be built in or not -- well consider this a test -- personally I find it easier built into the string than as standalone functions. I appreciate the convenience of such an interface, I really do, but doesn't this design just compound the "fat interface" problems that std::string already has?
Yes, that's partially the point :-) I understand std::string is too big for some. Sadly the members it has make it hard to do the things I tend to do with strings most often with strings. The fact is, if you look around at the languages people are using most for string processing, they offer just as many features as super_string and then some. Somehow, programmers are managing to deal with this. I'd buy more into the fat interface being a problem if something in the string class went beyond string processing, but it doesn't. String processing is a big complex domain -- whole languages have been optimized for it -- it needs a lot of functions to cover the domain and make easy to read code. Any way you slice it the current basic_string is inferior to what most modern languages offer.
Needless to say, I understand all about stl, free functions, their power, etc, etc. But the big thing this misses is that having a single type that unifies the string processing interface means there's a single set of documentation to start figuring out how to do a string manipulation.
I am with you, Jeff. I do not think that std::string is too fat but only that a design mistake was made with it. The mistake is that after specifying a std::string constructor that takes a C null-terminated string ( const char *), which std::string has, all other functionality dealing with a string should have been in terms of std::string, and nothing else should have been in terms of a C null-terminated string. This is the principle of making a clear interface which has a single good way of doing things, rather than a muddy interface with numerous ways of doing the same thing. Other than this design mistake, no doubt unfortunately done to cater to the C crowd, std::string is fine for what it does and is not too fat at all.
I don't have to wade thru 50 pages of string_algorithms, 50 pages of regex docs and so on -- there's hundreds of functions to deal with strings there. Not to mention the templatization factor in the docs of these libraries which mostly detracts from me figuring out how to process the string. If I'm a Boost novice much of this great a useful string processing capability might be lost in so many other libraries.
The other thing that gets me is the readability of code. With a built-in function, it's one less parameter to remember when calling these functions. It seems trivial, but I believe the code is ultimately easier to understand. Simple example:
std::string s1("foo"); std::string s2("bar); std::string s3("foo"); //The next line makes me go read the docs again, every time replace_all(s1,s2,s3); //which string is modified exactly? or s1.replace_all(s2, s3); //obvious which string is modified here
I understand this flies against the current established C++ wisdom, but that's part of the reason I've done it. After thinking about it, I think the 'wisdom' is wrong. Usability and readability has been lost -- my code is harder to understand. I expect that super_string has little chance of ever making it to Boost because it is goes too radically against some of these deeply held beliefs. That said, I think there's a group of folks out there that agree with me and are afraid to speak up.
I will speak up. The passion for loosely coupled free functions has gone too far. It works when there is a reason for it, usually because it is a function template and must deal with different types, ala the algorithms in the C++ standard library, but is not a solution for all situations. I am for a rich string class and think that super string is the right idea. My only difference is that I want a string class to only deal in C++ std::strings at all times, once a constructor has been provided for converting a null-terminated C string into the string class, in order to make the interface much cleaner and clearer.
Now they can at least download it from the vault -- but maybe they'll speak up -- we'll see. In any case, it's up to individuals to decide download and use super_string, or continue using their inferior string class ;-)
Even Python's string, which has a *lot* built in, doesn't try to handle the regex stuff directly.
There are plenty of counter examples: Perl, Java, Javascript, and Ruby that build regex directly into the library/language. It's very powerful and useful in my experience. And, of course, super_string doesn't take away anything, just makes these powerful tools more accessible and easier to use.
Jeff