
David Abrahams wrote:
Jeff Garland <jeff@crystalclearsoftware.com> writes:
I've been working on a little project where I've had to doing lots of string processing, so I decided to put together a string type that wraps up boost.regex and boost.string_algo into a string type. I also remember a discussion in the LWG about whether the various string algorithms should be built in or not -- well consider this a test -- personally I find it easier built into the string than as standalone functions.
I appreciate the convenience of such an interface, I really do, but doesn't this design just compound the "fat interface" problems that std::string already has?
Yes, that's partially the point :-) I understand std::string is too big for some. Sadly the members it has make it hard to do the things I tend to do with strings most often with strings. The fact is, if you look around at the languages people are using most for string processing, they offer just as many features as super_string and then some. Somehow, programmers are managing to deal with this. I'd buy more into the fat interface being a problem if something in the string class went beyond string processing, but it doesn't. String processing is a big complex domain -- whole languages have been optimized for it -- it needs a lot of functions to cover the domain and make easy to read code. Any way you slice it the current basic_string is inferior to what most modern languages offer. Needless to say, I understand all about stl, free functions, their power, etc, etc. But the big thing this misses is that having a single type that unifies the string processing interface means there's a single set of documentation to start figuring out how to do a string manipulation. I don't have to wade thru 50 pages of string_algorithms, 50 pages of regex docs and so on -- there's hundreds of functions to deal with strings there. Not to mention the templatization factor in the docs of these libraries which mostly detracts from me figuring out how to process the string. If I'm a Boost novice much of this great a useful string processing capability might be lost in so many other libraries. The other thing that gets me is the readability of code. With a built-in function, it's one less parameter to remember when calling these functions. It seems trivial, but I believe the code is ultimately easier to understand. Simple example: std::string s1("foo"); std::string s2("bar); std::string s3("foo"); //The next line makes me go read the docs again, every time replace_all(s1,s2,s3); //which string is modified exactly? or s1.replace_all(s2, s3); //obvious which string is modified here I understand this flies against the current established C++ wisdom, but that's part of the reason I've done it. After thinking about it, I think the 'wisdom' is wrong. Usability and readability has been lost -- my code is harder to understand. I expect that super_string has little chance of ever making it to Boost because it is goes too radically against some of these deeply held beliefs. That said, I think there's a group of folks out there that agree with me and are afraid to speak up. Now they can at least download it from the vault -- but maybe they'll speak up -- we'll see. In any case, it's up to individuals to decide download and use super_string, or continue using their inferior string class ;-)
Even Python's string, which has a *lot* built in, doesn't try to handle the regex stuff directly.
There are plenty of counter examples: Perl, Java, Javascript, and Ruby that build regex directly into the library/language. It's very powerful and useful in my experience. And, of course, super_string doesn't take away anything, just makes these powerful tools more accessible and easier to use. Jeff