
On 23 June 2010 03:49, Rob Stewart <robertstewart@comcast.net> wrote:
My intention is that unquote() should handle a string with multiple quoted substrings rather than just assuming that the entire string is quoted. It is certainly reasonable to think that it should only handle whole strings. In that case, it would be easy to identify malformed strings on input: either it starts and ends with the delimiter, and all other occurrences are escaped, or it is malformed.
I don't think that's right, since multiple quoted substrings normally means multiple values (apart from in C family languages, but then text outside the quotes is interpreted differently). Looking at your code, the escape should work outside of the quotes and you don't check for the end of the string after an escape.
I quickly chose to throw std::logic_error from unquoted() when it fails to find a closing delimiter. There may well be a better approach; feel free to suggest alternatives.
You should inherit from std::runtime_error, since that would be usually caused by bad input rather than programmer error. To be honest, I don't see the value of this. As this is the kind of thing which is handled well in other ways (e.g. using a parser or lexer generator, or a standard data format such as XML, JSON etc.). There tends to be odd differences in quoting, encoding and escaping styles making a generic function awkward. It's not as specific as a filename extractor and not as generic as a parser and it's not clear why there's a need for something in between. Daniel