[String_Ref] Working with other parts of boost

Hi, I wanted to play around with Boost.String_Ref and found some old code that I thought would be the ideal case for string_refs. It's basically a config file reader that stores the whole line and also a key/value pair that may be found in that line. So a single line looked something like this: std::string line; //original line as is in the file boost::string_ref key; //extracted key subpart boost::string_ref value; //extracted value subpart But when I wanted to trim the string_refs (eg. for lexical_casting them) I got compile errors, because boost::trim uses erase, which of course is not supported by string_ref. But string_ref has the remove_prefix/suffix member functions, which would be exactly what trim needs. So I quickly hacked an additional overload of trim_*_if that calls the remove_*fix instead of the .erase() member function and it seemed to work. The second problem occurred when I tried to extract my key/value pair from a line using the Tokenizer, which seems to be old, but it still works for simple cases. Tokenizer internally uses assign() to fill an internal buffer with the next token, which of course is no member function of string_ref. Since it seems Tokenizer only uses the assign version that takes two iterators it might be possible to add such an assign() function to string_ref. At least I added one as a testcase locally and it worked in my limited testcase. Are there any plans to add either trim() or Tokenizer support for string_ref? I'm not sure about the Tokenizer/assign stuff, since one would have to change string_ref but I really think the trim() support would be helpful. Anyway, these issues kept me from actually using string_ref, since the code did not compile (or required changes to boost headers in order to compile). And I did not want to redo some of the boost string algorithm stuff in order to use string_refs. Norbert

On Nov 17, 2013, at 11:47 AM, Norbert Wenzel
When I get some free time - maybe over the holidays, I'll be taking a long look at Boost.Algorirthm's string algorithm section, and seeing what I can do to make them work with string_ref. They're very general algorithms; I wonder how many people actually call them with anything other that std::string (and other specializations of std::basic_string) Anyone? Anyone? Bueller? -- Marshall Marshall Clow Idio Software mailto:mclow.lists@gmail.com A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait). -- Yu Suzuki

On 11/18/2013 02:36 AM, Marshall Clow wrote:
When I get some free time - maybe over the holidays, I'll be taking a long look at Boost.Algorirthm's string algorithm section, and seeing what I can do to make them work with string_ref. They're very general algorithms; I wonder how many people actually call them with anything other that std::string (and other specializations of std::basic_string)
I do not use these algorithms presently, but I have my own equivalent of string_ref for a suite of parsers that works on both textual and binary formats, and I would like to change that to use string_ref. Because I am parsing, I do not really need the mutable algorithms, but the rest are useful to me. Apropos string_ref, I would like to see lexical_cast extended to support a Range concept.

2013/11/18 Bjorn Reese
Apropos string_ref, I would like to see lexical_cast extended to support a Range concept.
lexical_cast has optimizations for ranges with character types since 1.50: http://www.boost.org/doc/libs/1_55_0/doc/html/boost_lexical_cast/changes.htm... It also has a lexical_cast(const CharType* chars, std::size_t count)function since 1.52. And I'm afraid that's all what can be done without breaking the initial interface and behavior of LexicalCast library. Or were you talking about some other "range concept" support? -- Best regards, Antony Polukhin

On 11/19/2013 02:30 PM, Antony Polukhin wrote:
Or were you talking about some other "range concept" support?
Yes, I was talking about Boost.Range concepts, which are used in Boost.StringAlgo. Here you pass a reference to an object whose beginning and ending can be obtained by boost::begin()/end(). This way the string algorithms work directly with string_ref. If I understood lexical_cast correctly, it uses ranges internally but do not expose them in the API.

On 19.11.2013 15:57, Bjorn Reese wrote:
I'm not sure if I understand your point correctly, but lexical_cast was the one thing I used, which actually did work with string_ref. Though I didn't check if lexical_cast made any copies or other unnecessary stuff internally, but I think I got the correct results from lexical_cast.

On 11/19/2013 04:26 PM, Norbert Wenzel wrote:
You are right. lexical_cast does work with string_ref. I overlooked the fact that lexical_cast works with both pointer+size and iterator_range ranges. More specifically, I overlooked the fact that lexical_cast uses string_ref::operator<< to obtain the data (and I guess that this is why Boost.Spirit generally outperforms lexical_cast.) Sorry about the noise.

2013/11/20 Bjorn Reese
There is a plan to add optimizations for string_ref to lexical_cast so there'll be minimal difference in performance. Boost.Spirit and Boost.LexicalCast differ not only in performance: lexical_cast is more precise when working with real numbers and uses std::locale during conversions to be able to process decimal and other separators in a locale dependent way. LexicalCast is also usable with user-defined types that have operator>> and operator<< overloads. So choose tool depending on your needs. -- Best regards, Antony Polukhin

On 11/20/2013 09:31 AM, Antony Polukhin wrote:
There is a plan to add optimizations for string_ref to lexical_cast so there'll be minimal difference in performance.
That is good news. Can this be extended to any type that fulfils the Single Pass Range (or the Forward Range) concept?

2013/11/20 Bjorn Reese
No, that's impossible: lexical_cast must convert data as if it was passed
via operator<<. Optimizations can be added to some types with exactly known
behavior: iterator_range
participants (4)
-
Antony Polukhin
-
Bjorn Reese
-
Marshall Clow
-
Norbert Wenzel