Variadic append for std::string
Hi, One frequently needs to append stuff to strings, but the standard way (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries. A variadic append() for std::string seems like the obvious solution. It could support string_view (boost and std), integers, maybe floats but without formatting options.. It could even be extensible by calling append(s, t); append(s, "A", "B", 42); Would this be useful for the Boost String Algo lib? -- Olaf
Pretty please? Also concat(stringy_things...) Billy3 ________________________________ From: Boost <boost-bounces@lists.boost.org> on behalf of Olaf van der Spek <ml@vdspek.org> Sent: Tuesday, December 27, 2016 6:47:33 AM To: boost@lists.boost.org Subject: [boost] Variadic append for std::string [This is one of the first messages you've received from ML@VDSPEK.ORG. Learn how we recognize email senders at http://aka.ms/LearnAboutSenderIdentification] Hi, One frequently needs to append stuff to strings, but the standard way (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries. A variadic append() for std::string seems like the obvious solution. It could support string_view (boost and std), integers, maybe floats but without formatting options.. It could even be extensible by calling append(s, t); append(s, "A", "B", 42); Would this be useful for the Boost String Algo lib? -- Olaf _______________________________________________ Unsubscribe & other changes: https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.boost.org%2Fmailman%2Flistinfo.cgi%2Fboost&data=02%7C01%7Cbion%40microsoft.com%7Ce8f9415a3f87491e848d08d42e675b57%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636184468831322908&sdata=wqjhaHLHfaeV0rw9xGlLdY2zHIWLtZdnew%2FwP%2F%2Fyf7A%3D&reserved=0
On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <ml@vdspek.org> wrote:
One frequently needs to append stuff to strings, but the standard way (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
Can't we already write it through (((s += "A") += "B") += to_string(42))? This is the time I think that assignment operators, other than =, should have had left associativitiy... pity they don't. -- Yakov Galka http://stannum.co.il/
On Thu, Dec 29, 2016 at 1:19 AM, Yakov Galka <ybungalobill@gmail.com> wrote:
On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <ml@vdspek.org> wrote:
One frequently needs to append stuff to strings, but the standard way (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
Can't we already write it through (((s += "A") += "B") += to_string(42))? This is the time I think that assignment operators, other than =, should have had left associativitiy... pity they don't.
We can, but it's ugly and I'd like to avoid the explicit to_string. It also wouldn't allow the two-pass optimization to calculate the final length before allocation. -- Olaf
On 12/29/16 11:54, Olaf van der Spek wrote:
On Thu, Dec 29, 2016 at 1:19 AM, Yakov Galka <ybungalobill@gmail.com> wrote:
On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <ml@vdspek.org> wrote:
One frequently needs to append stuff to strings, but the standard way (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
Can't we already write it through (((s += "A") += "B") += to_string(42))? This is the time I think that assignment operators, other than =, should have had left associativitiy... pity they don't.
We can, but it's ugly and I'd like to avoid the explicit to_string. It also wouldn't allow the two-pass optimization to calculate the final length before allocation.
I already mentioned in the std-proposals discussion that I don't think formatting should be dealed with by std::string or a function named append(). If formatting is to be involved I'd suggest creating a formatting library, but at that point you should provide clear advantages over the other formatting libraries we have in Boost.
On Thu, Dec 29, 2016 at 2:53 PM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
One frequently needs to append stuff to strings, but the standard way (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
Can't we already write it through (((s += "A") += "B") += to_string(42))? This is the time I think that assignment operators, other than =, should have had left associativitiy... pity they don't.
We can, but it's ugly and I'd like to avoid the explicit to_string. It also wouldn't allow the two-pass optimization to calculate the final length before allocation.
I already mentioned in the std-proposals discussion that I don't think formatting should be dealed with by std::string or a function named append().
It'd be helpful if you include *why* you think so..
If formatting is to be involved I'd suggest creating a formatting library, but at that point you should provide clear advantages over the other formatting libraries we have in Boost.
-- Olaf
On 12/31/16 20:36, Olaf van der Spek wrote:
On Thu, Dec 29, 2016 at 2:53 PM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
I already mentioned in the std-proposals discussion that I don't think formatting should be dealed with by std::string or a function named append().
It'd be helpful if you include *why* you think so..
I've already explained my opinion in the std-proposals discussion. For the sake of completeness, here's a short version: It's simply not std::string's job to do the formatting, IMO. This class should be nothing more than a container of characters (well, it is slightly more now, but I don't consider that a good thing). I guess, that's mostly because I believe one class should be responsible for doing only one thing, and in case of std::string it's storing a string. Adding formatting functionality to std::string would increase the class' bloat in terms of interface and implementation and likely add new dependencies. Though, this is probably not an argument against a separate non-intrusive library. Back to the proposal for Boost, I don't mind if there is a standalone function or library that does the formatting, as long as it offers some advantage over the existing libraries. The proposed function though should not be named `append` IMO because it's not the primary thing the function does. I would expect an `append` algorithm to be generic and compatible with any container, i.e. something that does nothing more than `c.insert(c.end(), x)` or `c.append(x)`, for `x` being every argument in the list of arguments to be appended. I.e. this should work: std::list< double > c{ 1.0, 2.0, 3.0 }; append(c, 10.0, 20.0, 30.0); // calls c.insert() As well as this: std::string s{ "Hello" }; append(s, ", world!", " Happy 2017! :)"); // calls s.append() This, however: append(s, 47); should result not in appending "47" but in appending "/" (a character with code 47). I can see how this could be confusing to someone, but that is what you'd get from calling `s.insert()` manually, and what I'd expect from a function called `append`. If formatting is required I would prefer to be required to spell my intent more clearly, like this: print(s, 47); or: format(s) << 47; Also, I'm not clear enough about the intended use cases of the proposed library. Is the goal just to optimize memory allocation? Is that at all possible when formatting is involved? Would it be better than snprintf into a local buffer? Does the library open new use cases? For example, someone suggested in the std-proposals discussion something similar to this: throw std::runtime_error(format(std::string()) << "Error " << 47); (I wrapped the default-constructed std::string() into format(), because I don't think overloading operator<< for std::string is an acceptable approach for the same reasons I mentioned above.) I think, something with one line capability like that would be useful. Would the library allow something like this? Would the library support targets other than std::string? E.g. would I be able to format into an `std::array< char, 10 >`?
On Sun, Jan 1, 2017 at 12:21 AM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
If formatting is required I would prefer to be required to spell my intent more clearly, like this:
print(s, 47);
I'd expect print to output to cout.. wouldn't you? sprint then?
or:
format(s) << 47;
I'd expect format to accept modifiers which this proposal explicitly doesn't support.
Also, I'm not clear enough about the intended use cases of the proposed library. Is the goal just to optimize memory allocation?
No, the goal is also to provide a better and simpler way to handle integers.
Is that at all possible when formatting is involved?
Yes, as manually calling reserve beforehand is always possible. How to optimally implement this is still an open question but that's kind of an implementation detail.
Would it be better than snprintf into a local buffer?
The resulting code certainly looks simpler to me.
Does the library open new use cases? For example, someone suggested in the std-proposals discussion something similar to this:
throw std::runtime_error(format(std::string()) << "Error " << 47);
(I wrapped the default-constructed std::string() into format(), because I don't think overloading operator<< for std::string is an acceptable approach for the same reasons I mentioned above.)
I disagree.. I really don't see the benefit, especially for the user, of the format wrapper. operator<< would be a different proposal but throw runtime_error(append(string(), "Error ", 47)); might work.
I think, something with one line capability like that would be useful. Would the library allow something like this?
Would the library support targets other than std::string?
Yes, probably.
E.g. would I be able to format into an `std::array< char, 10 >`?
No, as array is fixed-size you can't append to it.. vector<char> might work though. -- Olaf
On 01/03/17 13:14, Olaf van der Spek wrote:
On Sun, Jan 1, 2017 at 12:21 AM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
If formatting is required I would prefer to be required to spell my intent more clearly, like this:
print(s, 47);
I'd expect print to output to cout.. wouldn't you? sprint then?
sprint works for me as well.
Also, I'm not clear enough about the intended use cases of the proposed library. Is the goal just to optimize memory allocation?
No, the goal is also to provide a better and simpler way to handle integers.
Is that at all possible when formatting is involved?
Yes, as manually calling reserve beforehand is always possible. How to optimally implement this is still an open question but that's kind of an implementation detail.
But you would have to either overallocate memory or perform the formatting to determine its length. And while overallocating might be possible for standard types such as integers and FP numbers (assuming C-locale format), that does not seem possible for user's types. Or are you not planning to support user-defined types?
I think, something with one line capability like that would be useful. Would the library allow something like this?
Would the library support targets other than std::string?
Yes, probably.
E.g. would I be able to format into an `std::array< char, 10 >`?
No, as array is fixed-size you can't append to it.. vector<char> might work though.
My intent was to format into a local/preallocated buffer, without any additional allocations, but I assume that won't work because `std::array` is lacking APIs for insertion. That probably means that you have to define a concept of the possible target, what operations it must support.
On Tue, Jan 3, 2017 at 1:32 PM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
Yes, as manually calling reserve beforehand is always possible. How to optimally implement this is still an open question but that's kind of an implementation detail.
But you would have to either overallocate memory or perform the formatting to determine its length.
True
And while overallocating might be possible for standard types such as integers and FP numbers (assuming C-locale format), that does not seem possible for user's types. Or are you not planning to support user-defined types?
Supporting such types in one big call to sprint would be nice but it does complicate the proposal. One could always call sprint(s, <udt>) 'manually'. Or maybe the two-argument version could be the extension point. Maybe a sprint_max_size(s, <udt>) could be used (if defined) to estimate the size required.
My intent was to format into a local/preallocated buffer, without any additional allocations, but I assume that won't work because `std::array` is lacking APIs for insertion. That probably means that you have to define a concept of the possible target, what operations it must support.
Right, if we opt for a generic version. It would again complicate the proposal though. -- Olaf
On Tue, Jan 3, 2017 at 2:19 PM, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 01.01.2017 00:21, schrieb Andrey Semashev:
throw std::runtime_error(format(std::string()) << "Error " << 47);
How would that differ from
throw std::runtime_error((std::ostringstream{} << "Error " << 47).str());
Simpler syntax, better performance -- Olaf
Hin Am 03.01.2017 14:20, schrieb Olaf van der Spek:
On Tue, Jan 3, 2017 at 2:19 PM, Christof Donat <cd@okunah.de> wrote:
Am 01.01.2017 00:21, schrieb Andrey Semashev:
throw std::runtime_error(format(std::string()) << "Error " << 47);
How would that differ from
throw std::runtime_error((std::ostringstream{} << "Error " << 47).str());
Simpler syntax, better performance
I see the chances for better performance, but for the syntax I don't really see any remarkable improvements. If performance matters, I'd try with boost::spirit::karma. The syntax will be less concise, but I am not aware of a faster generic solution. auto message = std::string{10}; // <- preallocate enough memory for the message if( !karma::generate(std::begin(message), ascii::space, "Error " << karma::uint_, 47) ) { // formating the error message failed. throw something else. } // since this is the exit of the function, the compiler might apply copy elision. throw std::runtime_error(message); If you have multiple places like that in your code, I guess, you'd like to wrap it into a generic function and you have a similar API to the "append()" proposal. Now I see, how it might be useful, thanks. I think, append() should rely on karma generators then, instead of yet another int to string implementation, because we only already have five dozens. Christof
On Tue, Jan 3, 2017 at 3:16 PM, Christof Donat <cd@okunah.de> wrote:
Hin
Am 03.01.2017 14:20, schrieb Olaf van der Spek:
On Tue, Jan 3, 2017 at 2:19 PM, Christof Donat <cd@okunah.de> wrote:
Am 01.01.2017 00:21, schrieb Andrey Semashev:
throw std::runtime_error(format(std::string()) << "Error " << 47);
How would that differ from
throw std::runtime_error((std::ostringstream{} << "Error " << 47).str());
Simpler syntax, better performance
I see the chances for better performance, but for the syntax I don't really see any remarkable improvements.
The extra parentheses and the .str() part are annoying.. same goes for boost::format. How about this one? throw std::runtime_error("Error "s << 47);
If performance matters, I'd try with boost::spirit::karma. The syntax will
Performance matters but it's not the only thing that matters. What solution do you think someone new to C++ understands better?
If you have multiple places like that in your code, I guess, you'd like to wrap it into a generic function and you have a similar API to the "append()" proposal. Now I see, how it might be useful, thanks. I think, append() should rely on karma generators then, instead of yet another int to string implementation, because we only already have five dozens.
It's an implementation detail but yes, it might be useful. -- Olaf
Hi, Am 03.01.2017 15:23, schrieb Olaf van der Spek:
On Tue, Jan 3, 2017 at 3:16 PM, Christof Donat <cd@okunah.de> wrote: The extra parentheses and the .str() part are annoying.. same goes for boost::format.
I see. For me that is not a big issue, but people are different.
How about this one?
throw std::runtime_error("Error "s << 47);
Uh. How does that work with std::cout << "Error "s << 47; Will that be (std::cout << "Error "s) << 47; or std::cout << ("Error "s << 47); Also I'd expect a std::string to behave like a stream then and try to use e.g. manipulators. Maybe that would be acceptable with a different operator. e.g. like in SQL: throw std::runtime_error("Error "s || 47); Now this is explicit: std::cout << "Error "s || 47; // versus std::cout << "Error "s << 47; But then again this might behave surprisingly: throw std::runtime_error("Error "s || 47 || 11); I still don't feel comfortable with it.
If performance matters, I'd try with boost::spirit::karma. The syntax will
Performance matters but it's not the only thing that matters. What solution do you think someone new to C++ understands better?
I think, mixing the notion of strings and streams, but not for e.g. manipulators would confuse people a lot. I am a big fan of overloading operators and expression templates, wherever they improve the expression of intent. In this particular case my gut feeling tells me, that it will harm the expression of intent more often, than it will improve. Christof
On Tue, Jan 3, 2017 at 3:48 PM, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 03.01.2017 15:23, schrieb Olaf van der Spek:
On Tue, Jan 3, 2017 at 3:16 PM, Christof Donat <cd@okunah.de> wrote: The extra parentheses and the .str() part are annoying.. same goes for boost::format.
I see. For me that is not a big issue, but people are different.
I think the language still doesn't allow temporaries to bind to mutable& so we might've cheered too soon.. :(
How about this one?
throw std::runtime_error("Error "s << 47);
Uh. How does that work with
std::cout << "Error "s << 47;
Will that be
(std::cout << "Error "s) << 47;
or
std::cout << ("Error "s << 47);
We've got rules for that.. http://en.cppreference.com/w/cpp/language/operator_precedence
Also I'd expect a std::string to behave like a stream then and try to use e.g. manipulators. Maybe that would be acceptable with a different operator. e.g. like in SQL:
throw std::runtime_error("Error "s || 47);
Now this is explicit:
std::cout << "Error "s || 47; // versus
Why would you want to combine << for ostream and string? Just use the ostream one in both cases. I think << is quite elegant. -- Olaf
Hi, Am 03.01.2017 16:08, schrieb Olaf van der Spek:
How about this one?
throw std::runtime_error("Error "s << 47);
Uh. How does that work with
std::cout << "Error "s << 47;
Will that be
(std::cout << "Error "s) << 47;
or
std::cout << ("Error "s << 47);
We've got rules for that.. http://en.cppreference.com/w/cpp/language/operator_precedence
I know, but you came up with C++ beginners. They already often get confused with streams and bit shifts, mostly when they come to C++ from C. Now we additionally mix expressions, where we can use manipulators with those, where we can't. That will probably make things worse not only for C++ beginners.
Also I'd expect a std::string to behave like a stream then and try to use e.g. manipulators. Maybe that would be acceptable with a different operator. e.g. like in SQL:
throw std::runtime_error("Error "s || 47);
Now this is explicit:
std::cout << "Error "s || 47; // versus
Why would you want to combine << for ostream and string? Just use the ostream one in both cases.
We just came across the idea, that appending to the string will be faster than writing to a stream. Therefore it might be a good idea to put together the string with appending and then hand the complete result to the stream. Anyway, the syntax I proposed would at least make the difference explicit for the reader. The programmer expresses the intent to either first concatenate the strings and then write to the stream or write to the stream directly.
I think << is quite elegant.
I feel very uneasy with it and I think I have presented quite some reasoning why. Also I don't seem to be the only one in this discussion. At least that should count as a strong indicator, that you need much better reasons to back your proposal. Christof
Hi Olaf,
I think << is quite elegant.
I feel very uneasy with it and I think I have presented quite some reasoning why. Also I don't seem to be the only one in this discussion. At least that should count as a strong indicator, that you need much better reasons to back your proposal.
I agree with Christof. Also think about consistency within the C++ standard library. Consistency is good, because it allows you to apply the same thinking elsewhere, instead of looking up everything in a reference. There should be a minimum of surprises in using a language and a library. Python does this very well, it is codified in the "There should be one - and preferably only one - obvious way to do it" rule. Since many years we have established streams and containers as separate things. std::string is a container, std::ostringstream is a stream. They have separate responsibilities. You should not mix these. Herb Sutter and other experts already use std::string as a prime example of a class with too many responsibilities (in form of many member functions). Like it was said before, std::string should be a dynamic container of characters, nothing more. Hans
On Wed, Jan 4, 2017 at 11:28 AM, Hans Dembinski <hans.dembinski@gmail.com> wrote:
Hi Olaf,
I think << is quite elegant.
I feel very uneasy with it and I think I have presented quite some reasoning why. Also I don't seem to be the only one in this discussion. At least that should count as a strong indicator, that you need much better reasons to back your proposal.
I agree with Christof. Also think about consistency within the C++ standard library. Consistency is good, because it allows you to apply the same thinking elsewhere, instead of looking up everything in a reference. There should be a minimum of surprises in using a language and a library. Python does this very well, it is codified in the "There should be one - and preferably only one - obvious way to do it" rule.
Since many years we have established streams and containers as separate things. std::string is a container, std::ostringstream is a stream. They have separate responsibilities. You should not mix these. Herb Sutter and other experts already use std::string as a prime example of a class with too many responsibilities (in form of many member functions). Like it was said before, std::string should be a dynamic container of characters, nothing more.
I agree with Herb and that's why those operators are NOT member functions.. Note that the proposed syntax could also support other containers. -- Olaf
On Tue, Jan 3, 2017 at 4:55 PM, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 03.01.2017 16:08, schrieb Olaf van der Spek:
How about this one?
throw std::runtime_error("Error "s << 47);
Uh. How does that work with
std::cout << "Error "s << 47;
Will that be
(std::cout << "Error "s) << 47;
or
std::cout << ("Error "s << 47);
We've got rules for that.. http://en.cppreference.com/w/cpp/language/operator_precedence
I know, but you came up with C++ beginners. They already often get confused with streams and bit shifts, mostly when they come to C++ from C. Now we additionally mix expressions, where we can use manipulators with those, where we can't. That will probably make things worse not only for C++ beginners.
Maybe, maybe not. Overloading yet another operator for a similar (but not equal) purpose is problematic too.
I think << is quite elegant.
I feel very uneasy with it and I think I have presented quite some reasoning why.
You have, but I don't think we have a significantly better alternative.
Also I don't seem to be the only one in this discussion. At least that should count as a strong indicator, that you need much better reasons to back your proposal.
-- Olaf
Hi, Am 11.01.2017 09:28, schrieb Olaf van der Spek:
On Tue, Jan 3, 2017 at 4:55 PM, Christof Donat <cd@okunah.de> wrote:
I know, but you came up with C++ beginners. They already often get confused with streams and bit shifts, mostly when they come to C++ from C. Now we additionally mix expressions, where we can use manipulators with those, where we can't. That will probably make things worse not only for C++ beginners.
Maybe, maybe not. Overloading yet another operator for a similar (but not equal) purpose is problematic too.
That is true. Actually I think, the best option is a different class, that behaves just like a stream. Then it'll support manipulators, etc. No one will be surprised, that strings behave similar to streams, but not really the same, etc. We already have that class in the standard library: std::ostringstream. BTW: boost::format follows that same pattern as well. It just doesn't use the shift operator to prevent confusion with stream.
I think << is quite elegant.
I feel very uneasy with it and I think I have presented quite some reasoning why.
You have, but I don't think we have a significantly better alternative.
The significantly better alternative is std::ostringstream. I think our debate showed very nicely, that all the options, we have come up with to simplify the concatenation of strings are significantly inferior in total, though we could find examples, where they slightly improve readability. Christof
On Wed, Jan 11, 2017 at 10:02 AM, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 11.01.2017 09:28, schrieb Olaf van der Spek:
On Tue, Jan 3, 2017 at 4:55 PM, Christof Donat <cd@okunah.de> wrote:
I know, but you came up with C++ beginners. They already often get confused with streams and bit shifts, mostly when they come to C++ from C. Now we additionally mix expressions, where we can use manipulators with those, where we can't. That will probably make things worse not only for C++ beginners.
Maybe, maybe not. Overloading yet another operator for a similar (but not equal) purpose is problematic too.
That is true. Actually I think, the best option is a different class, that behaves just like a stream. Then it'll support manipulators, etc. No one will be surprised, that strings behave similar to streams, but not really the same, etc. We already have that class in the standard library: std::ostringstream.
BTW: boost::format follows that same pattern as well. It just doesn't use the shift operator to prevent confusion with stream.
I think << is quite elegant.
I feel very uneasy with it and I think I have presented quite some reasoning why.
You have, but I don't think we have a significantly better alternative.
The significantly better alternative is std::ostringstream. I think our debate showed very nicely, that all the options, we have come up with to simplify the concatenation of strings are significantly inferior in total, though we could find examples, where they slightly improve readability.
Interface is important, as is performance.. and performence of ostringstream sucks.. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0067r1.html https://gist.github.com/anonymous/7700052 -- Olaf
Hi, Am 11.01.2017 16:03, schrieb Olaf van der Spek:
On Wed, Jan 11, 2017 at 10:02 AM, Christof Donat <cd@okunah.de> wrote:
The significantly better alternative is std::ostringstream. I think our debate showed very nicely, that all the options, we have come up with to simplify the concatenation of strings are significantly inferior in total, though we could find examples, where they slightly improve readability.
Interface is important, as is performance.. and performence of ostringstream sucks..
Well, but obviously up to now we didn't manage to come up with a solution, that has a similar simple, maybe even better interface and allows for faster implementations. I'll try again: Basically the Idea of that kind of interface should work as well for a high performance solution. Let's start with this syntax: auto myString = (cat() + "Hello"s + " "s + "World! "s + 42).str(); "cat" would return an object that collects all the values by reference, and concatenates them in one go in the call to str(). Then when str() runs, the information about the string length can be calculated, or at least estimated correctly and no data needs to be copied unnecessarily. I am sure, you see the structural similarity to the usage of std::ostringstream as well. I just chose to use the operator +, because 1. we don't have stream manipulators here, so we don't want people to think, that we are talking about streams and 2. strings already can be concatenated with operator +. Therefore this is an intuitive and not very surprising interface for string concatenation. People might just think, they can add numbers to strings as well without a "cat"-object. There is some room for confusion, but it is much less, than people thinking, strings are a kind of streams. Maybe "cat" is not the best name for this function template, it should express, that here the concatenation includes conversions. We also need an extensible way to define these conversions, because we also will want to add e.g. complex<myBigRelativeType, myBigRelativeType> to a string. We also will want to define the format, because sometimes, we'll want to concat a number in decimal representation, sometimes in hexadecimal, octal, etc. I think, these conversion functions should write to a string view. e.g.: template <> // the default conversion for this type void boost::cat::convert<complex<...>>(std::string_view& output, size_t max_length, const complex<...>& data); We could use parameters to cat to define conversion functions. auto myString = (cat(hex<int>) + 42).str(); I am not sure now, if that syntax is so great to define the conversion functions. We'll have to discuss. Of course there should be an optimized default conversion function for cat-objects as well: auto myString = (cat() + "Hello"s + " "s + "World! "s + 42 + (cat(hex<int>) + " "s + 42)).str(); Here the outer cat-object would "steal" the values from the inner cat-object and use its conversion functions directly with a view on the big result string. I understand, that the syntax is not 100% what you try to achieve, but I still think, that conversion to string should not be directly done on a string object. This specific object to do concatenation and conversion actually is the key to gain the maximum possible performance, because it can delay the actual concatenation and conversion to a point, where all necessary information is available. Christof
Hi, I've been working in a format library that contains a function template called appendf that can do this: std::string str("blabla"); boost::stringify::appendf(str) () (" AAA ", 25, " BBB ", {255, "x"}); assert(str == "blabla AAA 25 BBB ff"); // ( 255 formated in hexadecimal ) This library - I call it Boost.Stringify - is in very early stage of development and I though it would be to premature to mention it now. But given the repercussion in this thread so far, I changed my mind. So I will soon publish it in github and I will start to write a brief documentation so that I can soon present here to gauge interest. On Wed, Jan 11, 2017 at 2:45 PM, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 11.01.2017 16:03, schrieb Olaf van der Spek:
On Wed, Jan 11, 2017 at 10:02 AM, Christof Donat <cd@okunah.de> wrote:
The significantly better alternative is std::ostringstream. I think our debate showed very nicely, that all the options, we have come up with to simplify the concatenation of strings are significantly inferior in total, though we could find examples, where they slightly improve readability.
Interface is important, as is performance.. and performence of ostringstream sucks..
Well, but obviously up to now we didn't manage to come up with a solution, that has a similar simple, maybe even better interface and allows for faster implementations.
I'll try again:
Basically the Idea of that kind of interface should work as well for a high performance solution. Let's start with this syntax:
auto myString = (cat() + "Hello"s + " "s + "World! "s + 42).str();
"cat" would return an object that collects all the values by reference, and concatenates them in one go in the call to str(). Then when str() runs, the information about the string length can be calculated, or at least estimated correctly and no data needs to be copied unnecessarily. I am sure, you see the structural similarity to the usage of std::ostringstream as well. I just chose to use the operator +, because
1. we don't have stream manipulators here, so we don't want people to think, that we are talking about streams and 2. strings already can be concatenated with operator +. Therefore this is an intuitive and not very surprising interface for string concatenation.
People might just think, they can add numbers to strings as well without a "cat"-object. There is some room for confusion, but it is much less, than people thinking, strings are a kind of streams. Maybe "cat" is not the best name for this function template, it should express, that here the concatenation includes conversions.
We also need an extensible way to define these conversions, because we also will want to add e.g. complex<myBigRelativeType, myBigRelativeType> to a string. We also will want to define the format, because sometimes, we'll want to concat a number in decimal representation, sometimes in hexadecimal, octal, etc. I think, these conversion functions should write to a string view. e.g.:
template <> // the default conversion for this type void boost::cat::convert<complex<...>>(std::string_view& output, size_t max_length, const complex<...>& data);
We could use parameters to cat to define conversion functions.
auto myString = (cat(hex<int>) + 42).str();
I am not sure now, if that syntax is so great to define the conversion functions. We'll have to discuss.
Of course there should be an optimized default conversion function for cat-objects as well:
auto myString = (cat() + "Hello"s + " "s + "World! "s + 42 + (cat(hex<int>) + " "s + 42)).str();
Here the outer cat-object would "steal" the values from the inner cat-object and use its conversion functions directly with a view on the big result string.
I understand, that the syntax is not 100% what you try to achieve, but I still think, that conversion to string should not be directly done on a string object. This specific object to do concatenation and conversion actually is the key to gain the maximum possible performance, because it can delay the actual concatenation and conversion to a point, where all necessary information is available.
Christof
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
On Wed, Jan 11, 2017 at 3:40 PM, Roberto Hinz <robhz786@gmail.com> wrote:
Hi,
I've been working in a format library that contains a function template called appendf that can do this:
std::string str("blabla"); boost::stringify::appendf(str) () (" AAA ", 25, " BBB ", {255, "x"}); assert(str == "blabla AAA 25 BBB ff"); // ( 255 formated in hexadecimal )
This library - I call it Boost.Stringify - is in very early stage of development and I though it would be to premature to mention it now. But given the repercussion in this thread so far, I changed my mind. So I will soon publish it in github and I will start to write a brief documentation so that I can soon present here to gauge interest.
On Wed, Jan 11, 2017 at 2:45 PM, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 11.01.2017 16:03, schrieb Olaf van der Spek:
On Wed, Jan 11, 2017 at 10:02 AM, Christof Donat <cd@okunah.de> wrote:
The significantly better alternative is std::ostringstream. I think our debate showed very nicely, that all the options, we have come up with to simplify the concatenation of strings are significantly inferior in total, though we could find examples, where they slightly improve readability.
Interface is important, as is performance.. and performence of ostringstream sucks..
Well, but obviously up to now we didn't manage to come up with a solution, that has a similar simple, maybe even better interface and allows for faster implementations.
I'll try again:
Basically the Idea of that kind of interface should work as well for a high performance solution. Let's start with this syntax:
auto myString = (cat() + "Hello"s + " "s + "World! "s + 42).str();
"cat" would return an object that collects all the values by reference, and concatenates them in one go in the call to str(). Then when str() runs, the information about the string length can be calculated, or at least estimated correctly and no data needs to be copied unnecessarily. I am sure, you see the structural similarity to the usage of std::ostringstream as well. I just chose to use the operator +, because
1. we don't have stream manipulators here, so we don't want people to think, that we are talking about streams and 2. strings already can be concatenated with operator +. Therefore this is an intuitive and not very surprising interface for string concatenation.
People might just think, they can add numbers to strings as well without a "cat"-object. There is some room for confusion, but it is much less, than people thinking, strings are a kind of streams. Maybe "cat" is not the best name for this function template, it should express, that here the concatenation includes conversions.
We also need an extensible way to define these conversions, because we also will want to add e.g. complex<myBigRelativeType, myBigRelativeType> to a string. We also will want to define the format, because sometimes, we'll want to concat a number in decimal representation, sometimes in hexadecimal, octal, etc. I think, these conversion functions should write to a string view. e.g.:
template <> // the default conversion for this type void boost::cat::convert<complex<...>>(std::string_view& output, size_t max_length, const complex<...>& data);
We could use parameters to cat to define conversion functions.
auto myString = (cat(hex<int>) + 42).str();
I am not sure now, if that syntax is so great to define the conversion functions. We'll have to discuss.
Of course there should be an optimized default conversion function for cat-objects as well:
auto myString = (cat() + "Hello"s + " "s + "World! "s + 42 + (cat(hex<int>) + " "s + 42)).str();
Here the outer cat-object would "steal" the values from the inner cat-object and use its conversion functions directly with a view on the big result string.
I understand, that the syntax is not 100% what you try to achieve, but I still think, that conversion to string should not be directly done on a string object. This specific object to do concatenation and conversion actually is the key to gain the maximum possible performance, because it can delay the actual concatenation and conversion to a point, where all necessary information is available.
Christof
Sorry for top posing in the previous message. It wasn't intentional Anyway, you can take a look at the source: https://github.com/robhz786/stringify Although there is no documentation, the unit tests and performance test provides some usage examples. I think it solves this variadic string append perfectly. However it will take time to get ready. As I said, perhaps is premature to gauge interest. Still, any interest ? This is a c++14 format library, that: - has good performance - allows the user to extend input types and output types. And when adding new input types, he can create formating options specifically this new input type. - is highly customizable. The user will able to customize, for instance - how is the width calculated. It can be simple the length of the string, or it be the count of unicode code points. Or it can be even more sophisticated - the numeric digits. For instance, to use arabic numbers. - support all character types. - its UTF friendly. For instance, while in std::ostream and others the fill must be a single char, in Boost.Stringify, it is a char32_t. If you a writing to char*, it is converted into utf-8 by default, but you can customize that. - currently can write to std::basic_string, char*. But it support also FILE* and std::basic_streambuf - is not locale sensitive. This is not necessarily aways an advantage. But it would be nice to have one format that is totally not locale sensitive, and thus ensure that it will aways write exactly the user expect. Note: I did not test it in Visual Studio. ( gcc and clang only )
On Wed, Jan 11, 2017 at 5:45 PM, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 11.01.2017 16:03, schrieb Olaf van der Spek:
On Wed, Jan 11, 2017 at 10:02 AM, Christof Donat <cd@okunah.de> wrote:
The significantly better alternative is std::ostringstream. I think our debate showed very nicely, that all the options, we have come up with to simplify the concatenation of strings are significantly inferior in total, though we could find examples, where they slightly improve readability.
Interface is important, as is performance.. and performence of ostringstream sucks..
Well, but obviously up to now we didn't manage to come up with a solution, that has a similar simple, maybe even better interface and allows for faster implementations.
I'll try again:
Basically the Idea of that kind of interface should work as well for a high performance solution. Let's start with this syntax:
auto myString = (cat() + "Hello"s + " "s + "World! "s + 42).str();
I like fmt's syntax much more: string s = fmt::format("The answer is {}", 42); http://fmtlib.net/latest/index.html I really don't get why we should bother with the (cat(...)).str() bit.
"cat" would return an object that collects all the values by reference, and concatenates them in one go in the call to str(). Then when str() runs, the information about the string length can be calculated, or at least estimated correctly and no data needs to be copied unnecessarily. I am sure, you see the structural similarity to the usage of std::ostringstream as well. I just chose to use the operator +, because
1. we don't have stream manipulators here, so we don't want people to think, that we are talking about streams and 2. strings already can be concatenated with operator +. Therefore this is an intuitive and not very surprising interface for string concatenation.
People might just think, they can add numbers to strings as well without a "cat"-object. There is some room for confusion, but it is much less, than people thinking, strings are a kind of streams. Maybe "cat" is not the best name for this function template, it should express, that here the concatenation includes conversions.
We also need an extensible way to define these conversions, because we also will want to add e.g. complex<myBigRelativeType, myBigRelativeType> to a string. We also will want to define the format, because sometimes, we'll want to concat a number in decimal representation, sometimes in hexadecimal, octal, etc. I think, these conversion functions should write to a string view. e.g.:
template <> // the default conversion for this type void boost::cat::convert<complex<...>>(std::string_view& output, size_t max_length, const complex<...>& data);
We could use parameters to cat to define conversion functions.
auto myString = (cat(hex<int>) + 42).str();
If you want that kind of formatting I'd really go for http://fmtlib.net/latest/syntax.html
I am not sure now, if that syntax is so great to define the conversion functions. We'll have to discuss.
Of course there should be an optimized default conversion function for cat-objects as well:
auto myString = (cat() + "Hello"s + " "s + "World! "s + 42 + (cat(hex<int>) + " "s + 42)).str();
Here the outer cat-object would "steal" the values from the inner cat-object and use its conversion functions directly with a view on the big result string.
I understand, that the syntax is not 100% what you try to achieve, but I still think, that conversion to string should not be directly done on a string object. This specific object to do concatenation and conversion actually is the key to gain the maximum possible performance, because it can delay the actual concatenation and conversion to a point, where all necessary information is available.
True, but then we're back again to an append / sprint style function. -- Olaf
Hi, Am 12.01.2017 20:05, schrieb Olaf van der Spek:
On Wed, Jan 11, 2017 at 5:45 PM, Christof Donat <cd@okunah.de> wrote:
Basically the Idea of that kind of interface should work as well for a high performance solution. Let's start with this syntax:
auto myString = (cat() + "Hello"s + " "s + "World! "s + 42).str();
I like fmt's syntax much more:
string s = fmt::format("The answer is {}", 42);
http://fmtlib.net/latest/index.html
I really don't get why we should bother with the (cat(...)).str() bit.
Format will have to parse the format string at runtime and then chose dynamically the function to stringify the number. The cat() aproach can be implemented to do that choice at compiletime. So no string parsing and not dynamic choice of formating functions. We were talking about performance in this chapter, weren't we? Christof
Christof Donat wrote:
Am 12.01.2017 20:05, schrieb Olaf van der Spek:
I like fmt's syntax much more:
string s = fmt::format("The answer is {}", 42);
fmtlib looks pretty cool. In fact it's basically a superset of my private library. Some of the design decisions aren't the same but the overall style is very similar.
Format will have to parse the format string at runtime and then chose dynamically the function to stringify the number.
It does, of course, parse the format string, but I'm not sure what you mean by "choose function dynamically." The above basically does something like os << "The answer is "; os << 42; return os.str(); using one virtual call for the os << 42 part.
What about the typesafe sprintf from Abel regarding this discussion, a library based on it could forward to eirher compiletime/runtime version and benefit from the ease of the printf syntax. http://abel.web.elte.hu/mpllibs/safe_printf/sprintf.html I think the api of sprintf is simple and good it just shall be made typesafe and string concept overloaded, like a formatted print algorithm. Any other new api will fail as boost.format, boost format is good but hasn't gained much traction because the added value balanced with it's complexity doesn't make it better than sprintf in usage. I mean : str(boost::format("%1%") % myvar)) is not so much cooler than streams formatting or sprintf. So why not stick with a modernized sprintf for formatting? -- Damien Buhl Alias daminetreg
On 13 Jan 2017, at 17:29, Peter Dimov <lists@pdimov.com> wrote:
Christof Donat wrote:
Am 12.01.2017 20:05, schrieb Olaf van der Spek:
I like fmt's syntax much more:
string s = fmt::format("The answer is {}", 42);
fmtlib looks pretty cool. In fact it's basically a superset of my private library. Some of the design decisions aren't the same but the overall style is very similar.
Format will have to parse the format string at runtime and then chose dynamically the function to stringify the number.
It does, of course, parse the format string, but I'm not sure what you mean by "choose function dynamically." The above basically does something like
os << "The answer is "; os << 42; return os.str();
using one virtual call for the os << 42 part.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Hi, Am 13.01.2017 17:29, schrieb Peter Dimov:
Christof Donat wrote:
Am 12.01.2017 20:05, schrieb Olaf van der Spek:
I like fmt's syntax much more:
string s = fmt::format("The answer is {}", 42);
fmtlib looks pretty cool. In fact it's basically a superset of my private library. Some of the design decisions aren't the same but the overall style is very similar.
Format will have to parse the format string at runtime and then chose dynamically the function to stringify the number.
It does, of course, parse the format string, but I'm not sure what you mean by "choose function dynamically." The above basically does something like
os << "The answer is "; os << 42; return os.str();
using one virtual call for the os << 42 part.
A call to a virtual function is a dynamic choice of the function to call. A non virtual call can be inlined. Even if it is not, it is simply an absolute jump, while a virtual call is at least loading the vtable pointer and loading the pointer to the function from said vtable, before you can jump to the function. But that is not the worst part of virtual function calls. With non virtual calls, the branch prediction will always be correct, the upcoming instructions will already be loaded and fed into the pipeline, and the code runs almost as fast as the inlined version (ignoring parameter passing here, which is where inlining really pays of). Virtual function calls give the branch prediction only a statistical chance to chose correctly. When it fails, the instruction cache has to be filled with instructions from the correct branch, while the prefetched instructions, that already partially executed in the pipeline, have to be rolled back. Depending on your hardware, that might cost up to several dozens of clock cycles, in case of a cache miss. We were talking about performance here. Therefore I tried to come up with a syntax, that can be implemented without virtual function calls as much as possible and avoids parsing format strings. Also that syntax should allow for creating the result in one go, in order to avoid unnecessary reallocations. Keeping that in mind, I think, my proposal is not the worst possible, though, of course, someone might come up with a better one. Libraries like boost::format have their strengths, when simply concatenating information does not fit the needs. E.g. for internationalized text output. I like boost::format a lot, but speed actually is not its focus. auto translate(const std::string& key) -> std::string; // ... // will translate to e.g. "%2% Bytes werden von der Datei '%1%' belegt." std::cerr << boost::format(translate("file '%1%' contains %2% bytes"s)) % filename % filesize << std::endl; On the other hand e.g. for writing a set of values as CSV, my proposed interface will produce faster code: for(const auto& line : all_lines) out << (cat() + line<1> + ',' + line<2> + ',' + line<3> + ',' + line<4> + ',' + line<5> + '\n').str(); Christof
On 15/01/2017 14:39, Christof Donat wrote:
We were talking about performance here. Therefore I tried to come up with a syntax, that can be implemented without virtual function calls as much as possible and avoids parsing format strings. Also that syntax should allow for creating the result in one go, in order to avoid unnecessary reallocations. Keeping that in mind, I think, my proposal is not the worst possible, though, of course, someone might come up with a better one.
Yes sorry I've been distracted by the discussion about formatting, because indeed your API idea looks like a good choice when it come to the initial problem of this discussion : appending / concatenating. While users might find more intuitive a variadic concat(...) api. But to close the ellipsis about formatting : with something like Abel Sincovick's snprintf in principle you only pay for the parsing at compile-time and at runtime you have your string formatted with no/less dynamic allocation.
Libraries like boost::format have their strengths, when simply concatenating information does not fit the needs. E.g. for internationalized text output. I like boost::format a lot, but speed actually is not its focus. I personally uses boost::format alot, but I think boost::format would be easier and shorter to use if it would be callable with an `std::string snprintf(format_str, args...)`. And it would be awesome that format_str would be, if already known at compile time handled with Metaparse. Sorry for getting off-topic. :D
-- Damien Buhl
On Sun, Jan 15, 2017 at 9:42 PM, Damien Buhl <damien.buhl@lecbna.org> wrote:
On 15/01/2017 14:39, Christof Donat wrote:
We were talking about performance here. Therefore I tried to come up with a syntax, that can be implemented without virtual function calls as much as possible and avoids parsing format strings. Also that syntax should allow for creating the result in one go, in order to avoid unnecessary reallocations. Keeping that in mind, I think, my proposal is not the worst possible, though, of course, someone might come up with a better one.
Yes sorry I've been distracted by the discussion about formatting, because indeed your API idea looks like a good choice when it come to the initial problem of this discussion : appending / concatenating. While users might find more intuitive a variadic concat(...) api.
But to close the ellipsis about formatting : with something like Abel Sincovick's snprintf in principle you only pay for the parsing at compile-time and at runtime you have your string formatted with no/less dynamic allocation.
http://abel.web.elte.hu/mpllibs/safe_printf/snprintf.html It appears to only do checking at compile-time and then forwards to sprintf..
Libraries like boost::format have their strengths, when simply concatenating information does not fit the needs. E.g. for internationalized text output. I like boost::format a lot, but speed actually is not its focus. I personally uses boost::format alot, but I think boost::format would be easier and shorter to use if it would be callable with an `std::string snprintf(format_str, args...)`. And it would be awesome that format_str would be, if already known at compile time handled with Metaparse. Sorry for getting off-topic. :D
Can't boost::format be updated to a variadic variant? -- Olaf
Am 16.01.2017 09:54, schrieb Olaf van der Spek:
On Sun, Jan 15, 2017 at 9:42 PM, Damien Buhl <damien.buhl@lecbna.org> wrote:
Libraries like boost::format have their strengths, when simply concatenating information does not fit the needs. E.g. for internationalized text output. I like boost::format a lot, but speed actually is not its focus. I personally uses boost::format alot, but I think boost::format would be easier and shorter to use if it would be callable with an `std::string snprintf(format_str, args...)`. And it would be awesome that
On 15/01/2017 14:39, Christof Donat wrote: format_str would be, if already known at compile time handled with Metaparse. Sorry for getting off-topic. :D
Can't boost::format be updated to a variadic variant?
I think so. Something like this should be possible, of course: auto format_args_imp(const boost::format& f) -> const boost::format& { return f; } template<typename Arg1, typename ... Args> auto format_args_imp(const boost::format& f, Arg1 arg1, Args ... args) -> const boost::format& { return format_args_impl(f % arg1, args...); } template<typename ... Args> auto format_args(const std::string& f, Args...args) -> std::string { return format_args_impl(f, args...).str(); } // ... auto first_which = format_args("When shall we %1% meet again? In %2%, %3%, or in %4%?"s, 3, "thunder"s, "lightning"s, "rain"s); I have not tried to compile this, so please just take it as a sketch. It inherits most of the advantages and disadvantages of boost::format, of course. Christof
Sorry to chime in so late in the discussion. What about a syntax similar to this? int main() { auto s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58); std::cout << s << std::endl; s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana")); std::cout << s << std::endl; } Which would produce the following output: Hello , World. The hex for 58 is 3a a : b : c8 : “banana" sample implementation (io manipulators may be incomplete, some efficiency gains could be made by re-implementing ostringstream more cleverly): #include <sstream> #include <iostream> #include <iomanip> namespace detail { template<class SepStr> struct separator_object { template<class T> std::ostream& operator ()(std::ostream& s, T&& t) const { return s << sep << t; } // // other iomanp specialisations here // std::ostream& operator ()(std::ostream& s, std::ios_base&(*t)(std::ios_base&)) const { t(s); return s; } SepStr const& sep; }; struct no_separator_object { template<class T> std::ostream& operator ()(std::ostream& s, T&& t) const { return s << t; } }; template<class Separator, class String, class...Rest> auto join(Separator&& sep, String&& s, Rest&&...rest) { std::ostringstream ss; ss << s; using expand = int []; void(expand{0, ((sep(ss, rest)), 0)... }); return ss.str(); }; } template<class Sep> static constexpr auto separator(Sep const& sep) { using sep_type = std::remove_const_t<std::remove_reference_t<Sep>>; return detail::separator_object<sep_type> { sep }; } template<class SepObject, class String, class...Rest> auto join(const detail::separator_object<SepObject>& sep, String&& s, Rest&&...rest) { return detail::join(sep, std::forward<String>(s), std::forward<Rest>(rest)...); }; template<class String, class...Rest> auto join(String&& s, Rest&&...rest) { return detail::join(detail::no_separator_object(), std::forward<String>(s), std::forward<Rest>(rest)...); }; int main() { auto s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58); std::cout << s << std::endl; s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana")); std::cout << s << std::endl; }
On Mon, Jan 16, 2017 at 11:41 AM, Richard Hodges <hodges.r@gmail.com> wrote:
Sorry to chime in so late in the discussion.
What about a syntax similar to this?
int main() { auto s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58); std::cout << s << std::endl;
s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana")); std::cout << s << std::endl;
}
Which would produce the following output:
Hello , World. The hex for 58 is 3a a : b : c8 : “banana"
The syntax is fine but it's missing an appending variant, like append(s, "A", "B", 42); This variant is important as it (also) allows you to reuse existing storage.
That's pretty straightforward with another overload: auto& s = join(to(y), separator(", "), "A", "b", 42); where to(y) is something like template<String> struct to_existing_type<String> { String& get() { return s_; } String s_; }; template<class String> auto to(String& s) { return to_existing_type<S>(s); } With a bit of template unwrapping, we could imagine something like this: join(to(x), 2, 3, to(y), "foo", "bar", create(), "baz", 42); which would return a tuple: std::tuple<std::string&, std::string&, std::string> in c++17 this would allow: auto&& [x, y, z] = join(to(x), 2, 3, to(y), "foo", "bar", create(), "baz", 42); But this maybe taking it a bit far... What do you think? On 18 January 2017 at 09:06, Olaf van der Spek <ml@vdspek.org> wrote:
On Mon, Jan 16, 2017 at 11:41 AM, Richard Hodges <hodges.r@gmail.com> wrote:
Sorry to chime in so late in the discussion.
What about a syntax similar to this?
int main() { auto s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58); std::cout << s << std::endl;
s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana")); std::cout << s << std::endl;
}
Which would produce the following output:
Hello , World. The hex for 58 is 3a a : b : c8 : “banana"
The syntax is fine but it's missing an appending variant, like append(s, "A", "B", 42); This variant is important as it (also) allows you to reuse existing storage.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/ mailman/listinfo.cgi/boost
Ok. here's my second attempt. It has 2 improvements: 1. allows appending to an existing string by specifiying std::string& join(onto(s), [optional separator("xxx"), parts...); 2. replaces use of std::ostringstream with a (very simple!) version that does string appending. #include <sstream> #include <iostream> #include <iomanip> struct string_ref_buffer : std::streambuf { using inherited = std::streambuf; using char_type = inherited::char_type; using char_traits = std::char_traits<char_type>; int overflow(int c) override { if (c != char_traits::eof()) { buffer_.push_back(c); } return char_traits::not_eof(c); } string_ref_buffer(std::string& buffer) : buffer_(buffer) { } const std::string& str() const & { return buffer_; } std::string&& str()&& { return std::move(buffer_); } std::string& buffer_; std::size_t inpos_ = 0; std::size_t outpos_ = 0; }; namespace detail { template<class SepStr> struct separator_object { template<class T> std::ostream& operator ()(std::ostream& s, T&& t) const { return s << sep << t; } // // other iomanp specialisations here // std::ostream& operator ()(std::ostream& s, std::ios_base&(*t)(std::ios_base&)) const { t(s); return s; } SepStr const& sep; }; struct no_separator_object { template<class T> std::ostream& operator ()(std::ostream& s, T&& t) const { return s << t; } }; template<class Target, class Separator, class...Rest> auto& join_onto(Target&& target, Separator&& sep, Rest&&...rest) { string_ref_buffer sbuf { target.str() }; std::ostream ss(std::addressof(sbuf)); using expand = int []; void(expand{0, ((sep(ss, rest)), 0)... }); return target; }; template<class Separator, class String, class...Rest> auto join(Separator&& sep, String&& s, Rest&&...rest) { std::string result {}; string_ref_buffer sbuf { result }; std::ostream ss(std::addressof(sbuf)); ss << s; using expand = int []; void(expand{0, ((sep(ss, rest)), 0)... }); return result; }; template<class String> struct onto_type { String& str() { return target_.get(); } std::reference_wrapper<String> target_; }; } template<class String> auto onto(String& target) { return detail::onto_type<String> { target }; } template<class Sep> static constexpr auto separator(Sep const& sep) { using sep_type = std::remove_const_t<std::remove_reference_t<Sep>>; return detail::separator_object<sep_type> { sep }; } template<class SepObject, class String, class...Rest> decltype(auto) join(detail::separator_object<SepObject> sep, String&& s, Rest&&...rest) { return detail::join(sep, std::forward<String>(s), std::forward<Rest>(rest)...); }; template<class String, class...Rest> decltype(auto) join(String&& s, Rest&&...rest) { return detail::join(detail::no_separator_object(), std::forward<String>(s), std::forward<Rest>(rest)...); }; template<class Target, class SepObject, class String, class...Rest> decltype(auto) join(detail::onto_type<Target> target, detail::separator_object<SepObject> sep, String&& s, Rest&&...rest) { return detail::join_onto(target, sep, std::forward<String>(s), std::forward<Rest>(rest)...); }; template<class Target, class String, class...Rest> decltype(auto) join(detail::onto_type<Target> target, String&& s, Rest&&...rest) { return detail::join_onto(target, detail::no_separator_object(), std::forward<String>(s), std::forward<Rest>(rest)...); }; int main() { auto s= std::string("foo"); s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58); std::cout << s << std::endl; s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana")); std::cout << s << std::endl; join(onto(s), separator(", "), "funky", "chicken"); join(onto(s), "====="); std::cout << s << std::endl; } expected output: Hello , World. The hex for 58 is 3a a : b : c8 : "banana" a : b : c8 : "banana", funky, chicken===== On 18 January 2017 at 09:53, Richard Hodges <hodges.r@gmail.com> wrote:
That's pretty straightforward with another overload:
auto& s = join(to(y), separator(", "), "A", "b", 42);
where to(y) is something like
template<String> struct to_existing_type<String> { String& get() { return s_; } String s_; };
template<class String> auto to(String& s) { return to_existing_type<S>(s); }
With a bit of template unwrapping, we could imagine something like this:
join(to(x), 2, 3, to(y), "foo", "bar", create(), "baz", 42);
which would return a tuple:
std::tuple<std::string&, std::string&, std::string>
in c++17 this would allow:
auto&& [x, y, z] = join(to(x), 2, 3, to(y), "foo", "bar", create(), "baz", 42);
But this maybe taking it a bit far... What do you think?
On 18 January 2017 at 09:06, Olaf van der Spek <ml@vdspek.org> wrote:
On Mon, Jan 16, 2017 at 11:41 AM, Richard Hodges <hodges.r@gmail.com> wrote:
Sorry to chime in so late in the discussion.
What about a syntax similar to this?
int main() { auto s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58); std::cout << s << std::endl;
s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana")); std::cout << s << std::endl;
}
Which would produce the following output:
Hello , World. The hex for 58 is 3a a : b : c8 : “banana"
The syntax is fine but it's missing an appending variant, like append(s, "A", "B", 42); This variant is important as it (also) allows you to reuse existing storage.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi, Am 18.01.2017 10:59, schrieb Richard Hodges:
int main() { auto s= std::string("foo");
s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58); std::cout << s << std::endl;
s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana")); std::cout << s << std::endl;
join(onto(s), separator(", "), "funky", "chicken"); join(onto(s), "====="); std::cout << s << std::endl; }
I do like the idea, but not the naming. With a function name "join" I'd expect to be able to pass an iterator range and have all its element concatenated into a string with a defined separator like this: auto my_number = std::vector<int>{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; std::cout << join(std::begin(my_number), std::end(my_number), ", ") << std::endl; expected output: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 maybe the name cat(), or concatenate() would fit better here. Also I am not 100% happy with is the reuse of iostream manipulators. They already don't have a distinct range of effect in iostream library. Another issue is, that the way, they work, is pretty complicated to be reused in a high performance implementation. How about an API like this: s = concat("Hello ", ", World.", " The hex for ", 58, " is ", concat_lib::hex(58)); s = concat(separator(" : "), "a", "b", concat_lib::hex(100), std::quoted("banana")); concat(onto(s), separator(", "), "funky", "chicken"); // will return a reference to s I guess concat(onto(s), "====="); This would go well with what I'd expect with the function name join(): s = join(std::begin(my_number), std::end(my_number)); // -> "12345678910" s = join(separator(", "), std::begin(my_number), std::end(my_number)); // -> "1, 2, 3, 4, 5, 6, 7, 8, 9, 10" join(onto(s), separator(" - "), std::begin(my_number), std::end(my_number)); // -> "1, 2, 3, 4, 5, 6, 7, 8, 9, 101 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10" Now, when we put it like that, there is no way, to let the functions "steal" work from each other. If you take this e.g.: concat(on(s), join(separator(", "), std::begin(my_number), std::end(my_number)), " ", join(separator(" - "), std::begin(my_number), std::end(my_number))); This will first execute the two join() calls that return strings, which are then passed to concat(). Instead join() and concat() could return objects, that can be converted to std::string. Then concat() can make the result of join() render its output into a common buffer. auto s = concat(join(separator(", "), std::begin(my_number), std::end(my_number)), " ", join(separator(" - "), std::begin(my_number), std::end(my_number))).str(); Christof
I agree re naming. concat() or concatenate() seems a better fit. I also share your unease re formatting. I have of course shoehorned the ostream formatters in here without thinking it through deeply (there is also the issue of the separator getting formatted, so any sequence of formatters must be executed *after* the separator has been applied, and formatters really ought to restore previous state. I think it's reasonable to develop a series of lightweight function object factories that provide trivial formatting objects in the same way that std::quoted does for std::ostream. In that case, it is perfectly reasonable to do away with the entire atrocious ostream performance bottleneck altogether (perhaps there are questions though, about formatting options and locales?). Totally agree with returning a string factory. That makes perfect sense. onto(x) could return the correct kind of wrapper, depending on the argument type of x. So it could cope with x being for example, std::string&, std::string const&, std::string&& or std::ostream&. As an observation, expressing the join as an iterator pair lends itself to being implemented in terms of std::copy(first, last, formatting_iterator<...>). I think this is good for containers, but for a series of disjoint types, or for joining words (as opposed to letters), you'd still need some templatery. boost::range springs to mind as a reasonable helper for expressing to concat (or join) that you want to treat each element of a container. On 18 January 2017 at 16:11, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 18.01.2017 10:59, schrieb Richard Hodges:
int main() { auto s= std::string("foo");
s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58); std::cout << s << std::endl;
s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana")); std::cout << s << std::endl;
join(onto(s), separator(", "), "funky", "chicken"); join(onto(s), "====="); std::cout << s << std::endl; }
I do like the idea, but not the naming. With a function name "join" I'd expect to be able to pass an iterator range and have all its element concatenated into a string with a defined separator like this:
auto my_number = std::vector<int>{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; std::cout << join(std::begin(my_number), std::end(my_number), ", ") << std::endl;
expected output:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
maybe the name cat(), or concatenate() would fit better here.
Also I am not 100% happy with is the reuse of iostream manipulators. They already don't have a distinct range of effect in iostream library. Another issue is, that the way, they work, is pretty complicated to be reused in a high performance implementation.
How about an API like this:
s = concat("Hello ", ", World.", " The hex for ", 58, " is ", concat_lib::hex(58)); s = concat(separator(" : "), "a", "b", concat_lib::hex(100), std::quoted("banana"));
concat(onto(s), separator(", "), "funky", "chicken"); // will return a reference to s I guess concat(onto(s), "=====");
This would go well with what I'd expect with the function name join():
s = join(std::begin(my_number), std::end(my_number)); // -> "12345678910" s = join(separator(", "), std::begin(my_number), std::end(my_number)); // -> "1, 2, 3, 4, 5, 6, 7, 8, 9, 10" join(onto(s), separator(" - "), std::begin(my_number), std::end(my_number)); // -> "1, 2, 3, 4, 5, 6, 7, 8, 9, 101 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10"
Now, when we put it like that, there is no way, to let the functions "steal" work from each other. If you take this e.g.:
concat(on(s), join(separator(", "), std::begin(my_number), std::end(my_number)), " ", join(separator(" - "), std::begin(my_number), std::end(my_number)));
This will first execute the two join() calls that return strings, which are then passed to concat(). Instead join() and concat() could return objects, that can be converted to std::string. Then concat() can make the result of join() render its output into a common buffer.
auto s = concat(join(separator(", "), std::begin(my_number), std::end(my_number)), " ", join(separator(" - "), std::begin(my_number), std::end(my_number))).str();
Christof
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi, Am 18.01.2017 17:30, schrieb Richard Hodges:
Totally agree with returning a string factory. That makes perfect sense. onto(x) could return the correct kind of wrapper, depending on the argument type of x. So it could cope with x being for example, std::string&, std::string const&, std::string&& or std::ostream&.
How about this: auto s = concat(1, " ", 2).str(); // -> s = "1 2" concat(" ", 3).append_to(s); // -> s = "1 2 3" // reuse preallocated memory concat(4, " ", 5).overwrite(s); // -> s = "4 5" overwrite() could also take a std::string_view with C++17, or const char* and size_t with earlier versions of the standard library.
As an observation, expressing the join as an iterator pair lends itself to being implemented in terms of std::copy(first, last, formatting_iterator<...>).
That definatelly is a possible implementation, yes. Though, of course having join(), concat() and maybe other functions return string factories instead of strings, enables generating the whole string in a single buffer without copying. concat(join(...), " ", 42, " - ", join(...), format("format string %1% %2% %3%", a, b, 42)).str(); concat() will allocate a long enough string and call overwrite() on the results of join() and format() with string views on that string. Taking everything together, a string factory will have at least interface like this: template<typename T> constexpr bool string_factory() { return requires(T a, std::string& s, std::string_view v, const char* p, size_t len) { // estimate necessary memory to render the string { a.size() } const -> size_t; // render and return result { a.str() } const -> std::string; // render at the end of an existing string - return the number of generated chars { a.append_to(s) } -> size_t; // render into an existing string, reusing its preallocated memory { a.overwrite(s) } -> size_t; // render into a string view { a.overwrite(v) } -> size_t; // render into a character buffer { a.overwrite(p, len) } -> size_t; }; } I hope, the constraint syntax is more or less correct. I haven't used constraints in real code up to now ;-)
I think this is good for containers, but for a series of disjoint types, or for joining words (as opposed to letters), you'd still need some templatery.
Yes, sure. You might also want to have somthing like this: std::tuple<std::string, int, double> my_tuple{"asdf", 42, 2.5; join(separator(", "), my_tuple); // -> "asdf, 42, 2.5" I think, using C++17s std::apply() it should be a straight forward wrapper around concat().
boost::range springs to mind as a reasonable helper for expressing to concat (or join) that you want to treat each element of a container.
Yes, when I wrote about my idea of join(), I thought of ranges as well. With range adaptors, that will make up for a very powerfull and btw. fast library to generate strings: format("file names and sizes:\n%1%\n", join(separator('\n'), my_files | range::transformed([](const std::filesystem::path& f) -> auto { return concat(separator(": "), f.filename(), std::filesystem::file_size(f)); })).str(); format().str() will ask the join() string factory to render into the preallocated buffer. Then join() will walk through its range and find, that it has a range string factories, returned by concat(). Therefore it asks every string factory to render to the given buffer. We "just" need format(), join(), concat(), and the corresponding string factories. Christof
I think this is looking good. Some suggested separator flavours... join(separator(", "), ...) -- results in "a, b, c" join(prefix(", "), ...) -- results in ", a, b, c" (for appending to an existing list) join(suffix(", "), ...) -- results in "a, b, c, " and the start of an idea... join(wrapped_sequence(range, "[", "]", ",", " ")) -- results in "[ a, b, c ]" or "[ ]" if the sequence is empty. The last one would allow automatic generation of JSON... On 19 January 2017 at 14:35, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 18.01.2017 17:30, schrieb Richard Hodges:
Totally agree with returning a string factory. That makes perfect sense. onto(x) could return the correct kind of wrapper, depending on the argument type of x. So it could cope with x being for example, std::string&, std::string const&, std::string&& or std::ostream&.
How about this:
auto s = concat(1, " ", 2).str(); // -> s = "1 2" concat(" ", 3).append_to(s); // -> s = "1 2 3" // reuse preallocated memory concat(4, " ", 5).overwrite(s); // -> s = "4 5"
overwrite() could also take a std::string_view with C++17, or const char* and size_t with earlier versions of the standard library.
As an observation, expressing the join as an iterator pair lends itself to
being implemented in terms of std::copy(first, last, formatting_iterator<...>).
That definatelly is a possible implementation, yes. Though, of course having join(), concat() and maybe other functions return string factories instead of strings, enables generating the whole string in a single buffer without copying.
concat(join(...), " ", 42, " - ", join(...), format("format string %1% %2% %3%", a, b, 42)).str();
concat() will allocate a long enough string and call overwrite() on the results of join() and format() with string views on that string.
Taking everything together, a string factory will have at least interface like this:
template<typename T> constexpr bool string_factory() { return requires(T a, std::string& s, std::string_view v, const char* p, size_t len) { // estimate necessary memory to render the string { a.size() } const -> size_t;
// render and return result { a.str() } const -> std::string;
// render at the end of an existing string - return the number of generated chars { a.append_to(s) } -> size_t;
// render into an existing string, reusing its preallocated memory { a.overwrite(s) } -> size_t;
// render into a string view { a.overwrite(v) } -> size_t;
// render into a character buffer { a.overwrite(p, len) } -> size_t; }; }
I hope, the constraint syntax is more or less correct. I haven't used constraints in real code up to now ;-)
I think this is good for containers, but for a series of disjoint types, or
for joining words (as opposed to letters), you'd still need some templatery.
Yes, sure. You might also want to have somthing like this:
std::tuple<std::string, int, double> my_tuple{"asdf", 42, 2.5; join(separator(", "), my_tuple); // -> "asdf, 42, 2.5"
I think, using C++17s std::apply() it should be a straight forward wrapper around concat().
boost::range springs to mind as a reasonable helper for expressing to
concat (or join) that you want to treat each element of a container.
Yes, when I wrote about my idea of join(), I thought of ranges as well. With range adaptors, that will make up for a very powerfull and btw. fast library to generate strings:
format("file names and sizes:\n%1%\n", join(separator('\n'), my_files | range::transformed([](const std::filesystem::path& f) -> auto { return concat(separator(": "), f.filename(),
std::filesystem::file_size(f)); })).str();
format().str() will ask the join() string factory to render into the preallocated buffer. Then join() will walk through its range and find, that it has a range string factories, returned by concat(). Therefore it asks every string factory to render to the given buffer.
We "just" need format(), join(), concat(), and the corresponding string factories.
Christof
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi, Am 19.01.2017 14:53, schrieb Richard Hodges:
and the start of an idea...
join(wrapped_sequence(range, "[", "]", ",", " ")) -- results in "[ a, b, c ]" or "[ ]" if the sequence is empty.
The last one would allow automatic generation of JSON...
I think, that is a great Idea. Maybe we'd prefer a more ranges like api here: join(separator(', '), range | wrapped("[", "]")); The ranges expression then of course returns a range of string factories. Just the first one and the last one do more than just pass through to the unterlying type. Christof
Except that "[ a, b, c ]" is not valid JSON.
Youre right, a JSON adapter would need to enquote strings, write bools as alphas and have access to an ADL-aware function to turn objects into tuples of NVP generators. That shouldn't be too much of an issue. On 19 January 2017 at 18:07, Bjorn Reese <breese@mail1.stofanet.dk> wrote:
On 01/19/2017 02:53 PM, Richard Hodges wrote:
join(wrapped_sequence(range, "[", "]", ",", " ")) -- results in "[ a,
b, c ]" or "[ ]" if the sequence is empty.
The last one would allow automatic generation of JSON...
Except that "[ a, b, c ]" is not valid JSON.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi Christof and Richard,
On 18 Jan 2017, at 16:11, Christof Donat <cd@okunah.de> wrote: I do like the idea, but not the naming. With a function name "join" I'd expect to be able to pass an iterator range and have all its element concatenated into a string with a defined separator like this:
please don't change the name. I don't see why "join" implies iterators. "join" is a standard name for similar functions which join strings in other languages, e.g. Python Perl C# Java JavaScript There is also the Qt library which has a QStringList::join method. "join" is well established and a short word, too, certainly better than "concat", which has to be abbreviated to not be too long. I have more arguments. The word "concatenate" sounds awkward and artificial. "join" is a word you use in daily conversations, "concatenate" is not. Google search for "join" yields 3e9 hits, Google search for "concatenate" yields 3e6 hits, so you can say "join" is about 1000x more common. I personally hate technical jargon in any field. Language was invented to include, not to exclude. Finally, "concatenate" in other programming contexts usually means that you append one collection to another collection. This is very different from what "join" does, which is piecing many individual strings and string-converted arguments together. That's clearly a "joining" operation. Best regards, Hans
Am 19.01.2017 15:12, schrieb Hans Dembinski:
Hi Christof and Richard,
On 18 Jan 2017, at 16:11, Christof Donat <cd@okunah.de> wrote: I do like the idea, but not the naming. With a function name "join" I'd expect to be able to pass an iterator range and have all its element concatenated into a string with a defined separator like this:
please don't change the name. I don't see why "join" implies iterators. "join" is a standard name for similar functions which join strings in other languages, e.g.
Python
https://docs.python.org/3/library/stdtypes.html#str.join ", ".join(my_numbers) versus "Hello" + ", " + "World" or - a bit arkward ", ".join(["Hello", "World"]) or "Hello".append(", ").append("World")
Perl
join($sep, @array) versus jon($sep, a, b, c, d)
C#
https://msdn.microsoft.com/de-de/library/57a79xd0(v=vs.110).aspx String.join(sep, string_arr); versus String.concat(a, b);
Java
https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#join-java.la... String.join(sep, some_iterable); versus String.join(sep, a, b, c, d);
JavaScript
https://developer.mozilla.org/de/docs/Web/JavaScript/Reference/Global_Object... my_array.join(sep) versus String.concat(a, b, c, d); While all languages, you named, support the idea of join(), to work on iterables, only Java, and Perl reuse the name for parameter lists, as well. Python can be shoehorned to concatenate with join(). Where I proposed to use the word concat(), Python uses the operator +, or append(), while C# and JavaScript call that concat(). Actually consistency with the experience from other languages is exactly my concern here. We already have operator + and operator += for string concatenation, but they are not the most efficient solutions.
There is also the Qt library which has a QStringList::join method.
https://doc.qt.io/qt-5/qstringlist.html#join Yes. Again just like I proposed to use that word. Thanks for making my point. For concatenation of QStrings you'd use append(), push_back or operator +: s.append("Hello").append(", ").append("World"); s.push_back("Hello").push_back(", ").push_back("World"); s += "Hello" + ", " + "World");
"join" is well established and a short word, too, certainly better than "concat", which has to be abbreviated to not be too long.
Yes, join is well established for a specific use case. I just ask not to use it for a different one. Instead my proposal includes a join() function, that works just like you'd expect from the established word.
I have more arguments. The word "concatenate" sounds awkward and artificial. "join" is a word you use in daily conversations, "concatenate" is not. Google search for "join" yields 3e9 hits, Google search for "concatenate" yields 3e6 hits, so you can say "join" is about 1000x more common. I personally hate technical jargon in any field. Language was invented to include, not to exclude.
There are two distinct concepts, we'll have to either use to distinct words or resort to overloading the same as Java and Perl do. Since I like to call different things differently I'd propose to go the way, C# and JavaScript go.
Finally, "concatenate" in other programming contexts usually means that you append one collection to another collection.
See above. To make things clear, a naive implementation of join, that only supports strings could look like this: template<typename Iterator> auto join(const std::string& separator, Iterator i, Iterator end) -> std::string { auro rval = std::ostringstream{}; auto addSeparator = false; for(; i != end; ++i) { if( addSeparator ) rval << separator; else addSeparator = true; rval << *i; } return rval.str(); } A naive implementation of concat(), could look like this: auto concat_impl() -> std::ostringstream { return {}; } template <typename LastArg, typename ... Args> auto concat_impl(Args&& ... args, LastArg&& last_arg) -> std::ostringstream { return concat_impl(std::forward(args)...) << last_arg; } template <typename ... Args> auto concat(Args&& ... args) -> std::string { return concat_impl(std::forward(args)...).str(); } Both implementations are untested. They are just here to emphasize my point. Christof
Dear Christof, you did a much more thorough (and correct) job on researching the use of "join" in other languages, while mine was really quick. I respect that and feel embarrassed that I didn't do a better job myself.
To make things clear, a naive implementation of join, that only supports strings could look like this:
template<typename Iterator> auto join(const std::string& separator, Iterator i, Iterator end) -> std::string { auro rval = std::ostringstream{};
auto addSeparator = false; for(; i != end; ++i) { if( addSeparator ) rval << separator; else addSeparator = true;
rval << *i; }
return rval.str(); }
A naive implementation of concat(), could look like this:
auto concat_impl() -> std::ostringstream { return {}; }
template <typename LastArg, typename ... Args> auto concat_impl(Args&& ... args, LastArg&& last_arg) -> std::ostringstream { return concat_impl(std::forward(args)...) << last_arg; }
template <typename ... Args> auto concat(Args&& ... args) -> std::string { return concat_impl(std::forward(args)...).str(); }
Both implementations are untested. They are just here to emphasize my point.
I still think that "join" is a better name for what you call "concat", as the function is literally joining various pieces and that name does not require an abbreviation. What you call "join" could be called "join_sequence". Best regards, Hans
Hi, Am 20.01.2017 10:41, schrieb Hans Dembinski:
I still think that "join" is a better name for what you call "concat", as the function is literally joining various pieces and that name does not require an abbreviation. What you call "join" could be called "join_sequence".
I think, we should try and go for the least surprise. People coming form other languages will probably associate with join(), what you call "join_sequence", while they will expect a name like concat(), operator +, append(), etc. for what you prefer to call "join". So the least surprise is, I still think, with concat(a, b, c, d) and join(separator, sequence), just as with C#, JavaScript. With Qt and Python you use append() or operator + instead of concat(), but join() is consistent with them as well. The advantage of concat() over the operator + on strings in C++ is, that concat() can easily be implemented in a much more efficient way with less realloc()s and therefore less data copying. At lesst, when we don't want to have an API like this: auto my_new_str = (cat() + "Hello" + ", " + "World" + "!").str(); Here cat() would return a string factory type, that collects all the stuff, that you "add" to it. In str() it allocates a string big enough for all the content and renders in there. Well compared to auto my_new_str = concat("Hello", ", ", "World", "!").str(); Judge for yourself, which one is easier for the user to understand. Christof
On 20 Jan 2017, at 20:03, Christof Donat <cd@okunah.de> wrote:
So the least surprise is, I still think, with concat(a, b, c, d) and join(separator, sequence), just as with C#, JavaScript. With Qt and Python you use append() or operator + instead of concat(), but join() is consistent with them as well. The advantage of concat() over the operator + on strings in C++ is, that concat() can easily be implemented in a much more efficient way with less realloc()s and therefore less data copying. At lesst, when we don't want to have an API like this:
I agree with the principle of least surprise, it is something I am also applying. :) I think that "join" is less surprising, but it looks like the evidence is against me, so I stand down.
auto my_new_str = (cat() + "Hello" + ", " + "World" + "!").str();
Here cat() would return a string factory type, that collects all the stuff, that you "add" to it. In str() it allocates a string big enough for all the content and renders in there. Well compared to
auto my_new_str = concat("Hello", ", ", "World", "!").str();
Judge for yourself, which one is easier for the user to understand.
Oh, I never questioned this. I see the obvious technical and stylistic advantages of the latter. I am just arguing about names. Your last proposal was without the ".str()", which made more sense: auto my_new_str = concat("Hello", ", ", "World", "!"); I hope this was just a typo.
Hi, Am 23.01.2017 09:55, schrieb Hans Dembinski:
On 20 Jan 2017, at 20:03, Christof Donat <cd@okunah.de> wrote:
auto my_new_str = concat("Hello", ", ", "World", "!").str();
[...]
Your last proposal was without the ".str()", which made more sense:
auto my_new_str = concat("Hello", ", ", "World", "!");
I hope this was just a typo.
Actually no, it wasn't. The idea is, that functions like concat(), join() and format() return string factories, instead of strings. This is an advantage, when you combine calls to these functions: auto my_new_str = concat("Hello ", join(", ",std::begin(my_nums), std::end(my_nums)), format(" the file %1% contains %2% bytes", filename, filesize)).str(); As you see, I only call str() on the return value of concat(). We can make the string factories write to a preallocated buffer. Here the concat facory will allocate enough memory, and ask the join string factory, and the format string factory to write directly to that buffer. If join() and format() would return strings, we'd have to copy them in concat(). An alternative to a function like str() could be an implicit conversion operator to std::string. Then we would write: auto my_new_str = std::string{concat("Hello ", join(", ",std::begin(my_nums), std::end(my_nums)), format(" the file %1% contains %2% bytes", filename, filesize))}; or std::string my_new_str{concat("Hello ", join(", ",std::begin(my_nums), std::end(my_nums)), format(" the file %1% contains %2% bytes", filename, filesize))}; Again we pass two string factories to concat(). Therefore it can allocate enough memory and ask these factories to write there. The only difference is, that all that does not happen in a call to str(), but in a call to operator std::string (). I am not completely opposed to implicit conversions, but in this case, an explicit function call feels better to me. Sorry, that I don't have a better reason at the moment. No matter if we use str() of an implicit conversion, combining these calls also lets us use tag parameters for formatting details: format("%1% is %2% and %3%", 42, concat(format::hex<int>, 42, " in hex"), concat(format::oct<int>, 42, " in oct")).str(); Other than the iostream manipulators these formatting instructions have a defined scope. And still we don't have to copy temporary strings, because we pass string factories. Christof
On 23 Jan 2017, at 11:23, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 23.01.2017 09:55, schrieb Hans Dembinski:
On 20 Jan 2017, at 20:03, Christof Donat <cd@okunah.de> wrote:
auto my_new_str = concat("Hello", ", ", "World", "!").str(); [...] Your last proposal was without the ".str()", which made more sense: auto my_new_str = concat("Hello", ", ", "World", "!"); I hope this was just a typo.
Actually no, it wasn't. The idea is, that functions like concat(), join() and format() return string factories, instead of strings. This is an advantage, when you combine calls to these functions:
auto my_new_str = concat("Hello ", join(", ",std::begin(my_nums), std::end(my_nums)), format(" the file %1% contains %2% bytes", filename, filesize)).str();
It makes sense to me for "join" to return a string factory, because it is likely to be nested in "concat". But I don't see the practical case of nested "concat" calls, at least it is not going to be a common pattern in the need of optimising. If "concat" is the outer layer anyway, I would return a std::string directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.
Hi, Am 23.01.2017 16:32, schrieb Hans Dembinski:
On 23 Jan 2017, at 11:23, Christof Donat <cd@okunah.de> wrote: auto my_new_str = concat("Hello ", join(", ",std::begin(my_nums), std::end(my_nums)), format(" the file %1% contains %2% bytes", filename, filesize)).str();
It makes sense to me for "join" to return a string factory, because it is likely to be nested in "concat". But I don't see the practical case of nested "concat" calls, at least it is not going to be a common pattern in the need of optimising.
There is several usecases: 1. scope for formatting tags: concat(format::hex<int>, 42, " is hex for ", concat(42)).str(); Here the inner concat will convert the 42 to its decimal representation, while the outer one converts the first 42 to its hex representation. 2. concat() in calls to format(): format("%|1$40t|%2%", concat(first_name, " ", last_name), phone_number).str(); format().str() will allocate the buffer and ask the concat string factory to write into it. 3. results from concat() in a boost::range that is passed to join(): join(separator("\n"), my_files | transformed([](const std::filesystem::path& f) -> auto { return concat(f.filename, ": ", std::filesystem::file_size(f)); })).str(); join().str() will ask every concat string factory to render directly into the common buffer.
If "concat" is the outer layer anyway, I would return a std::string directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.
Of course better proposals are welcome :-) Would you prefer the implicit conversion? If so, why? Christof
On Mon, Jan 23, 2017 at 5:32 PM, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 23.01.2017 16:32, schrieb Hans Dembinski:
On 23 Jan 2017, at 11:23, Christof Donat <cd@okunah.de> wrote:
auto my_new_str = concat("Hello ", join(", ",std::begin(my_nums), std::end(my_nums)), format(" the file %1% contains %2% bytes", filename, filesize)).str();
It makes sense to me for "join" to return a string factory, because it is likely to be nested in "concat". But I don't see the practical case of nested "concat" calls, at least it is not going to be a common pattern in the need of optimising.
There is several usecases:
1. scope for formatting tags:
concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
Here the inner concat will convert the 42 to its decimal representation, while the outer one converts the first 42 to its hex representation.
Wouldn't concat(hex(42), " is hex for", 42) make more sense?
2. concat() in calls to format():
format("%|1$40t|%2%", concat(first_name, " ", last_name), phone_number).str();
Why not fold the name concat into the format string?
format().str() will allocate the buffer and ask the concat string factory to write into it.
3. results from concat() in a boost::range that is passed to join():
join(separator("\n"), my_files | transformed([](const std::filesystem::path& f) -> auto { return concat(f.filename, ": ", std::filesystem::file_size(f)); })).str();
join().str() will ask every concat string factory to render directly into the common buffer.
If "concat" is the outer layer anyway, I would return a std::string
directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.
Of course better proposals are welcome :-) Would you prefer the implicit conversion? If so, why?
Implicit is problematic with auto.. -- Olaf
Why not fold the name concat into the format string?
Because format strings are evil. They cause errors that can only be detected at runtime, and that is not a Good Thing (tm). On 23 January 2017 at 19:26, Olaf van der Spek <ml@vdspek.org> wrote:
On Mon, Jan 23, 2017 at 5:32 PM, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 23.01.2017 16:32, schrieb Hans Dembinski:
On 23 Jan 2017, at 11:23, Christof Donat <cd@okunah.de> wrote:
auto my_new_str = concat("Hello ", join(", ",std::begin(my_nums), std::end(my_nums)), format(" the file %1% contains %2% bytes", filename, filesize)).str();
It makes sense to me for "join" to return a string factory, because it is likely to be nested in "concat". But I don't see the practical case of nested "concat" calls, at least it is not going to be a common pattern in the need of optimising.
There is several usecases:
1. scope for formatting tags:
concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
Here the inner concat will convert the 42 to its decimal representation, while the outer one converts the first 42 to its hex representation.
Wouldn't concat(hex(42), " is hex for", 42) make more sense?
2. concat() in calls to format():
format("%|1$40t|%2%", concat(first_name, " ", last_name), phone_number).str();
Why not fold the name concat into the format string?
format().str() will allocate the buffer and ask the concat string factory to write into it.
3. results from concat() in a boost::range that is passed to join():
join(separator("\n"), my_files | transformed([](const std::filesystem::path& f) -> auto { return concat(f.filename, ": ", std::filesystem::file_size(f)); })).str();
join().str() will ask every concat string factory to render directly into the common buffer.
If "concat" is the outer layer anyway, I would return a std::string
directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.
Of course better proposals are welcome :-) Would you prefer the implicit conversion? If so, why?
Implicit is problematic with auto..
-- Olaf
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/ mailman/listinfo.cgi/boost
On 24/01/2017 07:26, Olaf van der Spek wrote:
1. scope for formatting tags:
concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
Here the inner concat will convert the 42 to its decimal representation, while the outer one converts the first 42 to its hex representation.
Wouldn't concat(hex(42), " is hex for", 42) make more sense?
+1. If you like persistent formatting states (and all the unexpected fun they cause when you forget to cancel them), use iostreams instead.
3. results from concat() in a boost::range that is passed to join():
join(separator("\n"), my_files | transformed([](const std::filesystem::path& f) -> auto { return concat(f.filename, ": ", std::filesystem::file_size(f)); })).str();
Maybe I missed something, but what was the intended distinction between concat() and join()? To me, as vocabulary words, concat() implies "concatenate without separator" and join() implies "concatenate with separator"; as such it seems unnecessary to explicitly decorate the separator here since it's a required parameter. (Both should accept either variadics or ranges, if that's not too complicated to arrange, though it's likely for join() to be used more often on ranges; concat() might be more evenly split, though probably leaning toward variadics.) Although I suppose http://www.boost.org/doc/libs/1_63_0/doc/html/string_algo/reference.html#hea... and http://www.boost.org/doc/libs/1_63_0/libs/range/doc/html/range/reference/uti... might not entirely agree on this vocabulary... (Loads the bikeshed on the back of a bike and rides away.)
If "concat" is the outer layer anyway, I would return a std::string directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.
Of course better proposals are welcome :-) Would you prefer the implicit conversion? If so, why?
Implicit is problematic with auto..
While that's true, I think the flexibility of returning a factory from concat() is more useful than the discomfort of either remembering the str() or using std::string explicitly as the type instead of auto (or using auto combined with explicit std::string construction).
So let me just say that this is a bike shed topic -- I've been in the shed a decade ago on the subject -- but hey why not :) I dropped the proposal below bc it was clear to me that no agreement was possible on the topic. The proposal in isn't variadic because that feature didn't exist at the time, but you can see how it would be trivially changed to be -- why it is begging to be variadic. It brings together many of the discussed libraries format, string_algo, regex into an interface that is simple and clear. It allows for both formatting and appending. Extremely small sample super_string <http://www.crystalclearsoftware.com/libraries/super_string/classbasic__super__string.html> s(" (456789) [123] 2006-10-01 abcdef "); s.to_upper(); cout << s << endl; s.trim(); //lop off the whitespace on both sides cout << s << endl; double dbl = 1.23456; s.append(dbl); //append any streamable type s+= " "; cout << s << endl; date d(2006, Jul, 1); s.insert_at(28, d); //insert any streamable type cout << s << endl; super_string s; double dbl = 1.123456789; int i = 1000; s.append_formatted <http://www.crystalclearsoftware.com/libraries/super_string/classbasic__super__string.html#67740abfd6224fab0402b3dd7076f216>(dbl, i , dbl, i, "a string", "%-7.2f %-7d %-7.2f %-7d %s"); //s == "1.12 1000 1.12 1000 a string" //other overloadings available with less parameters super_string s1; s1.append_formatted <http://www.crystalclearsoftware.com/libraries/super_string/classbasic__super__string.html#67740abfd6224fab0402b3dd7076f216>(dbl, "This is the value: %-7.2f"); //s1 == "This is the value: 1.12" main page http://www.crystalclearsoftware.com/libraries/super_string/index.html My justification for *why* I did this as a type against all the *standard judgement* of the c++ experts http://www.crystalclearsoftware.com/libraries/super_string/index.html#why_ty... My original post in 2006 http://lists.boost.org/Archives/boost/2006/07/107087.php Jeff On Mon, Jan 23, 2017 at 4:14 PM, Gavin Lambert <gavinl@compacsort.com> wrote:
On 24/01/2017 07:26, Olaf van der Spek wrote:
1. scope for formatting tags:
concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
Here the inner concat will convert the 42 to its decimal representation, while the outer one converts the first 42 to its hex representation.
Wouldn't concat(hex(42), " is hex for", 42) make more sense?
+1. If you like persistent formatting states (and all the unexpected fun they cause when you forget to cancel them), use iostreams instead.
3. results from concat() in a boost::range that is passed to join():
join(separator("\n"), my_files | transformed([](const std::filesystem::path& f) -> auto { return concat(f.filename, ": ", std::filesystem::file_size(f)); })).str();
Maybe I missed something, but what was the intended distinction between concat() and join()?
To me, as vocabulary words, concat() implies "concatenate without separator" and join() implies "concatenate with separator"; as such it seems unnecessary to explicitly decorate the separator here since it's a required parameter.
(Both should accept either variadics or ranges, if that's not too complicated to arrange, though it's likely for join() to be used more often on ranges; concat() might be more evenly split, though probably leaning toward variadics.)
Although I suppose http://www.boost.org/doc/libs/ 1_63_0/doc/html/string_algo/reference.html#header.boost.algo rithm.string.join_hpp and http://www.boost.org/doc/libs/ 1_63_0/libs/range/doc/html/range/reference/utilities/join.html might not entirely agree on this vocabulary...
(Loads the bikeshed on the back of a bike and rides away.)
If "concat" is the outer layer anyway, I would return a std::string
directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.
Of course better proposals are welcome :-) Would you prefer the implicit conversion? If so, why?
Implicit is problematic with auto..
While that's true, I think the flexibility of returning a factory from concat() is more useful than the discomfort of either remembering the str() or using std::string explicitly as the type instead of auto (or using auto combined with explicit std::string construction).
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
For what it's worth, in my view join and/or concat should return a generator. To me the idiomatic way to convert it to a string would be and ADL function to_string(generator) -> std::string. The generator itself should be streamable to at least a basic_ostream via ADL operator<< It seems to me that since c++17 is going to have string_view, and boost already does, then there should be a to_string_view free function to return the view of a temporary. The result will be that string construction is then only necessary when the user wants it. basic ADL interface something like this: #include <memory> #include <iostream> #include <string> #include <experimental/string_view> namespace boost { template<class Char, class Traits = std::char_traits<Char>, class Allocator = std::allocator<Char>> struct join_engine_type { // implementation of joiner here }; template<class Char, class Traits, class Allocator> auto operator<<(std::basic_ostream<Char, Traits>& os, const join_engine_type<Char, Traits, Allocator>& eng) -> std::basic_ostream<Char, Traits>& { // impementation here return os; }; template<class Char, class Traits, class Allocator> auto allocate_string(const join_engine_type<Char, Traits, Allocator>& eng) -> std::basic_string<Char, Traits, Allocator>& { // note that the allocator is copied, so if the joiner has a memory pool // allocator, the strings will share the memory pool // impementation here }; template<class Char, class Traits, class Allocator> auto to_string(const join_engine_type<Char, Traits, Allocator>& eng) -> std::basic_string<Char, Traits>& { // note that the standard allocator is used. Strings are now independent. // impementation here }; template<class Char, class Traits, class Allocator> auto to_string_view(const join_engine_type<Char, Traits, Allocator>& eng) -> std::experimental::basic_string_view<Char, Traits>& { // note that the standard allocator is used. Strings are now independent. // impementation here }; } Furthermore, since the entire web is now (thankfully) UTF8, I strongly feel that there should be a utf8 version, which accepts wide and narrow strings and emits them correctly. On 24 January 2017 at 02:59, Jeff Garland <azswdude@gmail.com> wrote:
So let me just say that this is a bike shed topic -- I've been in the shed a decade ago on the subject -- but hey why not :) I dropped the proposal below bc it was clear to me that no agreement was possible on the topic. The proposal in isn't variadic because that feature didn't exist at the time, but you can see how it would be trivially changed to be -- why it is begging to be variadic. It brings together many of the discussed libraries format, string_algo, regex into an interface that is simple and clear. It allows for both formatting and appending. Extremely small sample
super_string <http://www.crystalclearsoftware.com/ libraries/super_string/classbasic__super__string.html> s(" (456789) [123] 2006-10-01 abcdef "); s.to_upper(); cout << s << endl;
s.trim(); //lop off the whitespace on both sides cout << s << endl;
double dbl = 1.23456; s.append(dbl); //append any streamable type s+= " "; cout << s << endl;
date d(2006, Jul, 1); s.insert_at(28, d); //insert any streamable type cout << s << endl;
super_string s; double dbl = 1.123456789; int i = 1000; s.append_formatted <http://www.crystalclearsoftware.com/libraries/super_string/ classbasic__super__string.html#67740abfd6224fab0402b3dd7076f216>(dbl, i , dbl, i, "a string", "%-7.2f %-7d %-7.2f %-7d %s"); //s == "1.12 1000 1.12 1000 a string"
//other overloadings available with less parameters super_string s1; s1.append_formatted <http://www.crystalclearsoftware.com/libraries/super_string/ classbasic__super__string.html#67740abfd6224fab0402b3dd7076f216>(dbl, "This is the value: %-7.2f"); //s1 == "This is the value: 1.12"
main page http://www.crystalclearsoftware.com/libraries/super_string/index.html
My justification for *why* I did this as a type against all the *standard judgement* of the c++ experts http://www.crystalclearsoftware.com/libraries/super_string/index. html#why_type
My original post in 2006 http://lists.boost.org/Archives/boost/2006/07/107087.php
Jeff
On Mon, Jan 23, 2017 at 4:14 PM, Gavin Lambert <gavinl@compacsort.com> wrote:
On 24/01/2017 07:26, Olaf van der Spek wrote:
1. scope for formatting tags:
concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
Here the inner concat will convert the 42 to its decimal
representation,
while the outer one converts the first 42 to its hex representation.
Wouldn't concat(hex(42), " is hex for", 42) make more sense?
+1. If you like persistent formatting states (and all the unexpected fun they cause when you forget to cancel them), use iostreams instead.
3. results from concat() in a boost::range that is passed to join():
join(separator("\n"), my_files | transformed([](const std::filesystem::path& f) -> auto
{
return concat(f.filename, ": ", std::filesystem::file_size(f)); })).str();
Maybe I missed something, but what was the intended distinction between concat() and join()?
To me, as vocabulary words, concat() implies "concatenate without separator" and join() implies "concatenate with separator"; as such it seems unnecessary to explicitly decorate the separator here since it's a required parameter.
(Both should accept either variadics or ranges, if that's not too complicated to arrange, though it's likely for join() to be used more often on ranges; concat() might be more evenly split, though probably leaning toward variadics.)
Although I suppose http://www.boost.org/doc/libs/ 1_63_0/doc/html/string_algo/reference.html#header.boost.algo rithm.string.join_hpp and http://www.boost.org/doc/libs/ 1_63_0/libs/range/doc/html/range/reference/utilities/join.html might not entirely agree on this vocabulary...
(Loads the bikeshed on the back of a bike and rides away.)
If "concat" is the outer layer anyway, I would return a std::string
directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.
Of course better proposals are welcome :-) Would you prefer the implicit conversion? If so, why?
Implicit is problematic with auto..
While that's true, I think the flexibility of returning a factory from concat() is more useful than the discomfort of either remembering the str() or using std::string explicitly as the type instead of auto (or using auto combined with explicit std::string construction).
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/ mailman/listinfo.cgi/boost
Hi, Am 24.01.2017 10:07, schrieb Richard Hodges:
For what it's worth, in my view join and/or concat should return a generator.
To me the idiomatic way to convert it to a string would be and ADL function to_string(generator) -> std::string.
Then why not use the std::string constructor instead? That would be the case with the implicit conversion: auto generator = concat(...); auto my_str = std::string{generator};
The generator itself should be streamable to at least a basic_ostream via ADL operator<<
Yes, that looks like a good idea to me as well. The simple and straight forward implementation (using the std::string constructor) might be: auto operator << (std::basic_ostream<...>& stream, generator g) -> std::basic_ostream<...>& { }
It seems to me that since c++17 is going to have string_view, and boost already does, then there should be a to_string_view free function to return the view of a temporary. The result will be that string construction is then only necessary when the user wants it.
basic ADL interface something like this:
#include <memory> #include <iostream> #include <string> #include <experimental/string_view>
namespace boost { template<class Char, class Traits = std::char_traits<Char>, class Allocator = std::allocator<Char>> struct join_engine_type { // implementation of joiner here };
template<class Char, class Traits, class Allocator> auto operator<<(std::basic_ostream<Char, Traits>& os, const join_engine_type<Char, Traits, Allocator>& eng) -> std::basic_ostream<Char, Traits>& { // impementation here return os; };
template<class Char, class Traits, class Allocator> auto allocate_string(const join_engine_type<Char, Traits, Allocator>& eng) -> std::basic_string<Char, Traits, Allocator>& { // note that the allocator is copied, so if the joiner has a memory pool // allocator, the strings will share the memory pool // impementation here };
template<class Char, class Traits, class Allocator> auto to_string(const join_engine_type<Char, Traits, Allocator>& eng) -> std::basic_string<Char, Traits>& { // note that the standard allocator is used. Strings are now independent. // impementation here };
template<class Char, class Traits, class Allocator> auto to_string_view(const join_engine_type<Char, Traits, Allocator>& eng) -> std::experimental::basic_string_view<Char, Traits>& { // note that the standard allocator is used. Strings are now independent. // impementation here }; }
Furthermore, since the entire web is now (thankfully) UTF8, I strongly feel that there should be a utf8 version, which accepts wide and narrow strings and emits them correctly.
On 24 January 2017 at 02:59, Jeff Garland <azswdude@gmail.com> wrote:
So let me just say that this is a bike shed topic -- I've been in the shed a decade ago on the subject -- but hey why not :) I dropped the proposal below bc it was clear to me that no agreement was possible on the topic. The proposal in isn't variadic because that feature didn't exist at the time, but you can see how it would be trivially changed to be -- why it is begging to be variadic. It brings together many of the discussed libraries format, string_algo, regex into an interface that is simple and clear. It allows for both formatting and appending. Extremely small sample
super_string <http://www.crystalclearsoftware.com/ libraries/super_string/classbasic__super__string.html> s(" (456789) [123] 2006-10-01 abcdef "); s.to_upper(); cout << s << endl;
s.trim(); //lop off the whitespace on both sides cout << s << endl;
double dbl = 1.23456; s.append(dbl); //append any streamable type s+= " "; cout << s << endl;
date d(2006, Jul, 1); s.insert_at(28, d); //insert any streamable type cout << s << endl;
super_string s; double dbl = 1.123456789; int i = 1000; s.append_formatted <http://www.crystalclearsoftware.com/libraries/super_string/ classbasic__super__string.html#67740abfd6224fab0402b3dd7076f216>(dbl, i , dbl, i, "a string", "%-7.2f %-7d %-7.2f %-7d %s"); //s == "1.12 1000 1.12 1000 a string"
//other overloadings available with less parameters super_string s1; s1.append_formatted <http://www.crystalclearsoftware.com/libraries/super_string/ classbasic__super__string.html#67740abfd6224fab0402b3dd7076f216>(dbl, "This is the value: %-7.2f"); //s1 == "This is the value: 1.12"
main page http://www.crystalclearsoftware.com/libraries/super_string/index.html
My justification for *why* I did this as a type against all the *standard judgement* of the c++ experts http://www.crystalclearsoftware.com/libraries/super_string/index. html#why_type
My original post in 2006 http://lists.boost.org/Archives/boost/2006/07/107087.php
Jeff
On Mon, Jan 23, 2017 at 4:14 PM, Gavin Lambert <gavinl@compacsort.com> wrote:
On 24/01/2017 07:26, Olaf van der Spek wrote:
1. scope for formatting tags:
concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
Here the inner concat will convert the 42 to its decimal
representation,
while the outer one converts the first 42 to its hex representation.
Wouldn't concat(hex(42), " is hex for", 42) make more sense?
+1. If you like persistent formatting states (and all the unexpected fun they cause when you forget to cancel them), use iostreams instead.
3. results from concat() in a boost::range that is passed to join():
join(separator("\n"), my_files | transformed([](const std::filesystem::path& f) -> auto
{
return concat(f.filename, ": ", std::filesystem::file_size(f)); })).str();
Maybe I missed something, but what was the intended distinction between concat() and join()?
To me, as vocabulary words, concat() implies "concatenate without separator" and join() implies "concatenate with separator"; as such it seems unnecessary to explicitly decorate the separator here since it's a required parameter.
(Both should accept either variadics or ranges, if that's not too complicated to arrange, though it's likely for join() to be used more often on ranges; concat() might be more evenly split, though probably leaning toward variadics.)
Although I suppose http://www.boost.org/doc/libs/ 1_63_0/doc/html/string_algo/reference.html#header.boost.algo rithm.string.join_hpp and http://www.boost.org/doc/libs/ 1_63_0/libs/range/doc/html/range/reference/utilities/join.html might not entirely agree on this vocabulary...
(Loads the bikeshed on the back of a bike and rides away.)
If "concat" is the outer layer anyway, I would return a std::string
directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.
Of course better proposals are welcome :-) Would you prefer the implicit conversion? If so, why?
Implicit is problematic with auto..
While that's true, I think the flexibility of returning a factory from concat() is more useful than the discomfort of either remembering the str() or using std::string explicitly as the type instead of auto (or using auto combined with explicit std::string construction).
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/ mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Hi, Sorry for the unfinished mail. I just was unable to handle my mail client correctly :-( Am 24.01.2017 10:07, schrieb Richard Hodges:
It seems to me that since c++17 is going to have string_view, and boost already does, then there should be a to_string_view free function to return the view of a temporary.
But where will the temporary live then? Someone will have to free the memory.
basic ADL interface something like this:
I didn't get that part. What exactly does ADL stand for?
Furthermore, since the entire web is now (thankfully) UTF8, I strongly feel that there should be a utf8 version, which accepts wide and narrow strings and emits them correctly.
UTF-8 is 8 bits only. Just some characters take more than a single byte. See https://en.wikipedia.org/wiki/UTF-8 Anyway, I agree, that we should have a wide char version as well for UTF-16 support. Christof
Responses inline. On 24 January 2017 at 10:45, Christof Donat <cd@okunah.de> wrote:
Hi,
Sorry for the unfinished mail. I just was unable to handle my mail client correctly :-(
Am 24.01.2017 10:07, schrieb Richard Hodges:
It seems to me that since c++17 is going to have string_view, and boost already does, then there should be a to_string_view free function to return the view of a temporary.
But where will the temporary live then? Someone will have to free the memory.
Imagine a function: void foo(std::string_view s); which we then call with: foo(join("the answer to life, the universe and everything is: ", hex(42))); The generator returned by join would stay alive until the end of the function foo, so there would be no need to construct a string, only to take a string_view of it. We could use the string_view implicit in the joiner object. This saves us an allocation and a copy.
basic ADL interface something like this:
I didn't get that part. What exactly does ADL stand for?
ADL stands for Argument Dependent Lookup. It means that when you call a free function, the namespaces of its arguments are searched for that function. This means that you can write operator<<, to_string, hash_code etc in the namespace of your custom object and the compiler will select the correct one. It's used a great deal in boost for customisation of structures like boost::hash
Furthermore, since the entire web is now (thankfully) UTF8, I strongly feel
that there should be a utf8 version, which accepts wide and narrow strings and emits them correctly.
UTF-8 is 8 bits only. Just some characters take more than a single byte. See https://en.wikipedia.org/wiki/UTF-8
Anyway, I agree, that we should have a wide char version as well for UTF-16 support.
I have no problem with a u16 version (UTF-16) for the windows crowd and a u8 version (UTF-8) for everyone else. The standard supports this idea in its unicode specialisations of std::string - std::u16string and std::u32string. A utf-8 string is just a std::basic_string<char> with utf-aware traits type.
Christof
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi, Am 24.01.2017 11:29, schrieb Richard Hodges:
On 24 January 2017 at 10:45, Christof Donat <cd@okunah.de> wrote:
Sorry for the unfinished mail. I just was unable to handle my mail client correctly :-(
Am 24.01.2017 10:07, schrieb Richard Hodges:
It seems to me that since c++17 is going to have string_view, and boost already does, then there should be a to_string_view free function to return the view of a temporary.
But where will the temporary live then? Someone will have to free the memory.
Imagine a function:
void foo(std::string_view s);
which we then call with:
foo(join("the answer to life, the universe and everything is: ", hex(42)));
Did you mean foo(concat(hex<int>, "the answer to life, the universe and everything is: ", 42)); ? SCNR.
The generator returned by join would stay alive until the end of the function foo, so there would be no need to construct a string, only to take a string_view of it. We could use the string_view implicit in the joiner object. This saves us an allocation and a copy.
I see. So the string would live inside the string factory as a member object, when we implicitly convert to a string_view. With C++17 std::string will have an implicit conversion operator to std::string_view. So this will be sufficient: foo(std::string{concat("the answer to life, the universe and everything is: ", hex(42))});
basic ADL interface something like this:
I didn't get that part. What exactly does ADL stand for?
ADL stands for Argument Dependent Lookup. It means that when you call a free function, the namespaces of its arguments are searched for that function. This means that you can write operator<<, to_string, hash_code etc in the namespace of your custom object and the compiler will select the correct one.
Ah, I see. Thank you. I was aware of the mechanism, but not of the name.
Furthermore, since the entire web is now (thankfully) UTF8, I strongly feel that there should be a utf8 version, which accepts wide and narrow strings and emits them correctly.
UTF-8 is 8 bits only. Just some characters take more than a single byte. See https://en.wikipedia.org/wiki/UTF-8
Anyway, I agree, that we should have a wide char version as well for UTF-16 support.
I have no problem with a u16 version (UTF-16) for the windows crowd and a u8 version (UTF-8) for everyone else. The standard supports this idea in its unicode specialisations of std::string - std::u16string and std::u32string. A utf-8 string is just a std::basic_string<char> with utf-aware traits type.
Yes, but I don't get why you want wide string versions for UTF-8-support. I sit about converting wide string to utf8? Like this: std::string{concat(my_wide_string)}; Christof
Answers to answers inline :) On 24 January 2017 at 12:17, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 24.01.2017 11:29, schrieb Richard Hodges:
On 24 January 2017 at 10:45, Christof Donat <cd@okunah.de> wrote:
Sorry for the unfinished mail. I just was unable to handle my mail client correctly :-(
Am 24.01.2017 10:07, schrieb Richard Hodges:
It seems to me that since c++17 is going to have string_view, and boost
already does, then there should be a to_string_view free function to return the view of a temporary.
But where will the temporary live then? Someone will have to free the memory.
Imagine a function:
void foo(std::string_view s);
which we then call with:
foo(join("the answer to life, the universe and everything is: ", hex(42)));
Did you mean
foo(concat(hex<int>, "the answer to life, the universe and everything is: ", 42));
I probably meant: foo(to_string_view(concat("the answer to life, the universe and everything is: ", hex(42)))); or foo(to_string_view(join(separator(' '), "the answer to life, the universe and everything is:", hex(42)))); The idea being to avoid the construction of any un-necessary string objects. The generator already contains a buffer (or could) so it seems wasteful to me to create a string temporary simply to view its buffer. I personally prefer the limited-scope manipulators. They feel more portable and less surprising when, for example, refactoring or merging code.
? SCNR.
The generator returned by join would stay alive until the end of the
function foo, so there would be no need to construct a string, only to take a string_view of it. We could use the string_view implicit in the joiner object. This saves us an allocation and a copy.
I see. So the string would live inside the string factory as a member object, when we implicitly convert to a string_view. With C++17 std::string will have an implicit conversion operator to std::string_view. So this will be sufficient:
A string, a string-like buffer, or a reference to a string. I feel that the generator should be able to work on a supplied string reference so that it can be used to extend an existing string without reallocations or copies if required.
foo(std::string{concat("the answer to life, the universe and everything is: ", hex(42))});
basic ADL interface something like this:
I didn't get that part. What exactly does ADL stand for?
ADL stands for Argument Dependent Lookup. It means that when you call a free function, the namespaces of its arguments are searched for that function. This means that you can write operator<<, to_string, hash_code etc in the namespace of your custom object and the compiler will select the correct one.
Ah, I see. Thank you. I was aware of the mechanism, but not of the name.
Furthermore, since the entire web is now (thankfully) UTF8, I strongly feel
that there should be a utf8 version, which accepts wide and narrow strings and emits them correctly.
UTF-8 is 8 bits only. Just some characters take more than a single byte. See https://en.wikipedia.org/wiki/UTF-8
Anyway, I agree, that we should have a wide char version as well for UTF-16 support.
I have no problem with a u16 version (UTF-16) for the windows crowd and a u8 version (UTF-8) for everyone else. The standard supports this idea in its unicode specialisations of std::string - std::u16string and std::u32string. A utf-8 string is just a std::basic_string<char> with utf-aware traits type.
Yes, but I don't get why you want wide string versions for UTF-8-support. I sit about converting wide string to utf8? Like this:
std::string{concat(my_wide_string)};
Maybe to_utf8(concat(...)); would be better. It's again explicit and could be given options to control behaviour. It also decouples the concept of UTF8 from the concept of concatenation. This adheres more to the c++ way of only paying for what you need.
Christof
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi, Am 24.01.2017 12:52, schrieb Richard Hodges:
On 24 January 2017 at 12:17, Christof Donat <cd@okunah.de> wrote:
Am 24.01.2017 11:29, schrieb Richard Hodges:
Imagine a function:
void foo(std::string_view s);
which we then call with:
foo(join("the answer to life, the universe and everything is: ", hex(42)));
Did you mean
foo(concat(hex<int>, "the answer to life, the universe and everything is: ", 42));
I probably meant: foo(to_string_view(concat("the answer to life, the universe and everything is: ", hex(42)))); or foo(to_string_view(join(separator(' '), "the answer to life, the universe and everything is:", hex(42))));
The idea being to avoid the construction of any un-necessary string objects. The generator already contains a buffer (or could) so it seems wasteful to me to create a string temporary simply to view its buffer.
The idea was, that not the generator holds a buffer, but the function, that actually executes it, instantiates a std::string. When that is returned, either the compiler elides that copy, or it will be moved out. So in reality the waste is very minimal, if not non existent.
The generator returned by join would stay alive until the end of the function foo, so there would be no need to construct a string, only to take a string_view of it. We could use the string_view implicit in the joiner object. This saves us an allocation and a copy.
I see. So the string would live inside the string factory as a member object, when we implicitly convert to a string_view. With C++17 std::string will have an implicit conversion operator to std::string_view. So this will be sufficient:
A string, a string-like buffer, or a reference to a string. I feel that the generator should be able to work on a supplied string reference so that it can be used to extend an existing string without reallocations or copies if required.
Yes, that is possible, when the function, that executes the generator is responsible for the buffer. Then you can have e.g. concat(...).str(); // allocate a string and return it concat(...).append_to(s); // append to an existing string; concat(...).replace(s); // write over an existing string, reusing its memory. I think, we all agree here, that implicit conversion is not the way to go. So my current proposal still is .str(), and the like. You propose free functions instead.
Yes, but I don't get why you want wide string versions for UTF-8-support. I sit about converting wide string to utf8? Like this:
std::string{concat(my_wide_string)};
Maybe to_utf8(concat(...)); would be better.
Uh, coming back to the formatting tags and my currently preferred syntax: concat(utf_8, ...).str(); If you need additional options for utf_8, it can be a function call: concat(utf_8(more, options), ...).str();
It's again explicit and could be given options to control behaviour. It also decouples the concept of UTF8 from the concept of concatenation. This adheres more to the c++ way of only paying for what you need.
The functions, that execute the string factories should, in my opinion, only care, if they have enough memory, and let the factories care about the content. Therefore I think, that the question of character encoding should be dealt with in the factories. I don't see, how the question of character encoding can be decoupled from the concept of converting arbitrary objects to strings. The converter has to have a way to encode its result. Christof
It's almost looking like conversion to utf8, wide strings, strings or string_views should be a filter rather than an option. I agree that the factory hierarchy should be as static as possible, so that it's trivially copyable. Thinking on that, it seems to me that the act of joining or concatenating is the result of applying an output filter (e.g. .str() or to_string()) to a sequence of input filters. Or put another way, executing a sequence of input filters with their outputs set to some output filters. While I don't like the idea of pipes, the following structure seems to model the process: auto s = some_string(); join(a, b, c) | to_utf8() | append(s); Another way to express this is: join(a, b, c).apply(to_utf8()).apply(append(s)); This is not dissimilar to the architecture of the boost::iostream library (although that library is polymorphic, join can be generic). Other options spring to mind: join(a, b, c) | widen() | prepend(some_wide_string); auto s = separator(" : ") | join(a, b, as_hex(fixed(4), c), std::quoted(d)) | create_string(); example output might be: foo : bar : 003a : "baz" alternative syntax: auto s = separator(" : ").join(a, b, as_hex(fixed(4), c), std::quoted(d) ).apply(create_string()); Note that I am still resisting the idea of .str() as a member function. If the joiner or concatenation object exports begin() and end(), it's un-necessary, because the object returned by create_string() (or similar) can use the iterators. Having iterators also means that the attributed factory can be used as a source in std::copy, std::transform, std::for_each etc. Whether the factory should simply export input_iterators iterators or random_access will depend on how much state we'd want the factory to carry. For now, I think input_iterators are sufficient. On 24 January 2017 at 13:26, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 24.01.2017 12:52, schrieb Richard Hodges:
On 24 January 2017 at 12:17, Christof Donat <cd@okunah.de> wrote:
Am 24.01.2017 11:29, schrieb Richard Hodges:
Imagine a function:
void foo(std::string_view s);
which we then call with:
foo(join("the answer to life, the universe and everything is: ", hex(42)));
Did you mean
foo(concat(hex<int>, "the answer to life, the universe and everything is: ", 42));
I probably meant: foo(to_string_view(concat("the answer to life, the universe and everything is: ", hex(42)))); or foo(to_string_view(join(separator(' '), "the answer to life, the universe and everything is:", hex(42))));
The idea being to avoid the construction of any un-necessary string objects. The generator already contains a buffer (or could) so it seems wasteful to me to create a string temporary simply to view its buffer.
The idea was, that not the generator holds a buffer, but the function, that actually executes it, instantiates a std::string. When that is returned, either the compiler elides that copy, or it will be moved out. So in reality the waste is very minimal, if not non existent.
The generator returned by join would stay alive until the end of the
function foo, so there would be no need to construct a string, only to take a string_view of it. We could use the string_view implicit in the joiner object. This saves us an allocation and a copy.
I see. So the string would live inside the string factory as a member object, when we implicitly convert to a string_view. With C++17 std::string will have an implicit conversion operator to std::string_view. So this will be sufficient:
A string, a string-like buffer, or a reference to a string. I feel that the generator should be able to work on a supplied string reference so that it can be used to extend an existing string without reallocations or copies if required.
Yes, that is possible, when the function, that executes the generator is responsible for the buffer. Then you can have e.g.
concat(...).str(); // allocate a string and return it concat(...).append_to(s); // append to an existing string; concat(...).replace(s); // write over an existing string, reusing its memory.
I think, we all agree here, that implicit conversion is not the way to go. So my current proposal still is .str(), and the like. You propose free functions instead.
Yes, but I don't get why you want wide string versions for UTF-8-support.
I sit about converting wide string to utf8? Like this:
std::string{concat(my_wide_string)};
Maybe to_utf8(concat(...)); would be better.
Uh, coming back to the formatting tags and my currently preferred syntax:
concat(utf_8, ...).str();
If you need additional options for utf_8, it can be a function call:
concat(utf_8(more, options), ...).str();
It's again explicit and could
be given options to control behaviour. It also decouples the concept of UTF8 from the concept of concatenation. This adheres more to the c++ way of only paying for what you need.
The functions, that execute the string factories should, in my opinion, only care, if they have enough memory, and let the factories care about the content. Therefore I think, that the question of character encoding should be dealt with in the factories.
I don't see, how the question of character encoding can be decoupled from the concept of converting arbitrary objects to strings. The converter has to have a way to encode its result.
Christof
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi, Am 24.01.2017 14:00, schrieb Richard Hodges:
It's almost looking like conversion to utf8, wide strings, strings or string_views should be a filter rather than an option.
A filter always needs an input encoding and an output encoding. If we want to have filters to chose the output encoding between UTF-8, Latin 1, etc. what is going to be the input encoding?
Note that I am still resisting the idea of .str() as a member function. If the joiner or concatenation object exports begin() and end(), it's un-necessary, because the object returned by create_string() (or similar) can use the iterators.
I like that idea, though I still don't like the name to_string(). How about this: auto s = generate<std::sting>(concat(...)); auto ws = generate<std::wsting>(concat(...)); generate<std::sting>(s, concat(...)); // returns reference to s for consistency auto v = generate<std::vector<int>>(any_other_range); // actually will just copy A very naive Implementation might look like this: template<typename ReturnT, typename Sequence> auto generate(ReturnT& r, Sequence& seq) -> ReturnT& { std::copy(std::begin(seq), std::end(seq), std::begin(r)); return r; } template<typename ReturnT, typename Sequence> auto generate(Sequence& seq) -> ReturnT { ReturnT r{seq.size()}; generate(r, seq); return r; } The word "generate" is very generic, therefore I'm still not really happy with it. The string factories, of course need more than only begin() and end(). They also have to provide a way to estimate the size of the output, because we want to be able to allocate the complete buffer upfront. In the naive implementation I called that size().
Having iterators also means that the attributed factory can be used as a source in std::copy, std::transform, std::for_each etc.
.. and in range expressions :-) Christof
The string factories, of course need more than only begin() and end(). They also have to provide a way to estimate the size of the output, because we want to be able to allocate the complete buffer upfront. In the naive implementation I called that size().
If you have size(), it means you know the size because the conversion work has already been done. If you know the size, then the iterators returned by begin() and end() can be of category std::random_access_tag, in which case std::distance(begin(), end()) is both trivial and yields the size. This is a long way of saying that if you have size(), you cant have the simpler initial option of using forward_only iterators. The practical fallout of this is that either the filter objects will have to separate the concerns of size computation and formatting (they are very much related), or the filter objects will have to carry some mutable state, even when const (this is done, for example in google protocol buffers). The latter options makes them a little less flexible. Whether all this is worth it in terms of the performance gain of being able to pre-compute size and therefore avoid un-necessary memory allocations... I don't know. I think on balance it might. On 25 January 2017 at 18:19, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 24.01.2017 14:00, schrieb Richard Hodges:
It's almost looking like conversion to utf8, wide strings, strings or string_views should be a filter rather than an option.
A filter always needs an input encoding and an output encoding. If we want to have filters to chose the output encoding between UTF-8, Latin 1, etc. what is going to be the input encoding?
Note that I am still resisting the idea of .str() as a member function. If
the joiner or concatenation object exports begin() and end(), it's un-necessary, because the object returned by create_string() (or similar) can use the iterators.
I like that idea, though I still don't like the name to_string().
How about this:
auto s = generate<std::sting>(concat(...)); auto ws = generate<std::wsting>(concat(...));
generate<std::sting>(s, concat(...)); // returns reference to s for consistency
auto v = generate<std::vector<int>>(any_other_range); // actually will just copy
A very naive Implementation might look like this:
template<typename ReturnT, typename Sequence> auto generate(ReturnT& r, Sequence& seq) -> ReturnT& { std::copy(std::begin(seq), std::end(seq), std::begin(r)); return r; } template<typename ReturnT, typename Sequence> auto generate(Sequence& seq) -> ReturnT { ReturnT r{seq.size()}; generate(r, seq); return r; }
The word "generate" is very generic, therefore I'm still not really happy with it.
The string factories, of course need more than only begin() and end(). They also have to provide a way to estimate the size of the output, because we want to be able to allocate the complete buffer upfront. In the naive implementation I called that size().
Having iterators also means that the attributed factory can be used as a
source in std::copy, std::transform, std::for_each etc.
.. and in range expressions :-)
Christof
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi, Am 25.01.2017 18:37, schrieb Richard Hodges:
The string factories, of course need more than only begin() and end(). They also have to provide a way to estimate the size of the output, because we want to be able to allocate the complete buffer upfront. In the naive implementation I called that size().
If you have size(), it means you know the size because the conversion work has already been done.
I think, it is fine, if it returns a reasonable upper bound, like e.g. 4 for a 16 bit value in hex representation. It is just meant, to make sure, that we can allocate enough memory upfront, and avoid calls to realloc(). Christof
On 26/01/2017 06:44, Christof Donat wrote:
If you have size(), it means you know the size because the conversion work has already been done.
I think, it is fine, if it returns a reasonable upper bound, like e.g. 4 for a 16 bit value in hex representation. It is just meant, to make sure, that we can allocate enough memory upfront, and avoid calls to realloc().
Perhaps spell it as capacity() instead, to indicate the *possible* (worst-case) output size of the factory. It's not exactly the same as the meaning in containers, but it's close enough that people would understand it better, I think. max_size() is another possibility.
Hi, Am 25.01.2017 23:50, schrieb Gavin Lambert:
On 26/01/2017 06:44, Christof Donat wrote:
If you have size(), it means you know the size because the conversion work has already been done.
I think, it is fine, if it returns a reasonable upper bound, like e.g. 4 for a 16 bit value in hex representation. It is just meant, to make sure, that we can allocate enough memory upfront, and avoid calls to realloc().
Perhaps spell it as capacity() instead, to indicate the *possible* (worst-case) output size of the factory.
It's not exactly the same as the meaning in containers, but it's close enough that people would understand it better, I think.
max_size() is another possibility.
I like max_size(). Christof
Note that I am still resisting the idea of .str() as a member function. If the joiner or concatenation object exports begin() and end(), it's un-necessary, because the object returned by create_string() (or similar) can use the iterators.
I like that idea, though I still don't like the name to_string().
How about this:
auto s = generate<std::sting>(concat(...)); auto ws = generate<std::wsting>(concat(...));
Why don't you add explicit conversion operators to the string factory explicit operator std::string() { … } explicit operator std::wstring() { … } explicit type conversion was exactly added for this situation in C++11. The call is then auto s = static_cast<std::string>(concat(…)); etc. Otherwise I don't care if you add .str() or .to_string() members. I would prefer writing auto s = concat(…).str(); over auto s = static_cast<std::string>(concat(…)); But both should be supported.
On Thu, Jan 26, 2017 at 10:41 AM, Hans Dembinski <hans.dembinski@gmail.com> wrote:
Note that I am still resisting the idea of .str() as a member function. If the joiner or concatenation object exports begin() and end(), it's un-necessary, because the object returned by create_string() (or similar) can use the iterators.
I like that idea, though I still don't like the name to_string().
How about this:
auto s = generate<std::sting>(concat(...)); auto ws = generate<std::wsting>(concat(...));
Why don't you add explicit conversion operators to the string factory
explicit operator std::string() { … } explicit operator std::wstring() { … }
explicit type conversion was exactly added for this situation in C++11. The call is then
auto s = static_cast<std::string>(concat(…));
etc.
Otherwise I don't care if you add .str() or .to_string() members. I would prefer writing
auto s = concat(…).str();
over
auto s = static_cast<std::string>(concat(…));
auto s = std::string(concat(…)); Or even std::string s(concat(…)); ? Why bother with the static_cast? -- Olaf
Hi, Am 26.01.2017 10:41, schrieb Hans Dembinski:
Note that I am still resisting the idea of .str() as a member function. If the joiner or concatenation object exports begin() and end(), it's un-necessary, because the object returned by create_string() (or similar) can use the iterators.
I like that idea, though I still don't like the name to_string().
How about this:
auto s = generate<std::sting>(concat(...)); auto ws = generate<std::wsting>(concat(...));
Why don't you add explicit conversion operators to the string factory
explicit operator std::string() { … } explicit operator std::wstring() { … }
explicit type conversion was exactly added for this situation in C++11. The call is then
auto s = static_cast<std::string>(concat(…));
That is the worst we had up to now. Even an implicit conversion is better, which is not at all my favourite as well. Christof
Hi, Am 24.01.2017 00:14, schrieb Gavin Lambert:
On 24/01/2017 07:26, Olaf van der Spek wrote:
1. scope for formatting tags:
concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
Here the inner concat will convert the 42 to its decimal representation, while the outer one converts the first 42 to its hex representation.
Wouldn't concat(hex(42), " is hex for", 42) make more sense?
+1. If you like persistent formatting states (and all the unexpected fun they cause when you forget to cancel them), use iostreams instead.
My proposal does not have anything similar to iostream manipulators. The conversion tags always extend to the whole concat parameter list, but not to enclosed calls to concat. There is no change in behavior that depends on the position of the conversion tag in the parameter list. Actually I think, we should be able to put all the conversion tags at the beginning of the parameter list.
3. results from concat() in a boost::range that is passed to join():
join(separator("\n"), my_files | transformed([](const std::filesystem::path& f) -> auto { return concat(f.filename, ": ", std::filesystem::file_size(f)); })).str();
Maybe I missed something, but what was the intended distinction between concat() and join()?
The distinction is the same as in many other languages: join() works on sequences, while concat() works on its parameters.
To me, as vocabulary words, concat() implies "concatenate without separator" and join() implies "concatenate with separator"; as such it seems unnecessary to explicitly decorate the separator here since it's a required parameter.
I see. The approach, we had here up to now was, to define formatting specifics as tag parameters at the beginning of the parameter list. If you give no separator to join() it just puts all the representations of the sequences objects in a row without any space in between. In Python the equivalent would be to join on the empty string: "".join(mySequence). Then, of course it is reasonable to use the same tags for join(), and concat(), and actually as well for format(), but there separator() is unused. I agree, that your definition is mostly consistent with what I found for other languages as well, but most of them also only use join() on sequences and concat() or equivalents on parameter lists. In languages, where join() is used on parameter lists as well, like Java or Perl, there is no other concatenation function. I guess, concat() usually has no separator in most languages, because you can easily put it in as additional parameters. With my proposal, it just comes for free.
(Both should accept either variadics or ranges, if that's not too complicated to arrange, though it's likely for join() to be used more often on ranges; concat() might be more evenly split, though probably leaning toward variadics.)
That part is not consistent with what I found from other languages. Just Perl and Java (from the list of languages I checked), use join() for both usecases, but they don't have anything else, that is like concat() then. All the other distinguish between concatenating a few parameters, and joining a sequence.
If "concat" is the outer layer anyway, I would return a std::string directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.
Of course better proposals are welcome :-) Would you prefer the implicit conversion? If so, why?
Implicit is problematic with auto..
While that's true, I think the flexibility of returning a factory from concat() is more useful than the discomfort of either remembering the str() or using std::string explicitly as the type instead of auto (or using auto combined with explicit std::string construction).
I agree. Would you prefer str(), or implicit conversion? Christof
Hi Christof
Would you prefer str(), or implicit conversion?
Please, no implicit conversion. I don't like .str() but it is better than implicit conversion. Implicit conversion is confusing, especially in the context of templates and auto. I quote the Zen of Python: "Explicit is better than implicit".
There is several usecases: […]
Ok, you made a point. Also, it follows the principle of least surprise if concat and friends all consistently return string factories.
Am 24.01.2017 00:14, schrieb Gavin Lambert:
On 24/01/2017 07:26, Olaf van der Spek wrote:
1. scope for formatting tags: concat(format::hex<int>, 42, " is hex for ", concat(42)).str(); Here the inner concat will convert the 42 to its decimal representation, while the outer one converts the first 42 to its hex representation. Wouldn't concat(hex(42), " is hex for", 42) make more sense? +1. If you like persistent formatting states (and all the unexpected fun they cause when you forget to cancel them), use iostreams instead.
My proposal does not have anything similar to iostream manipulators. The conversion tags always extend to the whole concat parameter list, but not to enclosed calls to concat. There is no change in behavior that depends on the position of the conversion tag in the parameter list. Actually I think, we should be able to put all the conversion tags at the beginning of the parameter list.
+1 for concat(hex(42), " is hex for", 42) If format::hex<int> is allowed to be in any position and still affect the whole string, that is not at all intuitive. What happens when you have two conflicting formatting tags in the argument list? Which one wins? Hard to judge for the user without looking into the reference manual. A design is intuitive if you don't need to look up stuff in the reference manual all the time. There is the principe of least surprise again. If you use template magic to detect the conflicting formatting tags and raise a compile time error, you are still left the the natural exception of users that positions in "concat" matter. They matter for all the other arguments, so why shouldn't they matter for the formatting tags. It is breaking an implicit rule. Let's get rid of them to avoid the whole mess. As other people pointed out, formatting tags should be left to streams, where they make sense. They don't make sense in concat, because the mental model for "concat" is "function call", not "stream". The most intuitive in this case to use explicit converters, unary functions like hex(42). Everyone who reads that can immediately understand what it does and that the call to hex(…) does not affect the other arguments in concat. Of course, hex(42) would return a string factory.
join(hex<int>, separator(" "), my_nums);
Here all the numbers are converted to their hex representation. With your approach this would look like:
join(separator(" "), my_nums | transform([](int i) -> int { return hex(i); }));
That is much more difficult to understand.
I don't see how that follows. You are free to write an overload for "join" which accepts an unary function as the first argument, which is then applied to all the values in the range. If hex<int> is an unary function, the first version makes even more sense. The second version I reject regardless, because it uses an overloaded operator |. Operator overloads are ambiguous outside the realm of mathematics. We want to follow the principle of least surprise, and this is surprising syntax. Hans
Hi, Am 24.01.2017 11:52, schrieb Hans Dembinski:
Would you prefer str(), or implicit conversion?
Please, no implicit conversion. I don't like .str() but it is better than implicit conversion. Implicit conversion is confusing, especially in the context of templates and auto.
I didn't have good reasons, but this is in line with my gut feeling. I totally agree.
If format::hex<int> is allowed to be in any position and still affect the whole string, that is not at all intuitive.
My idea was to allow those formatting tags only at the beginning of the parameter list. Then separator("sadf") is just another formatting tag.
What happens when you have two conflicting formatting tags in the argument list? Which one wins?
OK, here you made a point. In a specification I'd leave the behavior undefined. In an actual implementation I'd have the later ones overwrite the previous ones, because I expect that to be easy to implement. We could also try and prevent that with meta programming, but I think, that is not worth the effort. Leaving it unspecified in the specification, still leaves us the option to do that later.
join(hex<int>, separator(" "), my_nums);
Here all the numbers are converted to their hex representation. With your approach this would look like:
join(separator(" "), my_nums | transform([](int i) -> int { return hex(i); }));
That is much more difficult to understand.
I don't see how that follows. You are free to write an overload for "join" which accepts an unary function as the first argument, which is then applied to all the values in the range. If hex<int> is an unary function, the first version makes even more sense.
I think, that composes less elegantly with boost::range or ranges::v3. Maybe we could lean towards their adaptor APIs like this: join(separator(" "), my_nums | hex<int>())); Still separator("...") is a bit out of the picture now. Up to mow I thought, that it is some kind of formatting information and therefore I handled it like other formatting tags. Actually I begin to like the idea of formatting functions, that return string factories. That fits very nicely with the rest of the API and makes concat(), join() and format() simpler. A completely different approach, inspired by Python: separator(" ").join(my_nums | hex<int>()); Then we add free functions like this: template<typename Sequence auto join(const Sequence& seq) { }
The second version I reject regardless, because it uses an overloaded operator |.
That is the adaptor API from boost::range and is the same in ranges::v3. It's not my invention. Christof
The second version I reject regardless, because it uses an overloaded operator |.
That is the adaptor API from boost::range and is the same in ranges::v3. It's not my invention.
I must admit that I share Hans' distaste of the pipe overloads. I find them harder to read than a sequence of function calls, and to me they are not immediately expressive. I agree that explicit is better than implicit. I also strongly feel that .str() is an error, since it couples the concepts of strings with joiners, or concatenators. Strings have allocators and traits. I feel that a free function is more decoupled, and could even go in a separate header file to reduce complexity. In summary, to_string(x) is better than x.str() is better than implicit conversion. It is my view that this little interface change would make boost::format more std-like and consistent. It would also make template programming more convenient, since template expansion works much more nicely with ADL than with member functions. The ADL free functions act as glue for objects that lack a .str() member. e.g: #include <boost/format.hpp> #include <iostream> namespace boost { auto to_string(const boost::format& f) -> std::string { return f.str(); } } int main() { auto s = to_string(boost::format("%1%") % "hello"); std::cout << s << std::endl; } I'll make a suggestion about this in the boost::format forum. On 24 January 2017 at 12:54, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 24.01.2017 11:52, schrieb Hans Dembinski:
Would you prefer str(), or implicit conversion?
Please, no implicit conversion. I don't like .str() but it is better than implicit conversion. Implicit conversion is confusing, especially in the context of templates and auto.
I didn't have good reasons, but this is in line with my gut feeling. I totally agree.
If format::hex<int> is allowed to be in any position and still affect
the whole string, that is not at all intuitive.
My idea was to allow those formatting tags only at the beginning of the parameter list. Then separator("sadf") is just another formatting tag.
What happens when you
have two conflicting formatting tags in the argument list? Which one wins?
OK, here you made a point. In a specification I'd leave the behavior undefined. In an actual implementation I'd have the later ones overwrite the previous ones, because I expect that to be easy to implement. We could also try and prevent that with meta programming, but I think, that is not worth the effort. Leaving it unspecified in the specification, still leaves us the option to do that later.
join(hex<int>, separator(" "), my_nums);
Here all the numbers are converted to their hex representation. With your approach this would look like:
join(separator(" "), my_nums | transform([](int i) -> int { return hex(i); }));
That is much more difficult to understand.
I don't see how that follows. You are free to write an overload for "join" which accepts an unary function as the first argument, which is then applied to all the values in the range. If hex<int> is an unary function, the first version makes even more sense.
I think, that composes less elegantly with boost::range or ranges::v3. Maybe we could lean towards their adaptor APIs like this:
join(separator(" "), my_nums | hex<int>()));
Still separator("...") is a bit out of the picture now. Up to mow I thought, that it is some kind of formatting information and therefore I handled it like other formatting tags. Actually I begin to like the idea of formatting functions, that return string factories. That fits very nicely with the rest of the API and makes concat(), join() and format() simpler.
A completely different approach, inspired by Python:
separator(" ").join(my_nums | hex<int>());
Then we add free functions like this:
template<typename Sequence auto join(const Sequence& seq) {
}
The second version I reject regardless, because it uses an overloaded
operator |.
That is the adaptor API from boost::range and is the same in ranges::v3. It's not my invention.
Christof
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi, Am 24.01.2017 13:12, schrieb Richard Hodges:
The second version I reject regardless, because it uses an overloaded operator |.
That is the adaptor API from boost::range and is the same in ranges::v3. It's not my invention.
I must admit that I share Hans' distaste of the pipe overloads.
As I said, this is not my invention. That notation is used with boost::ranges and ranges::v3 and has good chances to be part of the C++ standard library in future. At least the committee seems to discuss ranges::v3.
I also strongly feel that .str() is an error, since it couples the concepts of strings with joiners, or concatenators. Strings have allocators and traits. I feel that a free function is more decoupled, and could even go in a separate header file to reduce complexity.
The allocator could be given with a parameter : concat(...).str(my_allocator); We could have a character type as template parameter, which might default to char: concat(...).str<wchar>(my_allocator); The whole discussion started with a question for a concat() like function, that would return std::string. Therefore I don't feel too embarrassed, that up to now we came up with stuff, that is coupled to produce strings.
to_string(x) is better than x.str() is better than implicit conversion.
Maybe it is just my strong distaste of the name to_string(), which at the moment is just a feeling, no good reasons. It just feels clumsy to me, while .str() is in line with regular expression matches, and stringstreams in the standard library. Christof
On 26/01/2017 05:23, Christof Donat wrote:
to_string(x) is better than x.str() is better than implicit conversion.
Maybe it is just my strong distaste of the name to_string(), which at the moment is just a feeling, no good reasons. It just feels clumsy to me, while .str() is in line with regular expression matches, and stringstreams in the standard library.
Clumsy or not, it's in the standard now. [1] str() is shorter, of course. I don't see any particular reason why you can't provide all of them, though (even the implicit conversion), as different things are going to feel more natural to different people, particularly in different contexts. [1] http://en.cppreference.com/w/cpp/string/basic_string/to_string
I don't see any particular reason why you can't provide all of them, though (even the implicit conversion), as different things are going to feel more natural to different people, particularly in different contexts.
There's a problem with implicit conversion. imagine the overloads function: void foo(std::string const& s); The we have a "joining engine" with various useful implicit conversions: struct engine { operator std::string const& () const; operator std::string_view () const; operator const char*() const; }; We have the code: foo(join(...)); And all is well. The string conversion is chosen. Then because of user performance concerns around the construction of strings, the author of the foo library adds some useful overloads: void foo(std::string_view s); void foo(const char* s); Now the above code will not compile, because there are 3 ambiguous overloads. So users are forced to modify their code to: foo(std::string_view(join(...))); or foo(to_string_view(join(...))); or, much as I dislike the idea, foo(join(...).strview()); The moral of the story is that if we are going to provide conversion operators, they need to be explicit anyway. If you're going to have explicit conversion operators, it is just as expressive to have the to_xxx free functions. It's no more typing, because what you lose in typing to_, you gain in not having to prefix the type with std:: (no-one uses using namespace std; here, right? It doesn't play well with boost). In return, you get decoupling. This means that if you later want to convert the formatted text to a vector (say) or a vector of terms, you can simply provide the free function overloads to_vector(), to_vector_of_strings() and so on. Template specialisations of free functions are always a bad idea - they don't play nicely with ADL. So I would not recommend convert<std::string>(join(...)) etc. R On 25 January 2017 at 23:55, Gavin Lambert <gavinl@compacsort.com> wrote:
On 26/01/2017 05:23, Christof Donat wrote:
to_string(x) is better than x.str() is better than implicit conversion.
Maybe it is just my strong distaste of the name to_string(), which at the moment is just a feeling, no good reasons. It just feels clumsy to me, while .str() is in line with regular expression matches, and stringstreams in the standard library.
Clumsy or not, it's in the standard now. [1]
str() is shorter, of course. I don't see any particular reason why you can't provide all of them, though (even the implicit conversion), as different things are going to feel more natural to different people, particularly in different contexts.
[1] http://en.cppreference.com/w/cpp/string/basic_string/to_string
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
On 26 Jan 2017, at 10:47, Richard Hodges <hodges.r@gmail.com> wrote:
There's a problem with implicit conversion. […]
Thank you Richard for this nice example, which illustrates the problem with implicit conversion. :) It would be nice if there was a wiki for C++ with items like this ("don't do this and why") so that these explanations do not have to be written again and again.
Hi, Am 26.01.2017 10:47, schrieb Richard Hodges:
I don't see any particular reason why you can't provide all of them, though (even the implicit conversion), as different things are going to feel more natural to different people, particularly in different contexts.
There's a problem with implicit conversion.
imagine the overloads function:
void foo(std::string const& s); [...] foo(join(...)); [...] void foo(std::string_view s); void foo(const char* s);
Now the above code will not compile, because there are 3 ambiguous overloads.
But that is only an issue for users, who have relied on the implicit conversion upfront. You can always use an implicit conversion explicitly as well, of course, and implicit conversion will be just one of multiple options, if we add it.
The moral of the story is that if we are going to provide conversion operators, they need to be explicit anyway.
Up to now, you haven't made a point here. Just those users, who decide to rely on implicit conversion, instead of one of the explicit variants, have to cope with the downsides of implicit conversion. Others don't. I think that fits well with "don't pay for what you don't use".
Template specialisations of free functions are always a bad idea - they don't play nicely with ADL. So I would not recommend convert<std::string>(join(...)) etc.
Please elaborate more on that. I can't see it. Christof
Am 26.01.2017 10:47, schrieb Richard Hodges:
I don't see any particular reason why you can't provide all of them, though (even the implicit conversion), as different things are going to feel more natural to different people, particularly in different contexts. There's a problem with implicit conversion. imagine the overloads function: void foo(std::string const& s); [...] foo(join(...)); [...] void foo(std::string_view s); void foo(const char* s); Now the above code will not compile, because there are 3 ambiguous overloads.
But that is only an issue for users, who have relied on the implicit conversion upfront. You can always use an implicit conversion explicitly as well, of course, and implicit conversion will be just one of multiple options, if we add it.
You don't know why the user added the overloads for foo, perhaps she suddenly had to adapt foo so that it also works with C code which uses a lot of const char*. As a designer, you have no control over other peoples' interfaces. You argued about the principle of least surprise. Let's say the user started to use concat in her code with a function foo(std::string const & s). Then she decided to add to the overload foo(const char* s). All of a sudden her code does not compile anymore. This should not happen. As a user, she will be very surprised at that moment. That's why it is a bad idea to have implicit conversions. Why do you think the standards committee added explicit operator <type>() to the language? They don't do these language changes for fun. Also, if you won't take it from us, please go and take the wisdom from Herb Sutter http://www.gotw.ca/gotw/019.htm "It's almost always a good idea to avoid writing automatic conversions, either as conversion operators or as single-argument non-explicit constructors."
The call is then auto s = static_cast<std::string>(concat(…));
That is the worst we had up to now. Even an implicit conversion is better, which is not at all my favourite as well.
As Olaf pointed out, it is sufficient to do auto s = std::string(concat(…)); you don't need the static_cast. Nevertheless, casts are the official way of doing type conversions, whether you like the syntax or not. I think Stroustrup intentionally made them ugly, because he wanted type conversions to be the exception in his statically typed language. Hans
Hi, Am 26.01.2017 15:04, schrieb Hans Dembinski:
Am 26.01.2017 10:47, schrieb Richard Hodges:
I don't see any particular reason why you can't provide all of them, though (even the implicit conversion), as different things are going to feel more natural to different people, particularly in different contexts. There's a problem with implicit conversion. imagine the overloads function: void foo(std::string const& s); [...] foo(join(...)); [...] void foo(std::string_view s); void foo(const char* s); Now the above code will not compile, because there are 3 ambiguous overloads.
But that is only an issue for users, who have relied on the implicit conversion upfront. You can always use an implicit conversion explicitly as well, of course, and implicit conversion will be just one of multiple options, if we add it.
You don't know why the user added the overloads for foo, perhaps she suddenly had to adapt foo so that it also works with C code which uses a lot of const char*. As a designer, you have no control over other peoples' interfaces.
But only people, that have relied on implicit conversion, might have an issue. Richard proposed to have member functions like .str(), free functions like to_string() and implicit conversion. Use the explicit member function, or free function, and you'll be fine. Whoever, for whichever reason, decides to rely on implicit conversion, will probably get easier to read code, but . Generally, please calm down. I am not attacking you personally. I am just discussing, weather implicit conversion might be a good idea as an additional way to execute string factories. My current opinion is, that it is not my preferred interface, but as an additional option, why not? Only those, who use it, pay for it.
Why do you think the standards committee added explicit operator <type>() to the language? They don't do these language changes for fun.
They also added the possibility to create implicit type conversions. If implicit conversions really are so bad, as you put them, why did they do that? For fun? Were they drunk? And I really dislike explicit type conversions. Every time I had considered to write an explicit type conversion, a member function, or a free function with a speaking name was the better choice. So if we come to the conclusion, that the implicit conversions harms anyone, but those who use them, I'd vote for no conversion operators at all. Use the explicit member functions then.
Also, if you won't take it from us, please go and take the wisdom from Herb Sutter
http://www.gotw.ca/gotw/019.htm
"It's almost always a good idea to avoid writing automatic conversions, either as conversion operators or as single-argument non-explicit constructors."
It is almost always a good idea to listen to the wise people and then think for yourself, if what they say really fits to your situation. [Christof Donat, just now]
The call is then auto s = static_cast<std::string>(concat(…));
That is the worst we had up to now. Even an implicit conversion is better, which is not at all my favourite as well.
As Olaf pointed out, it is sufficient to do
auto s = std::string(concat(…));
you don't need the static_cast. Nevertheless, casts are the official way of doing type conversions, whether you like the syntax or not. I think Stroustrup intentionally made them ugly, because he wanted type conversions to be the exception in his statically typed language.
I totally agree with him. Explicit type conversions are ugly as hell, and it is good, that way. Implicit type conversions might be dangerous, so we should think twice, before we add them. But if in this case they don't hurt anyone, then I think there is no good reason to not add them. Christof
Generally, please calm down. I am not attacking you personally. I am just discussing, weather implicit conversion might be a good idea as an additional way to execute string factories. My current opinion is, that it is not my preferred interface, but as an additional option, why not? Only those, who use it, pay for it.
Sorry, if I seem upset. :/ It is true that this point annoys me a bit, because I really think that the case against implicit conversion has been well made. Nevertheless, I believe in discussions as a device to exchange knowledge, so…
Why do you think the standards committee added explicit operator <type>() to the language? They don't do these language changes for fun.
They also added the possibility to create implicit type conversions. If implicit conversions really are so bad, as you put them, why did they do that? For fun? Were they drunk?
People learn from mistakes. There was a time when they (the standard committee) thought that implicit conversions are great, until they discovered that implicit conversions lead to subtle and dangerous bugs. They undermine the idea of discovering bugs at compile time rather than at runtime. If you find the one example given by Herb not convincing - although it is nicely down to the point -, please go ahead and read the references given in the article, from Scott Meyers and others, where you will find more.
And I really dislike explicit type conversions. Every time I had considered to write an explicit type conversion, a member function, or a free function with a speaking name was the better choice. So if we come to the conclusion, that the implicit conversions harms anyone, but those who use them, I'd vote for no conversion operators at all. Use the explicit member functions then.
There are also things I dislike, but personal preference should only help you make decisions when the case is ambiguous. Most of the times, there are good arguments for one side, and then - because you don't write a library for yourself, but for a potentially large group of people - it is just rational to put the needs of others before your own preferences. "The needs of the many outweigh the needs of the few or the one", as someone famously said. Consistency is very important. Consistency with the standard library. Consistency with contemporary use of C++. Consistency with other boost libraries. Consistency helps one to grow an intuition how things work, which in turn reduces the number of times that you have to look up stuff in the reference manual. Python is very good at this and it this is one reason why it is so popular. I think it is intuitive if you allow an explicit conversion to string for a string factory. If people don't use it, fine, no harm done.
Also, if you won't take it from us, please go and take the wisdom from Herb Sutter http://www.gotw.ca/gotw/019.htm <http://www.gotw.ca/gotw/019.htm> "It's almost always a good idea to avoid writing automatic conversions, either as conversion operators or as single-argument non-explicit constructors."
It is almost always a good idea to listen to the wise people and then think for yourself, if what they say really fits to your situation. [Christof Donat, just now]
How does it not fit your situation?
I totally agree with him. Explicit type conversions are ugly as hell, and it is good, that way. Implicit type conversions might be dangerous, so we should think twice, before we add them. But if in this case they don't hurt anyone, then I think there is no good reason to not add them.
Explicit type conversions will not hurt anyone, implicit type conversions might. Hans
So, we've seen a lot of nice ideas, who is going to implement his ideas?
Hi, Am 17.02.2017 13:22, schrieb Olaf van der Spek via Boost:
So, we've seen a lot of nice ideas, who is going to implement his ideas?
I am currently trying to create a proof of concept. The API now looks like this: auto s = std::string{}; replace(s).with(to_string(42)); append(hex(42).to(s); assert(s == "422A"); replace(s).with(concat(42, " "s, hex(42))); assert(s == "42 2A"); to_string() and hex() is boost::spirit::karma generators and therefore are probably pretty fast. replace().with() and append().to() both return a reference to the "string" they have been working on and can take a rvalue reference to that string. So this will work as well without any unnecessary copies: auto s = replace(""s).with(to_string(42)); assert(s == "42"); auto t = append(hex(42)).to("Test test "s); assert(t == "Test test 2A"); auto the_answer_to_everything() -> std::string { return append(to_string(42)).to("The ansert to everything is "s); } The functions to_string(), hex(), concat(), join(), format(), etc. return char ranges that can be iterated as well. e.g. for direct output: auto r = hex(); std::copy(begin(r), end(r), std::ostream_iterator<char>(std::cout)); The only restrictions, that append() puts on the "string" it appends to is, that std::back_inserter() has to be available and char has to be assignable to the value type of that back_inserter_iterator. For replace(), the "string" also has to provide a member function clear(). It works on e.g. std::vector<char>, or std::list<int> as well. My current implementation does not care about wide chars, or utf8. I am happy, when it works with simple chars. At the moment I am struggling with concat() and I haven't yet started with join() and format() Plans: 1. get concat() and join() working 2. extend replace().with() and append().to() to implicitly use to_string() on everything that is not an iterable char range 3. extend append().to() with a variant, that takes an output iterator instead of a "string" like object 4. add concept checks for better error messages 5. think about wide chars and utf8 6. try with format() 7. think how we can determine the size of the resulting string in advance and call resize() in the appender and the replacer before we iterate over the generating range, if available. Since for me this is a hobby project, things are going slow. I don't have too much time, I can spend on it, but it will be helpful for an other, bigger hobby project. If there is someone, who needs such a library more urgently, I am happy to share my code, so he can help. I just have to clean it up a little bit before. Christof
On 27/01/2017 03:04, Hans Dembinski wrote:
You don't know why the user added the overloads for foo, perhaps she suddenly had to adapt foo so that it also works with C code which uses a lot of const char*. As a designer, you have no control over other peoples' interfaces.
That wouldn't be a problem, because no sane implicit conversion could ever return const char*. And even if std::string had its own conversion (which it doesn't), the compiler won't chain two implicit conversions, so it can't be ambiguous. For string vs. string_view: I don't think returning string_view is practical for concat/join anyway, since they don't have a pre-assembled string to return a view onto; the string isn't constructed until the actual concat call, and this can't return a view as it has nowhere to store the "real" string to keep that view valid. The only other one that seems more likely is if the factories could output std::wstring as well as std::string, since it's reasonable that user code would have overloads for both (or accept basic_string). In practice though I don't think even this would be a problem, as: std::string x = concat("foo", "bar"); std::wstring y = concat(L"foo", L"bar"); std::wstring z = concat("foo", bar"); x and y are well-formed; z is not. Converting between character types is a potentially lossy operation (and ambiguous in the case of wstring to string -- did you mean to convert to UTF-8 or some other encoding?) and as such never makes sense as an implicit conversion. If you want that, you will have to put in an explicit conversion request, either on the output of concat or on both of its inputs. (Similarly concat should not accept mixed inputs as parameters either.) So I don't see these as real concerns.
Hi Gavin,
On 26 Jan 2017, at 23:28, Gavin Lambert <gavinl@compacsort.com> wrote:
On 27/01/2017 03:04, Hans Dembinski wrote:
You don't know why the user added the overloads for foo, perhaps she suddenly had to adapt foo so that it also works with C code which uses a lot of const char*. As a designer, you have no control over other peoples' interfaces.
That wouldn't be a problem, because no sane implicit conversion could ever return const char*. And even if std::string had its own conversion (which it doesn't), the compiler won't chain two implicit conversions, so it can't be ambiguous.
It was just an example to illustrate the general danger of implicit casts.
The only other one that seems more likely is if the factories could output std::wstring as well as std::string, since it's reasonable that user code would have overloads for both (or accept basic_string). In practice though I don't think even this would be a problem, as:
std::string x = concat("foo", "bar"); std::wstring y = concat(L"foo", L"bar"); std::wstring z = concat("foo", bar");
So let's say concat returns a factory with implicit casts to std::string and std::wstring. And I have function foo(…) which accepts std::string and std::wstring. Now if I write foo(concat(42)) I get an ambiguity again. Hans
On 27/01/2017 23:48, Hans Dembinski wrote:
The only other one that seems more likely is if the factories could output std::wstring as well as std::string, since it's reasonable that user code would have overloads for both (or accept basic_string). In practice though I don't think even this would be a problem, as:
std::string x = concat("foo", "bar"); std::wstring y = concat(L"foo", L"bar"); std::wstring z =concat("foo", "bar");
So let's say concat returns a factory with implicit casts to std::string and std::wstring. And I have function foo(…) which accepts std::string and std::wstring. Now if I write
foo(concat(42))
I get an ambiguity again.
concat should never provide implicit casts to both string and wstring, because that's silly. (Read further in my original message, where I mention that z above should be a compile error.) concat("foo", L"bar") should also be a compile error. If you use char-based parameters, it should only provide the conversion to string. If you use wchar_t-based parameters, it should only provide the conversion to wstring. If you don't use either, then either it needs to default to one or the other (most likely string), or you'd have to explicitly specify, eg. concat<char>. Which one of these would actually happen is likely to arise as an implementation detail, but the second seems more likely to me as part of the argument filtering.
Template specialisations of free functions are always a bad idea - they don't play nicely with ADL. So I would not recommend convert<std::string>(join(...)) etc.
Please elaborate more on that. I can't see it.
imagine: namespace boost { // the concept template<class To> T convert(joiner const& j); // some specialisations template<> std::string convert<std::string>(joiner const& j) {... } template<> std::wstring convert<std::wstring>(joiner const& j) {... } }; then in user code: template<class T> void do_something(boost::joiner const& j) { using boost::convert; auto v = something(convert<T>(j)); something_else(v); } now someone wishes to provide their own converter, not in the boost namespace: namespace user { struct UserRepresentation; // this is not allowed. There is not already a general template called user::convert<>. It's called boost::convert<> template<> UserRepresentation convert<UserRepresentation>(boost::joiner const& j) { ... } } As mentioned in the comments, this is not allowed. So the user is not forced to specialise the boost namespace. This is not the boost way (see boost::hash), and for good reason. Namespaces are for separation. This forces crowding of the boost namespace. Also, specialising templates in foreign namespaces is a source of user confusion. Google mentioned this in their proposal to make std::hash behave more like boost::hash (std::hash being an example of the standards committee turning a fantastic tool into an incomplete shambles, because they left out the bits that make it work well). If you want a template convert function (and I don't, but I can imagine that some might), then the model to follow would be that of boost::hash<> This uses a function object (boost::hash) which then calls out through a namespace collector and finally to the ADL hash_value function. In our case it would want to call out to a function like convert(boost::tag<T>, boost::joiner const& j) -> T. The general form would be: namespace boost { template<class T> struct join_converter { decltype(auto) operator()(tag<T>, joiner const& j) const { using boost::convert } }; } The signature of the general converter would then be auto convert(boost::tag<T>, joiner const& j) -> T and the user's non-template overload would be: namespace user { auto convert(boost::tag<UserRepresentation>, boost::joiner const& j) -> UserRepresentation; } boost would never define a convert function. Conversions for std::string, std::wstring etc would be specialisations of boost::join_converter. This allows ADL to find non-template converters in namespaces associated with the thing that they are converting to. The generic user calling function would then look more like this: template<class T> void do_something(boost::joiner const& j) { auto convert = boost::converter<T>(); auto v = convert(something(j)); something_else(v); } I'm afraid that this "class template which calls free function" dance is necessary to allow calls to ADL free functions to search beyond the boost namespace. Again, see boost::hash for the gory details. On 26 January 2017 at 14:27, Christof Donat <cd@okunah.de> wrote:
Hi,
Am 26.01.2017 10:47, schrieb Richard Hodges:
I don't see any particular reason why you can't provide all of them,
though (even the implicit conversion), as different things are going to feel more natural to different people, particularly in different contexts.
There's a problem with implicit conversion.
imagine the overloads function:
void foo(std::string const& s);
[...]
foo(join(...));
[...]
void foo(std::string_view s); void foo(const char* s);
Now the above code will not compile, because there are 3 ambiguous overloads.
But that is only an issue for users, who have relied on the implicit conversion upfront. You can always use an implicit conversion explicitly as well, of course, and implicit conversion will be just one of multiple options, if we add it.
The moral of the story is that if we are going to provide conversion
operators, they need to be explicit anyway.
Up to now, you haven't made a point here. Just those users, who decide to rely on implicit conversion, instead of one of the explicit variants, have to cope with the downsides of implicit conversion. Others don't. I think that fits well with "don't pay for what you don't use".
Template specialisations of free functions are always a bad idea - they
don't play nicely with ADL. So I would not recommend convert<std::string>(join(...)) etc.
Please elaborate more on that. I can't see it.
Christof
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi,
This is not the boost way (see boost::hash), and for good reason.
Well, the extension points to boost::spirit work by specializing e.g. is_contaner<>, or is_string<>. See http://www.boost.org/doc/libs/1_61_0/libs/spirit/doc/html/spirit/advanced/cu... That is the same for traits in the standard library, so it actually is a common pattern. I understand, that e.g. std::swap does it the way, you propose. Am 26.01.2017 15:04, schrieb Richard Hodges:
Template specialisations of free functions are always a bad idea - they don't play nicely with ADL. So I would not recommend convert<std::string>(join(...)) etc.
Please elaborate more on that. I can't see it.
imagine: [...] now someone wishes to provide their own converter, not in the boost namespace:
Actually I prefer to define a concept for targets of something like convert(). The generic convert() then just works with that concept. Then only users, who want to write to something, that does not fulfill the concept. We might also provide a few specializations like char*, that do not obey to the defined concept. Actually I'd avoid the word "convert" here, because I'd use convert() for a function, that returns a string factory to convert a single value. Other than concat(), the "string factory" returned by convert() will be able to convert in different ways, similar to boost::lexical_cast<>. Currently my favorite API looks like this: auto i = convert("42"s).to<int>(); // i = 42 // convert can convert from string to objects as well, like lexical_cast auto s = convert(i).to<std::string>(); // s == "42" hex(i).to(s); // s == "2A" convert(23).to(s); // s = "23" // convert(), hex(), and similar functions return string factories. append(concat(" - the hex representation of ", i, " is ", hex(i))).to(s); // s = "23 - the hex representation of 42 is 2A" concat("the hex representation of ", i, " is ", hex(i)).to(s); // s = "the hex representation of 42 is 2A" append(" Yeah!").to(s); // s = "the hex representation of 42 is 2A Yeah!" auto my_numbers = std::vector<int>{11, 12, 13, 14, 15}; join(separator(", "), my_numbers).to(s); // s = "11, 12, 13, 14, 15" join(separator(", "), std::begin(my_numbers), std::end(my_numbers)).to(s); // s = "11, 12, 13, 14, 15" join(separator(", "), my_numbers | hex).to(s); // s = "B, C, D, E, F" // join() works with iterator pairs, ranges and range expressions append(format(" - the hex representation of {2} is {1}", hex(i), i)).to(s); // s = "B, C, D, E, F - the hex representation of 42 is 2A" format("the hex representation of {2} is {1}", hex(i), i).to(s); // s = "the hex representation of 42 is 2A" auto v = hex(i).to<std::vector<char>>(); // v == {'2', 'A'} append(convert(i)).to(v); // v == {'2', 'A', '4', '2'} It almost reads like English cluttered with some weird punctuation. Of course I have no issue, when .to() and .append_to() resort to a free function, that can be overloaded using ADL, like you had proposed: class string_factory { public: // ... template <typename TargetT> auto to(TargetT& t) -> TargetT& { return stringify_to(*this, t); }; template <typename TargetT> auto to() -> TargetT { TargetT r; to(r); return r; }; }; Is there a problem for ADL, when stringify_to() is a template on the string factory? template <typename StringFactory> myTargetType& stringify_to(StringFactory& f, myTargetType& t) { // ... } Then we can avoid virtual function calls in the string factory. Christof
Of course I have no issue, when .to() and .append_to() resort to a free function, that can be overloaded using ADL, like you had proposed:
If member functions are resolving to ADL free functions via a function object, I'm fully ok with that.
Is there a problem for ADL, when stringify_to() is a template on the string factory?
Yes unfortunately. The template has the name `boost::stringify_to` regardless of the types. But if you go via a template functor object that uses ADL to find the free function, then the free function does not have to be a template function, and ADL will work as expected. In order to make the free function non-template you need to pass some trivial identifier from the function's namespace as an argument. So you'll need a tag type. so: namespace boost { template<class Tag> string_factory { using type = typename Tag::type; type operator()(joiner const& j) const { return stringify_to(Tag(), j); } }; } which gets syntactically nasty. which is why we have names like to_string(), hash_code() and so on... :) On 26 January 2017 at 18:08, Christof Donat <cd@okunah.de> wrote:
Hi,
This is not the boost way (see boost::hash), and for good reason.
Well, the extension points to boost::spirit work by specializing e.g. is_contaner<>, or is_string<>. See http://www.boost.org/doc/libs/ 1_61_0/libs/spirit/doc/html/spirit/advanced/customize/is_container.html That is the same for traits in the standard library, so it actually is a common pattern. I understand, that e.g. std::swap does it the way, you propose.
Am 26.01.2017 15:04, schrieb Richard Hodges:
Template specialisations of free functions are always a bad idea - they
don't play nicely with ADL. So I would not recommend convert<std::string>(join(...)) etc.
Please elaborate more on that. I can't see it.
imagine: [...] now someone wishes to provide their own converter, not in the boost namespace:
Actually I prefer to define a concept for targets of something like convert(). The generic convert() then just works with that concept. Then only users, who want to write to something, that does not fulfill the concept. We might also provide a few specializations like char*, that do not obey to the defined concept.
Actually I'd avoid the word "convert" here, because I'd use convert() for a function, that returns a string factory to convert a single value. Other than concat(), the "string factory" returned by convert() will be able to convert in different ways, similar to boost::lexical_cast<>. Currently my favorite API looks like this:
auto i = convert("42"s).to<int>(); // i = 42 // convert can convert from string to objects as well, like lexical_cast
auto s = convert(i).to<std::string>(); // s == "42" hex(i).to(s); // s == "2A" convert(23).to(s); // s = "23" // convert(), hex(), and similar functions return string factories.
append(concat(" - the hex representation of ", i, " is ", hex(i))).to(s); // s = "23 - the hex representation of 42 is 2A" concat("the hex representation of ", i, " is ", hex(i)).to(s); // s = "the hex representation of 42 is 2A" append(" Yeah!").to(s); // s = "the hex representation of 42 is 2A Yeah!"
auto my_numbers = std::vector<int>{11, 12, 13, 14, 15}; join(separator(", "), my_numbers).to(s); // s = "11, 12, 13, 14, 15" join(separator(", "), std::begin(my_numbers), std::end(my_numbers)).to(s); // s = "11, 12, 13, 14, 15" join(separator(", "), my_numbers | hex).to(s); // s = "B, C, D, E, F" // join() works with iterator pairs, ranges and range expressions
append(format(" - the hex representation of {2} is {1}", hex(i), i)).to(s); // s = "B, C, D, E, F - the hex representation of 42 is 2A" format("the hex representation of {2} is {1}", hex(i), i).to(s); // s = "the hex representation of 42 is 2A"
auto v = hex(i).to<std::vector<char>>(); // v == {'2', 'A'} append(convert(i)).to(v); // v == {'2', 'A', '4', '2'}
It almost reads like English cluttered with some weird punctuation. Of course I have no issue, when .to() and .append_to() resort to a free function, that can be overloaded using ADL, like you had proposed:
class string_factory { public: // ...
template <typename TargetT> auto to(TargetT& t) -> TargetT& { return stringify_to(*this, t); }; template <typename TargetT> auto to() -> TargetT { TargetT r; to(r); return r; }; };
Is there a problem for ADL, when stringify_to() is a template on the string factory?
template <typename StringFactory> myTargetType& stringify_to(StringFactory& f, myTargetType& t) { // ... }
Then we can avoid virtual function calls in the string factory.
Christof
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
Hi, Am 26.01.2017 19:44, schrieb Richard Hodges:
Is there a problem for ADL, when stringify_to() is a template on the string factory?
Yes unfortunately. The template has the name `boost::stringify_to` regardless of the types. But if you go via a template functor object that uses ADL to find the free function, then the free function does not have to be a template function, and ADL will work as expected. In order to make the free function non-template you need to pass some trivial identifier from the function's namespace as an argument. So you'll need a tag type.
so:
namespace boost {
template<class Tag> string_factory { using type = typename Tag::type; type operator()(joiner const& j) const { return stringify_to(Tag(), j); } };
}
I think, we misunderstood each other. I want to have multiple string factories, because the concat() string factory will work different than the format() string factory. I could do that with virtual functions, but that adds computational overhead at runtime. If I could templatize on the string factory, while I overload on the object to write to, I can resolve the string factory behavior at compile time and therefore get rid of the virtual function call. Virtual function call are suboptimal for the branch prediction of modern CPUs, and usually can not be inlined. This explanation suggests, that my approach should work: http://en.cppreference.com/w/cpp/language/adl If there is a template function with that name available by ordinary lookup, then Koenig lookup kicks in and finds the correct overloaded template function. namespace boost { template <typename StringFactory> std::string& stringify_to(StringFactory& f, std::string& t) { // ... } template <typename StringFactory> std::wstring& stringify_to(StringFactory& f, std::wstring& t) { // ... } class string_factory { public: // ... template <typename TargetT> auto to(TargetT& t) -> TargetT& { return stringify_to(*this, t); }; // ... }; } Here the ordinary lookup will find the standard overloads of stringify_to template function. Then there is no syntax error any more, and Koenig lookup will find the overloaded template function in the target objects namespace as well: namespace my_target { template <typename StringFactory> myTargetType& stringify_to(StringFactory& f, myTargetType& t) { // ... } } Now concat(...).to<myTargetType>() should find this template. Did I misinterpret the explanation on cppreference, or is it wrong? Christof
Am 25.01.2017 23:55, schrieb Gavin Lambert:
On 26/01/2017 05:23, Christof Donat wrote:
to_string(x) is better than x.str() is better than implicit conversion.
Maybe it is just my strong distaste of the name to_string(), which at the moment is just a feeling, no good reasons. It just feels clumsy to me, while .str() is in line with regular expression matches, and stringstreams in the standard library.
Clumsy or not, it's in the standard now. [1]
Just as .str() is - in stringstream and in regular expression matches :-) The issue is, that the way, to_string() was explained, it seemed obvious to me, that it should not only be able to produce strings, but also e.g. wstrings. The original proposal was, to use a different name for every potential output: to_string(concat(...)); to_wstring(concat(...)); I'd prefer a more generic approach, because this does not give us much more possibilities than .str(). The worst proposal was to_utf8(), because then the string factories have to provide some representation, that can be converted to any character encoding later. I'd really prefer to set the character encoding on the string factory. Therefore I have another proposal: to<std::string>(concat(...)); to<std::wstring>(concat(...)); to<std::string>(concat(...).encoding(utf8)); to<std::string>(concat(...).encoding(latin1)); to<std::wstring>(concat(...).encoding(utf16)); to<std::string>(concat(...).encoding(utf16)); // error! to<std::vector<char>>(concat(...)); to<QString>(concat(...)); or alternatively: concat(...).to<std::string>(); concat(...).to<std::wstring>(); concat(...).encoding(utf8).to<std::string>(); concat(...).encoding(latin1).to<std::string>(); concat(...).encoding(utf16).to<std::wstring>(); concat(...).encoding(utf16).to<std::string>(); // error! concat(...).to<std::vector<char>>(); concat(...).to<QString>(); Both versions can have overloads, to reuse existing result values. In that case actually we don't need the template parameter, because it can be derived from the parameters type. to(s, concat(...)); to(ws, concat(...)); to(s, concat(...).encoding(utf8)); to(s, concat(...).encoding(latin1)); to(ws, concat(...).encoding(utf16)); to(v, concat(...)); to(qs, concat(...)); or: concat(...).to(s); concat(...).to(ws); concat(...).encoding(utf8).to(s); concat(...).encoding(latin1).to(s); concat(...).encoding(utf16).to(ws); concat(...).to(v); concat(...).to(qs); I prefer the second version. It reads like an english sentence. We can have the compiler check, that encoding(utf16) does not allow to<std::string>(). We just have to make sure, it returns a type, that only implements to() for wide character types.
str() is shorter, of course. I don't see any particular reason why you can't provide all of them, though (even the implicit conversion), as different things are going to feel more natural to different people, particularly in different contexts.
Yes. With the above proposal, we can easily implement all of them: class string_factory { public: // ... template<typename TargetType> auto to() -> TargetType {...}; template<typename TargetType> auto to(TargetType& t) -> TargetType& {...}; auto str() -> std::string { return to<std::string>() }; auto wstr() -> std::wstring { return to<std::wstring>() }; auto str(std::string& s) -> std::string& { return to<std::string>(s) }; auto wstr(std::wstring& ws) -> std::wstring &{ return to<std::wstring>(ws) }; operator std::string { return str(); }; operator std::wstring { return wstr(); }; } template <typename StringFactory> auto to_string(StringFactory& factory) -> std::string { return factory.str(); } template <typename StringFactory> auto to_wstring(StringFactory& factory) -> std::string { return factory.wstr(); } template <typename StringFactory> auto to_string(std::string& s, StringFactory& factory) -> std::string { return factory.str(s); } template <typename StringFactory> auto to_wstring(std::wstring& ws, StringFactory& factory) -> std::string { return factory.wstr(ws); } Christof
On 24 Jan 2017, at 12:54, Christof Donat <cd@okunah.de> wrote:
That is the adaptor API from boost::range and is the same in ranges::v3. It's not my invention.
I didn't have time to look into boost::range, yet, so I didn't recognise this. Well that's a pity. :((
I think, that composes less elegantly with boost::range or ranges::v3. Maybe we could lean towards their adaptor APIs like this:
join(separator(" "), my_nums | hex<int>()));
If you use the design with the unary function in the beginning, it works for boost::range and classic iterators. You can still compose several unary functions by using a lambda.
OK, here you made a point. In a specification I'd leave the behavior undefined. In an actual implementation I'd have the later ones overwrite the previous ones, because I expect that to be easy to implement. We could also try and prevent that with meta programming, but I think, that is not worth the effort. Leaving it unspecified in the specification, still leaves us the option to do that later.
If you can catch the error it at compile time, so that it costs nothing at runtime, it is certainly worth the effort.
Still separator("...") is a bit out of the picture now. Up to mow I thought, that it is some kind of formatting information and therefore I handled it like other formatting tags. Actually I begin to like the idea of formatting functions, that return string factories. That fits very nicely with the rest of the API and makes concat(), join() and format() simpler.
A completely different approach, inspired by Python:
separator(" ").join(my_nums | hex<int>());
-1. It looks artificial in C++. In Python it is okay, because it creates a nice symmetry with .split(…). Here, there is no symmetry with .split. And any case, both should be methods of std::string then. For a user it will feel quite arbitrary that he/she has do use separator(" ").join(…) and instead of the simpler std::string(" ").join(…).
Hi, Am 24.01.2017 17:27, schrieb Hans Dembinski:
On 24 Jan 2017, at 12:54, Christof Donat <cd@okunah.de> wrote: I think, that composes less elegantly with boost::range or ranges::v3. Maybe we could lean towards their adaptor APIs like this:
join(separator(" "), my_nums | hex<int>()));
If you use the design with the unary function in the beginning, it works for boost::range and classic iterators. You can still compose several unary functions by using a lambda.
Sure, but a lambda is still more noise than a list of range adaptors (in ranges::v3 they are called "views"). join(separator("\n"), my_bytes | hex<int>(fixed_size(2)) | view::chunk(16) | join(separator(" "))).str(); Here I have added another, new idea. When join is only called without a range, it returns a range view/adaptor, that expects to iterate over a range of ranges and will return a range of string factories. The above code would produce multiple lines with 16 hex representation of bytes each.
OK, here you made a point. In a specification I'd leave the behavior undefined. In an actual implementation I'd have the later ones overwrite the previous ones, because I expect that to be easy to implement. We could also try and prevent that with meta programming, but I think, that is not worth the effort. Leaving it unspecified in the specification, still leaves us the option to do that later.
If you can catch the error it at compile time, so that it costs nothing at runtime, it is certainly worth the effort.
The issue is, that doing that is a lot of work. If we really want to do stuff and not just talk about, what could possibly be done, I'd propose to not implement it in the first release, but keep the behavior undefined in the specification, so that it can be implemented later.
A completely different approach, inspired by Python:
separator(" ").join(my_nums | hex<int>());
-1. It looks artificial in C++. In Python it is okay, because it creates a nice symmetry with .split(…). Here, there is no symmetry with .split. And any case, both should be methods of std::string then. For a user it will feel quite arbitrary that he/she has do use separator(" ").join(…) and instead of the simpler std::string(" ").join(…).
The latter is, in my opinion, even worse, because I really think, std::string should not have many member functions. But yes, the syntax I proposed here is not really intuitive in C++. Let's just forget about it. Christof
Hi, Am 23.01.2017 19:26, schrieb Olaf van der Spek:
On Mon, Jan 23, 2017 at 5:32 PM, Christof Donat <cd@okunah.de> wrote:
1. scope for formatting tags:
concat(format::hex<int>, 42, " is hex for ", concat(42)).str();
Here the inner concat will convert the 42 to its decimal representation, while the outer one converts the first 42 to its hex representation.
Wouldn't concat(hex(42), " is hex for", 42) make more sense?
That is a valid approach for concat() and format(), but suboptimal for join(). Think of this example: join(hex<int>, separator(" "), my_nums); Here all the numbers are converted to their hex representation. With your approach this would look like: join(separator(" "), my_nums | transform([](int i) -> int { return hex(i); })); That is much more difficult to understand. Since for join() tag parameters to define the conversion is, to me, the superior choice, I think, we should use it for concat() and format() as well, for consistency.
2. concat() in calls to format():
format("%|1$40t|%2%", concat(first_name, " ", last_name), phone_number).str();
Why not fold the name concat into the format string?
In this example I want the full name to take up 40 characters, no matter if the first name is long or short. With format strings as used by boost::format I don't know how that could be achieved, and extending the format language, of course, makes the interpreter slower and more complex.
If "concat" is the outer layer anyway, I would return a std::string directly for convenience. It is easy to forget the trailing .str() and it does not look elegant.
Of course better proposals are welcome :-) Would you prefer the implicit conversion? If so, why?
Implicit is problematic with auto..
That is one of the reasons, I prefer the explicit str() function. It also fits well to other factory functions, we might prefer to have as well, like e.g. concat(...).append_to(my_string), or concat(...).overwrite(my_string). Christof
Sorry to chime in so late in the discussion. What about a syntax similar to this? int main() { auto s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58); std::cout << s << std::endl; s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana")); std::cout << s << std::endl; } Which would produce the following output: Hello , World. The hex for 58 is 3a a : b : c8 : “banana" sample implementation (io manipulators may be incomplete, some efficiency gains could be mad): #include <sstream> #include <iostream> #include <iomanip> namespace detail { template<class SepStr> struct separator_object { template<class T> std::ostream& operator ()(std::ostream& s, T&& t) const { return s << sep << t; } // // other iomanp specialisations here // std::ostream& operator ()(std::ostream& s, std::ios_base&(*t)(std::ios_base&)) const { t(s); return s; } SepStr const& sep; }; struct no_separator_object { template<class T> std::ostream& operator ()(std::ostream& s, T&& t) const { return s << t; } }; template<class Separator, class String, class...Rest> auto join(Separator&& sep, String&& s, Rest&&...rest) { std::ostringstream ss; ss << s; using expand = int []; void(expand{0, ((sep(ss, rest)), 0)... }); return ss.str(); }; } template<class Sep> static constexpr auto separator(Sep const& sep) { using sep_type = std::remove_const_t<std::remove_reference_t<Sep>>; return detail::separator_object<sep_type> { sep }; } template<class SepObject, class String, class...Rest> auto join(const detail::separator_object<SepObject>& sep, String&& s, Rest&&...rest) { return detail::join(sep, std::forward<String>(s), std::forward<Rest>(rest)...); }; template<class String, class...Rest> auto join(String&& s, Rest&&...rest) { return detail::join(detail::no_separator_object(), std::forward<String>(s), std::forward<Rest>(rest)...); }; int main() { auto s = join("Hello ", ", World.", " The hex for ", 58, " is ", std::hex, 58); std::cout << s << std::endl; s = join(separator(" : "), "a", "b", std::hex, 200 , std::quoted("banana")); std::cout << s << std::endl; }
On 16/01/2017 09:54, Olaf van der Spek wrote:
http://abel.web.elte.hu/mpllibs/safe_printf/snprintf.html
It appears to only do checking at compile-time and then forwards to sprintf.. Yes the current implementation, but based on the expression template resulting from the parsing, one could easily produce at compile-time efficient code for formatting.
But I wasn't clear enough, I just wanted to tell that with the underlying library metaparse, one can do the compile-time dispatching. Naturally I think the compile-time cost would surely be higher than the cat() solution. -- -- Damien Buhl
On Mon, 16 Jan 2017 21:23:58 +0100 Damien Buhl <damien.buhl@lecbna.org> wrote:
On 16/01/2017 09:54, Olaf van der Spek wrote:
http://abel.web.elte.hu/mpllibs/safe_printf/snprintf.html
It appears to only do checking at compile-time and then forwards to sprintf.. Yes the current implementation, but based on the expression template resulting from the parsing, one could easily produce at compile-time efficient code for formatting.
I have a project http://code.leeclagett.com/prima (currently just a redirect to github) which generates a spirit::karma expression from a C-string literal using metaparse. The functions are closely modeled on the C format functions, but take an output iterator or a std::ostream instead. The functions are also constexpr objects, so they should work with the Fit library. For some reason the README.md does not mention the `fprintf` function, but it is currently in the repo too (the thread-safe variant has not been pushed out yet). There are a decent number of tests that compare the results against the system C formatting functions. The compile times are not great; metaparse, proto, and spirit v2 are a rough combination. And the documentation is not 100% accurate - a number of format string errors are not trapped at compile-time yet. I'm hoping to get a nicer error reporting system, currently the wrong type can yield an impenetrable error from within spirit. The "backend" is currently configurable, and if you can make sense of my "IR" from the metaparse frontend an output system that does not use spirit could be written. It would likely result in faster compile times, but I didn't want to write a floating point generator since there was much to experiment with on the interface and format string parsing side. Perhaps something to do if there is interest ...
But I wasn't clear enough, I just wanted to tell that with the underlying library metaparse, one can do the compile-time dispatching. Naturally I think the compile-time cost would surely be higher than the cat() solution.
Lee
On 01/03/17 17:23, Olaf van der Spek wrote:
On Tue, Jan 3, 2017 at 3:16 PM, Christof Donat <cd@okunah.de> wrote:
Hin
Am 03.01.2017 14:20, schrieb Olaf van der Spek:
On Tue, Jan 3, 2017 at 2:19 PM, Christof Donat <cd@okunah.de> wrote:
Am 01.01.2017 00:21, schrieb Andrey Semashev:
throw std::runtime_error(format(std::string()) << "Error " << 47);
How would that differ from
throw std::runtime_error((std::ostringstream{} << "Error " << 47).str());
Simpler syntax, better performance
I see the chances for better performance, but for the syntax I don't really see any remarkable improvements.
The extra parentheses and the .str() part are annoying.. same goes for boost::format.
How about this one?
throw std::runtime_error("Error "s << 47);
Well, if we're using UDLs, we might as well use my `format` proposal or a stream or Boost.Format behind the scene. throw std::runtime_error("Error "fmt << 47); throw std::runtime_error("Error "strm << 47); throw std::runtime_error("Error %d"fmt % 47); Each of the UDL operators would create a wrapper that implements formatting and is convertible to std::string. No need to infect std::string itself with formatting.
On Thu, Dec 29, 2016 at 2:53 PM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
On 12/29/16 11:54, Olaf van der Spek wrote:
On Thu, Dec 29, 2016 at 1:19 AM, Yakov Galka <ybungalobill@gmail.com> wrote:
On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <ml@vdspek.org> wrote:
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
Can't we already write it through (((s += "A") += "B") += to_string(42))? This is the time I think that assignment operators, other than =, should have had left associativitiy... pity they don't.
We can, but it's ugly and I'd like to avoid the explicit to_string. It also wouldn't allow the two-pass optimization to calculate the final length before allocation.
I already mentioned in the std-proposals discussion that I don't think formatting
should be dealt with by std::string or a function named append(). If
formatting
is to be involved I'd suggest creating a formatting library, but at that
point you should
provide clear advantages over the other formatting libraries we have in
Boost.
Or that exist elsewhere, e.g. https://github.com/fmtlib/fmt --DD
On Thu, Dec 29, 2016 at 1:19 AM, Yakov Galka <ybungalobill@gmail.com> wrote:
On Tue, Dec 27, 2016 at 4:47 PM, Olaf van der Spek <ml@vdspek.org> wrote:
One frequently needs to append stuff to strings, but the standard way (s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
Can't we already write it through (((s += "A") += "B") += to_string(42))? This is the time I think that assignment operators, other than =, should have had left associativitiy... pity they don't.
<< does the trick: s << "A" << "B" << 42; std::string& operator<<(std::string& os, std::string_view v) { return os += v; } std::string& operator<<(std::string& os, long long v) { return os += std::to_string(v); } -- Olaf
participants (16)
-
Andrey Semashev
-
Billy O'Neal (VC LIBS)
-
Bjorn Reese
-
Christof Donat
-
Damien Buhl
-
damien.buhl@lecbna.org
-
Dominique Devienne
-
Gavin Lambert
-
Hans Dembinski
-
Jeff Garland
-
Lee Clagett
-
Olaf van der Spek
-
Peter Dimov
-
Richard Hodges
-
Roberto Hinz
-
Yakov Galka