On Mon, Aug 22, 2022 at 9:04 AM Phil Endecott via Boost
auto r1 = f1(); // r1 is a string, OK. auto r2 = f2(); // r2 is a string, not a string&, so it's safe. auto& r2a = f2(); // r2a is a string&, but the user knows that because they wrote a &. auto r3 = f3(); // r3 is a string_view, danger!
Ah, yes I see. To me the problem here is auto. In other words, the issue is not specific to URLs; it is general for any user that uses auto to capture the result of a function that returns a reference-like type over its arguments. For example, I see no problems here: std::string f1(); std::string& f2(); std::string_view f3(); std::string r1 = f1(); // r1 is a string, OK. std::string r2 = f2(); // r2 is a string, not a string&, so it's safe. std::string& r2a = f2(); // r2a is a string&, but the user knows that because they wrote a &. std::string_view r3 = f3(); // r3 is a string_view, danger! Instead of burdening every function that has ever been written, has been written, and will ever be written with the responsibility of communicating its return type twice - once in the actual return type and again in the function name, it seems to me that a more logical solution is that the author of the code chooses whether to opt-in to brevity (using auto) or opt-in to clarity (naming the return type).
I feel that maybe what I'm saying is not getting through:
*I am not insisting that you remove these dangerous functions. You are welcome to keep them for those users who feel that they need the benefits that they offer. I would just like you to *also* offer safe-to-use functions, and to give the safe functions more concise names than the dangerous functions
Okay so, if I am understanding you right, you are saying that this
would be better:
auto u = parse_uri_into_string_view( s ).value();
I'm not sure that I agree. The "name" of a function includes its
return type and parameter list types. `parse_uri` is really
"result
Yes well, we considered that design and concluded that it wasn't practical. The library has multiple value types for URLs, which one should it return?
One of the safe ones. I'd suggest one that uses a std::string to store the URL.
Using std::string as the storage is an implementation detail. We could do this, and it would enable the use-case where the URL can take ownership of the string, which could be cool in some cases. However if we do that then we close the door to also storing any additional variable-length metadata such as offsets or sizes of the path segments or query params. We need more field experience and real-world measurements before we make this change.
you think that the library should not let the user parse the substring and return a view to the URL? And instead it should allocate memory, all to protect beginners?
No, I'm NOT SAYING THAT IT SHOULD NOT LET THE USER DO THAT, I'm saying that a user who wants to do that should have to do a bit more typing to make it clear to the next person who reads their code that they are using the unsafe view-returning function.
I wonder if clang-tidy has a setting to warn when someone uses auto on a reference-like type...
I'm asking for an option to select either upper case or lower case letters in the hex encodings, i.e. %2D or %2d.
Hah.. I knew someone would ask for that. We have the framework in place (pct_encode_opts struct).)
*I DON'T WANT THE POINTY ENDS FILED DOWN, I JUST WANT THEM TO HAVE DANGER SIGNS ON THEM AND FOR THE TOOL TO ALSO BE USEABLE IN A SAFE WAY.*
I'm not sure how to warn people who use auto variables. Perhaps this could be something added as a compiler warning or to clang-tidy if it isn't already there? And perhaps we could propose an attribute for the language to inform the compiler of reference-like types. Like this maybe? class [[reference-value]] url_view; and then if you write extern url_view get_url_view_from( string_view ); .... auto u = get_url_view_from( s ); the compiler could show the danger sign that you clamour for, if you specify -Wreference-value. Now that I think about it Phil, this is quite a good idea because it solves the problem for everyone, forever, instead of having to have each individual author solve it over and over again for every piece of code they write. And users who want to opt-in to brevity can still do so. If you want to write this up as a proposal I would be in favor of it and you could take credit.
Dave Sankel's talk has a lot about inclusiveness in Rust. Do please view it.
Yes that was quite a good video! Thanks Regards