wt., 23 sie 2022 o 23:30 Vinnie Falco
On Mon, Aug 22, 2022 at 11:48 PM Andrzej Krzemienski
wrote: Like in this example from the docs:
https://www.boost.org/doc/libs/1_80_0/libs/beast/doc/html/beast/quick_start/...
http::requesthttp::string_body req{http::verb::get, target,
version};
req.set(http::field::host, host); req.set(http::field::user_agent, BOOST_BEAST_VERSION_STRING);
At no point do I have to see or provide a full URL.
"target" is the URL there, it came from the command line. It is a relative-ref.
Ok, then maybe I do not understand how Beast works, I thought req.set(http::field::host, host); was setting the host. Anyway, other than this example, I did not have any other interface in mind.
* modifiers which take un-encoded inputs have a wide contract: all input strings are valid - however the url might need to reallocate memory to encode the result
Does set_port() taking a string_view fall into that category? Unlike other parts, port has special requirements on the string contents that cannot be satisfied by pct-encoding.
Yeah, we have a problem here. It sounds like we need to design an extra set functions. Maybe:
url_base& url_base::set_port( string_view ); // throws
result<void> url_base::try_set_port( string_view ); // returns result
what do you think about that?
If you used result<>, you would lose the ability to chain the setters.
Yeah... well, I think I'm OK with that.
As an alternative, you could say that function set_port() has a precondition: the input string has to represent a number, the caller is responsible for that, and set_port() performs no validation, and therefore throws nothing. But I guess that would violate one of the design goals of the library: "securely, including the case where the inputs come from untrusted sources".
My original thinking was that untrusted sources would only be presented to parse functions. But our dialog has convinced me that we should treat the parameters to modification functions as untrusted as well. True, in some situations we will be performing unavoidable, needless re-validation. But I think it is the right tradeoff, as offering the stronger invariant has more value for users. Besides, there are workarounds for assembling a URL which trade back performance in exchange for weaker invariants.
According to the design-by-contract theory, putting a precondition on class members (like, sting_view must represent an int) does not weaken the class invariant. It is just that the invariant is enforced through the new precondition rather than runtime validation. "I guarantee the invariant as long as users guarantee that they satisfy the preconditions." Of course, this may not work for your case, as you may want to allow for a use case when the user calls set_port() with an argument obtained from an untrusted source. Regards, &rzej;
Just to clarify what I mean: https://godbolt.org/z/qY3Y76fd9
urls::string_view s = "https://path?id=42&id=43"; urls::url r = urls::parse_uri( s ).value();
Should the above input string not cause the invariant to be broken? (because param 'id' appears twice.)
No, because the query is just a string. The interpretation of the query as "params" (an ampersand delimited list of name=value pairs) is an HTTP thing (and it has also spread to some other domains). When the query is used this way, duplicates are allowed. So the invariant is preserved in this case. It is always a valid URL. It might not be valid for custom schemes though. I could invent the "boost" scheme which requires that keys used in query parameters are unique. But there is no way for Boost.URL to enforce this (see my previous message regarding "mailto"). Users are provided with the tools to further specialize the generic URLs into custom schemes in a way that can preserve scheme-specific invariants.
Thanks