On 24/08/2022 20:03, Andrzej Krzemienski wrote:
śr., 24 sie 2022 o 01:37 Vinnie Falco napisał(a):
Because this library is capable of representing ALL URLs, it is necessary for the interface to allow the caller to interact with the port as a string which is valid according to the grammar.
I don't really think that's a good reason. The grammar specifically
states that the value must be numeric, and while it doesn't explicitly
place a limit on its range, it's not really reasonable to use it for
anything other than a TCP/UDP port number, which *is* explicitly limited
to 16-bit values. While URLs do occasionally get used for non-Internet
protocols, I can't think of a case where a larger port number would be used.
As such, I do think it's reasonable for it to fail parsing if someone
tries to use an out-of-range port number, at least until someone
demonstrates a (non-arbitrary) case where a larger range is needed.
Regarding 0 being potentially ambiguous, I don't think that's an issue
either, since it's not a valid TCP/UDP port number. And you do have
.has_port() if someone really wants to disambiguate that case. You
could potentially change it to optional
Is this the only reason? I would think that the primary reason is performance: usually you obtain a URL as a string, so the port-string is just sitting there. Then you may want to set it in another URL, so it will be a string again.
This is perhaps a better reason, and there are some other network libraries that do take the port as a string (or as a full authority string including hostname and colon). And if you're just disassembling, modifying, and reassembling the URL to pass on as a string then you may have no need of converting to integer. But then, in those cases you'd never call .port() anyway. So I don't think it would cost anything to remove the string accessors from the public API, even if it keeps the internal string representation and only parses to int on-demand. And it would avoid some complications with .set_port(s) and invalid input. Or perhaps make .port() and .set_port(n) use uint16_t but retain a string getter (but not setter) .port_string() for the corner cases.
2. The query part doesn't have duplicate keys.
I definitely disagree with this one. Repeated keys in the query are quite commonplace to represent array values or other multi-selects. Having said that, it may be useful to present (both for read and write) the params as a multi-map-like structure with unique keys and multiple values. This can't be the primary interface (since it can only be used when the order of the keys is not important, which is commonly the case but not *always* the case) but it would be a more convenient one for most usage. Aside: at the bottom of https://master.url.cpp.al/url/containers/query.html the examples 3 & 4 are both labelled "a key with no value", which is not what #4 shows. There is also no example for "no key and no value", which was stated to be a possible result in the preceding paragraphs. The example query string suggests that perhaps that should have appeared in between 3 and 4.