On Wed, Aug 24, 2022 at 4:44 PM Gavin Lambert via Boost
śr., 24 sie 2022 o 01:37 Vinnie Falco napisał(a):
Because this library is capable of representing ALL URLs, it is necessary for the interface to allow the caller to interact with the port as a string which is valid according to the grammar.
While URLs do occasionally get used for non-Internet protocols, I can't think of a case where a larger port number would be used.
I can't either, but when building this library we have done our best to refrain from subjective interpretations of the grammar, which states unambiguously: port = *DIGIT This means zero or more digits. TCP/IP was already quite well established for decades before rfc3986 was written, so I have to think that if they wanted the port to be limited to only what is possible with TCP/IP they would have stated so explicitly. Port zero for example, is an invalid port number, but it is allowed by this grammar. Some of the grammars in the spec are explicit when it comes to numeric limits, for example a dec-octet is limited to 0-255: dec-octet = DIGIT ; 0-9 / %x31-39 DIGIT ; 10-99 / "1" 2DIGIT ; 100-199 / "2" %x30-34 DIGIT ; 200-249 / "25" %x30-35 ; 250-255 had the authors intended to restrict the port they would have written it this way. I have been reluctant to put my own spin on interpreting the RFC if for no other reason, that I do not have sufficient field experience with the countless number of published and unpublished schemes which are currently in use. A conservative design choice follows the specification to the letter.
As such, I do think it's reasonable for it to fail parsing if someone tries to use an out-of-range port number
This is weird because you're saying that we should not accept valid productions according to the grammar?
So I don't think it would cost anything to remove the string accessors from the public API, even if it keeps the internal string representation and only parses to int on-demand. And it would avoid some complications with .set_port(s) and invalid input.
There's a problem here. What if we get url_view u( "//example.com:00" ); Is this a valid URL? According to the RFC it is. we parse it, and return 0 from port_number(). But you want to take away the port string modifiers. One of the principles of this library, is that the API allows the user to create any possible valid URL. That is, given a valid URL string, there exists a finite sequence of calls to the library that will produce the string. Taking away the port string modifiers leaves the library in the questionable position that users cannot create a URL that the parser thinks is valid. Regards