On Tue, Oct 12, 2021 at 6:06 PM Vinnie Falco
What you are thinking of as a "valid URL parser input" is actually an Internationalized Resource Identifier, which supports the broader universal character set instead of just ASCII and is abbreviated by the even more obscure acronym "IRI." It is covered by rfc3987:
We looked over this RFC and I think, it would be possible to support IRIs simply by providing a new set of parsing functions, for example void parse_iri ( string_view, error_code&, url& ); void parse_irelative_ref ( string_view, error_code&, url& ); void parse_absolute_iri ( string_view, error_code&, url& ); void parse_iri_reference ( string_view, error_code&, url& ); It wouldn't be possible to parse into a url_view, since UTF-8 encoded characters have to be converted to percent-encoded escapes. But this could be made to work, and it fits neatly into the current implementation. There would be an additional function to take a url and convert it back into its IRI string, which is mostly just decoding percent-escaped characters. The library would disallow invalid UTF-8. Thanks