On 21.08.22 17:13, Vinnie Falco via Boost wrote:
On Sun, Aug 21, 2022 at 7:15 AM Rainer Deyke via Boost
wrote: - The lack of IRI support is unfortunate. It's 2022; we should all be writing software with Unicode support by default. However, this can be built on top of Boost.URL, and isn't needed in all cases.
We will probably add something to parse an IRI but in all likelihood it would convert it to UTF-8 as a regular URL. I don't know if I have the stomach for a total duplication of the existing library except names like iri_view, iri, static_iri, the duplication of all the segments and params containers, and the modification of all those mutating algorithms to support Unicode. I'm not even sure that it is called for, given that IRIs are for more user-facing purposes. Such things cannot be submitted in HTTP requests, and as far as I know, unicode host names would need to be converted to punycode anyway (which we could do) to submit them to a DNS server.
I see IRIs not as a different datatype, but as a specific interpretation of URIs. The transparent percent en-/decoding of Boost.URL already gets us most of the way there. Additional IRI support would mean: - Decoding accessors that perform utf-8 validation. (Arbitrary percent-encoded 8-bit values are legal in URLs, but not in IRIs.) - Additional mutators that perform NFC normalization or validation. - Punycode encoding/decoding. -- Rainer Deyke (rainerd@eldwood.com)