Re: [boost] boost.url review

21 Aug 2022


      On 21.08.22 17:13, Vinnie Falco via Boost wrote:
...
On Sun, Aug 21, 2022 at 7:15 AM Rainer Deyke via Boost
<boost@lists.boost.org> wrote:
...
- The lack of IRI support is unfortunate.  It's 2022; we should all
be writing software with Unicode support by default.  However, this can
be built on top of Boost.URL, and isn't needed in all cases.
We will probably add something to parse an IRI but in all likelihood
it would convert it to UTF-8 as a regular URL. I don't know if I have
the stomach for a total duplication of the existing library except
names like iri_view, iri, static_iri, the duplication of all the
segments and params containers, and the modification of all those
mutating algorithms to support Unicode. I'm not even sure that it is
called for, given that IRIs are for more user-facing purposes. Such
things cannot be submitted in HTTP requests, and as far as I know,
unicode host names would need to be converted to punycode anyway
(which we could do) to submit them to a DNS server.
I see IRIs not as a different datatype, but as a specific interpretation 
of URIs.  The transparent percent en-/decoding of Boost.URL already gets 
us most of the way there.  Additional IRI support would mean:
   - Decoding accessors that perform utf-8 validation.  (Arbitrary 
percent-encoded 8-bit values are legal in URLs, but not in IRIs.)
   - Additional mutators that perform NFC normalization or validation.
   - Punycode encoding/decoding.


-- 
Rainer Deyke (rainerd@eldwood.com)