On Tue, Oct 12, 2021 at 1:02 PM Alex Christensen
I would say that the WhatWG URL specification is that something newer, but I sympathize. It is difficult to get started with.
I plan to pick through the WhatWG specification and see if there are any tidbits that could have value. The procedural exposition (append this character, execute this algorithm, output this string) makes it very difficult to grasp the higher level semantics of the thing. The BNF in the RFC is way easier to grok, e,g: scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) and in fact this is exactly how I have decomposed the parsing, into each individual named element described by the RFC.
That is certainly a choice you can make, but at some point you may run into issues with people trying to give your library input like http://example.com/đź’©
I'm not sure how someone would give the library that input. Is this expressible in a string_view?
I see Punycode encoding and decoding doesn’t seem to be in the scope of this library, and for your use cases that may be fine and for others that might not be fine.
I actually have all the punycode algorithms ready including tests: https://github.com/CPPAlliance/url/tree/punycode But after giving it some thought, I couldn't see the use-case for it. Callers who want to perform name resolution on a host can't pass a utf-8 string they have to pass the punycode-encoded string. The only use-case that I can discern for punycode is for display purposes which is out of scope. Unless, do you know of any other use-case?
It seems like you’re aware of this design choice, though.
Yes Actually, come to think of it - there is a use-case for punycode encoding and that is to take an international domain name string in utf-8 encoding and apply puny-code encoding. I think... I have no experience with this so some guidance would be helpful. -- Regards, Vinnie Follow me on GitHub: https://github.com/vinniefalco