Re: [boost] Boost.URL review

22 Aug 2022


      On Mon, Aug 22, 2022 at 9:41 AM Zach Laine <whatwasthataddress@gmail.com> wrote:
...
On Sun, Aug 21, 2022 at 6:47 PM Vinnie Falco <vinnie.falco@gmail.com> wrote:
...
On Sun, Aug 21, 2022 at 4:03 PM Zach Laine <whatwasthataddress@gmail.com> wrote:
...
...
It wouldn't be a proper review without at least one person forgetting
to use "Reply All" :) So here it is, from Zach. I have quoted his
message:
...
This makes sense.  The larger issue I'm raising is that a user is
going to have a harder time remembering this, if the names don't
match.  A way to mitigate that is to make this naming rationale
explicit, so that user
expectations will adjust.  I didn't get that this was the pattern from
the documentation, though I think you did say it.  A call-out
specifically noting this would porbably be enough.
Yeah we can do that.
...
Hm.  By lazy flat_map, do you mean that the positions are computed one
step at a time during iteration?  is that why they have
forward_iterators?
Yes. operator++ and operator-- intelligently find the next param or
segment using a non-validating parse algorithm (since we know that the
underlying character buffer has already been validated).
...
If so, please document that.  It might make a
pretty big difference whether I decided to keep using a params object
that I need to update a lot, or copy everything over to a proper
flat_map or vector to do my work.
Yes I agree, but I thought we did document that. Here, no?
<https://master.url.cpp.al/url/ref/boost__urls__params_encoded_view.html#url.ref.boost__urls__params_encoded_view.complexity>
We could probably add something here too:
<https://master.url.cpp.al/url/ref/boost__urls__params_encoded_view/iterator.html>
Right, and I understood how those worked.  However, I had no idea that
the owning containers were based on the same lazy approach (unless I
misunderstand what you said in this thread...).
...
...
This works for an immutable container.  It is *very surprising* for a
mutable one.  The rest of my notes about the containers are predicated
on a false assumption that mutability implied non-laziness in the
container itself.  I think all of these container would be better off
referred to as views, and made immutable.  All you really need to
support is begin/end, plus maybe key-lookup.  I suggest you drop the
mutability entirely.  With mutability, these containers are trying to
do two things at once.
Well, that's another way to do it and it certainly would remove
complexity and simplify the names. Right now we have 8 containers:
segments
segments_view
segments_encoded
segments_encoded_view
params
params_view
params_encoded
params_encoded_view
All of these containers reference the underlying character buffer of
the URL (although you can also construct the views from separately
parsed string views but that's basically the same thing in terms of
the ownership model being reference-like).
If we remove mutability as you suggest, then we can cut the number of
containers down to 4:
params_view
segments_view
params_encoded_view
segments_encoded_view
But this leaves us with a problem - how do you modify individual path
segments and query params?
Easy -- you just copy them into a flat_map, vector, or whatever.  The
90% use case is just to look at the result of the parse.  Also making
the parsing library responsible for mutations you do to the data later
seems unnecessary.  If you provide parsing -- including the lazy views
-- and then writing of URLs, and the writing can be done via a generic
container, that's all your library needs, IMO.  To be clear:
Parsing -> lazy_containers -> user mutations outside of Boost.URL ->
writing URLs
That last step only needs to understand the different kinds of data it
might write, abstractly.  That abstraction might be a params_view, a
flat_map, or a std::vector<std::pair<strd::string, std::string>>.
...
How do you do position-based insert and
erase of elements using an iterator? These features seem kind of
useful, I'm not sure I want to lose them.
You won't lose them, because the STL exists.
...
...
...
Did you mean that it "doesn't" have proper error reporting? Where do
you see this lack of reporting (if that's what you meant)?
By "proper" I mean that if the parse fails it should provide an
end-user-friendly diagnostic, and optionally quote the line and put a
caret under the position that the parse failed (a la the Clang
diagnostics). Your non-end-user error reporting is fine.
Oh... hmm.. well, each rule does leave the iterator out-param at the
spot where an error occurred. But I'm not quite sure how to emit the
diagnostic - surely you aren't suggesting we write to std::cout or
std::cerr? Are we using curses or something? Or maybe just plain ASCII
art? How exactly would this look?
The diagnostics go to some optional output channel, which might be a
stream or file or logger.  If you don't set the output, the diagnostic
is never generated.  You set this output in the call to parse*(), as
an optional param -- usually a std::function<void(std::string const
&)>.
...
Thanks to Peter's work on boost::system::error_code, when a parse
error occurs the error is augmented with file name, line, and function
name information. And I have worked on adding visualizers so these are
nicely presented in the debugger. So at least we have that.
That's really nice for programmers, but it leaves the programmer with
a lot of work to do to present the error to the user.
Zach
Forgot to reply-all *again*.

Zach