16 Aug
2022
16 Aug
'22
6:03 p.m.
Hi all, Here is my review of Boost.Url. > - Does this library bring real benefit to C++ developers for real world > use-case? Do you have an application for this library? I'm happy to see this library being proposed. I strongly believe we need such functionality in Boost. If it gets accepted, I'm planning on using it in Boost.Mysql to create type-erased connections from connection URLs (such as mysql://user@localhost:3306/database?ssl-mode=disable). I think URL functionality is a must for the C++ for web environment we're developing. - Does the API match with current best practices? The API is clear and concise. It allows parsing and modifying URLs in a pretty nice way. I've found the naming for view containers confusing (e.g. params and params_view). When I saw it, I thought params would take ownership of the query string params buffer, while params_view would not. If I've understood correctly, the difference is that params allows for modification (as it's returned from url containers) and params_view does not (as it's returned from url_view). So it's more about mutability than ownership. If that's the case, I would either emphasize it in the docs, or rename the containers to something that reflects this fact (e.g. const_params and mutable_params, as asio does). I'm happy to see percent-encoding standalone functions being exposed, as this is something present in almost all other languages. I'd consider exposing higher-level functions here, maybe taking a lvalue reference to a std::string, allowing the user to percent-encode strings easier (so they resemble JavaScript encodeURIComponent). Something I've found when playing with mutating URLs is that modifying a query parameter to a certain value is a little bit verbose: url u; // get it from somewhere auto it = u.params().find("k"); if (it == u.params().end()) { u.params().append("k", "value"); } else { u.params().replace_value(it, "value"); } I had this use case in the last web application I wrote (although it wasn't C++). I don't know if this is a common use case, but I'd consider adding a function that simplifies it (i.e. params::set_value(string_view key, string_view value). I'm not very keen on exposing the parsing infrastructure and helper functions (other than percent encoding/decoding) as public API, as I think it violates the principle of API surface minimization. I understand that these parsing facilities may help libraries such as HTTPProto, but I'm not sure if the general public should be using them. I may be missing something, but I haven't found the example on magnet links very compelling - I feel the same result could be achieved in a cleaner way by using the higher level API. If the parsing facilities are to be kept public, I think we need a more compelling example for them. >From reading this comment https://github.com/CPPAlliance/url/issues/379#issuecomment-1212026752 I get the impression that some schemes have different reserved character sets, and that these functions help with that problem - maybe an example of such an scheme could be helpful. I can see other public types in the reference such as recycled, recycled_ptr and aligned_storage, which again have the feeling should not be exposed as public types. - Is the documentation helpful and clear? The documentation is good and the reference is complete, but I miss some examples. We've just got two examples, and one of them is about a very concrete feature that most users will likely not be using day-to-day. In particular, I miss an example (and possibly a page) about composing and mutating URLs (starting with a base URL, add some segments and query parameters, emphasizing on when percent encoding gets applied). This is a pretty typical task when you're consuming a third party REST API. I'd add a page and an example on percent encoding functions, as they are only explained in the reference and the explanation assumes familiarity with the CharSet concept. As there are several container view types, I would also suggest creating a comparison table, like in json::value (https://www.boost.org/doc/libs/1_80_0/libs/json/doc/html/json/dom/value.html) (I've found that one particularly useful). The docstring for segments::insert seems to be wrong (https://master.url.cpp.al/url/ref/boost__urls__segments/insert/overload1.html), as I've passed a non-percent encoded string and it has accepted it, encoding the value. It seems an incorrect copy/paste from segments_encoded::insert. params::insert overloads with key only and with key/value also seem to have swapped docstrings. I'm also curious about the security review that is announced in the main page - when is that going to take place? - Did you try to use it? What problems or surprises did you encounter? I used it in a toy project via CMake FetchContent, with Boost 1.80.0, Linux and clang12. I had no problems building it. I initially tried `mkdir build && cmake -DCMAKE_PREFIX_PATH=path/to/boost ..` but got errors about linking to non existent targets. I only built the source (not the official examples and tests) and my toy programs. I focused on composing and mutating URLs, but also tried parsing and percent encoding. Except for the mismatched docstrings I've already mentioned, no surprises. - What is your evaluation of the implementation? I haven't looked into it. - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? I've dedicated half day to build and play with the library, and had read most of the docs (and some code) beforehand. - Are you knowledgeable about the problem domain? I'm not a URL expert, just a regular URL user. I've written several web applications where URL handling is a day-to-day task. My recommendation is to CONDITIONALLY ACCEPT this library as part of Boost, subject to reviewing which types/functions in the lower-level API are exposed and which ones are not. Thanks Vinnie and Alan for your work. Regards, Ruben.