
Hi Everyone, I am trying to do a review of Boost.URI library (I usually find the official ten-day review period to be too short), and there are a number of interesting things I came across that I thought I would mention. The docs say:
The library requires Boost and a compiler supporting at least C++11.
Even though the library is a candidate for a Boost library, I understand that it is now offered as a "stand alone" version, and this is what the docs describe. I tried to use it with the latest MinGW Distro on Windows ( https://nuwen.net/mingw.html), which uses GCC 11.2 and Boost 1.77. Without success. This is because Boost.URL relies on the component boost::system::result<T>, which is present in Boost.System only since version 1.78: https://www.boost.org/doc/libs/1_79_0/libs/system/doc/html/system.html#chang... First, this is news to me that we have `result<T>` in Boost.System, which has an overlap with result<T, E> from Boost.Outcome. Second, I recommend that Boost.URL docs say that it requires Boost 1.78 or higher. Next, we read: Aliases for standard types, such as string_view
<https://master.url.cpp.al/url/ref/boost__urls__string_view.html>, use their Boost equivalents.
After reading this, I expected that Boost.URL would use boost::string_view from Boost.Utility library: https://www.boost.org/doc/libs/1_79_0/libs/utility/doc/html/utility/utilitie... But instead, it uses boost::core::string_view, which is an implementation detail from Boost.Core library: https://github.com/CPPAlliance/url/blob/master/include/boost/url/string_view... Again, this is news for me that Boost has two implementations of string_view. Why? Second, I do not think that Boost.URL should rely on the implementation details of Boost.Core. A better alternative would be to use the official boost::string_view from Boost.Utility. Or is there a good reason not to? Next, the section on the parsers ( https://master.url.cpp.al/url/parsing/url.html) describes the function parse_uri() which returns result<url_view>. What strikes me is this difference: URI (Identifier) in the function name, and URL (Locator) in the return type. I always used the terms URL and URI interchangeably. But now that I see them used in this way in a well designed library, it looks disturbing. The quoted rfc3986 ( https://datatracker.ietf.org/doc/html/rfc3986#section-1.1.3) says that an URL is a subset of URI. Now, the name `parse_uri` implies that it will recognize any URI, but on the other hand it is impossible that the result will fit into a url_view, because not every URI is an URL. The synopsis for parse_uri ( https://master.url.cpp.al/url/ref/boost__urls__parse_uri.html) says: Exception safety: throws nothing.
And the line below it says that the function throws std::length_error when the input is too long. It looks like a bug in specs. Later we read: Return value: A result containing the view to the URL, or an error code if
the parsing was unsuccessful.
Which is not precise enough to give me the answer to the URI-vs-URL question. When can a parsing be non-successful? Is it only because it was not conformant to the grammar? The synopsis says "This function parses a string according to the URI grammar below", but is it a URI grammar or a URL grammar actually? Maybe the "return value" section should say instead: Return value: A result containing the view to the URL, or an error code if
the contents of `s` were not conformant with the above grammar.
That is, any other reason for not being successful (if any resources needed to be allocated and failed) may still be reported via exceptions. Now, there is probably a good explanation to the URI vs URL discrepancy. I think it would be good if it was placed in the docs, so that the users don't get confused. While this might look like a list of complaints, I really appreciate the efforts the authors put in creating this library and its documentation. The documentation is really high quality, way higher than the average you will find in GitHub. And this is actually because of this high quality that I am able to spot and report these issues. Regards, &rzej;