[JSON][review] a small review of boost.json
Hi, Here's my small review of boost json. ## Are you knowledgeable about the problem domain? I've written a json library (for reference: https://github.com/Julien-Blanc-tgcm/jbc-json ) which i used as playground / experiments. I guess this makes me knowledgeable enough of the domain. ## What is your evaluation of the design? I think the library is well designed, although it has several shortcomings. These shortcomings may not alter general usage of the library (with main target seeming to be REST api over http), but render it useless in specific cases. For me, a json library consists of three independant parts and some glue: * a parser * a type for representing values (usually a variant) * a printer (serializer) ### The parser In boost.json, the separation between the parser and seems neat enough (ie, you can use the parser without the json::value type). However, it resides in a <detail> header, which would indicate that it is not an expected use case. I believe it should be. The callbacks of the parser (push parser) are not documented. I'm a bit puzzled by what the on_number_part does, and how it fits with the on_int64/on_double callbacks, for example. The parser is an incremental push parser. However, it seems to expect to be feeded with partial json documents, but not consecutive json documents. I could not find a way to make : auto v = p.write(R"json({"foo":1}{"bar":2})json"); return 9. I may have missed something, but this use case is IMHO important to address. Since the only way to know that a json document has ended is to parse it, the parser should provide an option to parse a stream containing multiple consecutive json documents (ie, stopping without error at the end of each document). A common use case for this is JSON notifications received via raw sockets, so no envelope may be used to isolate JSON documents. ### The value type The value type is a variant, the kind that you would expect from a json library. Most of the interface uses string_view, which is a very good thing (the fact that boost.json uses another string type internally has not been an issue in my tests, due to the usage of string_view). It lacks visitation, though (IIRC there’s an open issue for this). The fact that this value type has been highly optimized is the great strength of this library. #### objects The object interface relies on the [] operator, which incurs the same shortcomings as with std::map : auto v = foo["bar"]; // will silently create bar if it does not exists Typos in keys had always been a source of hard to find bugs, i wished there was an option to disable operator[] and force the use of find. The library has no support for object with duplicate keys, which also are valid json. This, however, can be worked around by using the parser callbacks and not the variant (but that makes it difficult to produce such documents, cf later). Whether the order of the keys is kept by the object structure or not does not appear in the documentation. From my small testing, it seems that it is the case, but the api does not offer any interface to add elements at specific position, so i would assume it is not guaranted. The json specification, AFAIR, does not forbid to rely on a specific order of the keys. Some serialization libraries do expect the keys to be in a specific order. #### numbers Having two number types can lead to unexpected results : auto v = boost::json::parse("{\"foo\":1,\"bar\":2.0}"); std::cout << v.as_object()["bar"].as_double() << std::endl; // ok std::cout << " " << v.as_object()["foo"].as_double(); // asserts It is unexpected, because several json libraries will never output “2E0” or “2.0” for the value “2”, but a plain “2”. It means that whenever you expect a double value, you also have to check for the possibility that it has been parsed as an integer. The to_double function should silently convert integers to their double equivalent whenever it is possible. * as has already been noted by others, the number precision is hardcoded to IEE754 or 64bits integer. This is seldom an issue, since it is the case with most implementations. #### The printer The printer is not usable without the value item. I would expect it to be (producing json without resorting to an intermediate representation). Apart from this, it supports incremental printing, which is what is expected. It produces a strange representation for numbers, but it is conforming to the JSON specification. Because it is tied to the json::value, it cannot produce anything that cannot be represented by a json::value. This is an issue that should be addressed. ## What is your evaluation of the implementation? I did not have a thorough look into the implementation details. From what i have seen, the implementation looks fine. A parser is usually a landmine for security exposures, and would need a very careful review which i cannot do. IIRC the authors plan on having an independant audit on the library, which is a good thing. There is also fuzzing and tests, so we can expect that this point is correctly being taken care of. The rest of the code looks fine, easy to navigate into. ## What is your evaluation of the documentation? Very good. Easy to use, complete. There is some missing information, but they all seems related to design decision that have been taken implicitly (already discussed in design section). An example on using the parser on standalone mode would be a very welcome addition. Another thing that the doc should state clearly : boost.json is **not** a JSON (ECMA-404) library, but rather an I-JSON (RFC 7493) library with some extensions. ## What is your evaluation of the potential usefulness of the library? Very useful. While it does not address all use cases (and does not try to), it addresses what is probable the *most common* case. As JSON is very widespread, i expect this library to find a large audience, because it combines ease of use and performance. ## Did you try to use the library? With what compiler? Did you have any problems? I tried to use it on a small toy application, on a debian stable system. The main target seems to be to use the library with boost. I ran into several issues when trying to use the standalone version. The first one was easy to address, and due to gcc 8.3 lacking c++17 support. A patch was submitted, and merged quickly. The second one was incompatibilites between standalone asio bringing boost headers (because boost is installed on my machine) and json redefining them (because of standalone mode), resulting in multiple defenitions of throw_exception. I'm not sure yet if i should file this to boost.json, asio library, or asio debian package, or if it is just a missing define on my part. At that point, i switched to boost json, which fixed the issues i had. ## How much effort did you put into your evaluation? I'd say around 10 hours reading, testing and playing, including writing of the review. ## Do you think the library should be accepted as a Boost library? I think the library should be ACCEPTED. There are shortcomings in the library, it does not address every use case in the world, but it does a very good job at what it is designed to do. It is easy to use, performant, and it addresses a very widespread use case, so the pros clearly outweighs the cons. Regards, Julien
On Sat, Sep 19, 2020 at 3:32 AM Julien Blanc via Boost
However, it resides in a <detail> header, which would indicate that it is not an expected use case. I believe it should be.
The callbacks of the parser (push parser) are not documented. I'm a bit puzzled by what the on_number_part does, and how it fits with the on_int64/on_double callbacks, for example.
basic_parser is a class template whose header is in
I could not find a way to make :
auto v = p.write(R"json({"foo":1}{"bar":2})json");
`write()` returns an error if it does not consume all the input. You need to call `write_some()`, which will work exactly the way you want. This is even documented :) The second to last code snippet in this subsection: https://master.json.cpp.al/json/usage/parsing.html#json.usage.parsing.parser...
The json specification, AFAIR, does not forbid to rely on a specific order of the keys. Some serialization libraries do expect the keys to be in a specific order.
Boost.JSON keeps the order after parsing, if there are no duplicates, but does not guarantee preservation of ordering after the value is modified. The JSON spec says that the objects are unordered.
The to_double function should silently convert integers to their double equivalent whenever it is possible.
There's no `to_double`, maybe you mean `as_double`? In any event, the function that does what you want is called json::number_cast
It produces a strange representation for numbers, but it is conforming to the JSON specification.
Yes the subject of converting floating point to string is an area of improvement.
Another thing that the doc should state clearly : boost.json is **not** a JSON (ECMA-404) library, but rather an I-JSON (RFC 7493) library with some extensions.
The second one was incompatibilites between standalone asio bringing boost headers (because boost is installed on my machine) and json redefining them (because of standalone mode), resulting in multiple defenitions of throw_exception.
I'm open to suggestions on how to fix it: https://github.com/CPPAlliance/json/blob/8c1075ab0bdcd0691ebc6f3574b9988cc5d...
it does not address every use case in the world, but it does a very good job at what it is designed to do. It is easy to use, performant, and it addresses a very widespread use case
I fully agree with this description, that is precisely what was intended.
I think the library should be ACCEPTED.
Thank you for your thoughtful review and time, and for submitting a pull request! Regards
Le samedi 19 septembre 2020 à 05:37 -0700, Vinnie Falco a écrit :
On Sat, Sep 19, 2020 at 3:32 AM Julien Blanc via Boost
wrote: It *is* possible to use basic_parser with your own handler (and it is documented). You just need to include the header manually, like this:
#include
This gives access to the class declaration and the member function definitions. The callbacks are documented here:
< https://master.json.cpp.al/json/ref/boost__json__basic_parser.html#json.ref....
Then i shall submit a doc bug report :). This is the doc i’ve found,
this page explictely state that basic_parser is “ Defined in header
The second one was incompatibilites between standalone asio bringing boost headers (because boost is installed on my machine) and json redefining them (because of standalone mode), resulting in multiple defenitions of throw_exception.
I'm open to suggestions on how to fix it:
< https://github.com/CPPAlliance/json/blob/8c1075ab0bdcd0691ebc6f3574b9988cc5d...
Given that throw_expection is just a wrapper around a throw call in that case, i guess it could simply live in the boost::json namespace. That would avoid name clashes with asio (rapid testing has shown that it seems to work). Regards, Julien
Julien Blanc wrote:
Still, after reading the doc again, i don't get a clue what is this on_number_part or why it exists at all.
IIRC on_number_part gives you the number in string form, exactly as it exists in the original JSON, in case you want to preserve all its digits when it doesn't fit in int64 or double.
participants (3)
-
Julien Blanc
-
Peter Dimov
-
Vinnie Falco