
On 11/28/2010 04:54 PM, Dean Michael Berris wrote:
On Mon, Nov 29, 2010 at 6:25 AM, Marsh Ray<marsh@extendedsubset.com> wrote:
What I would really like is a clean and simple JSON library.
At the risk of sounding PR'ish...
Last time I looked around (a year or two ago) it seemed like there were a lot of 50-80% side projects, none of which gave me the warm fuzzies about being tested and maintained. Many would parse but not generate, or vice versa. The DOM v SAX architectural decisions seem relevant too.
It's actually on the list of things for me to do on cpp-netlib for 0.9
Cool!
-- I'm working on cleaning up the internals of the library, and then preparing to do higher level utilities that will make web application or web service (REST+JSON) development with C++ easier.
One of the things that I will be working on is a simple, robust, and type-safe way for doing JSON parsing/generation using Boost.Spirit.
Honestly, the couple of times I've tried to use Spirit I have not been successful. I've done a few templates in my time but that library blows my mind. The concept is brilliant - and its implementation, heroic. But trying to actually use it tends to just make me feel dumb. I look at the diagrams at http://www.json.org/ and I see a simple byte-by-byte (or character-by-character) state machine. The kind of thing that's been done since the early C compilers, only much simpler. Something I could understand in a debugger or, more importantly, review for security in a network-facing application.
I'm positive there's already an example of how to do it with Boost.Spirit's Qi/Karma and I'm almost sure that I'll start with those.
I hate to say it, but what I want is not that. I can't put Spirit code out on a network-facing environment for the same reason that I can't put a Haskell program out in such an environment - I don't understand it under the hood well enough to reason about the upper limits on its runtime resource consumption. (Actually, in the Haskell case it's not clear that anyone does. :-)
The idea with the utility library is that it will be usable in many different contexts -- and I'm actually prioritizing the parsing of HTTP requests that have JSON payload in PUT/POST requests.
Of course that's just work waiting to be done -- if you have specific use cases in mind aside from just (simple) configuration file parsing, I'd definitely appreciate guidance/thoughts on what you would look for in a JSON parsing/generation library.
Haha, cool, I get to play the customer for once. My wishlist/thoughts: * An interface based on UTF-8 encoded std::strings. Locales and other string encodings are not helpful to me. * Require minimal header dependencies. For example, I take std::vector, map, string, shared_ptr, and BOOST_FOREACH as a given. But other big header trees should have a justification. * You mentioned type-safe. But the documents are completely dynamic, there's no schema. I'd rather just have everything presented as strings, but maybe the library would do reasonable automatic conversions on output. I would not want incoming untrusted JSON to create objects of attacker chosen types unless the interface makes the code state its expectations and throws an exception. Like dynamic_cast to a reference type (not like to a pointer type which defaults to a null pointer crash). * Some types I see as valuable to work with are "string of arbitrary text" (e.g. an unqualified std::string), "string claimed to be JSON" (we received it), and "string of known-valid JSON" (we generated or validated it). These are things that tend to get confused in applications, can result in security holes (double escaping bugs), and that stricter typing could help. * What would make it really industrial-strength (i.e., good enough for web apps) is a first-class mechanism for declaring limits on total memory usage and object allocation count before beginning a parsing or generation operation. * The DOM could have an interface sort of like: void f(shared_ptr<json::dom_node> jdn) { shared_ptr<json::object> jo = jdn->as<json::object>(); // throws if somehow not a json::object ^^^^^^^^^^^^ std::string username; BOOST_FOREACH(json::object_pair & jopr, jo->pairs()) { // Iteration actively randomizes the order. // It's not significant according to the spec, right? :-) if (jopr->name() == "username") username = jopr->value_as<std::string>(); // throws if ^^^^^^^^ throws if not a json::string node } ... } * shared_ptr is great, but an intrusive_ptr could be good too. Hopefully cyclic references shouldn't be a problem, but a whole-document pool deallocator could be helpful. I like a convention where node types expose a typedef like 'sptr_type' with its preferred smart pointer type. * It doesn't have to be a header-only library. It'd be better to have the interface small and simple. * Interfacing to boost::serialization could be cool, but it's probably not the primary use case right now. * I don't much care about what type of exception gets thrown. Anything under std::exception is fine. It would be good to have line and char position information for parsing errors. * It would be cool if the parser could be incrementally spoon-fed input data and code could pull data out of the generator incrementally as well. This would facilitate usage with ASIO-like callbacks. * A simple pair of functions for escaping and unescaping according to the actual JSON rules for the between-doublequotes context. * And a pony. Thanks, - Marsh