On Tue, Oct 22, 2019 at 12:55 PM Mateusz Loskot via Boost
I'd consider covering the thing with https://google.github.io/oss-fuzz/ instead.
My strategy for ensuring correctness is two-fold. First, as with Beast, it will be reviewed by an external company (they will do the fuzzing). Second, are the special tests I write so that I have confidence everything works. This methodology is as follows: * Create a set of representative test vectors (examples of correct and invalid inputs) I have written my own inputs, and I have imported these test: https://github.com/nst/JSONTestSuite/tree/master/test_parsing Then, for each test vector: * Parse the input as one string and verify the output * Loop over every possible location that the input may be split into two pieces, parse the input as two individual pieces, verify the output. Code: https://github.com/vinniefalco/json/blob/cb348218345cfe2bea09d4a8ca8ea4c0f13... Then, for each possible split point also perform these algorithms: * Using a special allocator (`fail storage`) which throws after N calls to allocate, attempt to parse the input in a loop where N starts out at 1 and is incremented on each allocation failure. The test succeeds if the loop exits after a maximum number of iterations and the output is verified correct. Code for this failing allocator is here: https://github.com/vinniefalco/json/blob/cb348218345cfe2bea09d4a8ca8ea4c0f13... * Using a special parser (`fail_parser`) which returns an error after N calls to the parser's SAX API, attempt to parse the input in a loop where N starts out as 1 and is incremented on each failure. The test succeeds if the loop exits after a maximum number of iterations and the output is verified correct. Code for this failing allocator is here: https://github.com/vinniefalco/json/blob/cb348218345cfe2bea09d4a8ca8ea4c0f13... These tests are run under valgrind, address sanitizer, undefined behavior sanitizer, and code coverage. Then I look at the code coverage to find uncovered or partially covered lines, and devise individual tests to ensure that code is exercised. By now, there are only a handful of such lines if that. With these techniques I achieve close to 100% code coverage and very high confidence that every path through the parser is correct. After a bunch of testing (which consists of telling users it is "ready" and seeing what they report back) I submit it to the external code auditing company to get a report. After fixing any issues they raise in the report, my strategy changes: touch the code as little as possible. If this code used an external dependency, and that upstream code changed, then transitively it means my code changed - for this reason I avoid using external code like Spirit (or regex) even if it means I have to duplicate stuff. Thanks