
I'm really appreciative of the testing you are all doing with parsing code and hope that at the end we can see both how fast and how clear and maintainable the various styles of parser code become (focus on accuracy, speed then perhaps more on the clarity). Its great to see the enthusiasm and results. Looking forward to more tips once we have a fair test directly pitting spirit with expressive and other parsing methodologies. I also am encouraged by what Edward's timer might add though I'm a little wary of possible inclusion of code following FFTW's license as it may be incompatible with Boost. Edward, I do wonder about the statistical significance metrics you provide (great idea by the way). I'm wondering if your code assumes a normal timing duration distribution and if so, can it 'measure' whether the distribution of results conform to this assumption. In my experience, timing benchmarks can suffer from having outliers (usually OS-induced by not having a RTOS) that make the distribution less than 'normal'. Would it be possible to establish that the given sample set of timings correspond to a normal distribution (or perhaps discard a certain percentage of outliers if necessary). I'm no statistics person, but I have seen cases where 10,000 timing samples have been biased by 10 samples that probably relate to a program issue or a severe OS issue. e.g. normally 1 ms per iteration, 10 @ 1-10s each