Sorry the late reply, your email was filed as Spam because it SPF failed. Here was the SPF Failure cause: Received-SPF: Softfail (domain owner discourages use of this host) identity=mailfrom; client-ip=149.217.99.100; helo=mail2.mpi-hd.mpg.de; envelope-from=hans.dembinski@gmail.com; receiver=s_sourceforge@nedprod.com You might want to fix this. On 08/03/2017 08:29, Hans Dembinski wrote:
Dear Niall,
Those of you who watch reddit/r/cpp will know I've been working for the past month on a pure Python implementation of a C99 conforming preprocessor. I am pleased to be able to ask for Boost feedback on a fairly high quality implementation:
did you do some benchmarks on how fast pcpp is compared to a "normal" C-based preprocessor?
Not with really large inputs yet, no. But its scaling curves are ideal, so it's linear to tokens processed, linear to macros expanded, linear to macros defined. It's an ideally minimum copy implementation made easy by Python never copying anything unless asked, plus we keep token objects below 512 bytes so the small object Python allocator is used instead of malloc. In absolute terms it will always be far slower than a C or C++ implementation. But we're talking half a second versus a tenth of a second here for small inputs. I would suspect for large inputs the gap will close, Python ain't half bad at performance once objects are allocated, especially Python 3 where pcpp runs noticeably faster than on Python 2. I haven't tried pcpp with PyPy (JIT compiler) yet, but it does nothing weird so it should work. That would close the absolute performance gap substantially I would guess.
I understand that you wrote this implementation to get more features and better standard-compliance than some commercial preprocessors, but since some people around me have claimed that preprocessing takes a significant fraction of total compile time, I wonder about performance.
If a build step can pre-preprocess all #includes into a single file and run most of the preprocessing, compilers can parse it in much quicker. If that step takes a few seconds but saves minutes for the overall build. you win. I can't say anything about MSVC, but GCC and clang they have a special fast path for the preprocessor for chunks of text with no macro expansions possible in it. With already preprocessed input, each translation unit therefore can save big and thus the overall build time substantially reduces. That's why Facebook Warp, HPX and other projects have implemented a pre-preprocessing build step. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/