Re: [boost] My Proto Review

26 Mar 2008

      Eric Niebler wrote:
...
Joel de Guzman wrote:
...
...
Eric and I had our share of squabbles in the past ;-) related
to how I expected Proto to be. I knew from the start that I
needed something like it. Joao Abecasis, a Spirit developer,
started one like it, from my urging. Fusion was developed,
in part, due to the need for manipulation of heterogeneous
data structures, specifically for spirit: syntax trees.
But there still isn't a Fusion tree data structure, is there?
Nope. But I see no need. A fusion sequence can represent a
hierarchy because the elements are heterogeneous. A sequence
element can simply be another sequence.
...
...
I recall that our biggest disagreement, and one that we
resolved soon enough, was my need for flattened sequences.
Eric insisted on presenting the results with no flattening,
I wanted flattening. For example, I wanted:
a >> b >> c >> d
to be a flattened fusion sequence, not a tree. I was reluctant
to have Spirit go through a 2-step process that necessitated
a conversion from a tree to a fusion-sequence:
proto-tree-->fusion-sequence
All things come around again. With proto domains and generators, there 
is now a way to get the incremental flattening you originally wanted -- 
within a domain, where it belongs. But as I was able to show, the 
current solution in Spirit-2 -- letting proto build the tree and 
flattening it all at once -- actually performs better.
Cool. Actually, in Spirit-2, not all nodes are flattened.
So that makes perfect sense; I'm already doing per node
flattening in the transforms. Perhaps I can take advantage
of this new feature to further simplify Spirit-2?
...
...
At first, this lead to the development of segmented fusion
iterators. Eric actually implemented this:
http://lafstern.org/matt/segmented.pdf into Fusion!
It was OK and efficient at runtime. Unfortunately, compile
time suffered considerably.
To be clear, the compile time of a segmented Fusion algorithm with a 
segmented Fusion data structure is very good. It's when you use a 
non-segmented algorithm with a segmented data structure that the compile 
times get bad -- because you need to flatten the sequence first. Any 
attempt to use an ordinary Fusion algorithm with a hierarchical data 
structure will have this problem.
The biggest problem with segmented fusion is the need to implement 
segmented variants of all the algorithms. That's a non-trivial undertaking.
Yes. And it's always better to deal with it on the sequence level,
(not the iterator level) as we found out. Indeed, that's the
fundamental idea behind SYB.
...
...
Our endless discussions culminated with:
http://article.gmane.org/gmane.comp.parsers.spirit.devel/2765
Seems some folks were tuning in to our discussions.
The C++ version of "Scrap Your Boiler Plate", a very
powerful, well studied and widely used design pattern
for generic traversal from the Haskell world:
http://portal.acm.org/citation.cfm?id=1159861.1159871
references Eric's post above.
So far so good, yet, SYB ("Scrap Your Boiler Plate")
was not available in a mature form in C++ yet. This then,
again, lead to Dan Marsden's proposed traversal library.
What Eric ended up doing was to provide a zero-overhead
tree structure that's converted very efficiently to a
flattened fusion sequence. It was as fast as my non-proto
Spirit2 implementation and with some compile time performance
hit.
I would be surprised if the flatten-all-at-once approach causes longer 
compile times than the flatten-incrementally approach. I remember 
showing that, since flattening incrementally amounts to pushing at the 
back of a cons list, it causes more templates to be instantiated than 
just building the cons list all at once back-to-front when all the 
elements are known.
If the proto-ified version of Spirit-2 has longer compile times than the 
non-proto-ified version, I suspect the problem is elsewhere.
Right. Perhaps I was under an obsolete impression :P
It might be a good idea to show some new benchmarks on
this? Alas, I no longer have the old non-proto code :(
...
...
In due time, I'll be using Dan traversal library.
With Proto? Or as a replacement for Proto? How do you intend to use 
Dan's code in Spirit-2?
Proto+Fusion+SYB.

It seems Hartmut already outlined one plan. I've some
more ideas in addition to Hartmut's:

1 Proto+Spirit creates an ET tree from the Spirit grammar
   and rules. No need to flatten, Spirit stores the ET
   nodes as-is plus some node specific info (e.g. sequence,
   alternate tags).
2 Spirit+Fusion+Traversal traverses the ET tree using per-node
   specific specialized algorithms (e.g. parse/generate for
   terminals, sequences, alternates, etc) and generates
   attributes. Traversal deals with the walking strategy (e.g.
   I want to have a full stride over all sequence elements
   as if it was flattened).
3 Attributes are fusion-sequences, stl-sequences and variants
   essentially representing the AST. This AST can again be
   traversed/transformed by Traversal (e.g. for subsequent passes
   over the AST like optimizations and actual backend code
   generation).

If this is a bit unclear at the moment, no worries. The ideas
are just forming :-)

Regards,
-- 
Joel de Guzman
http://www.boost-consulting.com
http://spirit.sf.net