Various comments:
http://en.wikipedia.org/wiki/Algol68:
I don't see any discussion there of van Wijngaarden grammars, or even
anything that is recognizable as a formal grammar. What am I supposed to
be looking at on that page?
The Wikipedia Algol68 article is a reasonable brief informal
introduction. It does not mention VWG; I cited it so you could see what
VWG was used to define, and so you would have a more congenial starting
point than the Algol68 definition itself.
http://burks.brighton.ac.uk/burks/language/other/a68rr/rrtoc.htm
I see a large table of contents there. Care to narrow it down for me?
This citation is the formal definition. The discussion of the
definition mechanism is chapter 1. Chapters 2-9 are the language
proper; 10-11 are what a modern language would call the "standard
library" and in this context can be ignored; and 12 is tabular summary
material that appears to have been omitted from the document. The whole
report is <10% the size of of the C++ standard without its
standard library, and for a richer language than C++ at that.
In consequence the definition is extremely dense semantically. For
example, Algol68 permits slicing (a generalization of subscripting) a
multidimensional array so that the result is an array of the same or
less dimension that is a regular partial alias of the original. For
example, given a rectangular matrix you can apply a slicing operation
that yields the tridiagonal of the original, which then can be
manipulated as an array in its own right. The whole formal
definition of this capability (including syntax, type and semantic
definition) is this gem in VWG (from 5.3.2.1 in the cite):
a) REFETY MODE1 NEST slice{5D
} : weak REFLEXETY
ROWS1 of MODE1 NEST PRIMARY{5D
} , ROWS1 leaving
EMPTY NEST indexer{b,c,-} STYLE bracket, where (REFETY) is
derived
from (REFLEXETY){531b,c
,-}; where (MODE1) is
(ROWS2 of MODE2), weak REFLEXETY ROWS1 of MODE2 NEST PRIMARY{5D
} ,
ROWS1 leaving ROWS2 NEST indexer{b,d,-} STYLE bracket, where
(REFETY)
is derived from (REFLEXETY){531b,c
,-}.
The rest of the section comprises a few non-terminal definitions that
are used in the above, examples of each production rule, and a
(perhaps) helpful narrative comment to explain the lot:
{A subscript decreases the number of dimensions by
one, but a trimmer
leaves it unchanged. In rule a, 'ROWS1' reflects the number of trimscripts
in the slice, and 'ROWS2' the number of these which are
trimmers
or revised-lower-bound-options. If the value to be sliced is a
name,
then the yield of the slice is also a name. Moreover, if the
mode
of the former name is 'reference to flexible ROWS1 of MODE',
then
that yield is a transient name (see 2.1.3.6.c
).}
Note that Algol68 dates from the days when Fortran II was the
preeminent language for such things; C did not exist; object
orientation was known only in an obscure simulation language; and even
the notion of structured programming was not yet fully worked out.
If you really want to pursue this then I strongly recommend that you
start with Lindsey & van der Meulen (Lindsey, C.H. and van der
Meulen, S.G., /Informal Introduction to
ALGOL 68/, North-Holland, 1971). It is well written and clear, and you
will know far more about language design and definition than perhaps
you really wanted when you have finished it.
As for my own contribution:
How badly do you want to see something like this in
Boost? Badly enough to jump in and get your hands dirty with some code?
Maybe you could help me to add two-level grammar support to Proto.
the answer is not badly enough to do it myself, and you (or anyone)
have much to do even before deciding to actually try. Yes,
I'll help, but I have no capacity to hand-hold.
However, by all means jump in:
Well, is there room for *yet another guy* in this stuff cause I'm really interested to get somethign out of this.
However, the boost list is probably not the right place. You can reach
me directly through ivan at ootbcomp dot com. Google has a lot
on both Algol68 and van Wijngaarden grammars. Have fun :-)
Ivan