Re: [boost] [contract] syntax redesign

4 Jun 2011

      On Thu, Jun 2, 2011 at 1:56 PM, Matt Calabrese <rivorus@gmail.com> wrote:
...
On Thu, Jun 2, 2011 at 9:28 AM, lcaminiti <lorcaminiti@gmail.com> wrote:
...
Hey Matt, I am still looking over your slides so sorry in advance if the
following suggestions don't make sense...
First, thank you so much for taking the time to do this. I don't think
anyone other than myself has really picked apart the latest syntax and tried
to improve it.
:)
...
...
Is it possible remove some of the
extra parenthesis from your examples on page 79 and 96 as I suggest below?
If not, why?
...
// Is the following syntax possible?
BOOST_GENERIC(
// Use leading `concept` "keyword" to determine macro functionality.
I considered this during implementation, and same with "auto" concepts, but
decided against it for a few reasons. For one, it complicates the
implementation for, imo, little or no gain, and it can impact compile-time,
which is something that I consider a problem with the library at this point
in time.
OK.
...
...
// `typename` doesn't need extra parenthesis.
It does if I want to be consistent with all parameter kinds, but I could
special-case typename and class if I disallow fully qualified names as you
currently do. Right now I support typename/class, value parameters, and soon
varidiac parameters which will work via the syntax ( (ParameterKindHere,,,)
SomeNameHere ). I'll give some thought to special-casing, but for now, it's
more consistent for users and easier to preprocess if I don't do that, and
fully qualified names are handled without problem. The more I look at your
library though and the more you mention it, the more I question my choice,
so it is possible I'll start special-casing things as you are doing, but I
have a lot on my plate before I devote time to changing features that
already work.
My new syntax also supports these cases:

    template< class T = int, typename U, T x, int y = 0 > // usual C++

    template( class T, default int, typename U, (T) x, int y, default
0 ) // contract syntax

If for consistency you want to use extra parenthesis around all types,
the following also works:

    template( class T, default int, typename U, (T) x, (int) y, default 0 )
...
// `extends` is not necessary but I find it more readable.
...
This was just because I wanted to use a keyword that an IDE would highlight.
In the predecessor to this library I was using words that weren't keywords
and Dave Abrahams suggested that I switch over to keywords, with a rationale
that I ultimately agreed with (easy to remember, it's not highlighted if you
make a typo). Extends is not and likely will not be a keyword in C++.
Also, with concepts, there's no such thing as "private" or "protected"
refinement -- the public there acts as "extends" acts in your example in
that it's only used to signify that what follows is a list of less refined
concepts. If I change it, the public part would just be replaced rather than
both words being there, but I'd  probably use "refines" instead of extends
since that's the proper term in Generic Programming, though that too is not
a keyword.
This still keeps public that is highlighted (BTW, in the library docs
I will provide the DSEL grammar and a list of the new "keyword" like
extends, requires, precondition, etc so you can configure your IDE to
color them up) and yes, refines makes more sense:

    refines( public Semiregular<T> )
...
...
   // BOOST_GENERIC_ASSIGN just expand to `,` but I find it more readable.
I'm not a big fan of that. I use , because it's short and simple, though I
have considered (,) to make it jump out as being different and unambiguous.
If you can provide a good rationale for BOOST_GENERIC_ASSIGN over those
options, I'm open to it, but at this point I favor simplicitly over a big
macro name (and my gut is that most users will too).
...
   , (MoveConstructible) reference BOOST_GENERIC_ASSIGN typename
The encapsulating set of parentheses around each individual parameter are
for important reasons. For one, your current example is actually ambiguous.
//////////
(MoveConstructible) reference, typename something_here
//////////
could either be a contrained typename requirement with a default, or it
could be a constrained typename without a default followed by an
unconstrained typename. The encapsulating parentheses is imo the simplest
solution for users to remember and for implementation, although if I use
something like (,) or BOOST_GENERIC_ASSIGN for default assignment I could
disambiguate this particular issue that way.
I used ", default ..." for default parameters:

    (MoveConstructible) reference, default typename x::reference
...
However, another issue is that the default may, itself, have commas in it.
Without the encapsulating parentheses, I can't know if the next comma in a
default is a parameter separator or a part of the default (for instance, a
default of boost::array< int, 2 >). If I special-case it so that you have
the option of wrapping the default in parentheses if it has commas, then
it's ambiguous with the situation where there is no default and where the
next parameter to the concept macro starts with parentheses, etc. I could
keep trying to special-case further, or I could just make a single,
consistent rule that's easy to remember as I've done. IMO, this option is
simpler for both myself and users and also aids in keeping compile-times
down.
To handle unparenthesized commas, I'd use the IDENTITY_TYPE/VALUE
macros. Commas not wrapped within parenthesis are rare so IMO it's not
worth to ask users to always use extra parenthesis to handle them.
It's simpler to ask user to use a special macro when they are present.
...
Also, since I am consistent with wrapping each parameter with parentheses,
I also went down this path and called the previous syntax
"parenthesized syntax" because all tokens were consistently wrapped
within extra parenthesis. Then I was think it's easier for users to
learn the syntax "just wrap all tokens of the usual C++ syntax within
parenthesis (more or less)". Everyone that looked at the syntax
complained about the extra parenthesis :(( I also now think that the
newer syntax with only a few extra parenthesis is better.
...
it is much easier for me to give descriptive error reporting if the user
screws something up that I can catch at preprocessing time. For instance, if
a user is creating a many-line concept, such as an allocator-like concept,
and he makes a typo somewhere in the middle, I want to be able to give the
best error possible, including a parameter number, that particular parameter
text in full, and even be able to continue parsing the remaining parameters
to the concept, catching other potential errors during preprocessing. If
each parameter is wrapped in parentheses, it is much easier to do this --
any noticeable error that occurs in the middle of a given parameter, even if
it involves something tricky like commas, does not affect the parsing of
further parameters, and I'm able to output the full parameter text of the
parameter with the error in a static assert without accidentally cutting off
part of what the user thought was a single parameter but ends up not being
the case because of a misplaced comma.
Error reporting is a good point especially because all your code is
within the macros (for Boost.Local the function body code is outside
the macro and that's were most of the actual programming errors will
be so I likely don't have to worry about error reporting too much -- I
still check the syntax using the pp as much as possible).
...
Finally, the concept macro, as I'm sure you can imagine, is fairly nasty
underneath the hood. One nice thing about the implementation, however, is
that I handle most of the top-level implementation via a simple preprocessor
for-each operation over all of the parameters after the name and
refinements. Having the parameters consistently separated is what makes this
directly possible without any difficulty -- since the parameters are just
individual macro arguments, I use a construct that invokes a macro for each
parameter. It doesn't require any complicated logic to figure out where
individual parameters are separated, is very efficient, and it keeps the
loop logic separate from the parsing of individual parameters.
...
   // `operator(symbol, name)` allow user to name the operators
I can't use a symbol, and a user-provided name is unfortunately meaningless
both to me and the user here. Associated operator requirements of concepts
check if the operator is callable with respect to the pseudo-signature. The
reason I use a fixed name is because, internally, I have to generate
metafunctions during preprocessing that check to see if the operation is
callable for the given parameter types, and what is generated can vary
depending on which operator is used. In order to do this, I have to know,
entirely during preprocessing, exactly which operator is specified. You
can't do this if the user provides the symbol representation of the
operator, and an arbitrary, user-specified name wouldn't help. The names I
use come from N2914 standard concept names, so there is also some
consistency here.
I see.
...
...
   // Types `X&`, `X&&`, `int`, etc do not need extra parenthesis.
I could special case int and other built-ins, but I don't see how it would
be possible to special case X, or X&, etc. The "X" is a user-defined concept
parameter for that specific macro invocation, not a universal identifier
that I'm using.
, (postincrement_result) operator(++, postinc)( X&, int )

That's a tuple and X& is the only token... why can't you just get it with:

    PP_VARIADIC_TUPLE_ELEM(0, (X&, int)) // expand to X&

I am missing something here... maybe sometimes you have some other
tokens like in ( (X&) x, int y ) ??
...
...
BOOST_GENERIC(
// Use `concept_map` "keyword" to determine macro functionality.
Yes, I can do that, but again, it makes preprocessing more complicated,
affects compile-time, and is one more place for a user to make a mistake
that I would want to explicitly look for if I were to give an easy-to-read
error message. Overall, I don't think I'm gaining enough, if anything, by
pushing more stuff into the macro itself. In an ideal world, I agree, that
looks cool, but I just don't think it's practical. Simply calling the macro
BOOST_GENERIC_CONCEPT_MAP gives me all of the information that I need
without additional preprocessing.
...
// Do you need ObjectType? Is was missing from above syntax...
Sorry, that's covered in the talk itself and is not directly in the slides
(the video is not online yet, but I'll link it whenever whoever is doing
post uploads it) -- I can't handle concept-constrained concept maps just yet
and if it's not clear, ObjectType is a concept as opposed to a type. That's
why the example is not exactly the same between the two. In the end, if you
are curious, the syntax will be
//////////
template( ((ObjectType)) T )
// or if I disallow fully-qualified names
template( (concept ObjectType) T )
//////////
  , typedef T value_type
...
   , typedef ptrdiff_t difference_type
   , typedef T& reference
   , typedef T* pointer
No, that's not really possible in practice. The problem is that the typedefs
themselves may contain commas in them and what comes after the comma could
potentially be something not parenthesized that is not valid to concatenate
with at preprocessing time. I.E. a bool argument with a !, etc. Trying to
Again, can't you use IDENTITY_TYPE to handle unwrapped commas as
special cases? That way you simply the syntax removing the extra
parenthesis for the common cases (unwrapped commas are rare in my
experience).
...
handle this syntax and do special casing would be complicated and
incomplete. Worse, when someone does something that looks as though it
should be valid but really is not, solely for preprocessor reasons that I
can't explicitly check for such as the one mentioned above, they would get a
strange preprocessor concatenation error that would likely be completely
meaningless to them. I try hard to make it so that the user should not be
seeing errors like that, and without them having to remember what may seem
like arbitrary rules that only make sense if you have written something
along the lines of this library or your library, or Boost.Parameter, etc.
which most programmers have not. Even then, the issues are extremely subtle.
Again, thanks for all of this feedback. I've been wanting another head to
kind of tear things apart. If anything, I think I may go ahead and
special-case parameters like void, class, typename, etc. though that won't
be for a while. Not using a comma as a replacement for = when specifying
defaults may happen too. All of the other special-casing and attempted
removal of parentheses I'll have to give further thought to, but I think the
current design has the least subtle gotchas.
Thanks to you to take a look at Boost.Contract newer syntax. BTW, this
is major syntax re-design #4 for Boost.Contract... it requires a lot
of patience and perseverance :)

--Lorenzo

Re: [boost] [contract] syntax redesign

Lorenzo Caminiti