Re: [boost] Report from Berlin C++ Standards Committee meeting

11 Apr 2006

      ...
-----Original Message-----
From: boost-bounces@lists.boost.org 
[mailto:boost-bounces@lists.boost.org] On Behalf Of Douglas Gregor
...
I'm basing my comments on the most recent "Indiana" proposal 
for concepts, described in committee paper N1849:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1849.pdf
Okay.
...
However, we are busy working on a new proposal for concepts 
that should be available in the near future. The new proposal 
will have different syntax in several places and should have 
greatly improved presentation.
Okay, this is what I suspected--that a lot of the details exist mostly in the
collective minds of those close to the proposal.
...
...
Imagine a world where either 1) copious amounts of implementation 
detail bubbles up the hierarchy of concept requirements,
This claim has been made before, but has yet to be proven. 
With concepts, your templates need to require what they use, 
e.g., if you want to  treat some type "Iter" as an Input 
Iterator in your implementation, you have to say that in the 
interface. Otherwise, you don't get the benefits of separate 
type checking for templates. We've looked at a few examples 
that others have had where implementation details bubble up, 
but they were all either due to misunderstanding of how 
concepts worked or very poor abstractions. Did you have a 
particular example in mind?
Well, for a simple scenario:

template<class T> void f(T x) {
    return x;
}

template<class T> void g(T x) {
    return f(x);
}

If I add concept requirements to f and g, the set of requirements for g has to
include the set of requirements of f and f itself.  If I add another function:

template<class T> void h(T x) {
    return g(x);
}

Its set of requirements must include the set of requirements from g and g
itself.  So far, for h, that set includes CopyConstructible, the ability to call
f, and the ability to call g.

That's what I'm referring to when I say "bubbling of implementation detail."
Note, in particular, that CopyConstructible is not sufficient as a sole
requirement of g or h.  f is needed in the requirements of g, and both f and g
are needed in the requirements of h because both are bound during the second
phase--which could result in (e.g.) an ambiguity.  You need to raise that (e.g.)
ambiguity to the top-level instantiation (or concept check immediately prior to
instantiation) in order to avoid the two-megabyte error messages (whether they
be in regular templates or in concept checks).

Something like CopyConstructible is fairly innocuous as an implementation
detail, the f and g are not--particular the requirement for f in h.  Such
bubbling makes it virtual impossible to let those calls be resolved during the
second phase (without some sort of concept requirement union mechanism either
explicit or implicit).
...
...
2) the 2-megabyte error messages we
get now are replaced with 2-megabyte error messages of concept 
failures deep in the hierarchy of concept requirements,
The 2MB error messages we get today are mostly due to the 
need to include the instantiation context with the message. 
We get something that says "no operator< for x < y", which 
shows up in some implementation detail function 
__introsort_loop(), called from __introsort(), called from 
__sort(), called from sort(), called from <where the user 
actually made the mistake>. Each of these functions needs to 
also have the list of template arguments it was given.  
That's a lot of text.
Yes, I understand that.
...
Concepts provide separate type-checking, so the error will 
show up in one of two places:
1) In the definition of __introsort_loop(), before it 
is instantiated, if there is no suitable operator<. No 
instantiation stack needed, because there's no instantiation yet.
2) In the user code, where the user actually made the 
mistake. This error will say that the requirement for, e.g., 
LessThanComparable is not satisfied by the types involved. No 
instantiation stack needed, because we haven't tried to 
instantiate sort().
What I fail to understand is why this is just LessThanComparable here instead of
all of LessThanComparable, __introsort_loop, __introsort, __sort, and sort.  All
of those can fail to be bound during the second phase--even if they are
namespace qualified--because of overloading.
...
Concepts (and Generic Programming itself) have always been 
about extensibility and avoiding lock-in.
Let's take the trivial example of a min() function and 
examine it's use with and without concepts. Without concepts, 
it looks like this:
template<typename T>
const T& min(const T& x, const T& y)
{
   return x < y? x : y;
}
You can instantiate min() with any type U so long as there is 
a < operator that can compare two U's. The compiler does this 
by substituting U for T throughout the body of the template, 
and then looking up the < operator for two U's.
With concepts, one would create a LessThanComparable concept 
and use that within the declaration of min():
auto concept LessThanComparable<typename T> {
   bool operator<(T, T);
};
template<typename T> where LessThanComparable<T> const T& 
min(const T& x, const T& y) {
   return x < y? x : y;
}
You can instantiate min() with any type U so long as there is 
a < operator that can compare two U's. There are some subtle 
differences (i.e., the concept-based min() has been fully 
type-checked at definition time), but the use of min() will 
be the same either way.
Yes, I understand all of the above, and everything is simple at this simple
level.  What happens, on the other hand, when some other template uses min?  It
then needs both LessThanComparable and min (or some concept name that describes
the operation).  In this particular case it isn't so bad, but when functions (or
any other name bound during the second phase) is a non-public implementation
detail (such as func() calling func_impl()), it is a serious problem (func_impl
is an operation on the types passed to it just as much as operator< is).

Say you have operator< above, but it is implemented as:

template<class T> bool operator<(const X<T>& x, const X<T>& y) {
    x.less(y);
}

Now the min above breaks because the where clause does not include the
requirement that T (the one in min) has a less() member function.  I don't see
how you can avoid the bubbling of implementation detail without severely
restricting second-phase lookup--ultimately to the point of listing viable
alternatives before hand (subsetting second-phase lookup) and prioritizing them
(to avoid ambiguity).
...
Concepts allow one to adapt syntax through the use of 
"concept maps" (called "models" in N1849). Concept maps mean 
that you never have to match the exact syntax required by a 
concept, because you can tell the compiler how your types 
meet the requirements of a concept.  
For instance:
concept_map LessThanComparable<Employee> {
   bool operator<(Employee x, Employee y) {
     return x.id < y.id;
   }
};
void f(Employee e1, Employee e2) {
   min(e1, e2); // Okay, uses LessThanComparable<Employee>::operator<
}
This is a great idea, but that is not my main concern.  My main concern is how
you can avoid implementation detail bubbling--even in simple cases--without some
sort of premature binding.
...
...
IOW, I've yet to see a concept mechanism that could even be used to 
implement the STL (about the simplest possible example) 
with the same 
degree of extensibility that we have now.  Maybe a concept 
mechanism 
can be designed that minimizes how much extensibility is lost, but 
either way, it's hardly the all-around-win-scenario that you're 
implying above.
We immediately rejected any design that could not express the STL.
Note that I said "...implement the STL with the same degree of extensibility
that we have now."
...
The appendix of N1849 contains many examples illustrating how 
the STL can be expressed with our concept proposal. These 
aren't straw men or guesses or wishes: they were  copied 
directly out of the implementation of the Standard Library in 
ConceptGCC, and they work as expected.
Okay.
...
I am fully aware that I sound like a raving lunatic, claiming 
to fix all of the current problems with templates. ConceptGCC 
implements enough of the Indiana concepts proposal to express 
most of the STL, from the simplest concept like 
CopyConstructible all the way to proxy- laden InputIterators. 
We know how to express the rest of the STL, and much of the 
non-STL Standard Library, using concepts, and have also 
looked at expressing concepts from other libraries (e.g., 
Boost.Graph, Boost.Function). What we've found is that we can 
improve the quality of template libraries (by catching bugs 
earlier; we've found several in libstdc++), their usability 
(with shorter/better error messages), and their composability 
(by allowing syntax adaptation).
I really hope that is the case.  However, all examples that you've shown seem to
allow transitive compatibility--meaning f can call g without listing the
requirement for g if f and g both have the same requirements on their operands
(etc.).  I don't see how this could possibly work without binding g at the point
of f's definition--i.e. _not_ during the second phase.
...
...
Perhaps my understanding is out-of-date.  If so, corrections are 
welcome.
However, my opinion (watching this from the very beginning) is that 
concepts are fundamentally at odds with ad-hoc polymorphism and 
second-phase lookup.  In order to do it, you have to make 
significant 
concessions in those two areas--and I doubt this has changed.
There are always tradeoffs when introducing a type system 
into an untyped language. The question is whether one can 
express all of the interesting and useful constructs within 
the type system. If there's some useful construct that you 
are concerned might not be expressible with concepts, please 
give us an example and we'll discuss it.
My main concern is what I've mentioned above--which is a generic concern related
to all generic programming under a concept mechanism.  I'm not trying to be a
nay-sayer, I really want a concept mechanism to work, but I'm not sure that it
can work without major concessions in the ad-hoc way that regular templates
work.  That ad-hoc nature is what yields two-megabyte error messages, but it is
also what makes templates so powerful and so incredibly flexible.

Regards,
Paul Mensonides