Re: [boost] [preprocessor] "How macro expansion works", comments

15 Feb 2006

      ...
-----Original Message-----
From: boost-bounces@lists.boost.org 
[mailto:boost-bounces@lists.boost.org] On Behalf Of Tobias Schwinger
...
...
The paragraph might not be in the right spot, but what the 
paragraph 
says is important.  The process of macro expansion includes 
more than 
just expanding a single macro invocation.  Rather, it is 
the process 
of scanning for macro expansions, and the whole thing is 
defined that way.
It becomes clear there is more to macro expansion than 
expanding a single macro and that multiple steps are required 
when reading the text...
Okay, but I think that I still need something at the beginning to clarify what
the article is about.
...
The paragraph seems to try an introduction but does a bad job, IMO.
That's fair.  I'll rework it.
...
...
...
The reader probably has no idea what "painted" means at this point. 
Indicate the forward-declaration by "see below" or something like 
that.
I do in the very next sentence.
Yeah, but with too much text, IMO.
Okay.

I was trying to summarize the notational conventions that were used throughout
the document.  I suppose that I could avoid the forward references here, and
just introduce the notations when I introduce the concepts.
...
...
...
...
of tokens looking  for macro invocations to expand.
The most obvious of these is between preprocessing directives 
(subject to conditional compilation).  For example,
I had to read this sentence multiple times for me to make sense...
What part was difficult to follow?  It seems pretty 
straightforward to 
me (but then, I know what I'm looking for).
"Between preprocesing directives" -- what?!
Sure, it is correct. But it's too much from the viewpoint of 
the preprocessor than from where your reader is at.
Well, given that this article is not about the preprocessor at a whole, I think
it is safe to assume that readers are familiar with directives--I'm not even
describing #define directives in this article.

I'm really not sure how else to describe the block of tokens between directives.
Except for unused conditional compilation blocks, all tokens between directives
are scanned for macro expansion.
...
<snip>
...
...
in undefined
...
behavior.  For example,
#define MACRO(x) x
MACRO(
      #include "file.h"
  )
Indicate more clearly that this code is not OK.
The next sentence says that it is undefined behavior.  I'm not sure 
how to make it more clear than that.
An obvious sourcecode comment (e.g. in red).
I'll add a comment, such as:

MACRO(
    #include "file.h" // undefined behavior
)
...
...
...
...
[Blue Paint]
If the current token is an identifier that refers to a macro, the 
preprocessor must  check to see if the token is painted.
If it is painted, it outputs the token and moves  on to the next.
When an identifier token is painted, it means that the 
preprocessor 
will not attempt to  expand it as a macro (which is why it
outputs it
...
and moves on).  In other words, the token itself is flagged as 
disabled, and it behaves like an identifier that does not
corresponds
...
to a macro.  This disabled flag is commonly referred to as "blue 
paint,"  and if the disabled flag is set on a
particular token, that token is called "painted."   (The 
means by which an
identifier token can become painted is described below.)
Remove redundancy in the two paragraphs above.
I believe I was unclear, here:
The redundancy isn't the problem (redundancy is actually a 
good thing in documentation, when used right) but too much 
redundancy in one spot...
Well, I got complaints before about glossing over blue paint, so I'm trying to
be explicit.  At the same time, I'm trying to maintain the algorithm's
point-of-view.  How about:

If the current token is an identifier that refers to a macro, the preprocessor
must check to see if it is painted.  If an identifier token is painted, it means
that the preprocessor will not attempt to expand it as a macro.  In other words,
the token itself is flagged as disabled and behaves like an identifier that does
not correspond to a macro.  This disabled flag is commonly referred to as "blue
paint", and thus a token that is marked as disabled is called "painted."  (The
means by which an identifier token can become painted is describd below.)  If
the current token is painted, the preprocessor outputs the token and moves on to
the next.
...
...
...
...
function-like macro has no formal
parameters, and  therefore any use of the stringizing operator is 
automatically an error.)  The result  of token-pasting in F's 
replacement list is
It's not clear to me why the stringizing operator leads to an 
error rather than a '#' character. Probably too much of a 
sidenote, anyway.
I don't know the rationale for why it is the way it is.
In this case, "therefore" is a bit strange...
I don't know.  It is fairly cause-and-effect.  If you use the stringizing
operator in a nullary function-like macro, it is an error.  For the same reason,
if you use the stringizing operator in a non-nullary function-like macro without
applying it to an instance of a formal parameter, it is an error.
...
...
...
...
[Interleaved Invocations]
It is important to note that disabling contexts only exist 
during a 
_single_
scan.   Moreover, when scanning passes the end of a disabling 
context, that
disabling context  no longer exists.  In other words, the 
output of a 
scan results only in tokens and  whitespace separations.  Some of 
those tokens might be painted (and they remain  painted), but 
disabling contexts are not part of the result of scanning.
...
(If they were, there would be no need for blue paint.)
Misses (at least) a reference to 16.3.4-1 (the wording "with 
the remaining tokens of the source" (or so) is quite nice 
there, so consider using something similar).
I have to clarify: I'm missing a hint (in the text not the 
examples) that tokens from outside the replacement list can 
form a macro invocation together with expansion output. The 
sentence from 16.3.4-1 is actually quite good.
I think that any need for this (i.e. such a hint) is symptomatic of trying to
view macros like functions (e.g. F calls G, G returns 1 to F, F returns 1 to
whatever called F).  I'm in favor of doing anything that I can do to eradicate
this preconception.

At the same time, it is worth noting that the syntax of macro invocations is
dynamic.  That is, a series of macro invocations cannot be parsed into a syntax
tree and then evaluated.  It can only parse one invocation at a time, and its
result is basically appended to the input.
...
Now for the "rescanning" part: You don't have to introduce 
that term. Anyway I wouldn't have figured out what "a 
_single_ scan" was supposed to mean without knowing it, so it 
feels to me here is something missing.
This is the same kind of problem as forward references to (e.g.) blue paint,
except that it is way worse.  The article as a whole describes a single scan for
macro expansion (because that is all there is unless another is introduced by
non-arbitrary means: the entire process is recursively applied to some
arguments).
...
...
I'm pretty sure that I don't use the term "rescanning" 
anywhere in the whole
article (yep, I checked, and I don't).
"Rescanning" comes from the standard, of course. I bit myself 
through chapter 16 because I wanted to know how things work 
before you posted this article.
It is an ill-chosen term, IMO.
...
The previous version of the language is still widely used and 
taught so disambiguation makes some sense, IMO.
Okay, noted.
...
...
...
...
[Virtual Tokens]
BTW. I would've probably called them "control tokens" in 
analogy to "control characters" -- "virtual" has that 
dynamic-polymorphism-association, especially to C++ 
programmers with non-native English...
It is a *somewhat* common term that has been around for a while.  In a way, they
aren't that different from polymorphism.  Regular tokens (e.g. ++), special
tokens (i.e. canonical forms), and virtual tokens are all concrete forms of a
more general concept of token.

Regards,
Paul Mensonides