Re: [boost] [preprocessor] "How macro expansion works", comments

15 Feb 2006

      Paul Mensonides wrote:
...
...
...
The process of macro expansion is best viewed as the process of 
scanning for macro  expansion (rather than the process of a single 
macro expansion alone).
When the  preprocessor encounters a sequence of
preprocessing tokens
...
and whitespace separations  that needs to be scanned for macro 
expansion, it has to
perform a number of steps.   These steps are examined in this 
document in
detail.
Strike that paragraph. It uses terms not yet defined and 
doesn't say much more than the title (assuming it's still 
"how macro expansion works").
The paragraph might not be in the right spot, but what the paragraph says is
important.  The process of macro expansion includes more than just expanding a
single macro invocation.  Rather, it is the process of scanning for macro
expansions, and the whole thing is defined that way.
It becomes clear there is more to macro expansion than expanding a single macro and that multiple steps are required when reading the text...

The paragraph seems to try an introduction but does a bad job, IMO.
...
...
The reader probably has no idea what "painted" means at this 
point. Indicate the forward-declaration by "see below" or 
something like that.
I do in the very next sentence.
Yeah, but with too much text, IMO.
...
...
...
[Locations]
There are several points where the preprocessor must scan a
sequence 
of tokens looking  for macro invocations to expand.
The most obvious of these is between preprocessing 
directives (subject 
to conditional compilation).  For example,
I had to read this sentence multiple times for me to make sense...
What part was difficult to follow?  It seems pretty straightforward to me (but
then, I know what I'm looking for).
"Between preprocesing directives" -- what?!

Sure, it is correct. But it's too much from the viewpoint of the preprocessor than from where your reader is at.

<snip>
...
...
in undefined
...
behavior.  For example,
#define MACRO(x) x
MACRO(
      #include "file.h"
  )
Indicate more clearly that this code is not OK.
The next sentence says that it is undefined behavior.  I'm not sure how to make
it more clear than that.
An obvious sourcecode comment (e.g. in red).
...
...
...
[Blue Paint]
If the current token is an identifier that refers to a macro, the 
preprocessor must  check to see if the token is painted.
If it is painted, it outputs the token and moves  on to the next.
When an identifier token is painted, it means that the preprocessor 
will not attempt to  expand it as a macro (which is why it
outputs it
...
and moves on).  In other words, the token itself is flagged as 
disabled, and it behaves like an identifier that does not
corresponds
...
to a macro.  This disabled flag is commonly referred to as "blue 
paint,"  and if the disabled flag is set on a
particular token, that token is called "painted."   (The 
means by which an
identifier token can become painted is described below.)
Remove redundancy in the two paragraphs above.
I believe I was unclear, here:

The redundancy isn't the problem (redundancy is actually a good thing in documentation, when used right) but too much redundancy in one spot...
...
...
...
In the running example, the current token is the identifier OBJECT, 
which _does_ correspond to a macro name.  It is not
painted, however,
...
so the preprocessor moves on  to the next step.
[Disabling Contexts]
If the current token is an identifier token that corresponds to a 
macro name, and the  token is _not_ painted, the preprocessor must 
check to see if a disabling context that corresponds to the macro 
referred to by the identifier is active.  If a corresponding  
disabling context is active, the preprocessor paints the identifier 
token, outputs it,  and moves on to the next token.
A "disabling context" corresponds to a specific macro and 
exists over 
a range of tokens  during a single scan.  If an identifier 
that refers 
to a macro is found inside a disabling context that 
corresponds to the 
same macro, it is painted.
Disabling contexts apply to macros themselves over a given 
geographic 
sequence of  tokens, while blue paint applies to particular 
identifier 
tokens.  The former causes  the latter, and the latter is what 
prevents "recursion" in macro expansion.  (The means  by which a 
disabling cotnext comes into existence is discussed below.)
In the running example, the current token is still the identifier 
OBJECT.  It is not  painted, and there is no active 
disabling context that would cause it to be
painted.   Therefore, the preprocessor moves on to the next step.
The introductions of these terms feels structurally too 
aprupt to me. Introduce these terms along the way, continuing 
with the example.
They appear at the first point where their definition must appear.
I believe it's useful to sustain it.

<snip>
...
...
...
from the replacement list.
+ X OBJECT F() +
     |          |
     |__________|
          |
          OBJECT disabling context (DC)
<-- explain what a disabling context and then what blue paint 
is is here
Do you mean that they should be defined here for the first time, or that they
should be defined here again (but maybe with less detail)?
I meant: introduce the terms here.
...
...
...
function-like macro has no formal
parameters, and  therefore any use of the stringizing operator is 
automatically an error.)  The result  of token-pasting in F's 
replacement list is
It's not clear to me why the stringizing operator leads to an 
error rather than a '#' character. Probably too much of a 
sidenote, anyway.
I don't know the rationale for why it is the way it is.
In this case, "therefore" is a bit strange...
...
...
...
[Interleaved Invocations]
It is important to note that disabling contexts only exist during a 
_single_
scan.   Moreover, when scanning passes the end of a disabling 
context, that
disabling context  no longer exists.  In other words, the 
output of a 
scan results only in tokens and  whitespace separations.  Some of 
those tokens might be painted (and they remain  painted), but 
disabling contexts are not part of the result of scanning.
...
(If they were, there would be no need for blue paint.)
Misses (at least) a reference to 16.3.4-1 (the wording "with 
the remaining tokens of the source" (or so) is quite nice 
there, so consider using something similar).
I have to clarify: I'm missing a hint (in the text not the examples) that tokens from outside the replacement list can form a macro invocation together with expansion output. The sentence from 16.3.4-1 is actually quite good.
...
...
I believe I wouldn't really understand what you are talking 
about here without knowing that part of the standard. "A 
single scan" -- the concept of rescanning was introduced too 
periphicially to make much sense to someone unfamiliar with the topic.
This all comes back to the beginning--the process is scanning a sequence of
tokens for macros to expand (i.e. the first paragraph that you said I should
strike).  This entire process is recursively applied to arguments to macros
(without begin an operand...) and thus this entire scan for macros to expand can
be applied to the same sequence of tokens more than once.  It is vitally
important that disabling contexts don't continue to exist beyond the scan in
which they were created, but that blue paint does.  As I mentioned, there would
be no need for blue paint--what the standard calls "non-replaced macro name
preprocessing tokens"--if the disabling contexts weren't transient.
Now for the "rescanning" part: You don't have to introduce that term. Anyway I wouldn't have figured out what "a _single_ scan" was supposed to mean without knowing it, so it feels to me here is something missing.
...
I'm pretty sure that I don't use the term "rescanning" anywhere in the whole
article (yep, I checked, and I don't).
"Rescanning" comes from the standard, of course. I bit myself through chapter 16 because I wanted to know how things work before you posted this article.
...
...
...
In C++, if any argument is empty or contains only whitespace 
separations, the behavior  is undefined.  In C, an empty 
argument is 
allowed, but gets special treatment.  (That special treatment is 
described below.)
It requires at least C99, right? If so, say it (it's likely 
there are C compilers that don't support that version of the 
language).
As far as I am concerned, the 1999 C standard defines what C is until it is
replaced by a newer standard.  Likewise, 1998 standard defines what C++ is until
it is replaced by a newer standard.  I.e. an unqualified C implies C99, and
unqualified C++ implies C++98.  If I wished to reference C90, I'd say C90.
Luckily, I don't wish to reference C90 because I don't want to maintain an
article that attempts to be backward compatible with all revisions of a
language.  This is precisely why I should have a note above that variadic macros
are not part of C++, BTW, even though I know they will be part of C++0x.
The previous version of the language is still widely used and taught so disambiguation makes some sense, IMO.
...
Furthermore, deficiencies of compilers (like not implementing the language as it
is currently defined or not doing what they should during this process) is not a
subject of this article.
OTOH, it wouldn't hurt to mention in the "Conventions" section that, at the time
of this writing, C is C99 and C++ is C++98.
It adds clutter noone wants to read -- adding the version number still seems the better solution to me ;-).
...
...
...
[Virtual Tokens]
BTW. I would've probably called them "control tokens" in analogy to "control characters" -- "virtual" has that dynamic-polymorphism-association, especially to C++ programmers with non-native English...
...
...
I hope it's of any use.
Definitely.  Thanks again.
You're welcome -- it's my way to thank you for your support.

--
Tobias

Re: [boost] [preprocessor] "How macro expansion works", comments

Tobias Schwinger