Re: [boost] Can't understand how the BOOST_PP_REPEAT_FROM_TO exampleworks]

Thanks Paul, So are you saying that the preprocessor should look at BOOST_PP_REPEAT_FROM_TO and begin expanding that first, in isolation from what follows next on the line i.e. the parameters that are to be passed to whatever BOOST_PP_REPEAT_FROM_TO "returns", and then reset the macro replacement stack before continuing? That would explain a lot. It would also explain why even if I modify the other code to this: #define CONCAT_1(a, b) CONCAT_1_D(a, b) #define CONCAT_1_D(a, b) a ## b #define CONCAT_2(a, b) CONCAT_2_D(a, b) #define CONCAT_2_D(a, b) a ## b #define AB(c, x, y) CONCAT_ ## c(x,y) #define CONCAT CONCAT_1 CONCAT(A, B(1, p, q)) // Not calling CONCAT_1 directly any more Expansion still stops early because after expanding CONCAT (which is not a function macro this time) and deleting the replacement stack we are still kicking the next lot of expansion off with CONCAT_1, so we cannot expand it again? Thanks for your time Paul. I tried responding personally but your mail server considers me to be a spammer ;-) Brad

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Bradley Smith
Okay, this is going to be a very long reply... The process of macro expansion is best viewed as the process of scanning for macro expansions (rather than the process of a single macro expansion). When the preprocessor encounters a sequence of preprocessing tokens (and whitespace separations) that needs to be scanned for macro expansion, it has to perform a number of steps. For example, #define OBJECT 123 + X OBJECT + The preprocessor must scan the sequence of preprocessing tokens (and whitespace separations) between directives looking for macros to expand. For the sake of simplicity, say the preprocessor's current position is on the first +. Also for the sake of simplicity, I'm going to be implicitly skipping whitespace--which the preprocessor can't actually do--and referring to "preprocessing tokens" as "tokens"--though there is a difference. + X OBJECT + ^ The preprocessor sees that this token is not an identifier (and therefore can't be a macro name) and outputs it (or gives it to the parser--IOW, it is done with that token). So, scanning moves on to the next token. + X OBJECT + ^ The preprocessor sees that this token is an identifier (and therefore might be a macro name). The first thing it must do is check to see if the particular token itself is marked as disabled (often called "painted blue" or just "painted"). If it is painted, it outputs the token and moves on. If it is not painted, which is the case here with X, it looks it up in the symbol table (of the preprocessor) and finds that X is not a macro name, so it outputs it and moves on to the next token. + X OBJECT + ^ Again the preprocessor sees that this token is an identifier, so it checks for blue paint on the token (which isn't present) and then looks it up in the symbol table and finds that OBJECT is a macro name. Next, the preprocessor checks if a disabling context on the OBJECT macro exists (not on a particular token). If that disabling context does exist, it paints the particular token, outputs the token, and moves on. If that disabling context doesn't exist, which is the case here, the preprocessor checks to see what type of macro it is--object-like or function-like. In this case, it sees that OBJECT is an object-like macro, so it begins the replacement process. The first step in that process (for an object-like macro) is retrieving the replacement list of the macro definition (stored in the symbol table). In this case, that's just 123. The second step is performing any token-pasting operations inside the replacement list--which is a no-op in this case. Note that there is no stringizing operator in object-like macros. (Note also that the preprocessor has to do more with function-like macros, but I'll get to those later.) After token-pasting, it replaces the invocation (which consists only of a single token in the object-like macro scenario) with the token-pasted version of the replacement list--again only 123. It also creates a geographic disabling context on the macro name (in this case OBJECT). E.g. + X 123 + |^ | |___| | OBJECT disabling context (DC) It then resumes scanning at the first token of the replacement list, and we're back to the beginning of the process. Everything else is a no-op, so the entire result is: + X 123 + It is important to note that the replacement list is not scanned for macro expansion prior to being substituted in place of the invocation. Specifically, there is no "stack of macro expansions" nor does macro expansion form a functional hierarchy. Most people think that in a situation like this... #define A() B() #define B() 123 A() ...the top level calls A, A calls B, B returns 123 to A, and A returns 123 to the top level. That is not the way that it works. Rather, the top-level scan replaces A() with B(), the resumption of the top-level scan replaces B() with 123. More simply put, it is an iterative inlining process, not a recursive functional application. (This is why I put "returns" in quotes in my previous reply.) Continuing, consider a new example (and I'll just diagram the expansion process)... #define A B #define B A A * ^ B * |^| |_| | A DC A * ||^|| ||_|| | | | | B DC |___| | A DC A' * ^ Here, the result is ultimately A' (where the apostrophe indicates blue paint), because the invocation of B is geographically nested within the disabling context of A. It is also important to note that disabling contexts only exist through a single scan for macro expansion. Thus, it is a closing context as scanning moves forward. For example: #define A + + + A * ^ + + + * |^ | |_____| | A DC + + + * |^ | |___| | A DC + + + * |^| |_| | A DC + + + * ^ (This distinction does matter, though it may not seem like it.) On to function-like macros... Macro expansion of function-like macros behaves in exactly the same way except that the preprocessor does a few more things. #define F() 123 F() * ^ The preprocessor sees that the current token is an identifier. As before (and with any identifier token that it encounters) it checks to see if the token is painted. If it is, it outputs the token and moves on. In this case, the F token isn't painted, so the preprocessor looks up F in the symbol table and finds that it is a macro name. As before, it then checks to see if a disabling context exists on that macro. If that disabling context does exist, it paints the token and moves on. If it doesn't exist, which is the case here, it checks the type (object-like or function-like) of the macro. If it is defined as object-like, everything proceeds as above. If it is defined as function-like, which is the case here, then the preprocessor looks at the next token to see if it is a left parenthesis. If it isn't, the preprocessor outputs the token an moves on. [aside... Note that a particular token that refers to a function-like macro still gets painted if it is found inside a corresponding disabling context even if there is no attempt (i.e. parentheses) to invoke it.] If there is a left parenthesis, then invocation must occur or it is an error (e.g. if there is no closing parenthesis). Given that there is a left parenthesis, which is the case here, the preprocessor checks the arity of the macro definition. In this case, it is nullary (i.e. no arguments), so the preprocessor makes sure there are no tokens between the left and right parenthesis of the invocation. If there are, it is a "too many arguments" error. (I'll get to function-like macros with arguments later; I'm trying to gradually introduce complexity.) Assuming that everything is good, the preprocessor begins the replacement process. As with object-like macros, the first step of that process is retrieving the replacement list of the macro definition. In this case, that's again just 123. The next step is performing any token-pasting operations and stringizing operations in the replacement list--which is a no-op here. (Note that this is a function-like macro, so the stringizing operator exists. Unlike the token-pasting operator, the stringizing operator must always be applied to a parameter. If it is applied to anything else, it is an error. In the unique case of a nullary function-like macro, there are no parameters, so any use of the stringizing operator will result in an error at this point.) After token-pasting and stringizing (the order between the two is unspecified), the preprocessor replaces the invocation (which consists of the macro name token and the parentheses) with the token-pasted (etc.) version of the replacement list--just 123. As with object-like macros, it creates a geographic disabling context on the macro name. E.g. 123 * |^ | |___| | F DC The preprocessor then resumes scanning at the first token of the replacement list, and we're again back to the beginning of the process. Everything else is a no-op, so the entire result is: 123 * In essence, it is pretty much the same process for nullary function-like macros as with object-like macros but with a few extra steps. Before moving on to non-nullary function-like macros, we need to examine what it means for an invocation to be nested in a disabling context. Consider #define A() B() #define B() A() A() * ^ A() * |^ | |___| | A invocation (INV) B() * |^ | |___| | A DC B() * ||^ || ||___|| | | | | B INV |_____| | A DC A() * ||^ || ||___|| | | | | B DC |_____| | A DC A'() * || ^ || ||____|| | | | | B DC |______| | A DC A'() * || ^|| ||____|| | | | | B DC |______| | A DC A'() * ^ This one is pretty obvious, as the invocation of B is clearly nested within the disabling context of A. However, things are different when an invocation is *interleaved* with a disabling context: #define A() B #define B() A A()()() * ^ A() ()() * |^ | |___| | A INV B ()() * |^| |_| | A DC B () () * ||^| | ||_|___| | | | | | B INV |__| | A DC A () * |^| |_| | B DC A () * ||^| | ||_|___| | | | | | A INV |__| | B DC B * |^| |_| | A DC B * ^ IOW, an invocation that is interleaved with a disabling context is not geographically nested within that context. When scanning resumes after replacement, that context is no longer active. Thus, in a situation like the above, A and B will continue to expand (back and forth) indefinitely until a left-parenthesis is not found as the next token. (This is the fundamental reason why the REPEAT_FROM_TO example works, but the CONCAT example does not.) On to non-nullary function-like macros... Again, the process is basically the same except that a few more steps are involved. Everything proceeds exactly the same as will nullary invocations until the preprocessor checks the arity of the macro--which it finds to be N. It then takes the delineates the arguments in the argument list and verifies that there are N arguments. If there are more or less, than you its a "too many arguments" or "too few arguments" error. Assuming that everything is good, the preprocessor begins the replacement process. The first step, again, is retrieving the replacement list of the macro definition. It then replaces formal parameters in the replacement list with the actual parameters provided in the invocation. However, it does this in a variety of ways depending on how the formal parameter appears in the replacement list. If a particular instance of a formal parameter in the replacement list is an operand of the token-pasting or stringizing operator (more on this later), the formal parameter is directly replaced by the actual argument. However, if it is not an operand of one of either operator, it is scanned on its own for macro expansion. (This scanning for macro expansion in an argument is the only true instance of recursion in the macro expansion process.) Though it is scanned independently, any disabling contexts still exist (more on this later too). [aside--Note also that an argument is scanned for macro expansion if and *only* if at least one instance of a the corresponding formal parameter exists in the replacement list without being an operand of one of the operators. Thus, in something like #define EAT(x) #define BINARY(a, b) // ... EAT( BINARY() ) This is not an error, even though it would be if any attempt is made to expand BINARY() with too few arguments. Because there is no occurrence of x in the replacement list of EAT, the argument should never undergo a scan for macro expansion. It only undergoes being delineated as an argument.] After argument substitution, token-pasting and stringizing are performed. Again, the order is unspecified, as is the associativity of token-pasting. However, the operands to the token-pasting operator are single tokens--the ones immediately to the left and right of it. The operand to the stringizing operator, OTOH, is the entire argument (recall that it must be applied to a formal parameter). Thus, even though the order of evaluation between token-pasting and stringizing is unspecified, you can think of a use of the stringizing operator as being replaced with a third "stringized" version of the actual argument (rather than substituting the argument in the replacement list, marking the beginning and ending of it in some way, and then performing token-pasting and stringizing). Of course, you can't rely on any order particular of evaluation. Regarding token-pasting... Even though the token-pasting operator operates on exactly two tokens (rather than entire arguments), it does affect entire arguments in the sense that it prevents the entire argument from being scanned for macro expansion prior to substitution. After token-pasting and stringizing, the preprocessor replaces the invocation (which consists of the macro name token, the left parenthesis, all of the arguments, and the right parenthesis) with the argument-substituted-token-pasted-and-stringized version of the replacement list and creates a geographic disabling context on the macro name. Note that the disabling context is created *after* arguments have been expanded (if necessary), which is why stuff like A(A(1)) expands both A's. For example (note that it is hard to diagram the recursive scan of the parameters--so apologies in advance), #define A(x) B(x, x + 1) #define B(p, q) p + q A(7) * ^ A(7) * |^ | |____| | A INV A(7) * |^ | |____| | A INV [recursive scan of argument #1] 7 ^ [result = 7] B(7, 7 + 1) * ||^ || ||___________|| | | | | B INV | |_____________| | A DC [recursive scan of argument #1] 7 ^ _ _ ___ _ _ | A DC [result = 7] [recursive scan of argument #2] 7 + 1 ^ _ _ _____ _ _ | A DC (etc.) [result = 7 + 1] 7 + 7 + 1 * ||^ || ||_________|| | | | | B DC | |___________| | A DC 7 + 7 + 1 * || ^ || ||_________|| | | | | B DC | |___________| | A DC (etc.) 7 + 7 + 1 * ^ Okay, now things get interesting... As with nullary function-like macro invocations, an interleaved invocation is not nested, but when arguments are involved, an invocation inside an argument might be nested in a disabling context even though the invocation to which it is an argument is not nested in that context. (Very very few preprocessors get this completely right, so it is dangerous to mess with.) E.g. #define A(m) m( B(f) #define B(x) A(x) #define C(x) < x > A(C) ) * ^ A(C) ) * |^ | |____| | A INV // recursive scan of argument #1 results in C C( B(f) ) * |^ | |_______| | A DC C( B(f) ) * ||^ | | ||_______|_| | | | | C INV |________| | A DC // recursive scan of argument #1... B(f) ^ | _ _ ____| | A DC B(f) |^ || |____|| | | B INV _ _ _____| | A DC // recursive scan of argument #1 (to B) results in f A(f) |^ || |____|| | | B DC _ _ _____| | A DC A'(f) | ^ || |_____|| | | B DC _ _ ______| | A DC (etc.) // result = A'(f) < A'(f) > * |^ | |_________| | C DC < A'(f) > * | ^ | |_________| | C DC < A'(f) > * | ^ | |_________| | C DC (etc.) < A'(f) > * ^ Phew. As you can see, things can get complex in scenarios like this. In any case, that's a relatively detailed overview of macro expansion. Applying it to the CONCAT example... #define CONCAT_1(a, b) CONCAT_1_D(a, b) #define CONCAT_1_D(a, b) a ## b #define CONCAT_2(a, b) CONCAT_2_D(a, b) #define CONCAT_2_D(a, b) a ## b #define AB(c, x, y) CONCAT_ ## c(x, y) #define CONCAT CONCAT_1 CONCAT(A, B(1, p, q)) ^ CONCAT (A, B(1, p, q)) |^ | |______| | CONCAT INV CONCAT_1 (A, B(1, p, q)) |^ | |________| | CONCAT DC CONCAT_1 (A, B(1, p, q)) ||^ | | ||________|_______________| | | | |_________| CONCAT_1 INV | CONCAT DC // recursive scan of argument #1 results in: A // recursive scan of argument #2 results in: B(1, p, q) CONCAT_1_D(A, B(1, p, q)) |^ | |_________________________| | CONCAT_1 DC CONCAT_1_D(A, B(1, p, q)) ||^ || ||_________________________|| | | | | CONCAT_1_D INV | |___________________________| | CONCAT_1 DC // no recursive scan of argument #1 (only used as ## operand) // no recursive scan of argument #2 (only used as ## operand) // token-pasting: -> A ## B(1, p, q) -> AB(1, p, q) AB(1, p, q) ||^ || ||___________|| | | | | CONCAT_1_D DC |_____________| | CONCAT_1 DC AB(1, p, q) |||^ ||| |||___________||| || | || || AB INV || ||_____________|| | | | | CONCAT_1_D DC |_______________| | CONCAT_1 DC // no recursive scan of argument #1 (only used as ## operand) // recursive scan of argument #2 results in: p // recursive scan of argument #3 results in: q // token-pasting: -> CONCAT_ ## 1(p, q) -> CONCAT_1(p, q) CONCAT_1(p, q) |||^ ||| |||______________||| || | || || AB DC || ||________________|| | | | | CONCAT_1_D DC |__________________| | CONCAT_1 DC CONCAT_1'(p, q) ||| ^ ||| |||_______________||| || | || || AB DC || ||_________________|| | | | | CONCAT_1_D DC |___________________| | CONCAT_1 DC (etc.) CONCAT_1'(p, q) ^ -------------------- Note that the disabling context on the object-like macro CONCAT is long gone, but that doesn't really help because we still get the disabling context on CONCAT_1. -------------------- CONCAT(A, B(2, p, q)) ^ CONCAT (A, B(2, p, q)) |^ | |______| | CONCAT INV CONCAT_1 (A, B(2, p, q)) |^ | |________| | CONCAT DC CONCAT_1 (A, B(2, p, q)) ||^ | | ||________|_______________| | | | |_________| CONCAT_1 INV | CONCAT DC // recursive scan of argument #1 results in: A // recursive scan of argument #2 results in: B(2, p, q) CONCAT_1_D(A, B(2, p, q)) |^ | |_________________________| | CONCAT_1 DC CONCAT_1_D(A, B(2, p, q)) ||^ || ||_________________________|| | | | | CONCAT_1_D INV | |___________________________| | CONCAT_1 DC // no recursive scan of argument #1 (only used as ## operand) // no recursive scan of argument #2 (only used as ## operand) // token-pasting: -> A ## B(2, p, q) -> AB(2, p, q) AB(2, p, q) ||^ || ||___________|| | | | | CONCAT_1_D DC |_____________| | CONCAT_1 DC AB(2, p, q) |||^ ||| |||___________||| || | || || AB INV || ||_____________|| | | | | CONCAT_1_D DC |_______________| | CONCAT_1 DC // no recursive scan of argument #1 (only used as ## operand) // recursive scan of argument #2 results in: p // recursive scan of argument #3 results in: q // token-pasting: -> CONCAT_ ## 2(p, q) -> CONCAT_2(p, q) CONCAT_2(p, q) |||^ ||| |||______________||| || | || || AB DC || ||________________|| | | | | CONCAT_1_D DC |__________________| | CONCAT_1 DC CONCAT_2(p, q) ||||^ |||| ||||______________|||| ||| | ||| ||| CONCAT_2 INV |||________________||| || | || || AB DC || ||__________________|| | | | | CONCAT_1_D DC |____________________| | CONCAT_1 DC // recursive scan of argument #1 results in: p // recursive scan of argument #2 results in: q CONCAT_2_D(p, q) ||||^ |||| ||||________________|||| ||| | ||| ||| CONCAT_2 DC |||__________________||| || | || || AB DC || ||____________________|| | | | | CONCAT_1_D DC |______________________| | CONCAT_1 DC CONCAT_2_D(p, q) |||||^ ||||| |||||________________||||| |||| | |||| |||| CONCAT_2_D INV ||||__________________|||| ||| | ||| ||| CONCAT_2 DC | |||____________________||| || | || || AB DC || ||______________________|| | | | | CONCAT_1_D DC |________________________| | CONCAT_1 DC // no recursive scan of argument #1 (only used as ## operand) // no recursive scan of argument #2 (only used as ## operand) // token-pasting: -> p ## q -> pq pq |||||^ ||||| |||||__||||| |||| | |||| |||| CONCAT_2_D DC ||||____|||| ||| | ||| ||| CONCAT_2 DC |||______||| || | || || AB DC ||________|| | | | | CONCAT_1_D DC |__________| | CONCAT_1 DC pq ^ Okay, now we'll get a little fancy: #define INTERNAL_CAT(a, b) INTERNAL_PRIMITIVE_CAT(a, b) #define INTERNAL_PRIMITIVE_CAT(a, b) a ## b #define EMPTY() // ----- // #define CAT_1(a, b) PRIMITIVE_CAT_1(a, b) #define CAT_1_ID() CAT_1 #define PRIMITIVE_CAT_1(a, b) a ## b #define CAT_2(a, b) PRIMITIVE_CAT_2(a, b) #define CAT_2_ID() CAT_2 #define PRIMITIVE_CAT_2(a, b) a ## b #define CAT_3(a, b) PRIMITIVE_CAT_3(a, b) #define CAT_3_ID() CAT_3 #define PRIMITIVE_CAT_3(a, b) a ## b #define CAT_4(a, b) PRIMITIVE_CAT_4(a, b) #define CAT_4_ID() CAT_4 #define PRIMITIVE_CAT_4(a, b) a ## b // ----- // #define CAT TRY_1() #define TRY_1() INTERNAL_CAT(TRY_1_, CAT_1(0, 0))() #define TRY_2() INTERNAL_CAT(TRY_2_, CAT_2(0, 0))() #define TRY_3() INTERNAL_CAT(TRY_3_, CAT_3(0, 0))() #define TRY_4() INTERNAL_CAT(TRY_4_, CAT_4(0, 0))() #define TRY_1_00 CAT_1_ID #define TRY_2_00 CAT_2_ID #define TRY_3_00 CAT_3_ID #define TRY_4_00 CAT_4_ID #define TRY_1_CAT_1(a, b) TRY_2 #define TRY_2_CAT_2(a, b) TRY_3 #define TRY_3_CAT_3(a, b) TRY_4 #define TRY_4_CAT_4(a, b) error: ran out of CAT depth! EMPTY // ----- // CAT(C, AT(C, AT(1, 2))) // 12 ------------------------- To check you're understanding, you might try to diagram this expansion. I'll leave that as an exercise, though, this reply is already very long. This is an example of how one might detect the next available CAT implementation. It does a linear search. It tries to expand CAT_1. If that succeeds, CAT_1 is "returned". If not, it tries CAT_2, and so on. This is a less than ideal way to do it (both the linear search and macro replication), but it's how you might do it from scratch. The pp-lib uses a binary search rather than a linear search, and encapsulates as much as possible: #include <boost/preprocessor/detail/auto_rec.hpp> #define INTERNAL_CAT(a, b) INTERNAL_PRIMITIVE_CAT(a, b) #define INTERNAL_PRIMITIVE_CAT(a, b) a ## b // ----- // #define CAT_1(a, b) PRIMITIVE_CAT_1(a, b) #define PRIMITIVE_CAT_1(a, b) a ## b #define CAT_2(a, b) PRIMITIVE_CAT_2(a, b) #define PRIMITIVE_CAT_2(a, b) a ## b #define CAT_3(a, b) PRIMITIVE_CAT_3(a, b) #define PRIMITIVE_CAT_3(a, b) a ## b #define CAT_4(a, b) PRIMITIVE_CAT_4(a, b) #define PRIMITIVE_CAT_4(a, b) a ## b // ----- // #define CAT INTERNAL_CAT(CAT_, BOOST_PP_AUTO_REC(CAT_P, 4)) #define CAT_P(n) INTERNAL_CAT(CAT_, CAT_ ## n(0, 0)) #define CAT_00 1 #define CAT_CAT_1(a, b) 0 #define CAT_CAT_2(a, b) 0 #define CAT_CAT_3(a, b) 0 #define CAT_CAT_4(a, b) 0 CAT(C, AT(C, AT(1, 2))) // 12 ------------------------- Of course, we can do better than that: #include <boost/preprocessor/detail/auto_rec.hpp> #include <boost/preprocessor/detail/is_nullary.hpp> #define INTERNAL_CAT(a, b) INTERNAL_PRIMITIVE_CAT(a, b) #define INTERNAL_PRIMITIVE_CAT(a, b) a ## b // ----- // #define CAT_1(a, b) PRIMITIVE_CAT_1(a, b) #define PRIMITIVE_CAT_1(a, b) a ## b #define CAT_2(a, b) PRIMITIVE_CAT_2(a, b) #define PRIMITIVE_CAT_2(a, b) a ## b #define CAT_3(a, b) PRIMITIVE_CAT_3(a, b) #define PRIMITIVE_CAT_3(a, b) a ## b #define CAT_4(a, b) PRIMITIVE_CAT_4(a, b) #define PRIMITIVE_CAT_4(a, b) a ## b // ----- // #define CAT INTERNAL_CAT(CAT_, BOOST_PP_AUTO_REC(CAT_P, 4)) #define CAT_P(n) BOOST_PP_IS_NULLARY(INTERNAL_CAT(CAT_, CAT_ ## n(0, 0))) #define CAT_00 () // ----- // CAT(C, AT(C, AT(C, AT(1, 2)))) // 12 ------------------------- These examples are just trying to give you an idea of what REPEAT_FROM_TO is doing to make it appear that it is recursive. It might make things more concrete with a possible implementation strategy. The single most design-affecting part of this whole process is the disabling contexts--everything else is pretty straightforward--so I'll start with that. Recall that disabling contexts can be viewed as closing contexts because they are only active during a single scan for macro expansion. (I said before that this matters, but I wasn't referring to possible implementation strategies, but rather what one can do because of that behavior. I won't get into that here, however.) Because disabling contexts are only active during a single scan for macro expansion, they can be implemented using flags in the symbol table and virtual tokens. I.e. instead of maintaining an expensive structure similar to my diagrams, at the point when an invocation is replaced by the replacement list, the preprocessor can set a "disabled" flag in the symbol table on the macro definition. It can then put a virtual token at the end of the replacement list (diagramed as <MACRO>) that when encountered resets that flag (and is deleted). E.g. #define A() ... A() A() * |^ | |___| | A INV (set flag on A in symbol table) ... <A> A() * ^ ... <A> A() * ^ (reset flag on A) ... A() * |^ | |___| | A INV (set flag on A) ... ... <A> * ^ ... ... <A> * ^ (reset flag on A) ... ... * ^ This is pretty straightforward. It is a little more complex when parameters are involved. For any given formal parameter, how it is handled during substitution is a property of the formal parameter itself. It is known at the time of definition what variations of a particular formal parameter are needed. I.e. if the parameter is unused no variation of it is needed, if it is used without being an operand of # or ## then a macro-expanded variation is needed, if it is used as an operand of # or ## then a non-macro-expanded variation is needed. Note that both of the last two variations might be needed for a given parameter (i.e. they aren't mutually exclusive). Starting with the easy case--when no variation of the parameter is needed... This case is easy because the actual parameter is just going to be discarded. However, prior to being discarded, the argument has to be scanned in order to execute any virtual tokens that it might contain. After it does that, it can be discarded, and the preprocessor is done with that parameter. In the case where only the macro-expanded variation of the parameter is required, it just needs to be scanned for macro replacement as normal. In the case where only the non-macro-expanded variation is required, it needs to be scanned 1) to paint tokens and 2) to execute any virtual tokens that it might contain (deleting them after it executes them). The only hard case is when both a macro-expanded variation of the parameter is required and a non-macro-expanded variation of the same parameter is required. In that case, the preprocessor has to do both of the last two cases. The semi-hard part is that it has to start both scenarios with the same state--meaning all the same flags must be set, etc.. This can be accomplished by doing the non-macro-expanded variation first, but keeping track of every virtual token that gets executed so that all of the resets can be undone before the preprocessor starts the recursive scan-for-macro-expansion on the argument.
Thanks for your time Paul. I tried responding personally but your mail server considers me to be a spammer ;-)
Hmm. I don't have any filters set up on the server itself, so that's strange. Regards, Paul Mensonides

That was a very enlightening tutorial. Is there any chance it could be added to the boost pp documentation? -Jason
participants (4)
-
Bradley Smith
-
Deane Yang
-
Jason Hise
-
Paul Mensonides