
On Tue, Aug 17, 2010 at 2:12 AM, Paul Mensonides <pmenso57@comcast.net> wrote:
On 8/16/2010 9:21 PM, Lorenzo Caminiti wrote:
Yes, I am aware of this "limitation". However, for my application it is not a problem to limit the argument of `IS_PUBLIC()` to pp-identifiers and pp-numbers with no decimal points (if interested, see "MY APPLICATION" below).
1) Out of curiosity, is there a way to implement `IS_PUBLIC()` (perhaps without using `BOOST_PP_CAT()`) so it does not have this limitation? (I could not think of any.)
The limitation is not BOOST_PP_CAT per se, but token-pasting in general. The "good" part of using BOOST_PP_CAT in combination with BOOST_PP_IS_NULLARY, et al, is that they have been "hacked" together for preprocessors that are broken. Effectively, the detection macros work by
Yes, I understand.
manipulating the operational syntax of macro expansion. For that to work, stuff has to happen (namely, macros being expanded) at roughly the correct time. The basic problem with VC++, for example, is that they don't, so the pp-lib works overtime to attempt to _force_ expansions all over the library.
I got my pp-parsers to successfully work under both GCC and MSVC. Especially on MSVC, I also had to do "hack" some of the macros to make sure they expand when they are supposed to -- BTW, having a library like Boost.Preprocessor has proven to be immensely useful.
Unfortunately, there is a limit to what can be forced--particularly with more advanced manipulations of the macro expansion process such as those used by Chaos where there is analogy to the uncertainty principle (e.g. you cannot force expansion in may contexts without changing the result = you cannot measure particle velocity and position at the same time). Even with those types of manipulations, however, there is no way to do the above with "smashing the particles together and seeing what comes out."
That's an interesting analogy :) (I do have an engineering/physics background).
The limitation is caused by the ridiculous limitation that token-pasting arbitrary tokens together where the result is not a single token results in undefined behavior. Even to detect this scenario, the simplest implementation in a preprocessor is to simply juxtapose the characters making up the tokens and re-tokenize them. If there is more than one, issue diagnostic, otherwise insert the single token. A better definition would be simply to insert the resulting sequence of tokens.
2) Also, does the expansion of any of the following result in undefined behavior? (I don't think so...)
IS_PUBLIC(public abc) // Expand to 1. IS_PUBLIC(public::) // Expand to 1. IS_PUBLIC(public(abc, ::)) // Expand to 1. IS_PUBLIC(public (abc) (yxz)) // Expand to 1.
(My application relies on some of these expansions to work.)
All of those look fine. Basically, what happens in the following
#define M(a) id ## a
The appearance of the formal parameter 'a' adjacent to the token-pasting operator affects _which_ actual parameter is substituted. Namely, the version of the actual parameter which has _not_ had macros replaced in it. However, the token-pasting operation doesn't occur until after that substitution, and its operands are only the two _tokens_ immediately adjacent to it. E.g.
#define A() 123 #define B(x) x id ## x
B(A()) => 123 id ## A() => 123 idA()
OK, now I understand much better how my `IS_PUBLIC()` macro actually works -- thanks a lot!
I.e. the token-pasting operator affects the expansion of the actual parameter (at least in that substitution context), but its operands are only the tokens on either side after that substitution.
Because of that, you're basically getting:
PREFIX_ ## public abc PREFIX_ ## public :: PREFIX_ ## public ( abc , :: ) PREFIX_ ## public ( abc ) ( yxz )
...all of which are okay.
MY APPLICATION
I am using `IS_PUBLIC()` and similar macros to program the preprocessor to *parse* a Boost.Preprocessor sequence of tokens that represents a function signature. For example:
class c { public: void f(int x) const; // Usual function declaration. };
class c { PARSE_FUNCTION_DECL( // Equivalent declaration using pp-sequences. (public) (void) (f)( (int)(x) ) (const) ); };
What happens with stuff like pointers, or does that not matter for your application? E.g. (public) (void) (f)( (int*)(x) ) (const) ?
My library does not need to detect pointers at the preprocessor metaprogramming level. I can wait until using the compiler at the template metaprogramming level to detect and handle pointers (using Boost.MPL, Boost.TypeTraits, etc). So my pp-parser macros simply have to expand: IS_PUBLIC(int*) // Expand to 0. IS_INT(int*) // Expand to 1. Where I never use the last expansion because I use template metaprogramming to detect and manipulate types. Similarly for references, etc. (There is actually one exception to this for functions returning `void*` because my pp-parser macro need to detect functions returning `void`. I have implemented a workaround for this case allowing a special syntax within the signature sequence... but that is _very_ specific to my application.)
The parser macro above can say "the signature sequence starts with `public` so this is a member function" at a preprocessor metaprogramming level and then expand to special code as a library might need to handle member functions. The parser macros can even do some basic syntax error checking -- for example, if `(const)` is specified as cv-qualifier at the end of the signature sequence of a non-member function, the parser macro can check that and expand to a compile-time error like `SYNTAX_ERROR_unexpected_cv_qualifier` (using `BOOST_MPL_ASSERT_MSG()`).
Most of the tokens within C++ funciton signatures are composed of pp-idenfitiers such as the words `public`, `void`, `f`, etc. There are some exceptions like `,` to separate funciton parameters, `<`/`>` for templates, `:` for constructors' member initializers, etc. The grammar of my preprocessor parser macros requires the use of different tokens in these cases. For example, parenthesis `(`/`)` are used for templates instead of `<`/`>`:
template< typename T> f(T x); // Usual.
PARSE_FUNCTION_DECL( // PP-sequence. (template)( (typename)(T) ) (f)( (T)(x) ) );
(Instead of `(template)(<) (typename) (T) (>) (f)( (T)(x) )` which will have caused the parser macro to fail when inspecting `(<)` via one of the `IS_XXX()` macros as per the limitation from using `BOOST_PP_CAT()` mentioned above.)
The grammar of my preprocessor parser macros clearly documents that only pp-identifiers can be passed as tokens of the function signature sequence. Therefore, the "limitation" of `IS_PUBLIC()` indicated above is not a problem for my application.
Thank you very much.
You're welcome. I don't know the ultimate purpose of this encoding, but the
This encoding, which I am calling "parenthesized syntax" (given the ridiculous amount of parenthesis that it requires :) ) is used by my library under construction "Boost.Contract" https://sourceforge.net/projects/dbcpp/ to implement contract programming for C++ as specified by N1962. For example: template<typename T> class myvector { public: CONTRACT_FUNCTION( (public) (void) (push_back)( (const T&)(element) ) (copyable) (precondition)( (size() < max_size()) // More preconditions here... ) (postcondition)( (size() == (CONTRACT_OLDOF(this)->size() + 1)) // More postconditions here... ) ({ ... // Original implementation. }) ) ... }; Note how I can define new "keywords" like `precondition`, `postcondition`, `copyable`, etc; program `IS_XXX()` macros for those; and use the pp-parser macros to parse them and expand to code that checks these assertions at the right time during execution. I have also extended the parenthesized syntax to support concepts (interfacing with Boost.ConceptCheck) and named parameters (interfacing with Boost.Parameter). The idea being that contracts + concepts + named parameters fully specify the interface requirements. An example of concepts + contracts: CONTRACT_FUNCTION( (template)( (typename)(T) ) (requires)( (boost::CopyConstructible<T>) (boost::Assignable<T>) (Addable<T>) ) (T) (sum)( (T*)(array) (int)(n) (T)(result) ) (precondition)( (array) (n > 0) ) ({ ... // Original implementation. }) )
encoding itself doesn't look too bad.
In my experience, the parenthesized syntax is OK for this application -- it's not terrible but it's not great either... My programmer's life would be better without this syntax but worst without contracts :) However, using the preprocessor to parse and generate every function declaration (with a contract) slows down compilation quite a bit... I think I can optimize the code of my macros and the way I am using Boost.Preprocessor but I am still finishing up the implementation and I am leaving optimizations for later. BTW, for this optimization it would be useful to assess the computational complexity (maybe in terms of "number of macro expansions"?) of the Boost.Preprocessor macros -- how can I do that? Thank you. -- Lorenzo