[Regex] Is it possible to assign an alias to a subexpression (to use later instead of its index)?
Hi, I would like to know if it's possible to assign an alias/name to a sub-expression. When using short regex there is no problem nor confusion but when using a long regex then I get lost looking at the correct index to use. It would be much easier if I can assign an alias. So, instead to use a regex like this: "\s*(\d+)\s+(\d+)\s+(\d+)\s+" I would like to be able to use something like: "\s*([$VALUE1]\d+)\s+([$VALUE2]\d+)\s+([$VALUE3]\d+)\s+" I don't know if this feature exists in regex but I haven't found it but maybe I dind't know where to find... Thanks in advance, Jordi
I would like to know if it's possible to assign an alias/name to a sub-expression. When using short regex there is no problem nor confusion but when using a long regex then I get lost looking at the correct index to use. It would be much easier if I can assign an alias.
So, instead to use a regex like this:
"\s*(\d+)\s+(\d+)\s+(\d+)\s+"
I would like to be able to use something like:
"\s*([$VALUE1]\d+)\s+([$VALUE2]\d+)\s+([$VALUE3]\d+)\s+"
I don't know if this feature exists in regex but I haven't found it but maybe I dind't know where to find...
No that's not supported, but don't forget that you can always use non-marking parenthesis to avoid throwing out too many fields. John.
John Maddock wrote:
I would like to know if it's possible to assign an alias/name to a sub-expression. When using short regex there is no problem nor
No that's not supported, but don't forget that you can always use non-marking parenthesis to avoid throwing out too many fields.
John.
I already use non-marking parenthesis but when using a long regex it's very difficult to decide which is the correct index to define the subexpression to use. Furthermore, when using "conditional expressiones" in the format strings the difficulties are even more obvious... Thanks anyway for your reply, Jordi
On Apr 5, 2005 10:00 AM, jordi
John Maddock wrote:
I would like to know if it's possible to assign an alias/name to a sub-expression. When using short regex there is no problem nor
No that's not supported, but don't forget that you can always use non-marking parenthesis to avoid throwing out too many fields.
John.
I already use non-marking parenthesis but when using a long regex it's very difficult to decide which is the correct index to define the subexpression to use. Furthermore, when using "conditional expressiones" in the format strings the difficulties are even more obvious...
Thanks anyway for your reply,
Jordi
Jordi - something like Regex Coach (http://www.weitz.de/regex-coach/) makes it a lot easier to work out what match indices to use for the different captures in your regular expressions... Stuart Dootson
Jordi - something like Regex Coach (http://www.weitz.de/regex-coach/) makes it a lot easier to work out what match indices to use for the different captures in your regular expressions...
I'm already using Regex Coach because someone in this group suggested it to check my regex. I think it's a very great tool!. I have just realized there is the option to get the index in the "Hightlight" section but I don't know how to get an index greater than 10... (anyone knows how to do that?) Thanks, Jordi
On Apr 6, 2005 11:49 AM, jordi
Jordi - something like Regex Coach (http://www.weitz.de/regex-coach/) makes it a lot easier to work out what match indices to use for the different captures in your regular expressions...
I'm already using Regex Coach because someone in this group suggested it to check my regex. I think it's a very great tool!.
I have just realized there is the option to get the index in the "Hightlight" section but I don't know how to get an index greater than 10... (anyone knows how to do that?)
Thanks,
Jordi
I thinkn that might have been my suggestion :-) Hmmm - 10 captures!!! The only thing I can suggest for indices > 10 is contacting the tool maintainer through the mailing list (http://common-lisp.net/mailman/listinfo/regex-coach). It says something about feature requests there.... Stuart Dootson
John Maddock wrote:
I would like to know if it's possible to assign an alias/name to a sub-expression. When using short regex there is no problem nor
No that's not supported, but don't forget that you can always use non-marking parenthesis to avoid throwing out too many fields.
John.
I already use non-marking parenthesis but when using a long regex it's very difficult to decide which is the correct index to define the subexpression to use. Furthermore, when using "conditional expressiones" in the format strings the difficulties are even more obvious... Thanks anyway for your reply, Jordi
jordi wrote:
I would like to know if it's possible to assign an alias/name to a sub-expression. When using short regex there is no problem nor confusion but when using a long regex then I get lost looking at the correct index to use. It would be much easier if I can assign an alias.
So, instead to use a regex like this:
"\s*(\d+)\s+(\d+)\s+(\d+)\s+"
I would like to be able to use something like:
"\s*([$VALUE1]\d+)\s+([$VALUE2]\d+)\s+([$VALUE3]\d+)\s+"
Have a look at xpressive in boost-sandbox. It's a regex library with a interface similar to Boost.Regex, and it lets you write regexes as expression templates, which can call other regexes. You should be able to do this: using namespace boost::xpressive; sregex value1, value2, value3; // ... initialize value1, value2, value3 sregex rex = *_s >> (s1= value1 >> +_d) >> +_s >> (s2= value2 >> +_d) >> +_s >> (s3= value3 >> +_d) >> _s ; This creates a regex rex that refers to three other regexes: value1, value2 and value3. (The s1=, s2= and s3= are for capturing backreferences and are unrelated to the nested regex functionality. You can remove them if you're not interested in backreferences.) xpressive docs: http://boost-sandbox.sf.net/libs/xpressive xpressive download: http://boost-sandbox.sourceforge.net/vault/index.php?directory=eric_niebler (Caveat, xpressive is still under development, although it is quite usable at this point.) -- Eric Niebler Boost Consulting www.boost-consulting.com
In Boost::regex, is it possible (by setting flags) to set up a syntax which is like boost::regex::extended in that it finds the longest match, but which also supports non-marking brackets i.e. (? ... ) ? With boost::regex_constants::syntax_option_type sflags = boost::regex::extended | boost::regex::perlex // So that (?:...) non-marking brackets are allowed | boost::regex_constants::escape_in_lists; total_re.assign("(\n)|(.)|(\\.)|(\[[^\]]+\])", sflags); regex_search on "[0-9]+" matches "[" while with boost::regex_constants::syntax_option_type sflags = boost::regex::extended | boost::regex_constants::escape_in_lists; total_re.assign("(\n)|(.)|(\\.)|(\[[^\]]+\])", sflags); regex_search on "[0-9]+" matches "[0-9]" which is what I was wanting. But I'd also like non-marking brackets, which dont work with extended on its own. I'd like non-marking brackets because, I want users to be able to specify an ordered set of regexps, glom them together into one regexp and then know the number of the longest regexp that matched. Rather than parsing the user regexps myself and finding if they have used brackets I tell users to use non-marking brackets. In both cases boost::regex_constants::match_flag_type mflags = boost::match_default | boost::match_not_dot_newline | boost::match_continuous ; Thanks, David
In Boost::regex, is it possible (by setting flags) to set up a syntax which is like boost::regex::extended in that it finds the longest match, but which also supports non-marking brackets i.e. (? ... ) ?
Kind of: you need to compile the expression as a Perl regex, then pass match_posix to the regex algorithm(s) to tell the matcher to find the leftmost longest match. Be aware that POSIX matches tend to be quite a bit slower than Perl style matches on the whole. John.
participants (5)
-
David McKelvie
-
Eric Niebler
-
John Maddock
-
jordi
-
Stuart Dootson