Running Regular expression (RE) over a list, array or map
data:image/s3,"s3://crabby-images/ea087/ea087f6138136e49124c90add01de7f06e52029c" alt=""
Hi, I am wondering if it is possible to run directly regular expression over a list and getting the indexes (begging and end) of the match. For example, I have a list of strings and a regular expression and I want to know which part of the list matches the RE and get the corresponding indexes in the list. Any suggestion is welcome. Thank you. Regards Olivier
data:image/s3,"s3://crabby-images/39fcf/39fcfc187412ebdb0bd6271af149c9a83d2cb117" alt=""
I am wondering if it is possible to run directly regular expression over a list and getting the indexes (begging and end) of the match. For example, I have a list of strings and a regular expression and I want to know which part of the list matches the RE and get the corresponding indexes in the list. Any suggestion is welcome. Thank you.
You have two options: 1) Use the partial match option to search each item in the collection in turn, and then "do something" if regex indicates that there may be a match that spans two items in the collection. 2) Write a composite iterator which enumerates single characters over the whole collection of strings and pass that to the regex algorithms, in fact I can't believe we don't have this already in Boost, but it appears not :-( HTH, John.
data:image/s3,"s3://crabby-images/ea087/ea087f6138136e49124c90add01de7f06e52029c" alt=""
Thanks John, I will try both options.
Regards
Olivier
2013/9/26 John Maddock
I am wondering if it is possible to run directly regular expression over a
list and getting the indexes (begging and end) of the match. For example, I have a list of strings and a regular expression and I want to know which part of the list matches the RE and get the corresponding indexes in the list. Any suggestion is welcome. Thank you.
You have two options:
1) Use the partial match option to search each item in the collection in turn, and then "do something" if regex indicates that there may be a match that spans two items in the collection. 2) Write a composite iterator which enumerates single characters over the whole collection of strings and pass that to the regex algorithms, in fact I can't believe we don't have this already in Boost, but it appears not :-(
HTH, John. ______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/**mailman/listinfo.cgi/boost-**usershttp://lists.boost.org/mailman/listinfo.cgi/boost-users
data:image/s3,"s3://crabby-images/2d2e5/2d2e57d051c9707188fd91df3c8cf3539a400ceb" alt=""
On 9/26/2013 12:02 PM, John Maddock wrote: >> I am wondering if it is possible to run directly regular expression >> over a >> list and getting the indexes (begging and end) of the match. For >> example, I >> have a list of strings and a regular expression and I want to know which >> part of the list matches the RE and get the corresponding indexes in the >> list. Any suggestion is welcome. Thank you. ... > 2) Write a composite iterator which enumerates single characters over > the whole collection of strings and pass that to the regex algorithms, > in fact I can't believe we don't have this already in Boost, but it > appears not :-( Perhaps boost::range::join would work something like: std::string s1, s2; auto r12 = boost::join(s1, s2); ...regex...(...begin(r12), end(r12)...) at least for a list of size 2. Jeff
data:image/s3,"s3://crabby-images/9d139/9d13975c481bd2489734822788c1786cdc638701" alt=""
Olivier, greetings --
Olivier Austina
I am wondering if it is possible to run directly regular expression over a list and getting the indexes (begging and end) of the match. For example, I have a list of strings and a regular expression and I want to know which part of the list matches the RE and get the corresponding indexes in the list.
It looks like you got some other suggestions, but if they're not what you're looking for, you might want to clarify your request. In particular, it's not clear to me whether you want to match the RE against each individual item (which can be parallelized, but the result is a membership bitmap or subset, not a range), or if you want to match the RE against the concatenated value, or the largest span of continuous values (which are the requests that make the most sense if you want a starting and ending index). Some sample code would probably clarify things. E.g., given: typedef std::vector< std::string > string_vec; const string_vec sv{ "foo", "bar", "baz" }; And you wanted to match: const boost::regex re{ "ba.*" }; What answer do you want to see? * The membership bitmap would be something like: [ 0, 1, 1 ] * The subset would be: { "bar", "baz" } (This is basically a "grep" operator.) * The "range" answer would be sv.begin()+1, sv.begin()+3 (since "barbaz" matches). * The other "range" answer would be the same, but because "bar" matches, and "baz" matches, so the range represents the (longest?) set of elements that individually match the given regex. This ends up being something of a meta-regex, or if you prefer, the result of searching the concatenated string for instances of "(?:re)+". Or is there some other interpretation that you're trying to get at? Happy hacking, Tony p.s. Heh. Guess who just finished a few interviews where "spot the under-specified problem" was important...
participants (4)
-
Anthony Foiani
-
Jeff Flinn
-
John Maddock
-
Olivier Austina