Dear all, I want to misuse boost regex for pattern searching. The intermediate form will then be a string, but some of its characters can have multiple values, e.g: aaa aba aac abc Thus the string can be represented as a[a,b][a,c]. Still on this multiple string representation I want the full power of regular expressions, so I still want to search for aaa, or (a|b)* etc. Is it possible to create some sort of adjusted string and overload the operator== ? I know that as alternative I can adjust the matching string, but this would be a fallback. Wkr, me
# gast128@hotmail.com / 2006-08-09 10:44:06 +0000:
I want to misuse boost regex for pattern searching.
That's what regular expressions are for, no? What abuse?
The intermediate form will then be a string, but some of its characters can have multiple values, e.g:
aaa aba aac abc
Thus the string can be represented as a[a,b][a,c]. Still on this multiple string representation I want the full power of regular expressions, so I still want to search for aaa, or (a|b)* etc.
I don't understand what you're trying to achieve. Are you trying to hide the fact that you're using boost::regex from the interface, so that you can do this? some_type anyofaababc(/* what should be here? */); assert(anyofaababc == "aaa"); assert(anyofaababc == "aba"); assert(anyofaababc == "aac"); assert(anyofaababc == "abc"); If so, what are you envisioning to construct those objects with if not regular expression literals? You'll end up using regex_match() in operator==() anyway, so again, what's the question exactly? -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991
Let's leave the actual case, but take the example of having input which has different combinations for each caracter. One can make strings for each of this case, but this explodes of course the number of combinations. So suppose I still have a string like a[a,b][a,c] representing all cases of this string. Now suppose regex_match is going to match this with (a|b)*. I can imagine that it in the end it will call something like if (*it == 'a'). This is the equality operator called for the type the iterator is pointing to. thus e.g. struct multiple_string { typedef std::string char_type; std::vectorstd::string m_Data; }; bool operator==(char_type k1, multiple_string::char_type k2) { //etc... } Now the whole question was, how do I hget something like this compilable with boost regex. I see that it has overloads for icu and MFC string, but these always seems to be character based. Put it in another way, is boost regex biased towards string types, or supports it a generic interface. Wkr, me
gast128 wrote:
Let's leave the actual case, but take the example of having input which has different combinations for each caracter. One can make strings for each of this case, but this explodes of course the number of combinations.
So suppose I still have a string like a[a,b][a,c] representing all cases of this string. Now suppose regex_match is going to match this with (a|b)*. I can imagine that it in the end it will call something like if (*it == 'a'). This is the equality operator called for the type the iterator is pointing to.
thus e.g.
struct multiple_string { typedef std::string char_type;
std::vectorstd::string m_Data; };
bool operator==(char_type k1, multiple_string::char_type k2) { //etc... }
Now the whole question was, how do I hget something like this compilable with boost regex. I see that it has overloads for icu and MFC string, but these always seems to be character based.
Put it in another way, is boost regex biased towards string types, or supports it a generic interface.
It works on iterators, and on comparisons between characters. You could define your own traits class so that say: a character 'a' in your regex would always match any of [abc] in the string. But you can't match a regex against a regex which seems to be what you're asking for. As has been said already that's a much harder problem. John.
John Maddock
It works on iterators, and on comparisons between characters.
You could define your own traits class so that say: a character 'a' in your regex would always match any of [abc] in the string.
I can try this, but a first shot didn't compile. So I was wondered if this all together was possbile, but probably I should give it another try. In the end I could always write a simple regex myself(a la CUJ 'A Different Interpretation of the Interpreter Design Pattern'), but that's what I want to avoid.
But you can't match a regex against a regex which seems to be what you're asking for. As has been said already that's a much harder problem. Well that would be even more general :)
Wkr, me
gast128 wrote:
John Maddock
writes: It works on iterators, and on comparisons between characters.
You could define your own traits class so that say: a character 'a' in your regex would always match any of [abc] in the string.
I can try this, but a first shot didn't compile. So I was wondered if this all together was possbile, but probably I should give it another try. In the end I could always write a simple regex myself(a la CUJ 'A Different Interpretation of the Interpreter Design Pattern'), but that's what I want to avoid.
The easiest way is to define a traits class that inherits from regex_traits and over-rides the definition of translate() and/or translate_nocase(). John.
participants (3)
-
gast128
-
John Maddock
-
Roman Neuhauser