This probably is annoying me to no end. I have a string that I am matching against a pattern called 'sql'. My program tests 'sql' against 14 possible patterns. The patterns are in an array of strings. EVERYTHING works except for the last pattern. And even then, only when the 2nd sub-expression is longer than a certain length. For example, this works: insert into test (one,two,three) values () Through a lot of testing, I have seen that after the comma delimitted list in the second sub-expression surpasses a certain length, then my program gets aborted at the line if(sqlRegex.Match (sql)) { I am running the latest version of boost regex++, and I am running g++ 2.96 on Linux. I ran gdb, and I seemed to have narrowed down the problem to c_str(), but this is not logical. Here is the part of my program: string checks[] = { /* 0 */ "create table (_ID_) (\\(((\\s*,?\\s*(_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"] *\")))?( unique)?\\s*)+)\\))", /* 1 */ "show tables", /* 2 */ "drop table (_ID_)", /* 3 */ "show tdb\\.xml", /* 4 */ "desc( table)? (_ID_)", /* 5 */ "alter table (_ID_) add( column)? ((_ID_) (_TYPES_) ( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?( first| after (_ID_))?)", /* 6 */ "alter table (_ID_) drop( column)? (_ID_)", /* 7 */ "alter table (_ID_) add( column)? \\(((\\s*,?\\s* (_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)| (\"[^\"]*\")))?( unique)?( first| after (_ID_))?\\s*)+)\\)", /* 8 */ "rename table (_ID_) (to|as) (_ID_)", /* 9 */ "alter table (_ID_) alter( column)? (_ID_) (set default ((\\d+(\\.\\d+)?)|(\"[^\"]*\"))|drop default)", /* 10 */ "alter table (_ID_) modify( column)? ((_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"] *\")))?( unique)?( first| after (_ID_))?)", /* 11 */ "alter table (_ID_) change( column)? (_ID_) ((_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"] *\")))?( unique)?( first| after (_ID_))?)", /* 12 */ "alter table (_ID_) rename( to| as) (_ID_)", /* 13 */ "insert( into)? (_ID_)( \\(((\\s*,?\\s*(_ID_)\\s*)+) \\))? values \\(\\)" }; int checksLength = sizeof checks / sizeof *checks; int i; RegEx sqlRegex; for(i=0;i<checksLength;i++) { sqlRegex.SetExpression(checks[i],true); if(sqlRegex.Match(sql)) { break; } } The interesting thing is that the first string, checks[0] is similar to the insert syntax, and when I use a very long string for the create table syntax, it works perfectly fine. Also, I tried just adding another element to checks, just trying, but that didn't work. Also, I had no such problem for the other 12 cases, it only happens with this last insert case. Any help will be greatly appreciated, I have been stuck on this for days!! Thanks, Kevin
I am running the latest version of boost regex++, and I am running g++ 2.96 on Linux. I ran gdb, and I seemed to have narrowed down the problem to c_str(), but this is not logical. Here is the part of my program:
I tried testing using the program below and it checked out fine with VC6, can you verify that this reproduces the problem? If not can you please submit one that does? If the issue is occuring in a call to c_str() can you verify that the argument is actually a valid string, and/or get a stack backtrace? John Maddock http://ourworld.compuserve.com/homepages/john_maddock/index.htm Here's my test code: #include <iostream> #include "boost/cregex.hpp" int main(int, char**) { std::string checks[] = { /* 0 */ "create table (_ID_) (\\(((\\s*,?\\s*(_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?\\s*)+)\\))", /* 1 */ "show tables", /* 2 */ "drop table (_ID_)", /* 3 */ "show tdb\\.xml", /* 4 */ "desc( table)? (_ID_)", /* 5 */ "alter table (_ID_) add( column)? ((_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?( first| after (_ID_))?)", /* 6 */ "alter table (_ID_) drop( column)? (_ID_)", /* 7 */ "alter table (_ID_) add( column)? \\(((\\s*,?\\s*(_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?( first| after (_ID_))?\\s*)+)\\)",/* 8 */ "rename table (_ID_) (to|as) (_ID_)",/* 9 */ "alter table (_ID_) alter( column)? (_ID_) (set default ((\\d+(\\.\\d+)?)|(\"[^\"]*\"))|drop default)", /* 10 */ "alter table (_ID_) modify( column)? ((_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?( first| after (_ID_))?)", /* 11 */ "alter table (_ID_) change( column)? (_ID_) ((_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?( first| after (_ID_))?)", /* 12 */ "alter table (_ID_) rename( to| as) (_ID_)", /* 13 */ "insert( into)? (_ID_)( \\(((\\s*,?\\s*(_ID_)\\s*)+)\\))? values \\(\\)" }; int checksLength = sizeof checks / sizeof *checks; int i; std::string sql = "insert into test (one,two,three) values ( aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb , cccccccccccccccccccccccccccccccccccccc)"; boost::RegEx sqlRegex; for(i=0;i<checksLength;i++) { sqlRegex.SetExpression(checks[i],true); if(sqlRegex.Match(sql)) { break; } } return 0; }
I am running the latest version of boost regex++, and I am running g++ 2.96 on Linux. I ran gdb, and I seemed to have narrowed down
problem to c_str(), but this is not logical. Here is the part of my program:
I tried testing using the program below and it checked out fine with VC6, can you verify that this reproduces the problem? If not can you
Sorry, I didn't qualify my problem correctly. The problem comes in when the column list is too long, opposite of what you tested. Also, I doubt the problem is duplicable in such certain conditions, so if you want to try the full program, it will take about 5 minutes to download and test and then try the same test, except with the gibberish in the first ()'s and not the second ()'s. Code is here: http://www.myplaceonline.com/code/code.htm Just compile everything together and run ./try.cgi, and from there enter an sql string similar to (you'll also have to create a table first like this: "create table test (one int)"): insert into test (lkfsdjlkafdjklfsdajflkjfdklsjfsdlkjsfldkjfskldaj) values (1); When you go to download the code, the links are all to .htm files, so just take of the .htm, and you will get just the .cpp or .h files. This is the compile string I used: g++ tdbsql.cpp textdb.cpp textdbut.cpp $HOME/boost_1_29_0/libs/regex/build/gcc/libboost_regex.a /usr/lib/lib expat.a -I$HOME/boost_1_29_0 -o try.cgi -Wall You might just be able to take out both the .a files and the -I flag if everything is setup right on your Linux/Unix, but mine isn't. It is a very interesting problem, because essentially the create table syntax is almost the same, and when I try long delimited lists there, it works fine, but it randomly occurs for the insert syntax. I've tried everything to narrow down the problem, but it just makes no sense. It definitely occurs on RegeEx.Match(), but I can't figure out what is going wrong, that is why I posted here. It is something to do with RegEx, whether implicitly or explicitly is what I'm not sure of. The problem is just really confusing, and running gdb, like I said backtraces to a weird spot. It doesn't backtrace to c_str like I said below, what I meant by that is that if I run the gdb by stepping through each line, then right after the call is made to RegEx.Match() (then again, only in the case of the insert syntax does it fail), then there it goes into c_str and somewhere around there the abortion occurs. Hopefully you can help me with this. Thank you VERY MUCH for all your time and help, Kevin Grigorenko --- In Boost-Users@y..., "John Maddock" <john_maddock@c...> wrote: the please
submit one that does? If the issue is occuring in a call to c_str () can you verify that the argument is actually a valid string, and/or get a stack backtrace?
John Maddock http://ourworld.compuserve.com/homepages/john_maddock/index.htm
Here's my test code:
#include <iostream> #include "boost/cregex.hpp"
int main(int, char**) { std::string checks[] = { /* 0 */ "create table (_ID_) (\\(((\\s*,?\\s*(_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?\\s*)+)\\))", /* 1 */ "show tables", /* 2 */ "drop table (_ID_)", /* 3 */ "show tdb\\.xml", /* 4 */ "desc( table)? (_ID_)", /* 5 */ "alter table (_ID_) add( column)? ((_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?( first| after (_ID_))?)", /* 6 */ "alter table (_ID_) drop( column)? (_ID_)", /* 7 */ "alter table (_ID_) add( column)? \\(((\\s*,?\\s*(_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?( first| after (_ID_))?\\s*)+)\\)",/* 8 */ "rename table (_ID_) (to|as) (_ID_)",/* 9 */ "alter table (_ID_) alter( column)? (_ID_) (set default ((\\d+(\\.\\d+)?)|(\"[^\"]*\"))|drop default)", /* 10 */ "alter table (_ID_) modify( column)? ((_ID_) (_TYPES_) ( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?( first| after (_ID_))?)", /* 11 */ "alter table (_ID_) change( column)? (_ID_) ((_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?( first| after (_ID_))?)", /* 12 */ "alter table (_ID_) rename( to| as) (_ID_)", /* 13 */ "insert( into)? (_ID_)( \\(((\\s*,?\\s*(_ID_)\\s*)+) \\))? values \\(\\)" };
int checksLength = sizeof checks / sizeof *checks; int i;
std::string sql = "insert into test (one,two,three) values ( aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb , cccccccccccccccccccccccccccccccccccccc)";
boost::RegEx sqlRegex; for(i=0;i<checksLength;i++) { sqlRegex.SetExpression(checks[i],true); if(sqlRegex.Match(sql)) { break; } } return 0; }
Sorry, I didn't qualify my problem correctly. The problem comes in when the column list is too long, opposite of what you tested. Also, I doubt the problem is duplicable in such certain conditions, so if you want to try the full program, it will take about 5 minutes to download and test and then try the same test, except with the gibberish in the first ()'s and not the second ()'s.
I tried to look at your code, but can't compile because of: #include <xmlparse.h> whatever that is. If you want me to look at this, can you please try and boil it down to a reproducable test case, or failing that mail me a .zip file privately containing everything needed to build the app, plus a text file containing the input that demonstrates the problem. Remember that I'm likely to be building on Win32 not linux: I know next to nothing about debugging gcc built apps, so there's no mileage to be gained by me trying to do that rather than you. John Maddock http://ourworld.compuserve.com/homepages/john_maddock/index.htm
Sorry, I didn't qualify my problem correctly. The problem comes in when the column list is too long, opposite of what you tested. Also, I doubt the problem is duplicable in such certain conditions, so if you want to try the full program, it will take about 5 minutes to download and test and then try the same test, except with the gibberish in the first ()'s and not the second ()'s.
I tried to look at your code, but can't compile because of:
#include <xmlparse.h>
whatever that is.
If you want me to look at this, can you please try and boil it down to a reproducable test case, or failing that mail me a .zip file
containing everything needed to build the app, plus a text file containing the input that demonstrates the problem. Remember that I'm likely to be building on Win32 not linux: I know next to nothing about debugging gcc built apps, so there's no mileage to be gained by me trying to do
This test is completely stripped down, should work for you on Win32, and still reproduces the problem: #include <iostream> #include <string> #include <boost/regex.hpp> using namespace std; using namespace boost; const string re_ID = "[\\w\\-]+"; const string re_TYPES = "int|decimal|datetime|varchar"; const string re_BASIC_COLDEF = "(_ID_) (_TYPES_)( not null)?( unique)?( default ((\\d+(\\.\\d+)?)|(\"[^\"]*\")))?( unique)?"; const string re_EXT_COLDEF = re_BASIC_COLDEF+"( first| after (_ID_))?"; string replaceStr(string Source, const string& Find, const string& Replacement) { string::size_type Length = Find.length(); string::size_type ReplacementLength = Replacement.length(); string::size_type Pos = 0; while((Pos = Source.find(Find,Pos)) != string::npos) { Source.replace(Pos,Length,Replacement); Pos += ReplacementLength; } return Source; } string replaceMacros(string reg) { return(replaceStr(replaceStr(replaceStr(replaceStr(replaceStr (reg," ","\\s+"),"_COL_",re_BASIC_COLDEF),"_ECOL_",re_EXT_COLDEF),"_I D_",re_ID),"_TYPES_",re_TYPES)); } int main() { string sql = "insert into test (vklajsfkljsdfkjsldfsdjkfdslklfsdjklfsjfsd) values (1)"; string checks[] = { /* 0 */ "create table (_ID_) (\\(((\\s*,? \\s*_COL_\\s*)+)\\))", /* 1 */ "show tables", /* 2 */ "drop table (_ID_)", /* 3 */ "show database", /* 4 */ "desc( table)? (_ID_)", /* 5 */ "alter table (_ID_) add( column)? (_ECOL_)", /* 6 */ "alter table (_ID_) drop( column)? (_ID_)", /* 7 */ "alter table (_ID_) add( column)? \\(((\\s*,?\\s*_ECOL_\\s*)+)\\)", /* 8 */ "rename table (_ID_) (to|as) (_ID_)", /* 9 */ "alter table (_ID_) alter( column)? (_ID_) (set default ((\\d+(\\.\\d+)?)|(\"[^\"]*\"))|drop default)", /* 10 */ "alter table (_ID_) modify( column)? (_COL_)", /* 11 */ "alter table (_ID_) change( column)? (_ID_) (_COL_)", /* 12 */ "alter table (_ID_) rename( to| as) (_ID_)", /* 13 */ "insert( into)? (_ID_)( \\(((\\s*,?\\s* (_ID_)\\s*)+)\\))? values \\(((\\s*,?\\s*((\\d+(\\.\\d+)?)|(\"[^\"] *\"))\\s*)+)\\)" }; int checksLength = sizeof checks / sizeof *checks; int i; for(i=0;i<checksLength;i++) { checks[i] = "^(\\s*"+replaceMacros(checks[i]) +"\\s*;?)$"; } RegEx sqlRegex; for(i=0;i<checksLength;i++) { sqlRegex.SetExpression(checks[i],true); if(sqlRegex.Match(sql)) { break; } } cout << "Finished." << endl; return 0; } If you change the "sql" string to match any other of the 12 cases, they work fine. Only with the last case does it not work. Again, thanks a lot for your time, Kevin --- In Boost-Users@y..., "John Maddock" <john_maddock@c...> wrote: privately that
rather than you.
John Maddock http://ourworld.compuserve.com/homepages/john_maddock/index.htm
eydelber wrote:
Code is here:
http://www.myplaceonline.com/code/code.htm
Just compile everything together and run ./try.cgi, and from there enter an sql string similar to (you'll also have to create a table first like this: "create table test (one int)"):
insert into test (lkfsdjlkafdjklfsdajflkjfdklsjfsdlkjsfldkjfskldaj) values (1);
With some difficulty, I have reproduced your problem with GCC 3.2 on Red Hat 8.0. The difficulties were mainly non-conforming code: 1. Get James Clark's expat parser. 2. Include <iostream> where necessary. 3. Add further using directives where necessary. 4. Remove the non-conforming duplication of default parameter values in the headers and the implementation files. 5. There are further warnings which matter but are irrelevant to this exercise. In the general case, if a C++ program compiled with GCC prints "Aborted" and exits, this is caused by an unhandled exception. In this case, you promise that textdb::TextDB::execute will only throw TextDBException. However, RegEx::Match is thowing a different exception. You are not handling this exception in textdb::TextDB::execute, and you have promised not to leak it; the result is that GCC causes your program to abort. By removing the throw specification from your function and adding extra catches in main(), I can tell you that RegEx::Match is throwing an exception whose what() reports "Max regex search depth exceeded." I hope this helps. Regards, Stephen Jackson -- stephen.jackson@scribitur.com http://www.scribitur.com/spj/
eydelber wrote:
Code is here:
http://www.myplaceonline.com/code/code.htm
Just compile everything together and run ./try.cgi, and from
enter an sql string similar to (you'll also have to create a
first like this: "create table test (one int)"):
insert into test (lkfsdjlkafdjklfsdajflkjfdklsjfsdlkjsfldkjfskldaj) values (1);
With some difficulty, I have reproduced your problem with GCC 3.2 on Red Hat 8.0.
The difficulties were mainly non-conforming code: 1. Get James Clark's expat parser. 2. Include <iostream> where necessary. 3. Add further using directives where necessary. 4. Remove the non-conforming duplication of default
the headers and the implementation files. 5. There are further warnings which matter but are irrelevant to this exercise.
In the general case, if a C++ program compiled with GCC
and exits, this is caused by an unhandled exception.
In this case, you promise that textdb::TextDB::execute will only
TextDBException. However, RegEx::Match is thowing a different exception. You are not handling this exception in textdb::TextDB::execute, and you have promised not to leak it; the result is that GCC causes your
Well, you are absolutely right. The weird thing is that I tried to catch exceptions within my execute method, but for some reason it wasn't catching anything; however, now that I tried the stripped down code that I replied with to John Maddock, and added a catch, I get the error that you also found. Now that my problem has finally been isolated, do you know what this exception means? I didn't think that the regex was that complicated, and it works fine for other regex's that are very similar. What specifically is search depth? Also, following up on your introduction to your answer, can you expand on the problems you had compiling, and any general problems you found with my code? I am only 18, and I've just started getting into C++, so I am trying to learn as much as possible from people who know what they are talking about. It is obvious that I have made many mistakes, as you point out, so any expansion on these would be extremely helpful. I am running GCC 2.96 with -Wall, and I didn't have any warnings or errors compiling, so how did you have so many problems with GCC 3.2? It doesn't seem too backwards- compatabile if I am not having problems and you are. A few questions on only the comments that you included: 2: Where did you have to include <iostream>? 3: Again, where did you have to add using directives? 4: Do default values only have to be in the declaration? 5: Can you expand further on these unmentioned warnings? Thank you VERY much for all your help, I learn something new everyday. Kevin --- In Boost-Users@y..., Stephen Jackson <stephen.jackson@s...> wrote: there table parameter values in prints "Aborted" throw program
to abort.
By removing the throw specification from your function and adding extra catches in main(), I can tell you that RegEx::Match is throwing an exception whose what() reports "Max regex search depth exceeded."
I hope this helps.
Regards,
Stephen Jackson -- stephen.jackson@s... http://www.scribitur.com/spj/
eydelber wrote:
Well, you are absolutely right. The weird thing is that I tried to catch exceptions within my execute method, but for some reason it wasn't catching anything; however, now that I tried the stripped down code that I replied with to John Maddock, and added a catch, I get the error that you also found.
I would expect that catching within your execute method should work. However, your stripped down standalone example does show the same exception being thrown by RegEx::Match.
Now that my problem has finally been isolated, do you know what this exception means? I didn't think that the regex was that
I don't have time to debug it right now. Perhaps John Maddock may be able to help now that you have posted a simplified case.
Also, following up on your introduction to your answer, can you expand on the problems you had compiling, and any general problems you found with my code? I am only 18, and I've just started getting
I'll mail you off list since these are not Boost issues. It could be tonight (UK time), or possibly this weekend.
Thank you VERY much for all your help, I learn something new everyday.
No problem. We are all learning. Regards, Stephen Jackson -- stephen.jackson@scribitur.com http://www.scribitur.com/spj/
eydelber wrote:
Well, you are absolutely right. The weird thing is that I tried to catch exceptions within my execute method, but for some reason it wasn't catching anything; however, now that I tried the stripped down code that I replied with to John Maddock, and added a catch, I get the error that you also found.
I would expect that catching within your execute method should work. However, your stripped down standalone example does show the same exception being thrown by RegEx::Match.
Now that my problem has finally been isolated, do you know what
exception means? I didn't think that the regex was that
I don't have time to debug it right now. Perhaps John Maddock may be able to help now that you have posted a simplified case.
Also, following up on your introduction to your answer, can you expand on the problems you had compiling, and any general
--- In Boost-Users@y..., Stephen Jackson <stephen.jackson@s...> wrote: this problems
you found with my code? I am only 18, and I've just started getting
I'll mail you off list since these are not Boost issues. It could be tonight (UK time), or possibly this weekend.
Thank you VERY much for all your help, I learn something new everyday.
No problem. We are all learning.
Regards,
Stephen Jackson -- stephen.jackson@s... http://www.scribitur.com/spj/
Great, thanks. If it is at all possible, please email me at kevin@myplaceonline.com, and not the email used here. Again, thanks a lot for the help. Kevin Grigorenko
In this case, you promise that textdb::TextDB::execute will only throw TextDBException. However, RegEx::Match is thowing a different exception. You are not handling this exception in textdb::TextDB::execute, and you have promised not to leak it; the result is that GCC causes your program to abort.
By removing the throw specification from your function and adding extra catches in main(), I can tell you that RegEx::Match is throwing an exception whose what() reports "Max regex search depth exceeded."
I hope this helps.
Nice work! Thankyou :-) It seems like the expression is getting pathological with some text inputs and throwing (otherwise it would just go round and round indefinitely, so throwing is the least worst option in this case). Looking again at your expressions I see: /* 13 */ "insert( into)? (_ID_)( \\(((\\s*,?\\s*(_ID_)\\s*)+)\\))? values \\(((\\s*,?\\s*((\\d+(\\.\\d+)?)|(\"[^\"]*\"))\\s*)+)\\) Now I haven't picked this apart, the trick is to ensure that for each time the matcher has to choose which option to take (repeat or not, take alternative or not) that there is only one option it can take - whatever the regex engine in use this will optimise performance - and for backtracking engines it will prevent pathological behaviour. To pick just one example in your expression: \\s*,?\\s* this will misbehave if there is a lot of whitespace and no ",", changing to: \\s*(,\\s*)? fixes the issue. elsewhere several of your repeats both start and finish with \\s*, so again there is plenty of room for optimisations. Hope this gets you started, John Maddock http://ourworld.compuserve.com/homepages/john_maddock/index.htm
In this case, you promise that textdb::TextDB::execute will only
TextDBException. However, RegEx::Match is thowing a different exception. You are not handling this exception in textdb::TextDB::execute, and you have promised not to leak it; the result is that GCC causes your
--- In Boost-Users@y..., "John Maddock" <john_maddock@c...> wrote: throw program
to abort.
By removing the throw specification from your function and adding extra catches in main(), I can tell you that RegEx::Match is throwing an exception whose what() reports "Max regex search depth exceeded."
I hope this helps.
Nice work! Thankyou :-)
It seems like the expression is getting pathological with some text inputs and throwing (otherwise it would just go round and round indefinitely, so throwing is the least worst option in this case).
Looking again at your expressions I see:
/* 13 */ "insert( into)? (_ID_)( \\(((\\s*,?\\s*(_ID_)\\s*)+) \\))? values \\(((\\s*,?\\s*((\\d+(\\.\\d+)?)|(\"[^\"]*\"))\\s*)+)\\)
Now I haven't picked this apart, the trick is to ensure that for each time the matcher has to choose which option to take (repeat or not, take alternative or not) that there is only one option it can take - whatever the regex engine in use this will optimise performance - and for backtracking engines it will prevent pathological behaviour. To pick just one example in your expression:
\\s*,?\\s*
this will misbehave if there is a lot of whitespace and no ",", changing to:
\\s*(,\\s*)?
fixes the issue.
elsewhere several of your repeats both start and finish with \\s*, so again there is plenty of room for optimisations.
Hope this gets you started,
John Maddock http://ourworld.compuserve.com/homepages/john_maddock/index.htm
Thank you very much, that is most likely the problem then. I'll try that and if I'm still having problems I'll post back here. The point you make is interesting however, because if you look at the first Regex: create table (_ID_) (\\(((\\s*,?\\s*_COL_\\s*)+)\\)) It is essentially the same, but I have had no problems at all with it. Is it then safe to assume that the problem does not lie specifically with \s*,\s*, but somewhere deeper. Here is the actual regex that finally gets used (after the call to replaceMacros): ^(\s*insert(\s+into)?\s+([\w\-]+)(\s+\(((\s*(?:,?\s*)?([\w\-]+)\s*)+) \))?\s+values\s+\(((\s*,?\s*((\d+(\.\d+)?)|("[^"]*"))\s*)+)\)\s*;?)$ I'm guessing it is something with: (\s+\(((\s*(?:,\s*)?([\w\-]+)\s*)+)\))? This one has a few more groupings than all the other regexs, but I can't seem to figure out the problem. BTW, during the course of writing this reply, I tested the expression with \s*(,\s*)? (as you can see above), and it still throws an exception.
I think that if you have to parse very complicated expressions you should try to match it using spirit (this is also boost candidate library). This library use "recursive descent parser" instead of regular exprssion. (and is very convinet - at least for me) look at: http://spirit.sourceforge.net Regards Daniel "eydelber" <eydelber@yahoo.com> wrote in message news:asnqi6+oea6@eGroups.com...
In this case, you promise that textdb::TextDB::execute will only
TextDBException. However, RegEx::Match is thowing a different exception. You are not handling this exception in textdb::TextDB::execute, and you have promised not to leak it; the result is that GCC causes your
--- In Boost-Users@y..., "John Maddock" <john_maddock@c...> wrote: throw program
to abort.
By removing the throw specification from your function and adding extra catches in main(), I can tell you that RegEx::Match is throwing an exception whose what() reports "Max regex search depth exceeded."
I hope this helps.
Nice work! Thankyou :-)
It seems like the expression is getting pathological with some text inputs and throwing (otherwise it would just go round and round indefinitely, so throwing is the least worst option in this case).
Looking again at your expressions I see:
/* 13 */ "insert( into)? (_ID_)( \\(((\\s*,?\\s*(_ID_)\\s*)+) \\))? values \\(((\\s*,?\\s*((\\d+(\\.\\d+)?)|(\"[^\"]*\"))\\s*)+)\\)
Now I haven't picked this apart, the trick is to ensure that for each time the matcher has to choose which option to take (repeat or not, take alternative or not) that there is only one option it can take - whatever the regex engine in use this will optimise performance - and for backtracking engines it will prevent pathological behaviour. To pick just one example in your expression:
\\s*,?\\s*
this will misbehave if there is a lot of whitespace and no ",", changing to:
\\s*(,\\s*)?
fixes the issue.
elsewhere several of your repeats both start and finish with \\s*, so again there is plenty of room for optimisations.
Hope this gets you started,
John Maddock http://ourworld.compuserve.com/homepages/john_maddock/index.htm
Thank you very much, that is most likely the problem then. I'll try that and if I'm still having problems I'll post back here.
The point you make is interesting however, because if you look at the first Regex:
create table (_ID_) (\\(((\\s*,?\\s*_COL_\\s*)+)\\))
It is essentially the same, but I have had no problems at all with it. Is it then safe to assume that the problem does not lie specifically with \s*,\s*, but somewhere deeper. Here is the actual regex that finally gets used (after the call to replaceMacros):
^(\s*insert(\s+into)?\s+([\w\-]+)(\s+\(((\s*(?:,?\s*)?([\w\-]+)\s*)+) \))?\s+values\s+\(((\s*,?\s*((\d+(\.\d+)?)|("[^"]*"))\s*)+)\)\s*;?)$
I'm guessing it is something with:
(\s+\(((\s*(?:,\s*)?([\w\-]+)\s*)+)\))?
This one has a few more groupings than all the other regexs, but I can't seem to figure out the problem.
BTW, during the course of writing this reply, I tested the expression with \s*(,\s*)? (as you can see above), and it still throws an exception.
Info: <http://www.boost.org> Wiki: <http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl> Unsubscribe: <mailto:boost-users-unsubscribe@yahoogroups.com>
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Thank you very much, that is most likely the problem then. I'll try that and if I'm still having problems I'll post back here.
The point you make is interesting however, because if you look at the first Regex:
create table (_ID_) (\\(((\\s*,?\\s*_COL_\\s*)+)\\))
It is essentially the same, but I have had no problems at all with it. Is it then safe to assume that the problem does not lie specifically with \s*,\s*, but somewhere deeper. Here is the actual regex that finally gets used (after the call to replaceMacros):
^(\s*insert(\s+into)?\s+([\w\-]+)(\s+\(((\s*(?:,?\s*)?([\w\-]+)\s*)+) \))?\s+values\s+\(((\s*,?\s*((\d+(\.\d+)?)|("[^"]*"))\s*)+)\)\s*;?)$
I'm guessing it is something with:
(\s+\(((\s*(?:,\s*)?([\w\-]+)\s*)+)\))?
This one has a few more groupings than all the other regexs, but I can't seem to figure out the problem.
BTW, during the course of writing this reply, I tested the expression with \s*(,\s*)? (as you can see above), and it still throws an exception.
I didn't really think that was the issue: it was an example of the kind of thing that can trip you up, basically I'm trying not to get too deep into your expression, but I think the comma separated list part is probably messing you up, try something along the lines of: \(\s*([^\s][^,]*?\s*,\s*)+\) John Maddock http://ourworld.compuserve.com/homepages/john_maddock/index.htm
Thank you very much, that is most likely the problem then. I'll
that and if I'm still having problems I'll post back here.
The point you make is interesting however, because if you look at the first Regex:
create table (_ID_) (\\(((\\s*,?\\s*_COL_\\s*)+)\\))
It is essentially the same, but I have had no problems at all with it. Is it then safe to assume that the problem does not lie specifically with \s*,\s*, but somewhere deeper. Here is the actual regex that finally gets used (after the call to replaceMacros):
^(\s*insert(\s+into)?\s+([\w\-]+)(\s+\(((\s*(?:,?\s*)?([\w\-]+) \s*)+) \))?\s+values\s+\(((\s*,?\s*((\d+(\.\d+)?)|("[^"]*"))\s*)+)\) \s*;?)$
I'm guessing it is something with:
(\s+\(((\s*(?:,\s*)?([\w\-]+)\s*)+)\))?
This one has a few more groupings than all the other regexs, but I can't seem to figure out the problem.
BTW, during the course of writing this reply, I tested the expression with \s*(,\s*)? (as you can see above), and it still throws an exception.
I didn't really think that was the issue: it was an example of the kind of thing that can trip you up, basically I'm trying not to get too deep into your expression, but I think the comma separated list part is
--- In Boost-Users@yahoogroups.com, "John Maddock" <john_maddock@c...> wrote: try probably
messing you up, try something along the lines of:
\(\s*([^\s][^,]*?\s*,\s*)+\)
John Maddock http://ourworld.compuserve.com/homepages/john_maddock/index.htm
I got it to work finally, thank you very much for all the help! Kevin
participants (5)
-
Daniel Yerushalmi
-
eydelber
-
eydelber <eydelber@yahoo.com>
-
John Maddock
-
Stephen Jackson