
In follow up to the message and response quoted below. Boost regex seems to work fine on Mac OS X and on our Linux platforms. But, on Windows 32 bit we have the following situation. Note this message is a little bit on the long side given that I am including a short program and the output from running on Windows and Linux platforms. The brief program shown below illustrates this problem. The results are from the Linux and Windows 32-bit machine. You can see on Windows when using the Posix API, I get the right offset only if I use boost::REG_PERL or boost::REG_PERLEX. On Linux, it works fine for all flags. Program ---------- #include <boost/regex.hpp> #include <boost/regex.h> #include <string> #include <iostream> using namespace std; static const char* szPattern="[A-Z][a-z]*"; static const char* szString="small is Great for the Big and Tall"; void f1_(boost::regex::flag_type flag, const char* flag_str) { cout << "\nUsing boost::regex, flag=" << flag << " (" << flag_str << ")" << endl; std::string s = szString; boost::regex re(szPattern, flag); boost::match_results<std::string::const_iterator> what; boost::regex_search(s, what, re); std::cout << "pos=" << what.position() << " len=" << what.length() << std::endl; } void f2_(int flag, const char* flag_str) { cout << "\nUsing Posix, flag=" << flag << " (" << flag_str << ")" << endl; regex_t pattern; int x = regcomp(&pattern, szPattern, flag); if ( x != 0 ) { std::cout << "regcomp - error" << std::endl; return; } regmatch_t matches[5]; x = regexec(&pattern, szString, 5, matches, 0); if ( x != 0 ) { std::cout << "regexec - error" << std::endl; return; } std::cout << "matches[0].rm_so=" << matches[0].rm_so << std::endl; std::cout << "matches[0].rm_eo=" << matches[0].rm_eo << std::endl; } #define f1(x) f1_(x, #x) #define f2(x) f2_(x, #x) int main() { cout << "Regex=" << szPattern << endl; cout << "Input=" << szString << endl; f1(boost::regex::normal); f1(boost::regex::basic); f1(boost::regex::extended); f1(boost::regex::awk); f1(boost::regex::grep); f1(boost::regex::egrep); f1(boost::regex::sed); f1(boost::regex::perl); f2(0); // default f2(boost::REG_EXTENDED); f2(boost::REG_BASIC); f2(boost::REG_PERL); f2(boost::REG_AWK); f2(boost::REG_GREP); f2(boost::REG_EGREP); f2(boost::REG_PERLEX); return 0; } Output From Windows -------------------------- Regex=[A-Z][a-z]* Input=small is Great for the Big and Tall Using boost::regex, flag=0 (boost::regex::normal) pos=9 len=5 Using boost::regex, flag=2162689 (boost::regex::basic) pos=0 len=5 Using boost::regex, flag=2163456 (boost::regex::extended) pos=0 len=5 Using boost::regex, flag=2097920 (boost::regex::awk) pos=0 len=5 Using boost::regex, flag=2293761 (boost::regex::grep) pos=0 len=5 Using boost::regex, flag=2294528 (boost::regex::egrep) pos=0 len=5 Using boost::regex, flag=2162689 (boost::regex::sed) pos=0 len=5 Using boost::regex, flag=0 (boost::regex::perl) pos=9 len=5 Using Posix, flag=0 (0) matches[0].rm_so=0 matches[0].rm_eo=5 Using Posix, flag=1 (boost::REG_EXTENDED) matches[0].rm_so=0 matches[0].rm_eo=5 Using Posix, flag=0 (boost::REG_BASIC) matches[0].rm_so=0 matches[0].rm_eo=5 Using Posix, flag=2817 (boost::REG_PERL) matches[0].rm_so=9 matches[0].rm_eo=14 Using Posix, flag=513 (boost::REG_AWK) matches[0].rm_so=0 matches[0].rm_eo=5 Using Posix, flag=1024 (boost::REG_GREP) matches[0].rm_so=0 matches[0].rm_eo=5 Using Posix, flag=1025 (boost::REG_EGREP) matches[0].rm_so=0 matches[0].rm_eo=5 Using Posix, flag=2048 (boost::REG_PERLEX) matches[0].rm_so=9 matches[0].rm_eo=14 LINUX (Redhat) Output ---------------------------- Regex=[A-Z][a-z]* Input=small is Great for the Big and Tall Using boost::regex, flag=0 (boost::regex::normal) pos=9 len=5 Using boost::regex, flag=2162689 (boost::regex::basic) pos=9 len=5 Using boost::regex, flag=2163456 (boost::regex::extended) pos=9 len=5 Using boost::regex, flag=2097920 (boost::regex::awk) pos=9 len=5 Using boost::regex, flag=2293761 (boost::regex::grep) pos=9 len=5 Using boost::regex, flag=2294528 (boost::regex::egrep) pos=9 len=5 Using boost::regex, flag=2162689 (boost::regex::sed) pos=9 len=5 Using boost::regex, flag=0 (boost::regex::perl) pos=9 len=5 Using Posix, flag=0 (0) matches[0].rm_so=9 matches[0].rm_eo=14 Using Posix, flag=1 (boost::REG_EXTENDED) matches[0].rm_so=9 matches[0].rm_eo=14 Using Posix, flag=0 (boost::REG_BASIC) matches[0].rm_so=9 matches[0].rm_eo=14 Using Posix, flag=2817 (boost::REG_PERL) matches[0].rm_so=9 matches[0].rm_eo=14 Using Posix, flag=513 (boost::REG_AWK) matches[0].rm_so=9 matches[0].rm_eo=14 Using Posix, flag=1024 (boost::REG_GREP) matches[0].rm_so=9 matches[0].rm_eo=14 Using Posix, flag=1025 (boost::REG_EGREP) matches[0].rm_so=9 matches[0].rm_eo=14 Using Posix, flag=2048 (boost::REG_PERLEX) matches[0].rm_so=9 matches[0].rm_eo=14
Message: 4 Date: Mon, 10 Mar 2008 18:08:16 -0000 From: "John Maddock" <john@johnmaddock.co.uk> Subject: Re: [Boost-users] REG_PERLEX To: <boost-users@lists.boost.org> Message-ID: <00a201c882d9$bab38360$83d56b51@fuji> Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original
Phil Hystad wrote:
Does anyone know the definition of REG_PERLEX?
I am using the regex/regcomp traditional unix/posix API supported by Boost Regular Expression library. On a Windows 32 bit platform we are forced to use REG_PERLEX on the regcomp flags argument whereas for the same application we get by using a zero flag value on regcomp on platforms: Mac OS X and Linux.
REG_PERLEX allows the engine to accept Perl style regular expressions - what kind of expressions are you using, and what differences do you observe on the different platforms - there shouldn't really be any difference in behaviour.
John.