
Actually, I was asking about initial construction cost, in particular of an object representing a failed match. The acceptance of N1610 means that copy costs should be insignificant for cases like this one, provided that the smatch author puts in the required effort to make it moveable. ;-)
Sounds like a hint - maybe if we could make shared_ptr moveable then we could all delegate the work to that :-) As for the initial construction cost - yes there is a cost - it has to allocate memory to store the sub-expression matches, the matcher needs some working space and therefore starts storing the submatches before it knows whether there will be a match. Consider your current code: std::string line; boost::regex pat("^Subject: (Re: )?(.*)"); boost::smatch matches; while (std::cin) { std::getline(std::cin, line); if (boost::regex_match(line,matches, pat)) std::cout << matches[2]; } The first time regex_match gets called it allocates the storage it needs in the match_results class, subsequent calls then re-use this storage. This is efficient - in fact the cost of a single memory allocation is about 10 times that of a simple regex_match attempt - so this is very important IMO. In fact I've spent a lot of last year eliminating unnecessary memory allocations from regex, and there are some more I intend to stamp on this year. Believe me it makes a difference, and other libraries like GRETA and PCRE have all been through the same process and for the same reasons. In contrast if regex_match returns a match_results structure then you effectively "pessimise" the performance for a small improvement in ease of use (although I admit that there are options similar to the small-string optimisation that *might* be applicable here). BTW, just to be hyper critical, your alternative code: std::string line; boost::regex pat("^Subject: (Re: )?(.*)"); while (std::cin) { std::getline(std::cin, line); if (boost::smatch m = boost::regex_match(line, pat)) std::cout << m[2]; } contains an assignment inside a while loop, which while "neat", I have often seen criticised for being potentially error prone, there are even some compilers that throw out a helpful(!) warning if you do that (along the lines of "didn't you want to use operator==).
One other thing - the current regex_match overload that doesn't take a match_results as a parameter currently returns bool - the intent is that if the user doesn't need the info generated in the match_results, then some time can be saved by not storing it. Boost.Regex doesn't currently take advantage of that, but I was planning to in the next revision (basically you can cut out memory allocation altogether, and that's an order or magnitude saving).
But I do need the match results, when the match succeeds.
I understand that, but there is a group of users who don't - one example is a (commercial) email spam-filter that uses Boost.Regex. It only needs a true/false result "does this message have this pattern or not", and it wants the answer as fast as possible. For uses like this even a small change in performance can make the difference between "coping" and "not coping" with the email traffic they're seeing these days.
I guess my original suggestion of making it implicitly convertible to some safe_bool solves that problem. I guess I prefer that idea, though Allan probably has more experience with this than I do.
OK, let me mull this over, maybe we can find a way to keep everyone happy, maybe not ... John.