Thanks for taking the time to answer John.
> How desperate are you? Are there really that many regexes that loading them
> from their string representation is a problem?
Well, not *that* desperate - I'm looking at the existing options for now.
The number of regexp is roughly 4000 currently but unconstrained a priori
and a lot of them are quite huge. Futhermore, the match is done for all
those regexps on a list of string that can contain around 1 million strings
(so recompilation each time of the regexp is likely to be a problem).
The problem is the following: I'm supposed to implement a match
on general tokens, i.e. being able to code regexps that would
contain tokens like:
(CITY)
where CITY is defined elsewhere as a list of possible cities.
The only way I see to do this with boost-regexp is to translate those pseudo-regexp
into ones containing
(boston|chicago|.....)
I.e. replace all references to generic tokens to their expanded value - I'm afraid
that will use too much mem (if all loaded) or too much time (if recompiled each time) -
unless there is a way to refer to another regexp in a regexp ?
Note that I'm just foreseeing problems here - if you tell me there's no solution then
I'll implement the expansion and see what the performance look like. Just trying here
to code it right the 1st time :)
On a side note:
> regex under one locale and read back in under another, sadly bad things will
> very likely happen :-(
Not if you save the locale in the compiled version and throw if a mismatch occurs,
but anyway I understand serializing is not an easy thing to do.
Cheers
/jog