
Hi there, I am using regex_replace for string replacing. I am working with a very large string (about 3M) and need to do the following replacment %R% --> #R# %A% --> #R1# %C% --> #R2# and so on. I am new to regular express and regex. After having read the document, I know how regex works, but still not sure if regex_replace meet my demand 1) I invoke regex_replace several times to replace difference patterns. For a large input string, the operation is very slow. Can I call regex_replace once to do the several replacments. i.e. calling regex_replace( output, begin, end, "%R ,%A ,%C", "#R#, #R1#, #R2#", match_default); for replacing %R% with #R# %A% with #R1# %C% with #R2# 2) According to the document, the output fashion depends on the match_flag_type, for my case, I hope the output string, including both matched and nonmatched part, replace the whole input string. e.g. SourceStr = "xxxx%R%yyy%A%zzz%C%" Apply regex_replace( output_string, SourceStr.begin, SourceStr.end, ... ) We have , output_string = "xxxx#R#yyy#R1#zzz#R2#" rather than #R#, #R1# and #R2# Any reply will be highly appreciated.

llwaeva@21cn.com wrote:
Hi there, I am using regex_replace for string replacing. I am working with a very large string (about 3M) and need to do the following replacment
%R% --> #R# %A% --> #R1# %C% --> #R2#
and so on.
I am new to regular express and regex. After having read the document, I know how regex works, but still not sure if regex_replace meet my demand
1) I invoke regex_replace several times to replace difference patterns. For a large input string, the operation is very slow. Can I call regex_replace once to do the several replacments. i.e. calling regex_replace( output, begin, end, "%R ,%A ,%C", "#R#, #R1#, #R2#", match_default); for replacing %R% with #R# %A% with #R1# %C% with #R2#
You can, use something like: "(%R%)|(%A%)|(%C%)" as the regex, and then use the replace string: "(?1#R#)(?2#R1#)(?3#R2#)" with the flag "formal_all" set (this enables Boost-specific format-string extensions that allow you to do conditional search and replace like this).
2) According to the document, the output fashion depends on the match_flag_type, for my case, I hope the output string, including both matched and nonmatched part, replace the whole input string. e.g.
SourceStr = "xxxx%R%yyy%A%zzz%C%" Apply regex_replace( output_string, SourceStr.begin, SourceStr.end, ... ) We have , output_string = "xxxx#R#yyy#R1#zzz#R2#" rather than #R#, #R1# and #R2#
Use "format_no_copy" to suppress copying unmatched parts to the output. Otherwise the default behaviour is to always copy unmatched parts of the input to output. John.

on 2006-7-28 1:42:39, "John Maddock"
llwaeva@21cn.com wrote:
Hi there, I am using regex_replace for string replacing. I am working with a very large string (about 3M) and need to do the following replacment
%R% --> #R# %A% --> #R1# %C% --> #R2#
and so on.
I am new to regular express and regex. After having read the document, I know how regex works, but still not sure if regex_replace meet my demand
1) I invoke regex_replace several times to replace difference patterns. For a large input string, the operation is very slow. Can I call regex_replace once to do the several replacments. i.e. calling regex_replace( output, begin, end, "%R ,%A ,%C", "#R#, #R1#, #R2#", match_default); for replacing %R% with #R# %A% with #R1# %C% with #R2#
You can, use something like:
"(%R%)|(%A%)|(%C%)"
as the regex, and then use the replace string:
"(?1#R#)(?2#R1#)(?3#R2#)"
with the flag "formal_all" set (this enables Boost-specific format-string extensions that allow you to do conditional search and replace like this).
2) According to the document, the output fashion depends on the match_flag_type, for my case, I hope the output string, including both matched and nonmatched part, replace the whole input string. e.g.
SourceStr = "xxxx%R%yyy%A%zzz%C%" Apply regex_replace( output_string, SourceStr.begin, SourceStr.end, ... ) We have , output_string = "xxxx#R#yyy#R1#zzz#R2#" rather than #R#, #R1# and #R2#
Use "format_no_copy" to suppress copying unmatched parts to the output. Otherwise the default behaviour is to always copy unmatched parts of the input to output.
John.
Thanks for your help. But the result still not what I want. Here is my program and output, please check it for me string src = "xxx%R%__xy\r\n%\r\nRyyyy%A__%A\r\n%R%zzz%C%%A%ppp_%C%0123\r\n%R%ooo"; string output=src; re = "(%R%)|(%A%)|(%C%)"; string format = "(?1#R#)(?2#A#)(?3#C#)"; cout << "SOURCE:" << endl << src << endl << endl; regex_replace( output.begin(), src.begin(), src.end(), re, format, format_all | format_no_copy ); cout << "OUTPUT:" << endl << output << endl << endl; cout << "SRC:" << endl << src << endl; The output is SOURCE: xxx%R%__xy % Ryyyy% A__%A %R%zzz%C%%A%ppp_%C%0123 %R%ooo OUTPUT: #R##R##C##A##C##R#yy% A__%A %R%zzz%C%%A%ppp_%C%0123 %R%ooo SRC: xxx%R%__xy % Ryyyy% A__%A %R%zzz%C%%A%ppp_%C%0123 %R%ooo It is not what I want. I hope the output with matching pattern replaced also copy back to the input string For that purpose, I modify the program like string src = "xxx%R%__xy\r\n%\r\nRyyyy%A__%A\r\n%R%zzz%C%%A%ppp_%C%0123\r\n%R%ooo"; re = "(%R%)|(%A%)|(%C%)"; format = "(?1#R#)(?2#A#)(?3#C#)"; cout << "SOURCE:" << endl << src << endl << endl; regex_replace( src.begin(), src.begin(), src.end(), re, format, format_all); cout << "OUTPUT:" << endl << src << endl; Now, the output with the mataching pattern pattern replaced will exactly copy to the input string, . i.e. after having replaced src, src became src = xxx#R#__xy % Ryyyy% A__%A #R#zzz#C##A#ppp_#C#0123 #R#ooo NOTICE that in above code, there is NO format_no_copy flag! Thanks.
participants (2)
-
John Maddock
-
llwaeva@21cn.com