Hi there, I am using regex_replace to find_replace a pattern. The code is shown below string src = "xxx%R__xy\r\n%\r\nRyyyy% A__%%A\r\n%Rzzz%C%Appp_%C0123\r\n%Rooo"; re = "(%R)|(%A)|(%C)"; format = "(?1#R)(?2#A)(?3#C)"; cout << "SOURCE:" << endl << src << endl << endl; regex_replace( src.begin(), src.begin(), src.end(), re, format, format_all ); cout << "OUTPUT:" << endl << src << endl << endl; 1) For replacing %R with #R, %A with #A and %C with #C, and the output string save back to the source string, the above code do a good job. And the output is xxx#R__xy % Ryyyy% A__%#A #Rzzz#C#Appp_#C0123 #Rooo NOTE that %%A is replaced with %#A 2) If I change the format so that the length of format longer than that of subsitute string, e.g. re = "(%R)|(%A)|(%C)"; format = "(?1#RRR)(?2#AAA)(?3#CCC)"; regex_replace raise an error. I think the error is come from the original string is not long enough to store the output string. The problem can be solved by the following code string src = "xxx%R__xy\r\n%\r\nRyyyy% A__%%A\r\n%Rzzz%C%Appp_%C0123\r\n%Rooo"; string output=src; re = "(%R)|(%A)|(%C)"; format = "(?1#RRR)(?2#AAA)(?3#CCC)"; cout << "SOURCE:" << endl << src << endl << endl; regex_replace( output.begin(), src.begin(), src.end(), re, format, format_all); cout << "OUTPUT:" << endl << output << endl << endl; But I do want the output store in the original string. Reassigning the source string slove the problem, i.e. src = output, but for a large input string (for my case , >5M), this way is not that good. I am looking for a better approach. BTW, if the length of the output string shorter than the original string, the output carry some other extra characerts. e.g. string src = "xxx%Ry%Az%Ce"; string output=src; re = "(%R)|(%A)|(%C)"; format = "(?1R)(?2A)(?3C)"; The output is "xxxRyAzCe%Ce" where the last %Ce are extra characters from source string. How can I kill the extra characters? 3) Finally, I will modify the search condition to make sure that only %X rather than %%X (X can be R, A or C) is replaced. I try the following regular expression re = "([^%]%R)|([^%]%A)|([^%]%C)"; but it doesn't work properly for my problem. e.g. for src = "xxx%R %%R", for the format "(?1#R)(?2#A)(?3#C)"; the regular expression will kill the 'x' before %R, i..e. the output is "xx#R %%R" rather than "xxx#R %%R" Please help! Thanks in advance.
llwaeva@21cn.com wrote:
Hi there, I am using regex_replace to find_replace a pattern. The code is shown below
string src = "xxx%R__xy\r\n%\r\nRyyyy% A__%%A\r\n%Rzzz%C%Appp_%C0123\r\n%Rooo"; re = "(%R)|(%A)|(%C)"; format = "(?1#R)(?2#A)(?3#C)"; cout << "SOURCE:" << endl << src << endl << endl; regex_replace( src.begin(), src.begin(), src.end(), re, format, format_all ); cout << "OUTPUT:" << endl << src << endl << endl;
OK I've just realise what you're doing, and in short DON'T DO THAT!!!! regex_replace does not do inplace search and replace, YOU ABSOLUTELY CANNOT OVERWRITE A STRING YOU'RE STILL READING FROM! Use: src = regex_replace(src, re, format, format_all); instead. John.
participants (2)
-
John Maddock
-
llwaeva@21cn.com