Hi there, I wrote a piece of code to handle MIME header based on boost. The program subsitite the MIME header (from, to, subject, etc.) into a template with specific macro. Because the macro may occur everywhere in the template once or more, so I apply replace_all_regex. First of all, I need to transform the substitute string to escape special characters // input is a substitute string that will replace the macro string& escape_format( string& input ) { replace_all( input, "$", "$$" ); replace_all( input, "\\", "\\\\" ); replace_all( input, ":", "\\:" ); replace_all( input, "?", "\\?" ); replace_all( input, "(", "\\(" ); replace_all( input, ")", "\\)" ); return input; } There are some macros definied, some of them are listed below %s represents the subject (e.g. Re: hi philip!) %S represents the subject without prefix "Re:" (e.g hi philip!) %f represents the "form" (e.g. tom@hotmail.com) %t represents the "to" (e.g. philip@hotmail.com) ... With regex, I can replace all the macros with the help of format, something like replace_all_regex( source, boost::regex("(%f)|(%t)|(%s)|(%S)"), "(?1 xxx)(?2 yyy)(?3 aaa)(?4 bbb)", format_all ); In order to get the subject without prefix "Re:", I use erase_regex as follow string& deRe(string& input) { boost::regex regexp("^[[:space:]]*Re:[[:space:]]*"); erase_regex( input, regexp ); return input; } So simple!!! However, the order and number of macro will be changed sometimes. So I need a better way to build regex and format. I've got a function to do that // rformat is a stringstream for building a format // text is a substitute string // re_text is a string for building a regular expression // re is a string or regular expression, i.e. the macro // format_num is an global integer to hold the current number of format items (initialize to zero) void format_text(stringstream& rformat, string& text, string& re_text, const string& re) { format_num++; rformat << "(?" << format_num << " " << escape_format(text) << ")"; if (format_num>1) re_text += "|"; re_text += "(" + re + ")"; } Now, it's ready to build the regular expression and subsitiude formation dynamically. Here is an example, string regex_text; stringstream reformat; string source; // source is a string containing text and macros // spHeader is char* which is obtained from MIME parser. // if (spHeader) { format_text( reformat, string(spHeader), regex_text, "%s" ); // for %s format_text( reformat, deRe(string(spHeader)), regex_text, "%S" ); // for %S // Suppose spHeader is "Re: hi philp!", then // regex_text = "(%s)|(%S)" // reformat.str() = "(?1 Re: hi philip!)(?2 hi philip!)" replace_all_regex( source, boost::regex(regex_text), reformat.str(), format_all ); } Everything is fine except that there are some meaningless characters at the end of the output string (still source). By checking the code carefully, I found that the problem maybe caused by the format_text for %S. Hiding the second format_text solve the problem. However, just hiding the first format_text also solve the problem. It is confusing me! Someone please give me a direction. Thanks.
participants (1)
-
llwaeva@21cn.com