
Hello I have following problem: I need to filter some records from one file and save it to another in my c++ application. For example from this file: /////////////////// in.txt //////////////////////////////// http://google.com http://yahoo.com ... http://google.com/analytics //////////////////////////////////////////////////////////// I want to only extract lines that match regex: ^(?:http://google.com).* to get: //////////////// out.txt /////////////////////////////// http://google.com ... http://google.com/analytics ///////////////////////////////////////////////////////// So I wrote something like this: class Writer { public: Writer() :matchesCount_(0){} virtual std::string operator() (const boost::match_results<const char*>& result) { matchesCount_ = result.size(); return aux_; } int getMatchesCount() const { return matchesCount_; } virtual ~Writer(){} private: std::string aux_; //this i completely useless but i must return something in operator() int matchesCount_; }; /////////////////////////////////////////////////////////////////////////////// class FileWriter : public Writer { public: FileWriter(std::ostream* of) :of_(of) {} std::string operator() (const boost::match_results<const char*>& result) { *of_ << *result.begin() << endl; return Writer::operator()(result); } private: std::ostream* of_; }; /////////////////////////////////////////////////////////////////////////////// int main(int argc, char *argv[]) { boost::regex match_lower("^(?:http://google.com).*"); std::ofstream out("out.txt"); string str; filtering_istream first(boost::iostreams::regex_filter(match_lower, FileWriter(&out))); first.push(file_source("in.txt", ios_base::in)); first.ignore();// my output is a side effect of filtering so I don't have to process this stream return 0; } ///////////////////////////////////////////////////////////////////////////// It works fine for short files (IMO for files which size is smaller then size of stream buffer). But I work with very large files (~4,7 GB) and then this is not a good solution. Do you have any idea how to solve it? -- Regards Michał Nowotka