Hi,
I am trying to write a simple program which processes large gziped files as
streams. However, it turns out, that for some reason reading is very slow
and it takes almost 10 times more to even count single lines as compared to
a Java equivalent or "gunzip -c largefile.gz | wc -l". I would like to ask
experts to point out what assumption I make. The source is attached below.
Thanks in advance,
Andy
#include <iostream>
#include <fstream>
#include
#include
int main(int argc, char** argv)
{
char buf[10*4096];
std::cout << argv[1] << '\n';
long c=0;
std::ifstream file(argv[1], std::ios_base::in | std::ios_base::binary);
file.rdbuf()->pubsetbuf(buf, 10*4096);
try {
boost::iostreams::filtering_istream in;
in.push(boost::iostreams::gzip_decompressor());
in.push(file);
for(std::string str; std::getline(in, str); )
c++;
}
catch(const boost::iostreams::gzip_error& e) {
std::cout << e.what() << '\n';
}
std::cout << c << '\n';
return 0;
}