gzip_compressor cannot be flushed and closed properly
Hi, I have a code block trying to output a series of gzip files. I wish to have all the files flushed after the block. But the files can only be flushed after exiting the whole program. The files still have zero size after I try to close them. I am using std::vector to store all the pointers to boost filtering_ostream. Can you help? //Open iostreams char filename[20]; std::vector<boost::iostreams::filtering_ostream *> os_vector; for(i=0; i<4; i++) { sprintf(filename, "file_%d.gz", i ); std::ofstream *of = new std::ofstream(filename, std::ios_base::binary); boost::iostreams::filtering_ostream* os = new boost::iostreams::filtering_ostream; os->push(boost::iostreams::gzip_compressor()); os->push(*of); os_vector->push_back(os); } //Output something for(i=0; i<4; i++) { boost::iostreams::filtering_ostream* os = os_vector[i]; os<<"Output something here" << std::endl; } //Close streams for(i=0; i<4; i++) { boost::iostreams::filtering_ostream* os = os_vector[i]; os->strict_sync(); os->pop(); os->reset(); } Thanks, Mengda
Mengda Wu wrote:
Hi,
I have a code block trying to output a series of gzip files. I wish to have all the files flushed after the block. But the files can only be flushed after exiting the whole program. The files still have zero size after I try to close them. I am using std::vector to store all the pointers to boost filtering_ostream. Can you help?
//Open iostreams
char filename[20]; std::vector<boost::iostreams::filtering_ostream *> os_vector; for(i=0; i<4; i++) { sprintf(filename, "file_%d.gz", i ); std::ofstream *of = new std::ofstream(filename, std::ios_base::binary); boost::iostreams::filtering_ostream* os = new boost::iostreams::filtering_ostream; os->push(boost::iostreams::gzip_compressor()); os->push(*of); os_vector->push_back(os); }
//Output something for(i=0; i<4; i++) { boost::iostreams::filtering_ostream* os = os_vector[i]; os<<"Output something here" << std::endl; }
//Close streams for(i=0; i<4; i++) { boost::iostreams::filtering_ostream* os = os_vector[i]; os->strict_sync(); os->pop(); os->reset(); }
I haven't tried out your sample... but I am not sure if gzip filters are flushable?
Hi, I am trying to save gzip files and close them in my program without quitting it. And I would like to read these files using another program at the same time. The problem is I cannot access the gzip files unless I quit my program. Do you know whether I can properly save and close the gzip files with boost iostreams? Thanks, Mengda 2008/2/18, eg <egoots@gmail.com>:
Mengda Wu wrote:
Hi,
I have a code block trying to output a series of gzip files. I wish to have all the files flushed after the block. But the files can only be flushed after exiting the whole program. The files still have zero size after I try to close them. I am using std::vector to store all the pointers to boost filtering_ostream. Can you help?
//Open iostreams
char filename[20]; std::vector<boost::iostreams::filtering_ostream *> os_vector; for(i=0; i<4; i++) { sprintf(filename, "file_%d.gz", i ); std::ofstream *of = new std::ofstream(filename, std::ios_base::binary); boost::iostreams::filtering_ostream* os = new boost::iostreams::filtering_ostream; os->push(boost::iostreams::gzip_compressor()); os->push(*of); os_vector->push_back(os); }
//Output something for(i=0; i<4; i++) { boost::iostreams::filtering_ostream* os = os_vector[i]; os<<"Output something here" << std::endl; }
//Close streams for(i=0; i<4; i++) { boost::iostreams::filtering_ostream* os = os_vector[i]; os->strict_sync(); os->pop(); os->reset(); }
I haven't tried out your sample... but I am not sure if gzip filters are flushable?
Mengda Wu wrote:
Hi,
I am trying to save gzip files and close them in my program without quitting it. And I would like to read these files using another program at the same time. The problem is I cannot access the gzip files unless I quit my program. Do you know whether I can properly save and close the gzip files with boost iostreams?
The following works for me when I call it in a function using boost 1.33.1 (in Windows XP): using namespace std; namespace io = boost::iostreams; std::ifstream ifs(infile, std::ios_base::in | std::ios::binary); io::filtering_ostream out; out.push(io::gzip_compressor()); out.push( io::file_sink(outfile, ios_base::out | ios_base::binary)); out << ifs.rdbuf(); ifs.close(); out.flush(); out.reset(); // After the reset, the output file is closed.
The code works for me. Thanks, Mengda 2008/2/21, eg <egoots@gmail.com>:
Mengda Wu wrote:
Hi,
I am trying to save gzip files and close them in my program without quitting it. And I would like to read these files using another program at the same time. The problem is I cannot access the gzip files unless I quit my program. Do you know whether I can properly save and close the gzip files with boost iostreams?
The following works for me when I call it in a function using boost 1.33.1 (in Windows XP):
using namespace std; namespace io = boost::iostreams;
std::ifstream ifs(infile, std::ios_base::in | std::ios::binary); io::filtering_ostream out;
out.push(io::gzip_compressor()); out.push( io::file_sink(outfile, ios_base::out | ios_base::binary)); out << ifs.rdbuf(); ifs.close(); out.flush(); out.reset();
// After the reset, the output file is closed.
Hi Mengda, I'm sorry I didn't see this post sooner. If you include the library name in your message subject it is more likely that the library author will respond quickly. Mengda Wu wrote:
Hi,
I have a code block trying to output a series of gzip files. I wish to have all the files flushed after the block. But the files can only be flushed after exiting the whole program. The files still have zero size after I try to close them. I am using std::vector to store all the pointers to boost filtering_ostream. Can you help?
What you are noticing is that data is not being written to disk until the streams are closed. This is not actually a bug, as I will explain, but it still may warrant a change to the library. flush() is just a suggestion; in general you can't force a filter to output all the filtered data it is currently storing, except at the end of the stream. There may be internal constraints, depending on the format of the data output by the filter, that dictate when new data is available for flushing. For example, an encryption filter might not be able to output any new characters until its input has length equal to a multiple of its block size, or until EOF occurs. In the case of the gzip filters, Boost.Iostreams simply lets zlib determine when new characters in the filtered sequence are available. In your example, the compressed text is very short (21 characters) and it looks like zlib is simply waiting for more input before it spits anything out. When I run your example with 250K of uncompressed data, there is output written to disk before the streams are closed. I have opened a ticket (http://svn.boost.org/trac/boost/ticket/1656) raising the question whether symmetric filters (including gzip) should attempt to force the underlying filtering algorithm to spit out as many bytes as possible when flush() is called.
//Open iostreams
char filename[20]; std::vector<boost::iostreams::filtering_ostream *> os_vector; for(i=0; i<4; i++) { sprintf(filename, "file_%d.gz", i ); std::ofstream *of = new std::ofstream(filename, std::ios_base::binary); boost::iostreams::filtering_ostream* os = new boost::iostreams::filtering_ostream; os->push(boost::iostreams::gzip_compressor()); os->push(*of); os_vector->push_back(os); }
//Output something for(i=0; i<4; i++) { boost::iostreams::filtering_ostream* os = os_vector[i]; os<<"Output something here" << std::endl; }
//Close streams for(i=0; i<4; i++) { boost::iostreams::filtering_ostream* os = os_vector[i]; os->strict_sync(); os->pop(); os->reset(); }
There are several other problems with this code. First, os_vector is not a pointer, to os_vector->push_back(os); you should use os_vector.push_back(os). Second, the dynamically allocated ofstreams are leaked; when you add them to a filtering stream, it does not take ownership of them; it merely stores a reference. Third, the dynamically allocated filtering_ostreams are in danger of being leaked if an exception is thrown by any of the code following the allocation; you should consider some other method of storing the streams -- possibly using a ptr_vector (http://tinyurl.com/3x4yor). Fourth, it is useless to call pop() immeditately before reset(): pop() removes the last element in a chain, while reset removes all the elements in a chain.
Thanks, Mengda
Best Regards, -- Jonathan Turkanis CodeRage http://www.coderage.com
Jonathan Turkanis wrote:
What you are noticing is that data is not being written to disk until the streams are closed. This is not actually a bug, as I will explain, but it still may warrant a change to the library.
Actually, while everything I said is correct if you are using the Boost trunk, if you are using 1.34.1, the problem is that the dynamically allocated fstreams are not being closed until their destructors are called at program termination. You can fix this by: i. changing the way you store the ofstreams, so their destructors are called earlier; ii. manually closing the ofstreams after you are done writing to them; or iii. switching to Boost.Iostreams file_sinks or file_descriptor_sinks. As I mentioned at the end of my last post, if you continue to use fstreams, you should change the way you store them, since currently they represent a resource leak. Best, -- Jonathan Turkanis CodeRage http://www.coderage.com
participants (3)
-
eg
-
Jonathan Turkanis
-
Mengda Wu