Functions as Filters (was Program Reuse ...)

christopher diggins

8 Jan 2005 8 Jan '05

7:24 p.m.

Thank you very much to everyone who has been helping me out, and have made some suggestions. It took me a bit to realize the implications of what some people were suggesting, so I apologize if I am a bit slow to catch up. I now have code which allows two arbitrary functions ( which have no parameters and return void ) to be chained together from cout of one to the cin of the other: void HelloWorld() { cout << "hello world" << endl; } void CountChars() { string s; s = cin.getline(); cout << static_cast<int>(s.size()) << endl; } int main() { filter(HelloWorld) | filter(CountChars); return 0; } This is done by redirecting cout for the first function to a temporary stringstream buffer, and then redirecting the output of the second stream. The code to accomplish this is trivial, but potentially interesting for the community: #include <iostream> #include <sstream> using namespace std; typedef void(*procedure)(); class filter { public: filter(procedure x) : proc(x) { } void operator()(istream& in, ostream& out) { streambuf* inbuf = cin.rdbuf(); streambuf* outbuf = cout.rdbuf(); cin.rdbuf(in.rdbuf()); cout.rdbuf(out.rdbuf()); proc(); cin.rdbuf(inbuf); cout.rdbuf(outbuf); } private: procedure proc; }; void operator|(filter f1, filter f2) { stringstream s; f1(cin, s); s.seekg(0); f2(s, cout); } Many good suggestions have been made so far. Some possibilities for more features are: 1) create wrapper objects around the filters, this would allow the passing of data to functions, like a command line string 2) chain arbitrarily long sequences by creating a pipeline object, and doing the piping in its destructor 3) allow the chaining of streams 4) allow the chaining of the various iostreams concepts 5) allow the chaining of FILE* (i.e. popen, etc.) 6) allow the chaining of processes 7) allow threading of functions Any comments or suggestions? Is this a direction people would like to see continued? Also is anyone interested in collaborating? Christopher Diggins http://www.cdiggins.com http://www.heron-language.com

Show replies by date

Matt Austern

8 Jan 8 Jan

10:33 p.m.

On Sat, 08 Jan 2005 14:24:51 -0500, christopher diggins <cdiggins@videotron.ca> wrote:

...

Thank you very much to everyone who has been helping me out, and have made some suggestions. It took me a bit to realize the implications of what some people were suggesting, so I apologize if I am a bit slow to catch up.

I now have code which allows two arbitrary functions ( which have no parameters and return void ) to be chained together from cout of one to the cin of the other:

void HelloWorld() { cout << "hello world" << endl; }

void CountChars() { string s; s = cin.getline(); cout << static_cast<int>(s.size()) << endl; }

int main() { filter(HelloWorld) | filter(CountChars); return 0; }

This is done by redirecting cout for the first function to a temporary stringstream buffer, and then redirecting the output of the second stream. The code to accomplish this is trivial, but potentially interesting for the community:

#include <iostream> #include <sstream>

using namespace std;

typedef void(*procedure)();

class filter { public: filter(procedure x) : proc(x) { } void operator()(istream& in, ostream& out) { streambuf* inbuf = cin.rdbuf(); streambuf* outbuf = cout.rdbuf(); cin.rdbuf(in.rdbuf()); cout.rdbuf(out.rdbuf()); proc(); cin.rdbuf(inbuf); cout.rdbuf(outbuf); } private: procedure proc; };

void operator|(filter f1, filter f2) { stringstream s; f1(cin, s); s.seekg(0); f2(s, cout); }

Many good suggestions have been made so far. Some possibilities for more features are:

1) create wrapper objects around the filters, this would allow the passing of data to functions, like a command line string 2) chain arbitrarily long sequences by creating a pipeline object, and doing the piping in its destructor 3) allow the chaining of streams 4) allow the chaining of the various iostreams concepts 5) allow the chaining of FILE* (i.e. popen, etc.) 6) allow the chaining of processes 7) allow threading of functions

Any comments or suggestions? Is this a direction people would like to see continued? Also is anyone interested in collaborating?

Allow the two functions to be run as coroutines, and you'll have something really interesting. --Matt

christopher diggins

10:53 p.m.

----- Original Message ----- From: "Matt Austern" <austern@gmail.com>

...

<cdiggins@videotron.ca> wrote:

...
Any comments or suggestions? Is this a direction people would like to see continued? Also is anyone interested in collaborating?

Allow the two functions to be run as coroutines, and you'll have something really interesting.

That would be quite feasable I believe using boost::threads. I think that this would make most sense as a separate class, i.e. co_filter(MyProc) | co_filter(MyProc) Do you agree? On another note, I am thinking that it makes perfect sense to allow these kinds of pipe expressions to be combined with devices and filters as defined in the boost::iostreams library. source | input_filter | filter(MyProc) | output_filter | sink; I am concerned about the name "filter" because it clashes with iostreams::filter, perhaps fxn_filter and co_fxn_filter make more sense? Any suggestions? Christopher Diggins http://www.cdiggins.com http://www.heron-language.com

Pavel Vozenilek

11:13 p.m.

"christopher diggins" wrote:

...

...
Allow the two functions to be run as coroutines, and you'll have something really interesting.

That would be quite feasable I believe using boost::threads. I think that this would make most sense as a separate class, i.e.

It could be made even w/o threads, using some cooperative model of execution. Stephen Dewhurst gave example of such way in http://www.semantics.org/once_weakly/w11_judgement.pdf /Pavel

Daniel James

10 Jan 10 Jan

10:14 a.m.

Pavel Vozenilek wrote:

...

It could be made even w/o threads, using some cooperative model of execution.

Stephen Dewhurst gave example of such way in http://www.semantics.org/once_weakly/w11_judgement.pdf

Simon Tatham also wrote about it here: http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html Although, he disagrees with Stephen Dewhurst's dislike of macros. A while ago I wrote a library which did something similar for C++ and allowed the 'coroutines' to recurse, have parameters and local variables (which aren't easy because they have to be stored between calls and can't be defined inside a switch statement) and be members of a class. But I was never very happy with it, mainly because the implementation involves some pretty messy preprocessor metaprogramming and the coroutines had to be written using a strange macro based language, for example, a simple 'toupper' coroutine: // Roughly equivalent to: // char toupper(std::istream& in) COROUTINE(toupper, yield(char) parameter(std::istream&, in) local(char, c)) { while(in.get(c)) { COROUTINE_YIELD_VALUE(std::toupper(c)); } } COROUTINE_END int main() { coroutines::coroutine<char> co(toupper(std::cin)); // Coroutines return a boost::optional which is empty // when the coroutine exits. boost::optional<char> c; while((c = co.resume())) { std::cout<<*c; } } If anyone's interested I'll make the code available. I'll need to fix it up a little first, as it currently only works on old versions of gcc, and isn't documented. Daniel

Pavel Vozenilek

5:10 p.m.

"Daniel James" wrote: [coroutines]

...

A while ago I wrote a library which did something similar for C++ and allowed the 'coroutines' to recurse, have parameters and local variables (which aren't easy because they have to be stored between calls and can't be defined inside a switch statement) and be members of a class. But I was never very happy with it, mainly because the implementation involves some pretty messy preprocessor metaprogramming and the coroutines had to be written using a strange macro based language,

[snip code]

...

If anyone's interested I'll make the code available. Yes, me is. /Pavel

Daniel James

11 Jan 11 Jan

12:26 p.m.

Pavel Vozenilek wrote:

...

"Daniel James" wrote:

...
If anyone's interested I'll make the code available.

Yes, me is. /Pavel

Ok, you can get if from: http://myweb.tiscali.co.uk/calamity/code/coroutines.tar.gz I've added some quickly written (and pretty rubbish) documentation in a README file. Feel free to email me with any questions or suggestions. Be warned that the implementation is pretty incomprehensible at the moment. I was thinking of reimplementing it using less preprocessor code and more template stuff. This would make using it more C++-like, but also more verbose. If I removed the class member coroutines, it could probably be a bit simpler. Daniel

Roland Schwarz

10 Jan 10 Jan

11:11 a.m.

Matt Austern wrote:

...

Allow the two functions to be run as coroutines, and you'll have something really interesting.

I think having a coroutine lib within boost would be really nice. However I think it should not have an explicit dependancy on boost::thread, rather it should be regarded an alternative means of concurrency. It might e.g. implemented on terms of the "fiber" API that is available on some operating systems. Having such a library would not only allow "program reuse" of original poster, but also be applicable to write iterators in terms of coroutines. Roland

Jonathan Turkanis

9 Jan 9 Jan

7:02 a.m.

christopher diggins wrote:

...

Thank you very much to everyone who has been helping me out, and have made some suggestions. It took me a bit to realize the implications of what some people were suggesting, so I apologize if I am a bit slow to catch up.

I now have code which allows two arbitrary functions ( which have no parameters and return void ) to be chained together from cout of one to the cin of the other:

Hi Christopher, I don't have much time to write today but I'd like to make several observations about this discussion. It seems to me that what you are trying to achieve is something that is already well-supported by the iostreams library. The iostreams library offers users a number of different ways to write filters, with the understanding that some methods will be more efficient or convenient for a particular purpose than another. ---- The basic types of filters are these: - push filter (aka output filter): given a sequence of characters and a model of Sink, the filter writes filtered characters to the Sink using the generic output functions put() and write(). - pull filter (aka input filter): Given a model of Source, the filter returns a specified number of characters from the filtered sequence. Two important criteria in designing the filter and device concepts were these: - They should not be hard-wired to deal with a particular type of upstream or downstream object; e.g., they should not be forced to deal with standard input and output streams. This is important for flexibility and efficiency. - They should be able to filter small subsequences from the middle of an input sequence, rather than processing entire streams at once. This is important for memory usage and because some input sequences, such as continuous stock-tickers, have no natural end. On top of these basic filters can be built filters with a more user-friendly interface or filters which are suited to some specific purpose. For instance, - symmetric filter (useful for wrapping C filtering APIs such as zlib): given two character arrays, one for input and one for output, the filter consumes some characters from the input array and writes some characters from the filtered sequence to the output array. - one_step_filter: given an entire input sequence as a std::vector, the filter appends the entire filtered sequence to a second std::vector. ---- one_step_filters are useful when memory usage is not an issue and when streams have a well-defined beginning and end. Using one_step_filters, it is simple to define filters which take an input stream and an output stream, read input from the input stream and write filtered output to the output stream: #include <boost/iostreams/device/back_inserter.hpp> #include <boost/iostreams/filter/one_step_filter.hpp> #include <boost/iostreams/stream_facade.hpp> template<typename Ch> class co_filter : public one_step_filter<Ch> { typedef std::vector<Ch> vector_type; virtual void do_filter( std::basic_istream<Ch>& in, std::basic_ostream<Ch>& out ) = 0; // declared in one_step_filter: virtual void do_filter(const vector_type& src, vector_type& dest) { // Input stream which reads from src: stream_facade< basic_array_source<Ch> > in( &src[0], &src[0] + src.size() ); // Output stream which appends to dest stream_facade< boost::io::back_insert_device<vector_type> > out(boost::io::back_inserter(dest)); do_filter(in, out); } }; Given the above definition, if you write a class which derives from co_filter and override the pure virtual function do_filter, you can add it to the filtering streams from the iostreams library and it will work as you have described, if I understand you correctly. ---- Regarding the pipe notation, currently it can only be used as follows: filtering_istream in(filter1() | filter2() | filter3() | source()); or filtering_ostream out(filter1() | filter2() | filter3() | sink()); However, I should be able to extend it so that if a chain contains both a source and a sink, boost::io::copy is invoked. E.g., source() | filter1() | filter2() | filter3() | sink() would be equivalent to filtering_ostream out(filter1() | filter2() | filter3() | sink()); boost::io::copy(source, out); If I can make this work, and there are no objections, I'll add it. ----

...

1) create wrapper objects around the filters, this would allow the passing of data to functions, like a command line string

Filters can already be passed arbitrary data.

...

2) chain arbitrarily long sequences by creating a pipeline object, and doing the piping in its destructor

Arbitrarily long sequences can already be chained.

...

3) allow the chaining of streams

I'm not sure what this means, but it can probably already be done ;-)

...

4) allow the chaining of the various iostreams concepts

See above.

...

5) allow the chaining of FILE* (i.e. popen, etc.) 6) allow the chaining of processes

This would be a good addition to the library, in the form of the system_filter I described here: http://tinyurl.com/53w9o

...

7) allow threading of functions

Do you mean this: support filters which think they are processing an entire stream at once, but really their threads are waiting on some syncronization object whenever there is no more input available, or output buffers are full? Jonathan

christopher diggins

5:25 p.m.

...

Hi Christopher,

I don't have much time to write today but I'd like to make several observations about this discussion.

It seems to me that what you are trying to achieve is something that is already well-supported by the iostreams library. The iostreams library offers users a number of different ways to write filters, with the understanding that some methods will be more efficient or convenient for a particular purpose than another.

Hi Jonathan, Thanks for responding. I have now managed to get the iostreams library to do what I wanted (see my code below). The only thing missing now is the ability to chain sequences of filters.

...

Given the above definition, if you write a class which derives from co_filter and override the pure virtual function do_filter, you can add it to the filtering streams from the iostreams library and it will work as you have described, if I understand you correctly.

Using what you provided, here is the code which enables us to use an arbitrary procedure as an iostream filter: typedef void(*procedure)(); class proc_as_filter : public co_filter<char> { public: proc_as_filter(procedure x) : proc(x) { } virtual void do_filter( basic_istream<char>& in, basic_ostream<char>& out) { streambuf* inbuf = cin.rdbuf(); streambuf* outbuf = cout.rdbuf(); cin.rdbuf(in.rdbuf()); cout.rdbuf(out.rdbuf()); proc(); cin.rdbuf(inbuf); cout.rdbuf(outbuf); } private: procedure proc; }; void ToUpperFilterFxn() { char ch; while (cin.get(ch)) { cout.put(toupper(ch)); } } int main() { string s = "hello jonathan\n"; filtering_ostream out; out.push(proc_as_filter(ToUpperFilterFxn)); out.push(cout); boost::io::copy(stringstream(s), out); return 0; } I compiled and successfuly ran this on Visual C++ 7.1. What are the chances something like this could find its way into the iostreams library?

...

However, I should be able to extend it so that if a chain contains both a source and a sink, boost::io::copy is invoked. E.g.,

source() | filter1() | filter2() | filter3() | sink()

would be equivalent to

filtering_ostream out(filter1() | filter2() | filter3() | sink()); boost::io::copy(source, out);

If I can make this work, and there are no objections, I'll add it.

This is what I am ultimately striving for. Is there any reason why the source() and sink() can not be assumed to be cin and cout respectively when absent?

...

...
7) allow threading of functions

Do you mean this: support filters which think they are processing an entire stream at once, but really their threads are waiting on some syncronization object whenever there is no more input available, or output buffers are full?

What I want is to allow two proc_as_filter objects to be executed simultaneously, so that this code: proc_as_filter(Proc1) | proc_as_filter(Proc2) Runs optimally on a multi-processor machine. I don't know how hard this is, I am quite inexperienced in multithreaded code. Christopher Diggins http://www.cdiggins.com

Frank van Dijk

6:24 p.m.

hi christopher diggins <cdiggins@videotron.ca> writes:

...

...
...
7) allow threading of functions

Do you mean this: support filters which think they are processing an entire stream at once, but really their threads are waiting on some syncronization object whenever there is no more input available, or output buffers are full?

What I want is to allow two proc_as_filter objects to be executed simultaneously, so that this code:

proc_as_filter(Proc1) | proc_as_filter(Proc2)

Runs optimally on a multi-processor machine. I don't know how hard this is, I am quite inexperienced in multithreaded code.

Proc1 and Proc2 will be using the global cin and cout objects simultaneously if the are run in parallel. mvrgr frank

Jonathan Turkanis

10 Jan 10 Jan

11:27 p.m.

christopher diggins wrote:

...

...
Hi Christopher,

...

...
It seems to me that what you are trying to achieve is something that is already well-supported by the iostreams library. The iostreams library offers users a number of different ways to write filters, with the understanding that some methods will be more efficient or convenient for a particular purpose than another.

Hi Jonathan,

Thanks for responding. I have now managed to get the iostreams library to do what I wanted (see my code below). The only thing missing now is the ability to chain sequences of filters.

Filters can already be chained.

...

...
Given the above definition, if you write a class which derives from co_filter and override the pure virtual function do_filter, you can add it to the filtering streams from the iostreams library and it will work as you have described, if I understand you correctly.

Using what you provided, here is the code which enables us to use an arbitrary procedure as an iostream filter:

...

I compiled and successfuly ran this on Visual C++ 7.1. What are the chances something like this could find its way into the iostreams library?

If you can convince me that it's useful. If it really were possible to reuse existing code, as you orginally suggested, then it would be a clear win. However, I haven't yet seen an example of a pre-existing procedure which meets the requirements of the proc_as_filter class. Therefore, I first must decide whether writing a filter as a function which reads from standard input and writes to standard output is ever the best way to write a filter, given the other choices that the library provides. You clearly think the answer is yes, but I'd like to see some examples. Second, is there any reason to prefer functions with the signature void (*) () [A] over member functions taking references to an input stream and an output stream: struct myfilter { void filter(std::istream&, std::ostream&); [B] }; This is more in keeping with the reset of the library. Finally, why shouldn't the signature be: struct myfilter { template<typename Source, typename Sink> void filter(Source& src, Sink& snk); [C] }; ? I expect the answer is that you want to be able to use formatting i/o functions in the implementation of filter(). Here, again, I'd like to see some examples to prove that it is useful. Personally, I'd prefer [C], and I'd like to change the specification so that instead of filter() consuming all the characters in src, it reads some source source characters from src, writes some characters to snk and returns a status code, e.g. partial, eof or ok.

...

...
However, I should be able to extend it so that if a chain contains both a source and a sink, boost::io::copy is invoked. E.g.,

source() | filter1() | filter2() | filter3() | sink()

would be equivalent to

filtering_ostream out(filter1() | filter2() | filter3() | sink()); boost::io::copy(source, out);

If I can make this work, and there are no objections, I'll add it.

This is what I am ultimately striving for. Is there any reason why the source() and sink() can not be assumed to be cin and cout respectively when absent?

Yes. When you use the pipe operator you don't always want to form a complete chain: filtering_ostream out(filter1() | filter2() | filter3()); out.push(sink()); In a real world example the first and second lines might be at different locations in a program.

...

...
...
7) allow threading of functions

Do you mean this: support filters which think they are processing an entire stream at once, but really their threads are waiting on some syncronization object whenever there is no more input available, or output buffers are full?

What I want is to allow two proc_as_filter objects to be executed simultaneously, so that this code:

proc_as_filter(Proc1) | proc_as_filter(Proc2)

Runs optimally on a multi-processor machine. I don't know how hard this is, I am quite inexperienced in multithreaded code.

std::cin doesn't provide any way to distinguish between EOF and input being temporarily unavailable .Therefore Proc2 must assume that the EOF has been reached the first time std::cin.eof() returns true, for otherwise it could block indefinitely at EOF, waiting for more input to become available. Therefore Proc1 must have finished execution before Proc2 begins, for otherwise if Proc2 consumes input faster than Proc1 generates output a false EOF will be detected. Jonathan Jonathan

Philippe Mori

11:56 p.m.

...

If it really were possible to reuse existing code, as you orginally suggested, then it would be a clear win. However, I haven't yet seen an example of a pre-existing procedure which meets the requirements of the proc_as_filter class. Therefore, I first must decide whether writing a filter as a function which reads from standard input and writes to standard output is ever the best way to write a filter, given the other choices that the library provides. You clearly think the answer is yes, but I'd like to see some examples.

Second, is there any reason to prefer functions with the signature

void (*) () [A]

This is too much strict to force the uses of cin and cout... Any other possible uses would require either to redirect inputs (which might not always be possible particulary if there are multiple thread or if some code assumes that cin or cout is the standard output and can be used freely).

...

over member functions taking references to an input stream and an output stream:

struct myfilter { void filter(std::istream&, std::ostream&); [B] };

This is more in keeping with the reset of the library.

This is better, but why not uses operator() instead of a named function since it would allows us to uses ordinary function of function object and it will works well with boost::function.

...

Finally, why shouldn't the signature be:

struct myfilter { template<typename Source, typename Sink> void filter(Source& src, Sink& snk); [C] };

?

This, combined with my suggestion of using operator() would be the best solution in my opinion. But if the Source and Sink can have arbitrary type we should have a bunch of traits that would allows the code to works with most or all type of Source and Sink. OTOH, if someone only need to support one kind of stream in it algorithm then it might prefer to implement the non-template version.

...

I expect the answer is that you want to be able to use formatting i/o functions in the implementation of filter(). Here, again, I'd like to see some examples to prove that it is useful.

Personally, I'd prefer [C], and I'd like to change the specification so that instead of filter() consuming all the characters in src, it reads some source source characters from src, writes some characters to snk and returns a status code, e.g. partial, eof or ok.

...
...
However, I should be able to extend it so that if a chain contains both a source and a sink, boost::io::copy is invoked. E.g.,

source() | filter1() | filter2() | filter3() | sink()

would be equivalent to

filtering_ostream out(filter1() | filter2() | filter3() | sink()); boost::io::copy(source, out);

If I can make this work, and there are no objections, I'll add it.

This is what I am ultimately striving for. Is there any reason why the source() and sink() can not be assumed to be cin and cout respectively when absent?

Yes. When you use the pipe operator you don't always want to form a complete chain:

filtering_ostream out(filter1() | filter2() | filter3()); out.push(sink());

Well passing filters to the constructor might be more limitating that working with objects. So I thing we should be able to do something like: filtering_ostream out; out < cin; out | filter1() | filter2(); out | filter3(); out > cout; Philippe

Jonathan Turkanis

11 Jan 11 Jan

12:46 a.m.

Philippe Mori wrote:

...

...
... is there any reason to prefer functions with the signature

void (*) () [A]

This is too much strict to force the uses of cin and cout... Any other possible uses would require either to redirect inputs (which might not always be possible particulary if there are multiple thread or if some code assumes that cin or cout is the standard output and can be used freely).

I'm still looking for an example showing that it's useful at all.

...

...
over member functions taking references to an input stream and an output stream:

struct myfilter { void filter(std::istream&, std::ostream&); [B] };

This is more in keeping with the reset of the library.

This is better, but why not uses operator() instead of a named function since it would allows us to uses ordinary function of function object and it will works well with boost::function.

All the filter and device concepts are formulated in terms of ordinary functions rather than operators. This is so that read, write, close etc. can be distinguished. If there were lots of existing function object types lying around which took references to an istream and an ostream and had the correct semantics, it might be worth providing direct support for them. But I don't know of a single example. Also, I can't see how you would use boost::function here.

...

...
Finally, why shouldn't the signature be:

struct myfilter { template<typename Source, typename Sink> void filter(Source& src, Sink& snk); [C] };

...

This, combined with my suggestion of using operator() would be the best solution in my opinion. But if the Source and Sink can have arbitrary type we should have a bunch of traits that would allows the code to works with most or all type of Source and Sink.

The Source and Sink concepts are well defined. You access them using the functions boost::io::read, boost::io::write(), etc.

...

OTOH, if someone only need to support one kind of stream in it algorithm then it might prefer to implement the non-template version.

Again, I haven't yet seen a single convincing example of either, so I'm in no position to judge which is preferable. What I'm looking for is a case where it's significantly easier to write a filter like so

...

...
struct myfilter { void filter(std::istream&, std::ostream&); [B] };

than using one of the existing types of filters, such as input filter, output filter or symmetric filter.

...

...
...
...Is there any reason why the source() and sink() can not be assumed to be cin and cout respectively when absent?

Yes. When you use the pipe operator you don't always want to form a complete chain:

filtering_ostream out(filter1() | filter2() | filter3()); out.push(sink());

Well passing filters to the constructor might be more limitating that working with objects.

I don't follow.

...

So I thing we should be able to do something like:

filtering_ostream out; out < cin; out | filter1() | filter2(); out | filter3(); out > cout;

Okay, but what is it supposed to do? ;-)

...

Philippe

Jonathan

christopher diggins

12:32 a.m.

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

...

...
I compiled and successfuly ran this on Visual C++ 7.1. What are the chances something like this could find its way into the iostreams library?

If you can convince me that it's useful.

It is not that useful, unless you are inexperienced in C++, or just want something that is simple, easy to use and "good enough" (I fall into all of these camps). I understand that these may not be sufficient motivations to make an addition to the library, especially one written in the style of the STL, which is so rigorous.

...

If it really were possible to reuse existing code, as you orginally suggested, then it would be a clear win. However, I haven't yet seen an example of a pre-existing procedure which meets the requirements of the proc_as_filter class.

You mean a simple void procedure with no parameters, which reads from standard in, and outputs to standard out? The main() from many programs fits this bill.

...

Therefore, I first must decide whether writing a filter as a function which reads from standard input and writes to standard output is ever the best way to write a filter, given the other choices that the library provides.

It is possibly never the "best way", technically speaking.

...

You clearly think the answer is yes, but I'd like to see some examples.

Second, is there any reason to prefer functions with the signature [a, b, c] ?

The only reason to ever choose a over b or c, is simplicity and ease of use.

...

...
This is what I am ultimately striving for. Is there any reason why the source() and sink() can not be assumed to be cin and cout respectively when absent?

Yes. When you use the pipe operator you don't always want to form a complete chain:

filtering_ostream out(filter1() | filter2() | filter3()); out.push(sink());

In a real world example the first and second lines might be at different locations in a program.

But in the example of: source() | filter1() | filter2() | sink(); Then why couldn't we just write: filter1() | filter2(); And assume that source and sink were cin and cout respectively. Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org

Jonathan Turkanis

1:14 a.m.

christopher diggins wrote:

...

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

...
...
I compiled and successfuly ran this on Visual C++ 7.1. What are the chances something like this could find its way into the iostreams library?

If you can convince me that it's useful.

It is not that useful, unless you are inexperienced in C++, or just want something that is simple, easy to use and "good enough" (I fall into all of these camps). I understand that these may not be sufficient motivations to make an addition to the library, especially one written in the style of the STL, which is so rigorous.

I'm happy to add classes to make the library easier to use even if efficiency is sacrificed. one_step_filter is an example of this. I'm just looking for an example where it would be easier to write a filter as above than to use one of the existing filters concepts.

...

...
If it really were possible to reuse existing code, as you orginally suggested, then it would be a clear win. However, I haven't yet seen an example of a pre-existing procedure which meets the requirements of the proc_as_filter class.

You mean a simple void procedure with no parameters, which reads from standard in, and outputs to standard out? The main() from many programs fits this bill.

...

...
Therefore, I first must decide whether writing a filter as a function which reads from standard input and writes to standard output is ever the best way to write a filter, given the other choices that the library

Not conforming programs: main must return int. Also, you can't *use* the main function. E.g., void f() { return main(); // error. } provides.

...

It is possibly never the "best way", technically speaking.

I'm willing to define 'best' broadly. It might mean easiest to teach, fastest to write, ... .

...

...
You clearly think the answer is yes, but I'd like to see some examples.

Second, is there any reason to prefer functions with the signature [a, b, c] ?

The only reason to ever choose a over b or c, is simplicity and ease of use.

I understand that's what you're arguing, and it's a perfectly acceptable justification. But I haven't seen an example yet.

...

...
...
... Is there any reason why the source() and sink() can not be assumed to be cin and cout respectively when absent?

Yes. When you use the pipe operator you don't always want to form a complete chain:

filtering_ostream out(filter1() | filter2() | filter3()); out.push(sink());

In a real world example the first and second lines might be at different locations in a program.

But in the example of:

source() | filter1() | filter2() | sink();

Then why couldn't we just write:

filter1() | filter2();

And assume that source and sink were cin and cout respectively.

The expression filter1() | filter2() [A] yields a lightweight object which when added to a chain c has the effect of executing c.push(f1); c.push(f2) where f1 is the temporary instance of filter1and f2 is the temporary instance of filter2. Suppose evaluating [A] had the effect of copy input from std::cin and pumping it through f1, fs2 and std::cout. How could you prevent this from happening in the ordinary case: filtering_ostream out( filter1() | filter2() ); ?

...

Christopher Diggins

Jonathan

christopher diggins

2:28 a.m.

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

...

christopher diggins wrote:

...
----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

I'm happy to add classes to make the library easier to use even if efficiency is sacrificed. one_step_filter is an example of this. I'm just looking for an example where it would be easier to write a filter as above than to use one of the existing filters concepts.

Writing, learning, understanding and reading concepts is hard for inexperienced (and some experienced) C++ programmers. I still get confused and frustrated when using them, and I am not completely wet behind the ears. It took me a couple of hours to figure out how to write that trivial extension to your library, despite the fact that the library and documentation is extremely well written.

...

Not conforming programs: main must return int. Also, you can't *use* the main function. E.g.,

void f() { return main(); // error. }

I realize that. I meant that many of these programs can easily have their guts placed in void functions. void DoWork() { // do work } int main() { DoWork(); } By taking this one simple step, a person's code could then be easily reused. This does overlook the fact that most filter programs take parameters which is trivially remedied.

...

...
The only reason to ever choose a over b or c, is simplicity and ease of use.

I understand that's what you're arguing, and it's a perfectly acceptable justification. But I haven't seen an example yet.

Consider the following program: int main() { char c; while (cin.get(c)) cout.put(toupper(c)); return 0; } Now let's say that after the fact, I want to reuse my code as a filter, first using [c], I would have to rewrite it as: struct ToUpper { template<typename Source, typename Sink> void filter(Source& src, Sink& snk) { char c; while (src.get(c)) snk.put(toupper(c)); } }; Notice the introduction of five new identifiers (filter, Source, src, Sink, snk) and the fact that the while line has to be rewritten. Now compare that to using [a], where I simply put the two lines in a void function: void ToUpper() { char c; while (cin.get(c)) cout.put(toupper(c)); } I think it is clear that this would be simpler to teach, document, read, write and refactor.

...

The expression

filter1() | filter2() [A]

Sorry, I thought it was a statement. Wouldn't it be useful to also allow one liners with a separate syntax: source() > filter1() > filter2() > sink(); Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org

Jonathan Turkanis

5:06 a.m.

christopher diggins wrote:

...

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

...
christopher diggins wrote:

...
----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

I'm happy to add classes to make the library easier to use even if efficiency is sacrificed. one_step_filter is an example of this. I'm just looking for an example where it would be easier to write a filter as above than to use one of the existing filters concepts.

Writing, learning, understanding and reading concepts is hard for inexperienced (and some experienced) C++ programmers. I still get confused and frustrated when using them, and I am not completely wet behind the ears.

I agree with this. I'm happy to add features which make the library easier to use.

...

It took me a couple of hours to figure out how to write that trivial extension to your library, despite the fact that the library and documentation is extremely well written.

Thanks.

...

...
Not conforming programs: main must return int. Also, you can't *use* the main function. E.g.,

void f() { return main(); // error. }

I realize that. I meant that many of these programs can easily have their guts placed in void functions.

void DoWork() { // do work }

int main() { DoWork(); }

Okay, but do you really expect people to start writing programs this way? I think people would only do this to conform to your filter concept. My question is: why aren't the other concepts sufficient?

...

By taking this one simple step, a person's code could then be easily reused. This does overlook the fact that most filter programs take parameters which is trivially remedied.

Note that none of the other concepts has this problem.

...

...
...
The only reason to ever choose a over b or c, is simplicity and ease of use.

I understand that's what you're arguing, and it's a perfectly acceptable justification. But I haven't seen an example yet.

Consider the following program:

...

int main() { char c; while (cin.get(c)) cout.put(toupper(c)); return 0; }

...

...

I guess I should have asked for a *realistic* example. If you really write such simple programs you don't have to worry about reuse; it's simpler to write the whole program again from scratch.

...

...
The expression

filter1() | filter2() [A]

Sorry, I thought it was a statement. Wouldn't it be useful to also allow one liners with a separate syntax:

source() > filter1() > filter2() > sink();

As I mentioned in a previous message, I think this is a good idea, if it uses the pipe notation. I don't see why a different operator should be used. Jonathan

christopher diggins

4:32 p.m.

New subject: Functions as Filters (was Program Reuse ...)

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

...

...
void DoWork() { // do work }

int main() { DoWork(); }

Okay, but do you really expect people to start writing programs this way?

Yes, especially if the iostreams library provides the functionality to them. My point is that C++ code is pointlessly hard to reuse as is, and I am pushing for new ways to make small programs more reusable. This is incrediby important when managing large numbers of small programs (for instance library tests and demos). It is trivial to refactor code to make it look like the above, just cut and paste the main!

...

I think people would only do this to conform to your filter concept. My question is: why aren't the other concepts sufficient?

The other concepts are fine, they are just more obfuscated than most programmers require. Just imagine trying to explain how to use a filter concept in a way which makes sense to a Java / Delphi / C programmer. I think it is important to try and provide alternatives where possible which makes sense to professional programmers who may not be familiar with the intricacies of generic programming techniques and functors.

...

...
By taking this one simple step, a person's code could then be easily reused. This does overlook the fact that most filter programs take parameters which is trivially remedied.

Note that none of the other concepts has this problem.

Noted. I think that all of the concepts should be supported! I simply arguing the case for void(*)() as a valid concept for now.

...

I guess I should have asked for a *realistic* example. If you really write such simple programs you don't have to worry about reuse; it's simpler to write the whole program again from scratch.

First off I do write programs as simple as that and I have a lot of them. This occurs frequently for testing, prototypes, demos, and systems admin. I strongly disagree with maintaining multiple code bases, rather than refactoring and reusing the code. As a professional coder I am always looking for ways to be more productive and and have less code to manage. Nonetheless, I do currently have a non-trivial program which converts C++ into a <pre></pre> html tag, CppToHtmlPreTag, it operates obviously on the stdin and outputs to stdout. It looks essentially like this: void CppToHtmlPreTag() { // calls multiple other functions to do the work }; int main() { CppToHtmlPreTag(); return 0; } I want to reuse this program in another program which outputs an entire Html Docucment with a header and footer. ( CppToHtmlDoc ). The easiest way I can think of to do this is to write a new program such as (this is to a certain degree psuedo-code): struct CppToHtmlDoc { CppToHtmlDoc(string css, string title) : mCss(css), mTitle(title); void filter() { cout << "<html><head><title>" << mTitle; cout << "</title><link rel='stylesheet' type='text/css' href='"; cout << mCss << "'/><body>" cin | CppToHtmlPreTag(); cout << "</body></html>"; } string mCss; string mTitle } int main(int argc, char** argv) { assert(argc == 4); CppToHtmlDoc(argv[1], argv[2]) | filestream(argv[3]); } So I wrote program2 using the [b] approach you outlined which I agree that it is superior for this program. I also managed to retain my original code precisely as is using the [a] approach. If I had to rewrite the original program to use a filter concept I would have had to rewrite *all* of my functions to pass the the Source and Sink types to each one, and to use src and snk instead of cin / cout. I guess my point here is that I am able to refactor existing code more easily and quickly if you support [a] and [b] syntax. [c] is perfectly acceptable, and has its advantages in several scenarios, even though it is overkill for my work.

...

...
...
The expression

filter1() | filter2() [A]

Sorry, I thought it was a statement. Wouldn't it be useful to also allow one liners with a separate syntax:

source() > filter1() > filter2() > sink();

As I mentioned in a previous message, I think this is a good idea, if it uses the pipe notation. I don't see why a different operator should be used.

I just want to be able to write: filter1() | filter2(); as a statement, with the implicit understanding it pumps from cin and to cout. But you told me I can't have that, so I am offering ">" as a possible work-around. Implementation aside, don't you agree that the POV of an end-user isn't it obvious that the above statement should be equivalent to: cin | filter1() | filter2() | cout; I see that this conflicts with the current meaning of filter1() | filter2(), so I would propose that instead that could be rewritten as filter1() + filter2(). As an end-user I expect a | to have executed the data pumping by the end of the statement. If it does it sometimes (i.e. source() | filter() | sink(); statements) but not other times (i.e. filter1() | filter2() expressions) then these are two separate and conflicting meanings of filter | filter, which I am not comfortable with. best regards CD

Rob Stewart

6:18 p.m.

New subject: Functions as Filters (was Program Reuse ...)

From: christopher diggins <cdiggins@videotron.ca>

...

From: "Jonathan Turkanis" <technews@kangaroologic.com>

...
...
void DoWork() { // do work }

int main() { DoWork(); }

Okay, but do you really expect people to start writing programs this way?

Yes, especially if the iostreams library provides the functionality to them. My point is that C++ code is pointlessly hard to reuse as is, and I am pushing for new ways to make small programs more reusable. This is incrediby important when managing large numbers of small programs (for instance library tests and demos). It is trivial to refactor code to make it look like the above, just cut and paste the main!

You assume that nothing fails in the above, since main() falls through and, therefore, returns zero. Are you proposing that main() should actually catch exceptions and, possibly, extract its exit status from the exception object?

...

...
I think people would only do this to conform to your filter concept. My question is: why aren't the other concepts sufficient?

The other concepts are fine, they are just more obfuscated than most programmers require. Just imagine trying to explain how to use a filter

You've admitted to being relatively new to C++. Can you rightly determine what "most programmers require?"

...

concept in a way which makes sense to a Java / Delphi / C programmer. I think it is important to try and provide alternatives where possible which makes sense to professional programmers who may not be familiar with the intricacies of generic programming techniques and functors.

A reasonable goal, though all C++ programmers need to become familiar with generic programming and function objects. These really aren't novel or academic techniques.

...

...
I guess I should have asked for a *realistic* example. If you really write such simple programs you don't have to worry about reuse; it's simpler to write the whole program again from scratch.

First off I do write programs as simple as that and I have a lot of them.

In the *nix tradition, such simple filters are exactly what's desirable, at least for assembly into pipelines.

...

This occurs frequently for testing, prototypes, demos, and systems admin. I strongly disagree with maintaining multiple code bases, rather than refactoring and reusing the code. As a professional coder I am always looking for ways to be more productive and and have less code to manage.

I think Jonathan meant that a program as simple as your toupper example would more likely be written as a library function, to be reused in other programs, than as a standalone program.

...

Nonetheless, I do currently have a non-trivial program which converts C++ into a <pre></pre> html tag, CppToHtmlPreTag, it operates obviously on the stdin and outputs to stdout. It looks essentially like this:

void CppToHtmlPreTag() { // calls multiple other functions to do the work };

int main() { CppToHtmlPreTag(); return 0; }

So the result of this program is to write "<PRE>" to stdout, copy stdin to stdout, and write "</PRE>"? Why do you need a program for that? I can see needing a general purpose program for copying stdin to stdout such that a script can print/echo "<PRE>", call the no-op filter, and then print/echo "</PRE>" as well as other variations. You could even write a general purpose program that took two arguments -- strings -- that it writes before and after copying stdin to stdout. On *nix, that "general purpose program for copying stdin to stdout" could be awk: #!/bin/sh echo "<PRE>" awk '{print}' echo "</PRE>" A "general purpose program that took two arguments -- strings -- that it writes before and after copying stdin to stdout" can be, on *nix: #!/bin/sh if test -n "$1"; then echo "$1" fi awk '{print}' if test -n "$2"; then echo "$2" fi The point is that there are several approaches to your simple goal that don't involve something as complex as you've built, and yet provides all of the benefits. Another approach is something like Microsoft's rundll32.exe which loads a DLL, locates a named entry point, and then passes some arguments to it. If the function signature requirements are a problem, you could create your own version. Assembling your chain would involve a batch/command file that calls rundll32.exe (or your tool) as many times as needed using the shell's I/O redirection to assemble the pieces. IOW, I don't think I see the value in rewriting existing code to conform to your filter's interface such that it can be assembled into a pipeline via C++ when pipelining is a forte of the shell (at least *nix shells).

...

I want to reuse this program in another program which outputs an entire Html Docucment with a header and footer. ( CppToHtmlDoc ). The easiest way I can think of to do this is to write a new program such as (this is to a certain degree psuedo-code):

struct CppToHtmlDoc { CppToHtmlDoc(string css, string title) : mCss(css), mTitle(title); void filter() { cout << "<html><head><title>" << mTitle; cout << "</title><link rel='stylesheet' type='text/css' href='"; cout << mCss << "'/><body>" cin | CppToHtmlPreTag(); cout << "</body></html>"; } string mCss; string mTitle }

int main(int argc, char** argv) { assert(argc == 4); CppToHtmlDoc(argv[1], argv[2]) | filestream(argv[3]); }

So I wrote program2 using the [b] approach you outlined which I agree that it is superior for this program. I also managed to retain my original code precisely as is using the [a] approach. If I had to rewrite the original program to use a filter concept I would have had to rewrite *all* of my functions to pass the the Source and Sink types to each one, and to use src and snk instead of cin / cout.

This would be even easier to assemble via scripting and you don't need to compile anything or maintain source and binaries independently.

...

I guess my point here is that I am able to refactor existing code more easily and quickly if you support [a] and [b] syntax. [c] is perfectly acceptable, and has its advantages in several scenarios, even though it is overkill for my work.

That argument is even stronger for using scripting to assemble such building blocks.

...

I just want to be able to write:

filter1() | filter2();

as a statement, with the implicit understanding it pumps from cin and to cout.

You can do that if "filter1" and "filter2" are filter applications: #!/bin/sh filter1 | filter2 With that, you didn't have to make any changes to filter1 or filter2 to be able to form a pipeline. That's even simpler! -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

christopher diggins

7:23 p.m.

New subject: Functions as Filters (was Program Reuse...)

----- Original Message ----- From: "Rob Stewart" <stewart@sig.com> To: <boost@lists.boost.org> Cc: <boost@lists.boost.org> Sent: Tuesday, January 11, 2005 1:18 PM Subject: Re: [boost] Re: Re: Re: Re: Functions as Filters (was Program Reuse...)

...

From: christopher diggins <cdiggins@videotron.ca>

...
From: "Jonathan Turkanis" <technews@kangaroologic.com>

...
...
void DoWork() { // do work }

int main() { DoWork(); }

Okay, but do you really expect people to start writing programs this way?

Yes, especially if the iostreams library provides the functionality to them. My point is that C++ code is pointlessly hard to reuse as is, and I am pushing for new ways to make small programs more reusable. This is incrediby important when managing large numbers of small programs (for instance library tests and demos). It is trivial to refactor code to make it look like the above, just cut and paste the main!

You assume that nothing fails in the above, since main() falls through and, therefore, returns zero. Are you proposing that main() should actually catch exceptions and, possibly, extract its exit status from the exception object?

No I am not proposing that. I don't care at this point how a programmer chooses to deal with that.

...

...
...
I think people would only do this to conform to your filter concept. My question is: why aren't the other concepts sufficient?

The other concepts are fine, they are just more obfuscated than most programmers require. Just imagine trying to explain how to use a filter

You've admitted to being relatively new to C++. Can you rightly determine what "most programmers require?"

First off, I said "I'm not exactly wet behind the ears". Exactly how much, or how little I know about C++ should be irrelevant. I think it is extremely rude to put someone's credentials into question in such a discussion. If you disagree with something I said, then please state why, don't question the validity of my opinion.

...

...
concept in a way which makes sense to a Java / Delphi / C programmer. I think it is important to try and provide alternatives where possible which makes sense to professional programmers who may not be familiar with the intricacies of generic programming techniques and functors.

A reasonable goal, though all C++ programmers need to become familiar with generic programming and function objects. These really aren't novel or academic techniques.

No, but they are overkill in cases when a void procedure will do just as well.

...

...
Nonetheless, I do currently have a non-trivial program which converts C++ into a <pre></pre> html tag, CppToHtmlPreTag, it operates obviously on the stdin and outputs to stdout. It looks essentially like this:

void CppToHtmlPreTag() { // calls multiple other functions to do the work };

int main() { CppToHtmlPreTag(); return 0; }

So the result of this program is to write "<PRE>" to stdout, copy stdin to stdout, and write "</PRE>"? Why do you need a program for that?

Sorry I was unclear, the CppToHtmlPreTag adds <div class="xxx"></div> tags around the various syntactic elements so that a css document can control coloring. [big snip]

...

This would be even easier to assemble via scripting and you don't need to compile anything or maintain source and binaries independently.

I am perfectly aware of how to use the various shells and scripting languages. I don't see how that is relevant to a discussion of C++ techniques. CD

Rob Stewart

12 Jan 12 Jan

3:02 p.m.

New subject: Functions as Filters (was Program Reuse...)

From: christopher diggins <cdiggins@videotron.ca>

...

From: "Rob Stewart" <stewart@sig.com>

...
From: christopher diggins <cdiggins@videotron.ca>

...
From: "Jonathan Turkanis" <technews@kangaroologic.com>

...
...
void DoWork() { // do work }

int main() { DoWork(); }

Okay, but do you really expect people to start writing programs this way?

Yes, especially if the iostreams library provides the functionality to them. My point is that C++ code is pointlessly hard to reuse as is, and I am pushing for new ways to make small programs more reusable. This is incrediby important when managing large numbers of small programs (for instance library tests and demos). It is trivial to refactor code to make it look like the above, just cut and paste the main!

You assume that nothing fails in the above, since main() falls through and, therefore, returns zero. Are you proposing that main() should actually catch exceptions and, possibly, extract its exit status from the exception object?

No I am not proposing that. I don't care at this point how a programmer chooses to deal with that.

If your framework doesn't account for exit status codes and doesn't provide an exception mechanism for communicating the exit status, then you assume nothing can fail whether by omission or comission.

...

...
...
...
I think people would only do this to conform to your filter concept. My question is: why aren't the other concepts sufficient?

The other concepts are fine, they are just more obfuscated than most programmers require. Just imagine trying to explain how to use a filter

You've admitted to being relatively new to C++. Can you rightly determine what "most programmers require?"

First off, I said "I'm not exactly wet behind the ears". Exactly how much, or how little I know about C++ should be irrelevant. I think it is extremely rude to put someone's credentials into question in such a discussion. If you disagree with something I said, then please state why, don't question the validity of my opinion.

You've tried to suggest that you weren't expert to explain various shortcomings and here you tried to make yourself knowledgeable, if not expert, on what "most programmers require." I simply called you on that. Granted, I could have omitted the first sentence and avoided the trouble. For that, I'm sorry.

...

...
...
concept in a way which makes sense to a Java / Delphi / C programmer. I think it is important to try and provide alternatives where possible which makes sense to professional programmers who may not be familiar with the intricacies of generic programming techniques and functors.

A reasonable goal, though all C++ programmers need to become familiar with generic programming and function objects. These really aren't novel or academic techniques.

No, but they are overkill in cases when a void procedure will do just as well.

Then keep your arguments to the merits of the approach; don't bring the knowledge and skills of unmotivated programmers into your argument.

...

...
...
Nonetheless, I do currently have a non-trivial program which converts C++ into a <pre></pre> html tag, CppToHtmlPreTag, it operates obviously on the stdin and outputs to stdout. It looks essentially like this:

void CppToHtmlPreTag() { // calls multiple other functions to do the work };

int main() { CppToHtmlPreTag(); return 0; }

So the result of this program is to write "<PRE>" to stdout, copy stdin to stdout, and write "</PRE>"? Why do you need a program for that?

Sorry I was unclear, the CppToHtmlPreTag adds <div class="xxx"></div> tags around the various syntactic elements so that a css document can control coloring.

Ah, got it.

...

...
This would be even easier to assemble via scripting and you don't need to compile anything or maintain source and binaries independently.

I am perfectly aware of how to use the various shells and scripting languages. I don't see how that is relevant to a discussion of C++ techniques.

I had no idea of what you or others reading the thread knew, hence my discussing the matter. As to how scripting is relevant, I thought I made that pretty clear in what you snipped: shells are very good at I/O redirection and assembling multiple programs into a new program, without requiring that the code conform to any structure except using stdin and stdout. This raises the question of whether your idea has merit within C++. (If all you have is a hammer....) -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

christopher diggins

3:40 p.m.

New subject: Functions as Filters (was Program Reuse...)

----- Original Message ----- From: "Rob Stewart" <stewart@sig.com>

...

If your framework doesn't account for exit status codes and doesn't provide an exception mechanism for communicating the exit status, then you assume nothing can fail whether by omission or comission.

This is simply beyond the scope of my framework at this point, I don't think the ideas I proposed will ever see the boost::light, however so this point is probably moot.

...

You've tried to suggest that you weren't expert to explain various shortcomings and here you tried to make yourself knowledgeable, if not expert, on what "most programmers require." I simply called you on that.

I am very competent in C++, however I have worked for quite a while in a lot of different languages, which helps give me insight into what programmers require. I think being an expert in C++ would be a hindrance to designing the interface for libraries. Besides being an expert in C++ is a very subjective idea.

...

Granted, I could have omitted the first sentence and avoided the trouble. For that, I'm sorry.

No worries.

...

...
...
This would be even easier to assemble via scripting and you don't need to compile anything or maintain source and binaries independently.

I am perfectly aware of how to use the various shells and scripting languages. I don't see how that is relevant to a discussion of C++ techniques.

I had no idea of what you or others reading the thread knew, hence my discussing the matter. As to how scripting is relevant, I thought I made that pretty clear in what you snipped: shells are very good at I/O redirection and assembling multiple programs into a new program, without requiring that the code conform to any structure except using stdin and stdout. This raises the question of whether your idea has merit within C++. (If all you have is a hammer....)

I thought it was obvious that shell scripts are superior for several tasks than C++. What concerns me is that whether or not something is easier in another language (e.g. bash or dos shell) shouldn't affect whether or not these capabilities have merit for C++. However some concrete reasons for not using scripts combined with C++ code, is that most scripts are platform dependant, and don't integrate well with C++. CD

Jonathan Turkanis

3:56 a.m.

New subject: Functions as Filters (was Program Reuse...)

Hi Christopher, I think I haven't made my main point clear, so let me try to rephrase it. Some filtering operations are easier to express using the InputFilter concept than the OutputFilter concept, and vice versa. For example, the tab-expanding filter http://home.comcast.net/~jturkanis/iostreams/libs/iostreams/doc/?path=4.1 is easier to express as an output filter, since whenever it encounters a tab character it can simply write the appropriate number of space characters to the provided Sink. If it were an InputFilter, whenever it read a tab character from the provided Source it would have to return a single space character and then record the number of space characters to return on subsequent calls to get(). Similarly, whenever a new filter concept is added, I'd like to have a concrete, real-world example (preferably several examples) of a filtering operation which is easier to express using to the new concept than using any of the existing concepts. I was hoping you would be able to say something like this: "Considering the XXX filter; look how easy it is to express using a co_filter: [code] Now look how hard is to express it as an InputFilter [code] or as an OutputFilter [code] Therefore, adding direct support for the co_filter will make the iostreams library easier to use." I'm getting the feeling, however, that you think the existing concepts are simply too hard to understand. Is this correct? Do you think more examples or tutorial material might help? christopher diggins wrote:

...

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

...
...
void DoWork() { // do work }

int main() { DoWork(); }

Okay, but do you really expect people to start writing programs this way?

Yes, especially if the iostreams library provides the functionality to them. My point is that C++ code is pointlessly hard to reuse as is, and I am pushing for new ways to make small programs more reusable. This is incrediby important when managing large numbers of small programs (for instance library tests and demos).

I agree that reuse is critical. The iostreams library was designed to allow developers to create highly reusable components.

...

It is trivial to refactor code to make it look like the above, just cut and paste the main!

If you can convince me that there are lots of existing programs lying around which can be transformed into co_filters by modifying a few lines, I'd be inclined to add co_filters to the library. One thing I'd like to know is why system_filters wouldn't be an acceptable vehicle for reuse in that case.

...

...
I think people would only do this to conform to your filter concept. My question is: why aren't the other concepts sufficient?

The other concepts are fine, they are just more obfuscated than most programmers require. Just imagine trying to explain how to use a filter concept in a way which makes sense to a Java / Delphi / C programmer.

Actually, the InputFilter and OutputFilter concepts are very similar to the classes FilterInputStream and FilterOutputStream from java.io. The main difference is that the Java classes store a reference to the downstream Device as member data while models of the Boost concepts are passed the downstream Devices as function arguments. This was done to allow the exact type of the downstream Device to vary, to prevent user-defined filters from having to derive from specific classes, and to shield the user from managing the lifetime of the downstream Device.

...

I think it is important to try and provide alternatives where possible which makes sense to professional programmers who may not be familiar with the intricacies of generic programming techniques and functors.

My hope is that users who do not feel comfortable reading the semi-formal concept specifications will be able to learn to use the library quickly by studying the examples. I intend to provide many additional examples in the final documentation.

...

...
...
By taking this one simple step, a person's code could then be easily reused. This does overlook the fact that most filter programs take parameters which is trivially remedied.

Note that none of the other concepts has this problem.

Noted. I think that all of the concepts should be supported! I simply arguing the case for void(*)() as a valid concept for now.

I don't think I ever said it was invalid.

...

...
I guess I should have asked for a *realistic* example. If you really write such simple programs you don't have to worry about reuse; it's simpler to write the whole program again from scratch.

First off I do write programs as simple as that and I have a lot of them. This occurs frequently for testing, prototypes, demos, and systems admin. I strongly disagree with maintaining multiple code bases, rather than refactoring and reusing the code. As a professional coder I am always looking for ways to be more productive and and have less code to manage.

To me, if you want to be able to reuse code which converts to uppercase, simply write an uppercase filter; e.g.: struct toupper_filter : input_filter { template<typename Source> int get(Source& src) { return toupper(boost::io::get(src)); } };

...

Nonetheless, I do currently have a non-trivial program which converts C++ into a <pre></pre> html tag, CppToHtmlPreTag, it operates obviously on the stdin and outputs to stdout. It looks essentially like this:

void CppToHtmlPreTag() { // calls multiple other functions to do the work };

int main() { CppToHtmlPreTag(); return 0; }

I want to reuse this program in another program which outputs an entire Html Docucment with a header and footer. ( CppToHtmlDoc ). The easiest way I can think of to do this is to write a new program such as (this is to a certain degree psuedo-code):

struct CppToHtmlDoc { CppToHtmlDoc(string css, string title) : mCss(css), mTitle(title); void filter() { cout << "<html><head><title>" << mTitle; cout << "</title><link rel='stylesheet' type='text/css' href='"; cout << mCss << "'/><body>" cin | CppToHtmlPreTag(); cout << "</body></html>"; } string mCss; string mTitle }

int main(int argc, char** argv) { assert(argc == 4); CppToHtmlDoc(argv[1], argv[2]) | filestream(argv[3]); }

So I wrote program2 using the [b] approach you outlined which I agree that it is superior for this program. I also managed to retain my original code precisely as is using the [a] approach. If I had to rewrite the original program to use a filter concept I would have had to rewrite *all* of my functions to pass the the Source and Sink types to each one, and to use src and snk instead of cin / cout.

I'm starting to believe that having to pass the Source or Sink (or both) as function arguments is what you find problematic, but I don't follow the entire discussion. Why would [c] require you to rewrite each function to take Source and Sink parameters, but not [b]?

...

I guess my point here is that I am able to refactor existing code more easily and quickly if you support [a] and [b] syntax. [c] is perfectly acceptable, and has its advantages in several scenarios, even though it is overkill for my work.

If [c] is perfectly acceptable, then it can't be passing the Source and Sink as function arguments which is troubling you. As a result, I don't see why it is easier to express CppToHtmlDoc as a co_filter than as one of the other types of filters. You omitted the implementation, which is where the difference would presumably show itself.

...

...
...
...
The expression

filter1() | filter2() [A]

Sorry, I thought it was a statement. Wouldn't it be useful to also allow one liners with a separate syntax:

source() > filter1() > filter2() > sink();

As I mentioned in a previous message, I think this is a good idea, if it uses the pipe notation. I don't see why a different operator should be used.

I just want to be able to write:

filter1() | filter2();

as a statement, with the implicit understanding it pumps from cin and to cout. But you told me I can't have that, so I am offering ">" as a possible work-around.

Okay. But wouldn't the main use case be writing command-line filters? In that case, the bulk of the main function could be written: filtering_ostream out(filter1() | filter2() | ref(cout)); copy(std::cin, out); Is that really so hard? (I'll eliminate the need for "ref()" shortly.)

...

Implementation aside, don't you agree that the POV of an end-user isn't it obvious that the above statement should be equivalent to:

cin | filter1() | filter2() | cout;

No.

...

I see that this conflicts with the current meaning of filter1() | filter2(), so I would propose that instead that could be rewritten as filter1() + filter2(). As an end-user I expect a | to have executed the data pumping by the end of the statement. If it does it sometimes (i.e. source() | filter() > sink(); statements) but not other times (i.e. filter1() | filter2() expressions) then these are two separate and conflicting meanings of filter > filter, which I am not comfortable with.

You were the one who suggested the first meaning. Now you say you are not comfortable with it because it conflicts with the second (pre-existing) meaning! Best Regards, Jonathan

christopher diggins

4:33 p.m.

New subject: Functions as Filters (was Program Reuse...)

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com> To: <boost@lists.boost.org> Sent: Tuesday, January 11, 2005 10:56 PM Subject: [boost] Re: Re: Re: Re: Re: Functions as Filters (was Program Reuse...)

...

Hi Christopher,

I think I haven't made my main point clear, so let me try to rephrase it.

Hi Jonathan, I am pretty sure I understand you. Before I continue, I first want to know one thing, do you (or anyone else) see any value of being able to write C++ code like the following? #include <iostream> #include <fstream> #include "fxn_filters.hpp" using namespace std; void ToUpper() { char c; while (cin.get(c)) cout.put(c); } int main() { fstream("input.txt") > Filter(ToUpper); } Christopher Diggins http://www.cdiggins.com

Jonathan Turkanis

5:35 p.m.

New subject: Functions as Filters (was Program Reuse...)

christopher diggins wrote:

...

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com> To: <boost@lists.boost.org> Sent: Tuesday, January 11, 2005 10:56 PM Subject: [boost] Re: Re: Re: Re: Re: Functions as Filters (was Program Reuse...)

...
Hi Christopher,

I think I haven't made my main point clear, so let me try to rephrase it.

Hi Jonathan,

I am pretty sure I understand you. Before I continue, I first want to know one thing, do you (or anyone else) see any value of being able to write C++ code like the following?

#include <iostream> #include <fstream> #include "fxn_filters.hpp"

using namespace std;

void ToUpper() { char c; while (cin.get(c)) cout.put(c); }

I don't see why this is better than struct toupper_filter : input_filter { template<typename Source> int get(Source& src) { return toupper(boost::io::get(src)); } }; and its cousins.

...

int main() { fstream("input.txt") > Filter(ToUpper); }

This won't work, since fstream("input.txt") produces a temporary object which cannot be used to initialize a non-const reference. Therefore operator< would have to be declared to take a const reference, and the non-const extraction functions would not be usable. I have to go no, but I'll continue this discussion later.

...

Christopher Diggins http://www.cdiggins.com

Jonathan

christopher diggins

8:09 p.m.

New subject: Functions as Filters (was Program Reuse...)

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

...

I don't see why this is better than

struct toupper_filter : input_filter { template<typename Source> int get(Source& src) { return toupper(boost::io::get(src)); } };

and its cousins.

Then that pretty much closes the discussion. If you do not see the benefits of the lower complexity, then there really is little more I can do to convince you otherwise.

...

...
int main() { fstream("input.txt") > Filter(ToUpper); }

This won't work, since fstream("input.txt") produces a temporary object which cannot be used to initialize a non-const reference. Therefore operator< would have to be declared to take a const reference, and the non-const extraction functions would not be usable.

I already made it work by creating temporary helper objects which take the address of the temporary objects.

...

I have to go no, but I'll continue this discussion later.

Bob Bell

11:05 p.m.

New subject: Functions as Filters (was Program Reuse...)

christopher diggins <cdiggins <at> videotron.ca> writes:

...

----- Original Message ----- From: "Jonathan Turkanis" <technews <at> kangaroologic.com>

...
I don't see why this is better than

struct toupper_filter : input_filter { template<typename Source> int get(Source& src) { return toupper(boost::io::get(src)); } };

and its cousins.

Then that pretty much closes the discussion. If you do not see the benefits of the lower complexity, then there really is little more I can do to convince you otherwise.

The first thing that's apparent to me is that the actual difference in complexity between the ToUpper you wrote and the toupper_filter quoted here is really quite minor. The second thing that's apparent is that your ToUpper relies on global variables, whereas toupper_filter relies on function arguments, which in my mind is a more heavily-weighted concern than the complexity difference. My question is why you do you think relying on globals is a better approach? Some of your points have been motivated by the needs of novice C++ programmers; do you really think it is better to advocate a programming style that relies on global variables? Bob

Jonathan Turkanis

13 Jan 13 Jan

6:40 a.m.

New subject: Functions as Filters (was Program Reuse...)

Bob Bell wrote:

...

quoted here is really quite minor. The second thing that's apparent is that your ToUpper relies on global variables, whereas toupper_filter relies on function arguments, which in my mind is a more heavily-weighted concern than the complexity difference. My question is why you do you think relying on globals is a better approach? Some of your points have been motivated by the needs of novice C++ programmers; do you really think it is better to advocate a programming style that relies on global variables?

I was going to mention the global variables, but in fairness I think the four global streams have special status. If I could redesign the standard iostreams library without concern for backward compatibility, I'd change a number of things but I'm not sure I'd require users to write std::cout::instance() << "Hello World!\n"; I'm interested to know exactly what Christopher thinks is too complex about the filter concepts, since I'm sure others will share his view. I suspect it's the member templates and typedefs, but I'd like to hear it from him. Christopher? ;-) Jonathan

christopher diggins

3:48 p.m.

New subject: Functions as Filters (was Program Reuse...)

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com> To: <boost@lists.boost.org> Sent: Thursday, January 13, 2005 1:40 AM Subject: [boost] Re: Functions as Filters (was Program Reuse...)

...

Bob Bell wrote:

...
quoted here is really quite minor. The second thing that's apparent is that your ToUpper relies on global variables, whereas toupper_filter relies on function arguments, which in my mind is a more heavily-weighted concern than the complexity difference. My question is why you do you think relying on globals is a better approach? Some of your points have been motivated by the needs of novice C++ programmers; do you really think it is better to advocate a programming style that relies on global variables?

Reading and writing to cin and cout, is a long established effective practice which results in generic and easily reused code. I am not simply advocating for needs of novice programmers. I am not a novice, but I always want my code to be written as if for novices whereever possible. It is my experience that when writing as if for novices usually results in code which is more easy to maintain, reuse and document.

...

I was going to mention the global variables, but in fairness I think the four global streams have special status. If I could redesign the standard iostreams library without concern for backward compatibility, I'd change a number of things but I'm not sure I'd require users to write

std::cout::instance() << "Hello World!\n";

I'm interested to know exactly what Christopher thinks is too complex about the filter concepts, since I'm sure others will share his view.

I don't think they are too complicated! They do precisely what they should do, the way they should do it. I don't think that the filter concepts can and should be simplified any further. I however want the alternative option to use a void function which operate on cin and cout, as a filter when it is appropriate. CD

Jonathan Turkanis

2:16 a.m.

New subject: Functions as Filters (was Program Reuse...)

christopher diggins wrote:

...

----- Original Message ----- From: "Jonathan Turkanis"

...

...
void ToUpper() { char c; while (cin.get(c)) cout.put(c); }

...

...
I don't see why this is better than

struct toupper_filter : input_filter { template<typename Source> int get(Source& src) { return toupper(boost::io::get(src)); } };

and its cousins.

Then that pretty much closes the discussion. If you do not see the benefits of the lower complexity, then there really is little more I can do to convince you otherwise.

It's not that I don't see the benefit of lower complexity; I don't really see the lower complexity. ;-) The only noticeable difference is that mine contains "template<typename Source>" and uses a non-member get function. Your version requires that an entire stream of data be processed at once -- leading to poor memory use -- and doesn't work at all for streams which have no natural end. To justify this I'd like to see something more substantial that the elimination of "template<typename Source>".

...

...
...
int main() { fstream("input.txt") > Filter(ToUpper); }

This won't work, since fstream("input.txt") produces a temporary object which cannot be used to initialize a non-const reference. Therefore operator< would have to be declared to take a const reference, and the non-const extraction functions would not be usable.

I already made it work by creating temporary helper objects which take the address of the temporary objects.

The Boost.Iostreams solution (if I add your extension to the |-syntax) is as follows: file("input.txt") | toupper_filter() | ref(std::cout); This seems very elegant to me. I don't see any reason to make the reference to std::cout implicit. In fact, I think it would make the above more difficult to understand. Jonathan

Daniel James

2:47 p.m.

New subject: Functions as Filters (was Program Reuse...)

Jonathan Turkanis wrote:

...

It's not that I don't see the benefit of lower complexity; I don't really see the lower complexity. ;-)

Simon Tatham provides a nice motivation for this kind of thing at: http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html Here's a rough (untested) translation of his run-length decoding example: void decompress(std::istream& in, std::ostream& out) { char c; while(in.get(c)) { if(c == 0xFF) { int len = in.get(); if(!in || !in.get(c)) // Return an error. while(len--) out.put(c); } else { out.put(c); } } } And as a filter: struct toupper_filter : input_filter { int repeat_char; int repeat_length; toupper_filter() : repeat_char(0), repeat_length(0) {} template<typename Source> int get(Source& src) { if(repeat_length > 0) { repeat_length--; return repeat_char; } else { char c = boost::io::get(src); if(c == 0xFF) { repeat_length = boost::io::get(src); repeat_char = boost::io::get(src); repeat_length--; return repeat_char; } else { return c; } } } }; And that's a fairly simple example. (Sorry if you have a better way to do this, I haven't really looked at the library).

...

Your version requires that an entire stream of data be processed at once -- leading to poor memory use -- and doesn't work at all for streams which have no natural end.

Not necessarily, he could use threads or fibres with pipes, although that's quite expensive. That's why I was playing around with using a Duff's Device style switch statement for implementing coroutines. Daniel

Jonathan Turkanis

10:12 p.m.

New subject: Functions as Filters (was Program Reuse...)

Daniel James wrote:

...

Jonathan Turkanis wrote:

...
It's not that I don't see the benefit of lower complexity; I don't really see the lower complexity. ;-)

Simon Tatham provides a nice motivation for this kind of thing at:

http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Here's a rough (untested) translation of his run-length decoding example:

void decompress(std::istream& in, std::ostream& out) { char c; while(in.get(c)) { if(c == 0xFF) { int len = in.get(); if(!in || !in.get(c)) // Return an error. while(len--) out.put(c); } else { out.put(c); } } }

And as a filter:

struct toupper_filter : input_filter { int repeat_char; int repeat_length;

toupper_filter() : repeat_char(0), repeat_length(0) {}

template<typename Source> int get(Source& src) { if(repeat_length > 0) { repeat_length--; return repeat_char; } else { char c = boost::io::get(src); if(c == 0xFF) { repeat_length = boost::io::get(src); repeat_char = boost::io::get(src); repeat_length--; return repeat_char; } else { return c; } } } };

And that's a fairly simple example. (Sorry if you have a better way to do this, I haven't really looked at the library).

Nice example! That's the type of evidence I was hoping Christopher would produce. Refering to one of my old messages, which doesn't seem to be archived yet, you have written a filter in form [B] (using streams as function arguments). I think [C] (using Sources and Sinks) would be sufficient here. So I'm leaning toward allowing filters along the lines of [C]. I guess I should mention that this was first suggested by Rob Stewart in a private email during the iostreams review: <email> Jonathan turkanis wrote:

...

Rob Stewart wrote:

...
Can we simplify all of this to the following?

template <typename Source, typename Sink> unspecified-status-indicator filter(Source & in, Sink & out);

IOW, if the framework provided both the source and the sink, the call to filter() would cause data to flow from in to out. Whether the data flow is input or output for the entire stream doesn't matter. The filter just knows its own source and sink.

The source and sink could even be objects in your library that wrap a Device and hook into the framework mechanisms to move data along, if you need to intervene in any way. That's particularly useful for async I/O.

...

I've thought of this too, and I like it. This could be called a CoprocessFilter.

</email>

...

Daniel James wrote: Jonathan Turkanis wrote:

...
Your version requires that an entire stream of data be processed at once -- leading to poor memory use -- and doesn't work at all for streams which have no natural end.

Not necessarily, he could use threads or fibres with pipes, although that's quite expensive. That's why I was playing around with using a Duff's Device style switch statement for implementing coroutines.

I think I already answered this. Standard input streams don't have a way to indicate that input is temporarily unavailable. Therefore if the producer and consumer are operating in separate threads and the consumer gets ahead of the producer, a false EOF will be detected. Version [C] will not suffer from this problem, because the final version of the filter and device concepts will provide a way to distinguish EOF from EAGAIN. Unfortunately, there's no way to retrofit this onto the standard streams or stream buffers, since they do not recognize this distinction.

...

Daniel

Jonathan

Jonathan Turkanis

11:12 p.m.

New subject: Functions as Filters (was Program Reuse...)

Jonathan Turkanis wrote:

...

Daniel James wrote:

...

...
Simon Tatham provides a nice motivation for this kind of thing at:

...

Nice example! That's the type of evidence I was hoping Christopher would produce. Refering to one of my old messages, which doesn't seem to be archived yet, you have written a filter in form [B] (using streams as function arguments). I think [C] (using Sources and Sinks) would be sufficient here. So I'm leaning toward allowing filters along the lines of [C].

...

...
Not necessarily, he could use threads or fibres with pipes, although that's quite expensive. That's why I was playing around with using a Duff's Device style switch statement for implementing coroutines.

I think I already answered this. Standard input streams don't have a way to indicate that input is temporarily unavailable. Therefore if the producer and consumer are operating in separate threads and the consumer gets ahead of the producer, a false EOF will be detected.

Version [C] will not suffer from this problem, because the final version of the filter and device concepts will provide a way to distinguish EOF from EAGAIN. Unfortunately, there's no way to retrofit this onto the standard streams or stream buffers, since they do not recognize this distinction.

I may have spoken too soon (as I have several times in this thread ;-). Once the run-length encoding example is rewritten to handle the possibility that get() can return an indication that input is temporarily unavailable, I'm not sure the simplfication will survive. If we want to preserve the simplcity of the example without requiring that an enitre stream of data be processed at once, we could switch to thread-aware concepts. E.g., template<typename AsyncSource, typename AsynSink> void decompress(AsyncSource& in, AsynSink& out) { char c; while((c = boost::io::blocking_get(in)) != EOF) { if(c == 0xFF) { int len = boost::io::blocking_get(in); if(!in || !boost::io::blocking_get(in)) // Return an error. while(len--) boost::io::blocking_put(out, c); } else { boost::io::blocking_put(out, c); } } } (I've been planning to introduce such concepts eventually, but definitely not in the initial release.) With some sacrifice of efficiency, we could even use your example unchanged, and stipulate that the provided istream and ostream wait on some synchronization object until input is available or EOF is reached (for the istream) or until the output buffers have free space (for the ostream). In short, I 'm not sure the co_filter idea can be implemented without some cost in efficiency or simplicity. Jonathan

Bruno Martínez Aguerre

17 Jan 17 Jan

12:34 p.m.

New subject: Functions as Filters (was Program Reuse...)

Coroutines or threads are overkill for this problem, IMHO. What you need is laziness, as in haskell. FC++ probably can handle decompress with less overhead than threads. Bruno Martinez On Thu, 13 Jan 2005 16:12:44 -0700, Jonathan Turkanis <technews@kangaroologic.com> wrote:

...

If we want to preserve the simplcity of the example without requiring that an enitre stream of data be processed at once, we could switch to thread-aware concepts. E.g.,

template<typename AsyncSource, typename AsynSink> void decompress(AsyncSource& in, AsynSink& out) { char c; while((c = boost::io::blocking_get(in)) != EOF) { if(c == 0xFF) { int len = boost::io::blocking_get(in); if(!in || !boost::io::blocking_get(in)) // Return an error. while(len--) boost::io::blocking_put(out, c); } else { boost::io::blocking_put(out, c); } } }

(I've been planning to introduce such concepts eventually, but definitely not in the initial release.)

With some sacrifice of efficiency, we could even use your example unchanged, and stipulate that the provided istream and ostream wait on some synchronization object until input is available or EOF is reached (for the istream) or until the output buffers have free space (for the ostream).

In short, I 'm not sure the co_filter idea can be implemented without some cost in efficiency or simplicity.

Jonathan Turkanis

6:31 p.m.

New subject: Functions as Filters (was Program Reuse...)

Bruno Martínez Aguerre wrote:

...

Coroutines or threads are overkill for this problem, IMHO. What you need is laziness, as in haskell. FC++ probably can handle decompress with less overhead than threads.

I'm not sure what you mean. The algorithm Daniel mentioned can be implemented efficiently in the current framework. The question is whether it can be simplified without sacrificing performance. If you think this can be done with FC++, please post some code. Jonathan

Bruno Martínez Aguerre

8:28 p.m.

New subject: Functions as Filters (was Program Reuse...)

On Mon, 17 Jan 2005 11:31:36 -0700, Jonathan Turkanis <technews@kangaroologic.com> wrote:

...

Bruno Martínez Aguerre wrote:

...
Coroutines or threads are overkill for this problem, IMHO. What you need is laziness, as in haskell. FC++ probably can handle decompress with less overhead than threads.

I'm not sure what you mean. The algorithm Daniel mentioned can be implemented efficiently in the current framework. The question is whether it can be simplified without sacrificing performance.

You can write decompress as a FC++ function taking and returning lazy lists, in the straight-forward way. FC++'s lists can be contructed from a pair of iterators, which are used lazily, and can be converted to forward iterators themselves, so it should be possible to interface FC++ with the current framework.

...

If you think this can be done with FC++, please post some code.

I don't have FC++ installed, but what I say should be possible according to: http://www.cc.gatech.edu/~yannis/fc++/boostpaper/fcpp.sectrelation.html#id27... A Haskell implementation can be found here: http://community.moertel.com/ss/space/Talk+-+Haskell+for+Perl+Hackers/pgh-pm... page 74. Regards, Bruno

Jonathan Turkanis

9:08 p.m.

New subject: Functions as Filters (was Program Reuse...)

Bruno Martínez Aguerre wrote:

...

On Mon, 17 Jan 2005 11:31:36 -0700, Jonathan Turkanis <technews@kangaroologic.com> wrote:

...
Bruno Martínez Aguerre wrote:

...
Coroutines or threads are overkill for this problem, IMHO. What you need is laziness, as in haskell. FC++ probably can handle decompress with less overhead than threads.

I'm not sure what you mean. The algorithm Daniel mentioned can be implemented efficiently in the current framework. The question is whether it can be simplified without sacrificing performance.

You can write decompress as a FC++ function taking and returning lazy lists, in the straight-forward way.

Judging from measurements posted by Gennadiy Rozental during the FC++ review, I doubt the lazy list solution qualifies according to the above criteria. Still, it sounds like an interesting idea. It's been a while since I looked at FC++, so I'm having trouble seeing how the lazy list solution would work. Does it work only in cases similar to the compression example, or is a general replacement for Christopher's co_filters? I'd appreciate a small code sample, even if it is untested.

...

Regards, Bruno

Jonathan

christopher diggins

13 Jan 13 Jan

3:19 p.m.

New subject: Functions as Filters (was Program Reuse...)

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

...

It's not that I don't see the benefit of lower complexity; I don't really see the lower complexity. ;-)

The only noticeable difference is that mine contains "template<typename Source>" and uses a non-member get function.

And the facts that: 1) ToUpper() is not a non-member function 2) ToUpper() uses cin rather than a passed argument Nonetheless consider the following psuedo-program (which does model a lot of software): void Fu(string s) { cout << s; } string Bar() { string ret; cin >> ret; return ret; } string Transform(string s) { string ret(s); // do some stuff return ret; } void DoTransform() { while (!cin.eof()) { Fu(Transform(Bar())); } } int main() { DoTransform(); return 0; } Now if at a later point I want to refactor and reuse the code from a program like this in another, I can either rewrite the entire program as an iostreams filter (which is not trivial), or I can simply write (using my personal library): int main() { fstream("in.txt") > Filter(DoTransform) > fstream("out.txt"); } Irregardless of the arguments for using iostreams filters (which trust me, I am very aware of), most of the time I am simply going to opt to use this syntax, because it require far less work. Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org

Peter Dimov

3:41 p.m.

New subject: Functions as Filters (was Program Reuse...)

christopher diggins wrote:

...

Nonetheless consider the following psuedo-program (which does model a lot of software):

void Fu(string s) { cout << s; }

string Bar() { string ret; cin >> ret; return ret; }

string Transform(string s) { string ret(s); // do some stuff return ret; }

void DoTransform() { while (!cin.eof()) { Fu(Transform(Bar())); } }

int main() { DoTransform(); return 0; }

IMO, this is how a more realistic example would look like: void consume( string const & s, ostream & out ) { out << s; } string produce( istream & in ) { string ret; in >> ret; return ret; } // Transform as above void do_transform( istream & in, ostream & out ) { while( !in.eof() ) { consume( Transform( produce(in) ), out ); } } int main() { do_transform( cin, cout ); } This is just a reflection of the general "globals are bad" principle.

christopher diggins

3:56 p.m.

New subject: Functions as Filters (was Program Reuse...)

----- Original Message ----- From: "Peter Dimov" <pdimov@mmltd.net>

...

IMO, this is how a more realistic example would look like:

[snip]

...

This is just a reflection of the general "globals are bad" principle.

Most of the application code I have seen written, does so directly to cout rather than to a function parameter. I also don't see any advantage to passing istream and ostream as function parameters, when cin and cout can be easily redirected using rdbuf. Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org

Bryan Ross

4:48 p.m.

New subject: Functions as Filters (was Program Reuse...)

The problem I personally have run into in situations like that (embedding references to std::cout and std::cin within my functions) is that I often find myself needing to input from our output to a ifstream or ofstream. Designing the functions from the ground up to use references to i/ostreams alleviates the headache of heavily modifying the program at a later date. Bryan Ross me@daerid.com christopher diggins wrote:

...

----- Original Message ----- From: "Peter Dimov" <pdimov@mmltd.net>

...
IMO, this is how a more realistic example would look like:

[snip]

...
This is just a reflection of the general "globals are bad" principle.

Most of the application code I have seen written, does so directly to cout rather than to a function parameter. I also don't see any advantage to passing istream and ostream as function parameters, when cin and cout can be easily redirected using rdbuf.

Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Peter Dimov

4:49 p.m.

New subject: Functions as Filters (was Program Reuse...)

christopher diggins wrote:

...

...
This is just a reflection of the general "globals are bad" principle.

Most of the application code I have seen written, does so directly to cout rather than to a function parameter. I also don't see any advantage to passing istream and ostream as function parameters, when cin and cout can be easily redirected using rdbuf.

The advantages of following a style of programming where functions do not take hidden arguments and do not reference global state are well known. Even if in this particular case the "wrong way" does not cause much harm because of cin and cout's redirection capability, in general, if a module has been written to operate on a hardcoded global (or a singleton), you will not be able to reuse it.

Rob Stewart

5:23 p.m.

New subject: Functions as Filters (was Program Reuse...)

From: christopher diggins <cdiggins@videotron.ca>

...

From: "Peter Dimov" <pdimov@mmltd.net>

...
IMO, this is how a more realistic example would look like:

[snip]

...
This is just a reflection of the general "globals are bad" principle.

Most of the application code I have seen written, does so directly to cout rather than to a function parameter. I also don't see any advantage to passing istream and ostream as function parameters, when cin and cout can be easily redirected using rdbuf.

Part of the issue being raised here are whether it is wise to promote code that uses cin and cout in these ways and relies upon redirection via rdbuf() to make them reusable. If a client of your library must rewrite main() in order to make it usable, is it onerous to expect that the code would be altered to use supplied streams rather than cin and cout? Even a slight alteration to your library would provide the same feature set but would encourage clients to write better code in the future. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Beman Dawes

4:50 p.m.

New subject: Functions as Filters (was Program Reuse...)

At 10:41 AM 1/13/2005, Peter Dimov wrote:

...

This is just a reflection of the general "globals are bad" principle.

Shouldn't that be "global variables are bad":-? --Beman

Jonathan Turkanis

10:30 p.m.

New subject: Functions as Filters (was Program Reuse...)

christopher diggins wrote:

...

----- Original Message ----- From: "Jonathan Turkanis" <technews@kangaroologic.com>

...
It's not that I don't see the benefit of lower complexity; I don't really see the lower complexity. ;-)

The only noticeable difference is that mine contains "template<typename Source>" and uses a non-member get function.

And the facts that: 1) ToUpper() is not a non-member function

It looks like a non-member function. Anyway, I don't see the relevance.

...

2) ToUpper() uses cin rather than a passed argument

Right, but I was suggesting that you prefered using std::cin because is avoided the member template.

...

Nonetheless consider the following psuedo-program (which does model a lot of software):

void Fu(string s) { cout << s; }

string Bar() { string ret; cin >> ret; return ret; }

string Transform(string s) { string ret(s); // do some stuff return ret; }

void DoTransform() { while (!cin.eof()) { Fu(Transform(Bar())); } }

int main() { DoTransform(); return 0; }

Now if at a later point I want to refactor and reuse the code from a program like this in another, I can either rewrite the entire program as an iostreams filter (which is not trivial), or I can simply write (using my personal library):

int main() { fstream("in.txt") > Filter(DoTransform) > fstream("out.txt"); }

This discussion is way off track now. Whenever a new framework for code reuse is proposed, there are two types of reuse to consider. (1) Reuse of code predating the framework. (2) Reuse of code yet to be written. Here we can assume that code intended to be used with the framework will be designed specifically for this purpose. The question is which patterns or concepts the framework should rely on. The ability to reuse old code is irrelevant; what matters is which concepts or patterns are easiest to use and most efficient. Regarding (1), I agree that this is important, and I've indicated a willingness to support this type of reuse several times during our discussion. All I asked was that you show me some specific examples of old code which could be quickly modified to conform to your filter concept. So far you haven't done so. Regarding (2), I wanted to see a specific filtering operation which could be expressed clearly using a co_filter but was awkward to express using one of the existing filter concepts. Daniel James has now given an example of this. Best Regards, Jonathan

Rob Stewart

5:35 p.m.

New subject: Functions as Filters (was Program Reuse...)

From: christopher diggins <cdiggins@videotron.ca>

...

Before I continue, I first want to know one thing, do you (or anyone else) see any value of being able to write C++ code like the following?

#include <iostream> #include <fstream> #include "fxn_filters.hpp"

using namespace std;

void ToUpper() { char c; while (cin.get(c)) cout.put(c); }

int main() { fstream("input.txt") > Filter(ToUpper); }

In principle, the approach seems interesting. It is the approach that is being questioned here. In fact, if the idea didn't strike the fancy of anyone, you'd have gotten no discussion. How well will your scheme work with a tens of megabytes file? If you have a great deal of memory on your machine, then make it tens of gigabytes. Eventually, your approach fails because each stage of the pipeline must have a (possibly modified) copy of its input. Thus, there are at least two copies of the data in memory at once. Jonathan's library scales better, so if you can make use of it, your library will scale better. Perhaps your library providing some classes/functions to reduce the requirements levied on users to achieve your goal. IOW, there might be a way to recast your ideas to eliminate reliance on cin and cout and to avoid having to process all data at once. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

7449

Age (days ago)

7458

Last active (days ago)

List overview

Download

46 comments

14 participants

participants (14)

Beman Dawes
Bob Bell
Bruno Martínez Aguerre
Bryan Ross
christopher diggins
Daniel James
Frank van Dijk
Jonathan Turkanis
Matt Austern
Pavel Vozenilek
Peter Dimov
Philippe Mori
Rob Stewart
Roland Schwarz