[boost] Re: IOStreams formal review start

4 Sep 2004

      "Pavel Vozenilek" <pavel_vozenilek@hotmail.com> wrote in message
news:chade8$n8k$1@sea.gmane.org...
...
...
start of the formal review of the Iostreams
library by Jonathan Turkanis.
I vote to include Iostreams into Boost with both hands.
It is prime example of professional work.
Thanks!
...
Looking back on MoreIo library I recommend:
 - to prefere Iostreams for all overlapping functionality
 - to move the applicable rest from MoreIo into Iostreams
   with:
    - naming conventions taken from Iostreams
    - documentation look and feel from Iostreams
I assume you view null_buf, pointerbuf, etc. as overlapping functionality. What
about 'streambuf wrapping'?
...
For user it is always better to have one library
with consistent structure/naming/documentation
than to have many of them. This is hard earned
experience.
Yes.
...
As the Iostreams is large, I didn't yet read reference
docs and test code. I plan it and will post rest
of review later.
I look forward to it. Note that the end date for this review is not yet fixed.
...
Current review notes are bellow.
To deal with its complexity I recommend to add many
more overcommented code snippets into documentation
and add hyperlinks to relevant examples on every fitting place.
Pictures, flowcharts and diagrams would help too.
Some information would look better in tabular form.
(Details bellow).
Yes; the review so far has made this very clear to me.
...
__________________________________________________________________
1. overview docs:
The first occurence of terms "data source"
   and "sink" should be hyperlinked (it is but only later
   in text). Dtto "filters".
My reason for not hyperlinking the first occurence was my feeling that in the
first two paragraphs I was simply using them as 'common sense' terms, but that
when I used the terms in the section marked 'Concepts' I was formally
introducing them as technical terms defined by the library. Obviously this was
not very clear.
...
It may be also useful to put there some diagram. In addition
   link to short code example could be added to the first
   occurence for each term.
Good idea. I've become convinced of the need to add explanatory material right
at the beginning.
...
The enumeration of components as 'access to memory-mapped files'
   may look better in <ul>...</ul> list.
Okay.
...
Sentence: "Sources, Sinks and their refinements are called
      resources."
The term is overused in  programming. Maybe "data resources"
   or "data-module" or so would be better.
I know that 'resource' is a loaded term ;-) but I couldn't think of anything
better. 'Data-module' doesn't seem very descriptive. Unfortunately, the standard
term seems to be "Source/Sink" -- which I find ugly.
...
Relation between sins/sources/filters and streams should be
   explained immediatelly. Possibly using picture.
Okay. What do you think of the pictures here: http://tinyurl.com/5v2ak? Too much
detail for the introduction?
...
Link to "adapters.html" is broken.
Which link?
...
The links to iterator_facade and Iterators library should
   be local.
I'm planning to make all the boost links local eventually. I wanted them to work
properly for people viewing the docs from the zipped package.
...
__________________________________________________________________
2. tutorial docs:
The first code example:
      s[z] = (char) (rand() * 256 / RAND_MAX);
   I think this causes overflows way too often.
      s[z] = (char) (rand() % 256);
   is more likely intended even if it
   adds small bias.
Thanks.
...
In example:
    in.open(634875);
   also stream example should be added,
   just to show how it is written.
I don't understand.
...
streambuf_facade and stream_facade should be hyperlinked.
   (I am fan of hyperlinking everything everywhere.)
I tried to be liberal with my hyperlinks. Sometimes I find it annoying when the
same word is hyperlinked several times in a sentence. But you're right, in the
tutorial streambuf_facade and stream_facade definitely need to link the user's
guide or the reference.
...
In code example line:
       streambuf_facade<vector_sink> out(v); // error!!
   it should be sais if it is compile time error,
   assertion or whether user needs to check it manually.
I wonder it is is possible to workaround this limitation
   internally in library by const_casts/whatever nasty.
These are the standard forwarding problem trade offs. The one thing I defintely
want to avoid is allowing const references to bind to temporary stream objects:

     filtering_istream in(stringstream("hello"));

This would immediately lead to disaster.
...
Sentence:
      "... and modifies it before returning it to the user."
   should be
      "... and modifies or eliminates it before returning it to the user."
Isn't elimination a type of modification?
...
In example:
filtering_streambuf<input> in;
    in.push(alphabetic_input_filter());
    in.push(cin);
It should be explained what is "input" and where
   does it come from.
You're right. To clarify, input is a tag struct representing the i/o mode
'input'.
So, filtering_streambuf<input> is a read-only filtering stream, synonymous with
filtering_istream, and  filtering_streambuf<output> is a write-only filtering
stream, synonymous with filtering_ostream
...
Shouldn't it have more expressive
   name e.g. input_filter_base? Maybe the source/sink too.
I don't follow this.
...
It is a bit confusing to see push() of both input_filter
   descendant and source descendant. Should be noted,
   maybe with class diagram.
Others have said this too. Originally I had something like push_filter and
push_resource. The i/o categories made this unnecessary. I consider the
std::stack-like push/pop/size/empty interface to be much more elegant. After
all, there's just one stack. Internally, also, adding a filter is essentially
the
same as adding a resource.
...
All examples use variable names 'in' and 'out'. It may be
   confusing for one just skimming through code snippets.
   Maybe different names (e.g. vector_in) could be used.
Okay.
...
It would be also nice to distinguish variables and types
   systematically. E.g. variables having _ suffix.
That's the convention I use for member variables.
...
filters.html link is broken.
Which link?
...
Generally, tutorial may be sprinkled with few more
   overcommented code snippets. They may be hide-able.
   Other docs won't suffer having them too.
Okay. What do you mean by hideable? I'm hoping not to have to use any javascript
except for the menu.
...
__________________________________________________________________
2. examples docs:
Link to regex should be local.
Link "Uncommenting Input Filter" is broken.
If you mean the link under "Regex OutputFilter", I think it appears broken only
because there's nowhere left to scroll.
...
__________________________________________________________________
3. presidential_output_filter.hpp: a nitpick, in code:
if (word == "subliminable")
       return "subliminal";
   else if (word == "nucular")
   ....
the 'else' parts are not needed.
True.
...
__________________________________________________________________
4. docs: all <img> elements should have their width/height set in HTML
For slow connections, I guess?
...
__________________________________________________________________
5. concepts docs: all concepts here can be also listen in table,
   sorted and with one line info.
The individual concepts should be added as sub tree into left panel.
   Dtto compression filters, etc.
   Every page should be accessible from left panel.
I'll try to do this, but I might find that the menu is so big it takes too long
to load.
...
__________________________________________________________________
6. Installation docs: it should be in moredetail described what is
   BOOST_IO_NO_LIB for and who needs it for what.
Maybe there should be list of *.cpp files one needs
   to use for zlib/bzip2/..., for those who do not want
   to use autolink.
All the .cpp files are enumerated here: http://tinyurl.com/6klot.
...
__________________________________________________________________
7. "Classes by category" docs: each link could have very short
   decription on the left:
array_resouce (xyz...)
    array_sink    (abc...)
Similarly classes sorted alphabetically, functions, etc.
   It would help when one is trying quickly locate some info.
Good idea.
...
__________________________________________________________________
8. All applicable headers should have on top:
#if (defined _MSC_VER) && (_MSC_VER >= 1200)
#  pragma once
#endif
to keep compilation times for VC and Intel C++ down.
I thought I already made this change.

________________________________________________________________
...
9. Syntax like this was suggested:
filter_stream out(tee(std::cout) | encode | gzip | file("some file"));
Maybe support for this could be built up on existing framework.
I think it's really cool. I've already contacted the (co-)author JC van Winkel,
who gave me permission to use it.
...
__________________________________________________________________
10. To question:
...
do you prefer the names filter_stream and filter_streambuf to
filtering_stream and filtering_streambuf?
The first. Steve McConnell in his Code Complete 2 quotes
study recommending identifier length 10-16 chars.
Me too.
...
__________________________________________________________________
11. A collection of useful stream(buffer)s could be added into library,
    so they are handy to users:
- /dev/nul like stream (like the one from more_io)
- tee, tee3, ... etc stream
- random (the one from code snippet)
These are all good. I'm thinking as the collection of provided components grows,
I may need to be more systematic about library organization. So one might have
directories boost/io/filters/compression, etc.
...
- repeater (some sequence of characters over and over)
   This one could ideally accept syntax used inn Boost.Assign.
   It would allow to easily create complex but predictable
   test data structures.
Could you elaborate?
...
- 'a switcher'. Something what sends data into another stream
   but has method switch_to_other_stream(s). E.g. for circular logs.
Doable, but presents some problems with buffer synchronization. There is
currently no generic sync() function or Synchronizable concept.
...
__________________________________________________________________
12. One could dream about ability to use lambda expressions directly
     for filters.
Lambda expressions which define function objects which transform characters one
at a time whould be easy. But I suppose you mean lambda expressions representing
complex filtering operations.

Let's see ....

How about this: instead of place-holders _1, _2, etc. we could have _get, _peek,
_put... The following could be an inline version of Kanze's uncommenting input
filter (http://tinyurl.com/6qn3a):

    if_(_get == '#')
    [
       while_(_peek != EOF && _peek != '\n') [ _get ]
    ]
    ,
    _peek

:-)
...
__________________________________________________________________
13. Possible feature of library (I do not know whether it is
    implementable or implementable efficiently/elegantly):
- lets have chain of streams (e.f. buffer/compress/write file)
- someone feeding the head of chain may want to execute an action
   somwhere down in the chain (e.g. switching files).
- now it would require 'someone' to know all details of chain
   and to have access points to its parts (=> high coupling)
Right. That's the status quo.
...
It would be nice to be able to 'send a command' down the stream
and have some part of chain picking it up and executing.
This way sender would not need to know anything about
current chain structure.
If the command wouldn't be recognizable by any part of stream
chain it would become nop.
Similarly one can imagine events going upstream.
If anyone handles them: OK, if not they will be ignored.
(They are different from exceptions that ignoring them
 is default and harmless.)
Sounds interesting. I believe this would have to rely on runtime rather than
compile-time polymorphism. I'll think about it.
...
Examples of downstream commands:
 - switch file (for logging)
 - flush compressor dictionaries
Examples of upstream events:
 - predetermined file size reached
 - current compression ratio info
...
__________________________________________________________________
14. May it be possible to add "synchronize panels" button into
    left panel? If clicked, it would expand and highlight item
    in left panel for currently opened page in right panel.
The [link to this page] button already expands the tree to the current page, if
the tree contains a link to the current page. I guess highlighting would be easy
enough to do. I'm afraid some would find this annoying.
...
__________________________________________________________________
15, cloable.html: it should be explained (+ example) what
    'notification' here means.
How's this:  A Closable filter or resource receives notifications -- via the
function <A HREF='...'>boost::io::close</A> --immediately before a containing
stream or stream buffer is closed.

I defintely need an example here.
...
__________________________________________________________________
16. The examples section should contain all available examples,
    e.g. compressors. There may be subsection for them.
Maybe something as:
    + Examples
      + Simple Examples
          ....
      + Compressors Examples
          ....
      + ....
It is because many people would not go any futher than here
    and would miss the rest.
Good idea.

Thanks for the detailed examination!

Jonathan