[boost] Re: IOStreams formal review start

3 Sep 2004

...
start of the formal review of the Iostreams
library by Jonathan Turkanis.
I vote to include Iostreams into Boost with both hands.
It is prime example of professional work.

Looking back on MoreIo library I recommend:
 - to prefere Iostreams for all overlapping functionality
 - to move the applicable rest from MoreIo into Iostreams
   with:
    - naming conventions taken from Iostreams
    - documentation look and feel from Iostreams

For user it is always better to have one library
with consistent structure/naming/documentation
than to have many of them. This is hard earned
experience.

In case both MoreIo and Iostreams will be added
into Boost, then they *BOTH* should have disambiguation
section on the top of documentation, with info:
 - what funcionality overlaps
 - how can they cooperate together

As the Iostreams is large, I didn't yet read reference
docs and test code. I plan it and will post rest
of review later. Current review notes are bellow.

To deal with its complexity I recommend to add many
more overcommented code snippets into documentation
and add hyperlinks to relevant examples on every fitting place.
Pictures, flowcharts and diagrams would help too.
Some information would look better in tabular form.
(Details bellow).

/Pavel

__________________________________________________________________
1. overview docs:

   The first occurence of terms "data source"
   and "sink" should be hyperlinked (it is but only later
   in text). Dtto "filters".

   It may be also useful to put there some diagram. In addition
   link to short code example could be added to the first
   occurence for each term.

   The enumeration of components as 'access to memory-mapped files'
   may look better in <ul>...</ul> list.

   Sentence: "Sources, Sinks and their refinements are called
      resources."

   The term is overused in  programming. Maybe "data resources"
   or "data-module" or so would be better.

   Relation between sins/sources/filters and streams should be
   explained immediatelly. Possibly using picture.

   Link to "adapters.html" is broken.

   The links to iterator_facade and Iterators library should
   be local.

__________________________________________________________________
2. tutorial docs:

   The first code example:
      s[z] = (char) (rand() * 256 / RAND_MAX);
   I think this causes overflows way too often.
      s[z] = (char) (rand() % 256);
   is more likely intended even if it
   adds small bias.

   In example:
    in.open(634875);
   also stream example should be added,
   just to show how it is written.

   streambuf_facade and stream_facade should be hyperlinked.
   (I am fan of hyperlinking everything everywhere.)

   In code example line:
       streambuf_facade<vector_sink> out(v); // error!!
   it should be sais if it is compile time error,
   assertion or whether user needs to check it manually.

   I wonder it is is possible to workaround this limitation
   internally in library by const_casts/whatever nasty.

   Sentence:
      "... and modifies it before returning it to the user."
   should be
      "... and modifies or eliminates it before returning it to the user."

   In example:

    filtering_streambuf<input> in;
    in.push(alphabetic_input_filter());
    in.push(cin);

   It should be explained what is "input" and where
   does it come from. Shouldn't it have more expressive
   name e.g. input_filter_base? Maybe the source/sink too.

   Chainig of filters example feels as good place for a picture.

   It is a bit confusing to see push() of both input_filter
   descendant and source descendant. Should be noted,
   maybe with class diagram.

   All examples use variable names 'in' and 'out'. It may be
   confusing for one just skimming through code snippets.
   Maybe different names (e.g. vector_in) could be used.

   It would be also nice to distinguish variables and types
   systematically. E.g. variables having _ suffix.

   filters.html link is broken.

   Generally, tutorial may be sprinkled with few more
   overcommented code snippets. They may be hide-able.
   Other docs won't suffer having them too.

__________________________________________________________________
2. examples docs:

   Link to regex should be local.

   Link "Uncommenting Input Filter" is broken.

__________________________________________________________________
3. presidential_output_filter.hpp: a nitpick, in code:

   if (word == "subliminable")
       return "subliminal";
   else if (word == "nucular")
   ....

   the 'else' parts are not needed.

__________________________________________________________________
4. docs: all <img> elements should have their width/height set in HTML

__________________________________________________________________
5. concepts docs: all concepts here can be also listen in table,
   sorted and with one line info.

   The individual concepts should be added as sub tree into left panel.
   Dtto compression filters, etc.
   Every page should be accessible from left panel.

__________________________________________________________________
6. Installation docs: it should be in moredetail described what is
   BOOST_IO_NO_LIB for and who needs it for what.

   Maybe there should be list of *.cpp files one needs
   to use for zlib/bzip2/..., for those who do not want
   to use autolink.

__________________________________________________________________
7. "Classes by category" docs: each link could have very short
   decription on the left:

    array_resouce (xyz...)
    array_sink    (abc...)

   Similarly classes sorted alphabetically, functions, etc.
   It would help when one is trying quickly locate some info.

__________________________________________________________________
8. All applicable headers should have on top:

#if (defined _MSC_VER) && (_MSC_VER >= 1200)
#  pragma once
#endif

to keep compilation times for VC and Intel C++ down.

__________________________________________________________________
9. Syntax like this was suggested:

   filter_stream out(tee(std::cout) | encode | gzip | file("some file"));

Maybe support for this could be built up on existing framework.

__________________________________________________________________
10. To question:
...
do you prefer the names filter_stream and filter_streambuf to
filtering_stream and filtering_streambuf?
The first. Steve McConnell in his Code Complete 2 quotes
study recommending identifier length 10-16 chars.

__________________________________________________________________
11. A collection of useful stream(buffer)s could be added into library,
    so they are handy to users:

 - /dev/nul like stream (like the one from more_io)

 - tee, tee3, ... etc stream

 - random (the one from code snippet)

 - repeater (some sequence of characters over and over)
   This one could ideally accept syntax used inn Boost.Assign.
   It would allow to easily create complex but predictable
   test data structures.

 - 'a switcher'. Something what sends data into another stream
   but has method switch_to_other_stream(s). E.g. for circular logs.

__________________________________________________________________
12. One could dream about ability to use lambda expressions directly
     for filters.

__________________________________________________________________
13. Possible feature of library (I do not know whether it is
    implementable or implementable efficiently/elegantly):

 - lets have chain of streams (e.f. buffer/compress/write file)

 - someone feeding the head of chain may want to execute an action
   somwhere down in the chain (e.g. switching files).

 - now it would require 'someone' to know all details of chain
   and to have access points to its parts (=> high coupling)

It would be nice to be able to 'send a command' down the stream
and have some part of chain picking it up and executing.
This way sender would not need to know anything about
current chain structure.

If the command wouldn't be recognizable by any part of stream
chain it would become nop.

Similarly one can imagine events going upstream.
If anyone handles them: OK, if not they will be ignored.

(They are different from exceptions that ignoring them
 is default and harmless.)

Examples of downstream commands:
 - switch file (for logging)
 - flush compressor dictionaries

Examples of upstream events:
 - predetermined file size reached
 - current compression ratio info

__________________________________________________________________
14. May it be possible to add "synchronize panels" button into
    left panel? If clicked, it would expand and highlight item
    in left panel for currently opened page in right panel.

__________________________________________________________________
15, cloable.html: it should be explained (+ example) what
    'notification' here means.

__________________________________________________________________
16. The examples section should contain all available examples,
    e.g. compressors. There may be subsection for them.

    Maybe something as:
    + Examples
      + Simple Examples
          ....
      + Compressors Examples
          ....
      + ....

    It is because many people would not go any futher than here
    and would miss the rest.

__________________________________________________________________
EOF

[boost] Re: IOStreams formal review start

Pavel Vozenilek