RE: [boost] Re: Formal review of "Output Formatters" library beginstoday

Vladimir Prus wrote:
John Torjo wrote:
The FORMAL Review of "Output Formatter" library begins today, Sept 12, 2004.
The primary case is debugging. I want to either output small vectors to some log/dump file (in which case the output will be small one-line), or output some huge structures (vector<Function>, where Function has a lot of data). In the latter case, the output should be multiline, with nice indentation, or I won't understand anything.
You can explicitally state where/how new lines and indentation are going to be within my library by adding them to the open/close/separator formatting (see output-3D.cpp for an example). As for more intelligent indentation, it is hard to know in advance if an object you are writing to the stream should be wrapped or not. This is where feeding it through Jonathan's library (via an indentation filter) would be useful.
A particularly interesting question is how the proposed library overlaps with serialization. When outputting vector<Function> I'd prefer the content of 'Function' to be outputted too, preferably by describing the members with the 'serialize' method.
The default behaviour for an unrecognised type is to output it directly to the stream so it will use the << operator associated with that type. If you want to use a particular method, such as 'serialize', then you will need to write your own format object that calls that function.
1. What is your evaluation of the design?
There are some things I don't like. First, as many mentioned, the naming is not optimal. For example:
boost::io::formatob(v).format(" | ");
The naming scheme for the various elements has undergone an overhaul. See my other posts for comments. From here on in, I will use the new naming, with: namespace io = boost::io; namespace fmt = boost::io::format;
or maybe even
("[" + io(v)/"," + "]")
Cool! I shall look into trying to implement something like this.
It's desirable that the library support some multiline output style out of the box, so that I could write:
os << io::multiline(v) << ...
Do you mean something like: std::cout << io::object( v ).decorate( "[\n ", "\n]", ",\n " ); resulting in: [ 1, 2, 3 ]
2. What is your evaluation of the implementation?
There should be the <boost/outfmt.hpp> header, including everything else of the library.
ok
Some of the lines are longer than 80 characters (e.g. template header of formatob_t has 111 characters).
ok
Methods defined in the body of the class are implicitly inline, there's no need to put the "inline" in front of them, as it's just one more word to read.
ok
I don't think defining methods in the class is a good idea -- this makes the class interface less obvious.
I agree. I prefer to keep interface and implementation seperate, but MS VC (prior to 7.1) will choke on nested templates declared out of line, e.g.: template< typename T > struct foo { template< typename U > void bar(); template< typename U > void ok(){} }; template< typename T > template< typename U > // VC<=7.0 ==> error void foo< T >::bar(){} I also try to be consistent where possible, so don't tend to mix the two.
Why is the formatob_t necessary? It seems to work by delegating everything to the underlying formatter. Can't the 'formatob' just return the appropriate formatter?
It's necessary because the format objects don't define << and it is necessary to keep a refererence to the object passed to it for formatting so that it can be accessed when inside the implementation of the << operator.
I might be missing something, but the mechanism for getting the type of formatter from a type to be output seems too complex. First, the type_deducer.hpp file is used, and 'select' computes a 'category'. Then format_deducer.hpp takes the category, and again uses 'select' to obtain the real type of formatter. Why the type_deducer.hpp is needed?
I have been working on improving this. There are 3 parts to the type deduction: [1] identifying the type and mapping it to its generic type (e.g. container, array, etc.) This is done by type_traits.hpp. [2] working out the formatter type that is needed to render this to the stream. In the review implementation this is done by format_deducer.hpp, using the information gathered by type_deducer.hpp to compute the nested structure (e.g. std::vector< std::pair< char, std::list< int > > >). [3] creating an instance of the format object. This is also done by format_deducer.hpp in the review version. In the version I am working on at the moment, I have a deduce_type< int > template that is used to simulate partial specialization (for compilers that don't support it). The int parameter is the format object category and is taken from type_traits.hpp (or your own value if you add a custom type and format object). The decude_type template contains template< typename CharT, typename T > struct type_from{ ... }; that has a format_object typedef and the method: static format_object deduce( const T & ); for constructing an instance of that format object. For example: template<> struct deduce_type< io::seq_container_type > { template< typename CharT, typename T > struct type_from { typedef typename T::value_type value_type; typedef typename get_deducer< CharT, value_type >::type value_deducer; typedef container_t < CharT, typename value_deducer::format_object > format_object; static format_object deduce( const T & ) { return( format_object( value_deducer::deduce( value_type()))); } }; };
3. What is your evaluation of the documentation?
The documentation leaves much to be desired. I'll walk though some of the aspects (indented text is quote from the docs)
I will attempt to address the issues you have raised.
"providing an extensible framework that sits"
How it's extensible? I see only one section about extending and it's just one paragraph long
I really need to add more documentation regarding this. The idea is that you can: * add your own decorators (delimiters in the review docs) to support more complex types (e.g. trees and graphs). * add your own format objects to control the way the data is rendered, for example adding a format object to call the 'serialize' method on a class. * create state objects used by fmt::state to perform custom formatting (e.g. in the john-torjo.cpp example, a state object is used to render the position: { [0] Vladimir, [1] Terje, [2] John, [3] Dave, [4] Rene } ) * register a type with the type deduction system so it can be implicitly deduced, e.g.: BOOST_IO_CLASSIFY_TYPE( 2, boost::array, io::seq_container_type ); // ... boost::array< int, 5 > a; std::cout << io::object( a ); // output: [ 1, 2, 3, 4, 5 ]
boost::io::formatobex< DelimeterType >( const T & ob );
This will format ob according to it's underlying type.
How is this different from using the 'formatob' function. What's "underlying type".
Underlying type is: typeid(ob).name(). That is std::vector< int > will be formatted differently to std::pair< char, std::list< std::complex< float >
.
I'd also suggest for formatobex use the "basic" prefix to indicate it's templated on the character type. E.g. stlio and basic_stlio, or something like that.
That is a good idea. Especially considering the revisions I am in the process of making.
If you need to specify a range or sub-range, boost::io::formatob will not recongnise it unless it is a container.
boost::io::range( ForwardIterator first, ForwardIterator last );
This is more of a language problem. Range, by definition, is a pair (but not std::pair) of iterators. It's never a container.
What I meant is that: std::cout << io::object( vec ); // [1] ok - know how to process containers std::cout << io::object( a, a + 5 ); // [2] oops! don't know how to handle ranges std::cout << io::object( io::range( a, a + 5 )); // [3] ok - range is now explicit The reason I don't allow variant 2 is because that would mess up argument resolution.
For a reference, the code examples are not necessary. For tutorial, you don't need to list every overload, just mention that there are others.
noted.
boost::io::openclose_formatter is a class that allows the user to change and access the format used for open and close delimeters
Here we're definitely in the reference docs already, while I did not get an overall picture yet. Then, what's "change and access the format". If the format can be changed, it is stored somewhere. Where?
It is stored inside the [openclose_]formatter object (wrapper_decorators and sequence_decorators in the new version). You can set the delimiters (decorators in the new version) using the various 'format' functions ('decorate') and get their values via open(), close() and separator() (used when implementing your own format objects).
The code block before this comment defines two classes openclose_formatter_t and openclose_formatter. Is this a typo, or you really have two classes?
This is not a typo. The _t variants take a template parameter that contains the return value of the 'format' ('decorate') functions. This is so that: std::cout << io::object( vec, fmt::container().decorate( " | " )); works properly.
Do you mean that every FormatObject class should have this signature? Then, the phrase about "FormatObject::write(os, fo.ob)" is not necessary.
ok. Regards, Reece _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/

Reece Dunn wrote:
The primary case is debugging. I want to either output small vectors to some log/dump file (in which case the output will be small one-line), or output some huge structures (vector<Function>, where Function has a lot of data). In the latter case, the output should be multiline, with nice indentation, or I won't understand anything.
You can explicitally state where/how new lines and indentation are going to be within my library by adding them to the open/close/separator formatting (see output-3D.cpp for an example).
Yes, but I'd rather write os << v and have the right formatting, even if 'v' is a vector of something, which has vector members, and the vector element type has another vector member, and so on...
As for more intelligent indentation, it is hard to know in advance if an object you are writing to the stream should be wrapped or not. This is where feeding it through Jonathan's library (via an indentation filter) would be useful.
The wrapping is indeed a separate issue. What I was asking is support of indenting in your library. For example, the output such as - foo: 1 - bar: 2 - biz: - [1, 2, 3 ] - giz: p1: 1 p2: - [1, 2, 3]
A particularly interesting question is how the proposed library overlaps with serialization. When outputting vector<Function> I'd prefer the content of 'Function' to be outputted too, preferably by describing the members with the 'serialize' method.
The default behaviour for an unrecognised type is to output it directly to the stream so it will use the << operator associated with that type. If you want to use a particular method, such as 'serialize', then you will need to write your own format object that calls that function.
The problem is that I need support from your library to get right indentation.
It's desirable that the library support some multiline output style out of the box, so that I could write:
os << io::multiline(v) << ...
Do you mean something like:
std::cout << io::object( v ).decorate( "[\n ", "\n]", ",\n " );
resulting in:
[ 1, 2, 3 ]
Yes, except that I'd want "multiline" to act recursively, so if I print vector of maps, the result will look like: [ { a: 1 b: 2 } { c: 3 } ] or using a different syntax.
I don't think defining methods in the class is a good idea -- this makes the class interface less obvious.
I agree. I prefer to keep interface and implementation seperate, but MS VC (prior to 7.1) will choke on nested templates declared out of line, e.g.:
template< typename T > struct foo { template< typename U > void bar(); template< typename U > void ok(){} };
template< typename T > template< typename U > // VC<=7.0 ==> error void foo< T >::bar(){}
Yes, that's a bad thing.
I also try to be consistent where possible, so don't tend to mix the two.
I'd suggest that you at least try to make top-level functions non-inline. That would reduce the code size for application which use your library -- I've had this experience both with program_options and function. For a single library, it's probably not a huge difference, but if all libraries use inline less liberaly, the code size of C++ apps would be lower.
Why is the formatob_t necessary? It seems to work by delegating everything to the underlying formatter. Can't the 'formatob' just return the appropriate formatter?
It's necessary because the format objects don't define << and it is necessary to keep a refererence to the object passed to it for formatting so that it can be accessed when inside the implementation of the << operator.
Understood.
I might be missing something, but the mechanism for getting the type of formatter from a type to be output seems too complex. First, the type_deducer.hpp file is used, and 'select' computes a 'category'. Then format_deducer.hpp takes the category, and again uses 'select' to obtain the real type of formatter. Why the type_deducer.hpp is needed?
...
template<> struct deduce_type< io::seq_container_type > { template< typename CharT, typename T > struct type_from { typedef typename T::value_type value_type; typedef typename get_deducer< CharT, value_type >::type value_deducer;
typedef container_t < CharT, typename value_deducer::format_object > format_object;
static format_object deduce( const T & ) { return( format_object( value_deducer::deduce( value_type())));
Does this expects the T::value_type is DefaultConstructible? IIRC, containers only require that objects be CopyConstructible and Assignable.
}
Still, I don't understand the need for integer type category. Can't you have 'deduce_sequence_type' 'deduce_pair_type' and so on, and use 'select' to map a type directly into those classes, not into integer?
"providing an extensible framework that sits"
How it's extensible? I see only one section about extending and it's just one paragraph long
I really need to add more documentation regarding this. The idea is that you can:
* add your own decorators (delimiters in the review docs) to support more complex types (e.g. trees and graphs).
That's interesting. Can you explain?
boost::io::formatobex< DelimeterType >( const T & ob );
This will format ob according to it's underlying type.
How is this different from using the 'formatob' function. What's "underlying type".
Underlying type is: typeid(ob).name().
Sorry, that's the name of type ;-) Maybe you just mean "according to its type?" (also note "its", not "it's"). The "underlying" implies there's some another type, besides 'T'.
This is more of a language problem. Range, by definition, is a pair (but not std::pair) of iterators. It's never a container.
What I meant is that:
std::cout << io::object( vec ); // [1] ok - know how to process containers std::cout << io::object( a, a + 5 ); // [2] oops! don't know how to handle ranges std::cout << io::object( io::range( a, a + 5 )); // [3] ok - range is now explicit
The reason I don't allow variant 2 is because that would mess up argument resolution.
Maybe, you can use something like: "The 'object' function can only output a container, so to output a range of iterators you need to make a container from it using io::range".
boost::io::openclose_formatter is a class that allows the user to change and access the format used for open and close delimeters
Here we're definitely in the reference docs already, while I did not get an overall picture yet. Then, what's "change and access the format". If the format can be changed, it is stored somewhere. Where?
It is stored inside the [openclose_]formatter object (wrapper_decorators and sequence_decorators in the new version). You can set the delimiters (decorators in the new version) using the various 'format' functions ('decorate') and get their values via open(), close() and separator() (used when implementing your own format objects).
The original sentence sounds like the format is something external to openclose_formatter, and the class is just a proxy which allows to set it.
The code block before this comment defines two classes openclose_formatter_t and openclose_formatter. Is this a typo, or you really have two classes?
This is not a typo. The _t variants take a template parameter that contains the return value of the 'format' ('decorate') functions. This is so that: std::cout << io::object( vec, fmt::container().decorate( " | " )); works properly.
Why don't 'decorate' return the same type as 'fmt::container()'? - Volodya
participants (2)
-
Reece Dunn
-
Vladimir Prus