Formal review of "Output Formatters" library begins today

Dear boosters, The FORMAL Review of "Output Formatter" library begins today, Sept 12, 2004. Author: Reece Dunn Description: The outfmt library provides means to write and read containers/ arrays/ ranges (pairs of iterators) to/from STL streams. It also provides means to customize formatting as your write to/read from a stream: std::list<int> l; // { 1, 4, 10 } std::cout << formatob(l); // output: [ 1, 4, 10 ] std::cout << formatob(l).format("<",">"); // output: <1, 4, 10> Download: You can get it from the boost sandbox: http://cvs.sourceforge.net/viewcvs.py/boost-sandbox/boost-sandbox/libs/outfm... http://cvs.sourceforge.net/viewcvs.py/boost-sandbox/boost-sandbox/boost/outf... As a convenience (only for the period of the review), you can also get it from http://www.torjo.com/code/outfmt.zip (153Kb) Review process: Your comments may be brief or lengthy, but basically the Review Manager needs your evaluation of the library. If you identify problems along the way, please note if they are minor, serious, or showstoppers. When doing the formal review, please answer the following questions: 1. What is your evaluation of the design? 2. What is your evaluation of the implementation? 3. What is your evaluation of the documentation? 4. What is your evaluation of the potential usefulness of the library? 5. Did you try to use the library? With what compiler? Did you have any problems? 6. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? 7. Are you knowledgeable about the problem domain? And most important, 8. Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion. More on the formal review process: http://www.boost.org/more/formal_review_process.htm The Review Manager is: John Torjo john.lists@torjo.com Contributing editor, C/C++ Users Journal -- "Win32 GUI Generics" -- generics & GUI do mix, after all -- http://www.torjo.com/win32gui/

Hi Reece, Here are some general comments: ---------------------------------- 0. I think the library should be put in the directory "boost/output_format" I hate cryptic names. 1. In this example std::cout << boost::io::formatob( vec ); // output: [ 1, 2, 3 ] std::cout << boost::io::formatob( vec ).format( "{ ", " }" ); // output: { 1 : 2 : 3 } why does the change of braces change "," to ":" ? 2. In this example int a[] = { 5, 4, 3, 2, 1 }; std::cout << boost::io::formatob( boost::io::range( a, a + 5 )); std::cin >> boost::io::formatob( boost::io::range( a, a + 5 )); // output: [ 5, 4, 3, 2, 1 ] it might be good to support a range version with the posibility to 1. throw if there is too few elements 2. default initialize if there is too few elements 3. cryptic name: formatob.hpp && formatobex; should be format_object and whatever the latter really means 4. given std::cout << boost::io::formatobex< std::string >( v ); maybe std::string should be a default argument? 5. A lot of boost::io::range() can be replaced with one Taking a Single Pass Range 6. STL IO: It is not obvious how insertion is done? With push_back or by overwriting elements? 7. why is it necessary with boost::io::formatob( boost::io::range( a, a + 5 )); could we not just say boost::io::range( a, a + 5 ); 8. It should be possible to say vector<int> v = ...; cout << v; without the format_ob() if I include stl.hpp, ie, overloads for all standard types. 9. I would prefer basicfmt_t to be basic_format_type or perhaps basic_fmt_type if the "fmt" shorthand is used consistently etc. That is, all the cryptic concatenation should be removed. And now my review comments. | When doing the formal review, please answer the following questions: | | 1. What is your evaluation of the design? seems ok. I'm a little irritated about too cryptic names, concatenation and _t types. | 2. What is your evaluation of the implementation? haven't looked. | 3. What is your evaluation of the documentation? good, although I miss return-types and template paramaters and some synopsises with better overview. For example, what does boost::io::formatobex< DelimeterType >( const T & ob ); return? Maybe it doesn't matter, but then it should say <i>Implementation-defined</i>boost::io::formatobex< DelimeterType >( const T & ob ); etc. | 4. What is your evaluation of the potential usefulness of the library? quite useful. | 5. Did you try to use the library? | With what compiler? Did you have any problems? no. | 6. How much effort did you put into your evaluation? | A glance? A quick reading? In-depth study? a quick glance. | 7. Are you knowledgeable about the problem domain? no. | And most important, | 8. Do you think the library should be accepted as a Boost library? | Be sure to say this explicitly so that your other comments don't obscure | your overall opinion. yes. I think all issues can be dealt with quite easy for a post-review. br Thorsten

Anybody have a hint on how to build the docs for this (or be willing to just build them and put them up somewhere)? Thanks, -t On Sep 12, 2004, at 2:41 PM, John Torjo wrote:
Dear boosters,
The FORMAL Review of "Output Formatter" library begins today, Sept 12, 2004.
Author: Reece Dunn
Description: The outfmt library provides means to write and read containers/ arrays/ ranges (pairs of iterators) to/from STL streams. It also provides means to customize formatting as your write to/read from a stream: std::list<int> l; // { 1, 4, 10 } std::cout << formatob(l); // output: [ 1, 4, 10 ] std::cout << formatob(l).format("<",">"); // output: <1, 4, 10>
Download: You can get it from the boost sandbox: http://cvs.sourceforge.net/viewcvs.py/boost-sandbox/boost-sandbox/ libs/outfmt/docs/ http://cvs.sourceforge.net/viewcvs.py/boost-sandbox/boost-sandbox/ boost/outfmt/
As a convenience (only for the period of the review), you can also get it from http://www.torjo.com/code/outfmt.zip (153Kb)
Review process: Your comments may be brief or lengthy, but basically the Review Manager needs your evaluation of the library. If you identify problems along the way, please note if they are minor, serious, or showstoppers.
When doing the formal review, please answer the following questions:
1. What is your evaluation of the design? 2. What is your evaluation of the implementation? 3. What is your evaluation of the documentation? 4. What is your evaluation of the potential usefulness of the library? 5. Did you try to use the library? With what compiler? Did you have any problems? 6. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? 7. Are you knowledgeable about the problem domain?
And most important, 8. Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion.
More on the formal review process: http://www.boost.org/more/formal_review_process.htm
The Review Manager is: John Torjo john.lists@torjo.com
Contributing editor, C/C++ Users Journal -- "Win32 GUI Generics" -- generics & GUI do mix, after all -- http://www.torjo.com/win32gui/
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

John Torjo <john.lists@torjo.com> writes:
When doing the formal review, please answer the following questions:
1. What is your evaluation of the design?
It seems to introduce a great deal syntactic noise: std::cout << boost::io::formatob( vec ); // output: [ 1, 2, 3 ] std::cout << boost::io::formatob( vec ).format( "{ ", " }" ); // output: { 1 : 2 : 3 } I'd like to (at least) see rationale on why something cleaner wasn't chosen. For example: namespace io = boost::io; std::cout << io::sequence( vec ); // output: [ 1, 2, 3 ] std::cout << io::sequence( vec, "{ ", " }" ); // output: { 1 : 2 : 3 } In case it's not obvious, I don't like the name "formatob." Abbrevs should generally be avoided, and frankly I don't know what 'ob' stands for. Also "format" doesn't seem to add much semantically. I'm generally not a great fan of statefulness, but this also seems like a reasonable thing to want: namespace io = boost::io; std::cout << io::sequence_delimiters("{ ", " }"); std::cout << io::sequence( vec ); // output: { 1, 2, 3 } Library organization seems problematic. On one hand, I'd like to see the io for individual boost types like octonion distributed across their respective libraries (e.g. in boost/math/io.hpp). On the other hand that will introduce an "apparent library dependency" that may not be needed by some. The "apparent dependency" of this library on all the others isn't very comforting, either. Perhaps some refactoring is appropriate. Another point: I'd like to see functionality for pretty-printing sequences. It's very common to have sequences that are too long to represent comfortably on one line. There are several strategies for dealing with that, and it seems to me that to be *really* indispensible, this library should help in that department. It might also need ways to prevent infinite recursion in self-referential sequences (consider sequences that embed ranges). Where's I/O for tuples?
2. What is your evaluation of the implementation?
Haven't looked.
3. What is your evaluation of the documentation?
What little I saw of the tutorial documentation looked readable and comprehensible. The lack of namespace aliases in the examples don't help the library's case. I'd like to see more rationales. AFAICT the docs lack any sort of formalized reference guide, showing interface summaries, what headers to include, requirements, etc. IMO it's unacceptable without that component. Did I miss something? The use of "implementation defined" in the documentation is incorrect. To be correct, the Boost implementation would have to define whatever it is. The appropriate term is "unspecified."
4. What is your evaluation of the potential usefulness of the library?
Sequence/STL printing: For quickly throwing together "script-like" programs and diagnostic purposes, very high. I'm less sure how useful it will be in hardcore application/library programming. I/O for existing Boost types such as rational<>: IMO the fact that we didn't have I/O for these is sort of embarassing, but at the same time it's unclear to me that not having inserters/extractors was a burden for anyone. I can't recall a single post asking where these things were.
5. Did you try to use the library?
Nope.
6. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
Somewhere between a glance and a quick reading. I admit that my reading was quite cursory and my comments should be judged accordingly. I might do a more in-depth followup if the concerns I've raised are addressed during the review process.
7. Are you knowledgeable about the problem domain?
I guess a little. It hasn't been uncommon for me to want this sort of capability.
And most important, 8. Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion.
The lack of reference docs is a showstopper for me. Aside from that, I guess I'm not sure yet that this library solves enough/important problems to be worth accepting in its current form. I would want to be very convinced that the design was as slick and beautiful as possible, and I'm not there yet. So I'm provisionally at "No." -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

"John Torjo" <john.lists@torjo.com> wrote in message news:414443E0.60008@torjo.com...
Dear boosters,
The FORMAL Review of "Output Formatter" library begins today, Sept 12, 2004.
1. What is your evaluation of the design?
I find the naming conventions misleading. 'outfmt' but it does io? What is an '...ob' and '...obex'? This effort at reducing verbosity is then countered as mentioned by Dave Abrahams, by the "syntactic noise". I do like the delimiter traits objects approach to dealing with leader/trailer existence issues. Although I have not found passing "" (empty string) as the leader/trailer being a performance bottleneck since they are only performed once per range. I like the range approach, but isn't a range library coming? Will io::range be entirely compatible? My home grown range_streamer is often used with filter and transform iterators. It would be nice to use a shorter syntax like: os << range_stream( aC, lFilterFnc, "\n=\n{ ", "\n, ", "\n}" ); rather than s << range_stream( boost::make_filter_iterator( lFilterFnc, aC.begin(), aC.end() ) , boost::make_filter_iterator( lFilterFnc, aC.end (), aC.end() ) , "\n= { " , "\n , " , "\n }" );
2. What is your evaluation of the implementation?
The main motivation I had for developing a similar output facility was dealing with the dangling separator when using std::copy and ostream_iterator in a consistent/performant fashion. list.hpp does: os << open(); while( first != last ) { fo.write( os, *first ); if( ++first != last ) os << separator(); } return( os << close()); Which does two comparisons for each element in [first,last). This can be rearranged to do a single comparison as shown below. os << open(); if( first != last ) { fo.write( os, *first ); for( ++first ; first != last ; ++first ) { os << separator(); fo.write( os, *first ); } } return( os << close()); This is the only recognizably similar code to that which I've bee using. (Which I'll attach for comparison purposes at the end of this review).
3. What is your evaluation of the documentation?
After reading Jonathon's IOStream documentation, my expectations have been raised. Some of the problems are due to the misleading names used in the implementation itself. Almost everything is a format_this or format_that with format methods... Perhaps an additional namespace level would alleviate some of this. Missing are Motivation and Rationale sections. When/Why would someone use this library versus ostream_iterator/Tokenizer/Spirit/...
4. What is your evaluation of the potential usefulness of the library?
At least on the output side, I've found this functionality very useful. It replaced a lot of legacy repeated code within my company where everyone had their own way of dealing with the issue.
5. Did you try to use the library? With what compiler? Did you have any problems?
No.
6. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
An hour with the documentation and looking at some of the code.
7. Are you knowledgeable about the problem domain?
I've implemented and extensively use a simplified version that does output only given a pair of iterators and delimiters.
8. Do you think the library should be accepted as a Boost library?
Not in current state. Jeff Flinn ========== #if !defined(RangeStreamerHeaderIncluded) #define RangeStreamerHeaderIncluded template< class tItr, class tDBeg, class tDInner, class tDEnd > class range_streamer { tItr mBeg; tItr mEnd; tDBeg mDBeg; tDInner mDInner; tDEnd mDEnd; public: range_streamer( tItr aBeg, tItr aEnd, tDBeg aDBeg, tDInner aDInner, tDEnd aDEnd ) : mBeg ( aBeg ) , mEnd ( aEnd ) , mDBeg ( aDBeg ) , mDInner( aDInner ) , mDEnd ( aDEnd ) {} template< class tStream > void OutputTo( tStream& s )const { if( mBeg != mEnd ) { tItr lItr = mBeg; s << mDBeg << *lItr; for( ++lItr ; lItr != mEnd ; ++lItr ) { s << mDInner << *lItr; } s << mDEnd; } } }; template< class tItr, class tDBeg, class tDInner, class tDEnd > inline range_streamer<tItr,tDBeg,tDInner,tDEnd> range_stream( tItr aBeg, tItr aEnd, tDBeg aDBeg, tDInner aDInner, tDEnd aDEnd ) { return range_streamer<tItr,tDBeg,tDInner,tDEnd>( aBeg, aEnd, aDBeg, aDInner, aDEnd ); } template< class tStream, class tItr, class tDBeg, class tDInner, class tDEnd
inline tStream& operator<<( tStream& s, const range_streamer<tItr,tDBeg,tDInner,tDEnd>& aRS ) { aRS.OutputTo(s); return s; } #endif //RangeStreamerHeaderIncluded

"John Torjo" wrote:
The FORMAL Review of "Output Formatter" library begins today,
I briefly looked on the library and wrote down few notes. I am especially interested in point [4]. 1. Name of the library would be better changed. Hearing it for the first time I suspected something like variant on std::endl. 2. The first part of documentation should be what is purpose of the library. I guess in this case it is debugging and testing support, 100% cases. 3. The name 'formatob' is not very good. "naryfmt" etc is horrible. 4. There seems to be significant overlap with Boost.Serialization (I suspect outfm doesn't handle cyclic structures). It may or may not be possible to implement outfm functionality with special archive. Was this considered? 5. To support debugging I would like to see many more features: - ability to wrap long lines in pretty way - ability to indent outputed data when they are part of complex structure - ability to generate HTML as output (+ ability to fold/unfold big data structures using Javascript). - helper function generate_html_output_and_show_it_in_browser() - ability to diff two outputed data where applicable and produce some easy to read report, possibly helper function: generate_html_output_and_show_it_in_browser_together_with_diff_of_previous_d ata_from_here() - some time ago John Torjo designed SMART_ASSERT library. If this will make into Boost, outfm would be perfect complement for it. /Pavel

2. The first part of documentation should be what is purpose of the library.
I guess in this case it is debugging and testing support, 100% cases.
Introduction: "The Standard Template Library (STL) provides a mechanism for storing collections of objects, but does not provide I/O support via streams. [...] The outfmt library is an attempt to solve the problems outlined above, providing an extensible framework [...]" So, what I gather is that it provides customizable I/O for containers/ranges. That said, I donot think its purpose is debugging/testing support 100% cases. It does definitely support that, but you can do more - like, do some pretty printing to file(s), std::cout, etc. I would even go further saying that you can do pretty reports that you can show on a GUI (example, print a report with files that were copied from place A to B - with eventual errors -, in an Edit Box)
4. There seems to be significant overlap with Boost.Serialization (I suspect outfm doesn't handle cyclic structures).
I would not say so. I see serialization as filling a quite different gap. For instance, for what you say above - testing/debugging, I would certainly not use boost.serialization. For testing/debugging, I will want pretty printing.
5. To support debugging I would like to see many more features:
- ability to wrap long lines in pretty way
I guess this could be solved by IO streams lib ;)
- ability to generate HTML as output (+ ability to fold/unfold big data structures using Javascript).
I assume this could be added in time... Reece?
- ability to diff two outputed data where applicable and produce some easy to read report, possibly helper function:
I've been toying with this a while ago. But I think you'd better have this as a different application.
- some time ago John Torjo designed SMART_ASSERT library. If this will make into Boost, outfm would be perfect complement for it.
Oh yes ;) Indeed, I could adapt SMART_ASSERT to allow pretty printing of containers and such. -- John Torjo -- john@torjo.com Contributing editor, C/C++ Users Journal -- "Win32 GUI Generics" -- generics & GUI do mix, after all -- http://www.torjo.com/win32gui/ -- v1.4.0 - save_dlg - true binding of your data to UI controls! + easily add validation rules (win32gui/examples/smart_dlg)

2. The first part of documentation should be what is purpose of the library.
I guess in this case it is debugging and testing support, 100% cases.
Introduction: "The Standard Template Library (STL) provides a mechanism for storing collections of objects, but does not provide I/O support via streams. [...] The outfmt library is an attempt to solve the problems outlined above, providing an extensible framework [...]" So, what I gather is that it provides customizable I/O for containers/ranges. That said, I donot think its purpose is debugging/testing support 100% cases. It does definitely support that, but you can do more - like, do some pretty printing to file(s), std::cout, etc. I would even go further saying that you can do pretty reports that you can show on a GUI (example, print a report with files that were copied from place A to B - with eventual errors -, in an Edit Box)
4. There seems to be significant overlap with Boost.Serialization (I suspect outfm doesn't handle cyclic structures).
I would not say so. I see serialization as filling a quite different gap. For instance, for what you say above - testing/debugging, I would certainly not use boost.serialization. For testing/debugging, I will want pretty printing.
5. To support debugging I would like to see many more features:
- ability to wrap long lines in pretty way
I guess this could be solved by IO streams lib ;)
- ability to generate HTML as output (+ ability to fold/unfold big data structures using Javascript).
I assume this could be added in time... Reece?
- ability to diff two outputed data where applicable and produce some easy to read report, possibly helper function:
I've been toying with this a while ago. But I think you'd better have this as a different application.
- some time ago John Torjo designed SMART_ASSERT library. If this will make into Boost, outfm would be perfect complement for it.
Oh yes ;) Indeed, I could adapt SMART_ASSERT to allow pretty printing of containers and such. -- John Torjo -- john@torjo.com Contributing editor, C/C++ Users Journal -- "Win32 GUI Generics" -- generics & GUI do mix, after all -- http://www.torjo.com/win32gui/ -- v1.4.0 - save_dlg - true binding of your data to UI controls! + easily add validation rules (win32gui/examples/smart_dlg)

| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of John Torjo | Sent: 12 September 2004 13:41 | To: Boost list; boost-announce@lists.boost.org | Subject: [boost] Formal review of "Output Formatters" library | begins today | | 1. What is your evaluation of the design? A good effort at a complex and messy problem(s). | 2. What is your evaluation of the implementation? OK, but * There is a terribly central spelling mistake which nobody seems to noted so far: A _delimiter_ does de-limit things and has nothing to do with deli_s or meter_s! A global search in replace in code and docs is vital. Unlike me, Boosters are obviously not native __English__ speakers ;-) * I strongly dislike abbreviations and concatenations. It is MUCH easier to read/understand and consistent with STL styling to use full words and _s, if a bit longer to type. naryfmt really must qualify for some prize. nary_format or n_ary_format. "Boost prefers clarity to curtness". | 3. What is your evaluation of the documentation? OK - Though I am not a fan of background coloured boxes for code - I think the font change is enough. One thing I would REALLY REALLY like is a Boost 'Standard' way of colouring and indenting code similar to Visual Studio IDE - though I don't feel their colour scheme is quite as good mine! Nor do I like Doxygen generated docs much - they never seem to tell me what I want to know. And the grey-ed out backgrounded boxes obscures the Doxygen code colouring. | 4. What is your evaluation of the potential usefulness of the library? Very useful. | 5. Did you try to use the library? Yes previously worked OK. | 6. How much effort did you put into your evaluation? About an hour re-reviewing previous work. | | 7. Are you knowledgeable about the problem domain? A would-be user. | 8. Do you think the library should be accepted as a Boost library? Yes - but with some name changes. And I would like to see active work by all authors to ensure that this interacts with the filtering, range, more_io libraries before the first actual full release (in 1.33?) so that the documentation cross-references too, including examples of combinations of these techniques. I can understand that authors are unwilling/unable to work together until they are sure which other libraries can be assumed a part of Boost, and I think we must accept (and flag up) that there will probably be changes, perhaps major, while interworking is perfected. The three recent IO contributions, with serialisation, are prime candidates for mutual refinement. Paul Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB +44 1539 561830 +44 7714 330204 mailto: pbristow@hetp.u-net.com

Reece and fellow boosters, I have a special interest in Output Formatters working together with lexical_cast. The idea is this: Since lexical_cast internally uses a stringstream and its associated operators >> and <<, and since outfmt defines << and >> for STL containers, it should be possible to lexical_cast any STL container. Well, in theory. While it works for "output" operations like this std::vector<int> v; v.push_back( 7 ); v.push_back( 9 ); boost::lexical_cast<std::string>( v ); // gives "[ 7, 9 ]" the reverse "input" operation boost::lexical_cast< std::vector<int> >( "[ 7, 9 ]" ); // throws bad_lexical_cast fails. This is because lexical_stream turns off white space skipping, and outfmt uses "[ ", ", ", and " ]" (with all those nifty spaces) as default formatting. The problem is that I can't pass any formatting options to lexical_stream at runtime. Hence, I'd like to propose that a) either the default formatting of containers is changed to "[", ",", and "]", b) or a compile-time mechanism (macro? template magic?) should be provided to set the formatting. Ok, I just tested this lexical_cast-specific behaviour, thus I can't really answer most of the other questions:
1. What is your evaluation of the design? 2. What is your evaluation of the implementation? 3. What is your evaluation of the documentation? 4. What is your evaluation of the potential usefulness of the library?
Quite useful. Indeed, _very, very_ useful if the above-mentioned limitations were solved.
5. Did you try to use the library? With what compiler? Did you have any problems?
VC++ 7.0; see above.
6. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? 7. Are you knowledgeable about the problem domain?
And most important, 8. Do you think the library should be accepted as a Boost library?
Yes. - Roland PS: I noticed that in earlier versions strings within STL containers where surrounded by "quotes"; now, they are no longer, which causes problems if a string contains ",". Why was that changed?

PS: I noticed that in earlier versions strings within STL containers where surrounded by "quotes"; now, they are no longer, which causes problems if a string contains ",". Why was that changed?
at my suggestion ;) I think you'd use this library a lot for pretty printing. Thus, the above would only hurt. You can still specify it manually (if you wish to read the strings back) Best, John -- John Torjo -- john@torjo.com Contributing editor, C/C++ Users Journal -- "Win32 GUI Generics" -- generics & GUI do mix, after all -- http://www.torjo.com/win32gui/ -- v1.4.0 - save_dlg - true binding of your data to UI controls! + easily add validation rules (win32gui/examples/smart_dlg)

Roland Richter wrote:
Reece and fellow boosters,
I have a special interest in Output Formatters working together with lexical_cast. The idea is this:
Since lexical_cast internally uses a stringstream and its associated operators >> and <<, and since outfmt defines << and >> for STL containers, it should be possible to lexical_cast any STL container. ... the reverse "input" operation
boost::lexical_cast< std::vector<int> >( "[ 7, 9 ]" ); // throws bad_lexical_cast
fails. This is because lexical_stream turns off white space skipping, and outfmt uses "[ ", ", ", and " ]" (with all those nifty spaces) as default formatting.
The problem is that I can't pass any formatting options to lexical_stream at runtime. Hence, I'd like to propose that
a) either the default formatting of containers is changed to "[", ",", and "]", b) or a compile-time mechanism (macro? template magic?) should be provided to set the formatting.
I've made a noise about this in the past. I strongly believe that current lexical_cast behaviour does not play nice with existing stream operators (which are commonly written to assume whitespace is skipped). There's no good reason for that, and lexical_cast should not use the "noskipws" flag. - Volodya

Vladimir Prus wrote:
Roland Richter wrote:
the reverse "input" operation
boost::lexical_cast< std::vector<int> >( "[ 7, 9 ]" ); // throws bad_lexical_cast
fails. This is because lexical_stream turns off white space skipping, and outfmt uses "[ ", ", ", and " ]" (with all those nifty spaces) as default formatting.
[...]
I've made a noise about this in the past. I strongly believe that current lexical_cast behaviour does not play nice with existing stream operators (which are commonly written to assume whitespace is skipped). There's no good reason for that, and lexical_cast should not use the "noskipws" flag.
Do you happen to know why it was introduced then in the first place? - Roland

Roland Richter wrote:
I've made a noise about this in the past. I strongly believe that current lexical_cast behaviour does not play nice with existing stream operators (which are commonly written to assume whitespace is skipped). There's no good reason for that, and lexical_cast should not use the "noskipws" flag.
Do you happen to know why it was introduced then in the first place?
Not really. IIRC, Kevlin Henney said he finds it is more clear when lexical_cast is required to consume the entire input, without skipping whitespace at the beginning. I don't understand this argument at all, and the new behaviour only creates practical issues :-( - Volodya

"John Torjo" <john.lists@torjo.com> wrote in message news:414443E0.60008@torjo.com...
Dear boosters,
The FORMAL Review of "Output Formatter" library begins today, Sept 12, 2004.
1. What is your evaluation of the design?
It is simple, but seems designed so that more complex (What ?) "Output Shapes" can be created ie by nesting, from the basic building blocks.
2. What is your evaluation of the implementation?
Only a preliminary glance.. hence no comment. As long as it works and all that :-)
3. What is your evaluation of the documentation?
Full marks for style. Informative examples and shows off the potential of the library quite well, though I didnt read it in depth.
4. What is your evaluation of the potential usefulness of the library?
Having read the later posts about the library, a useful goal of the libarary would be xml and html formatting, possibly It might be useful in rendering various text formats, e.g dxf and csv.,files, both for input and output.... but trivial I guess. I assume that the format object could be serialised somewhere along with the data which might be useful to interrrogate on input. However the abstraction is interesting. The process of transforming a serial stream into 2 or more dimensions seems to have a generic flavour, though what other uses there may be I couldnt put my finger on. I guess parsing is the term I am looking for, though I havent looked in enough to see where delimiting stops and parsing begins. Wordprocessing ?.... I dont know.
5. Did you try to use the library? With what compiler? Did you have any problems?
I tried out and played around with some of the test/ example programs on VC7.1. No problems were encountered at all.
6. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
No more than a superficial look really.
7. Are you knowledgeable about the problem domain?
No.
And most important, 8. Do you think the library should be accepted as a Boost library?
I abstain from voting. Another question is could I make use of the library. I do like the 2D 3D matrix examples, which I have only looked at briefly, but hopefully I might find these useful for generating matrices in source code. regards Andy Little

John Torjo wrote:
The FORMAL Review of "Output Formatter" library begins today, Sept 12, 2004.
Before going to review itself, I'll list the cases where I want to use this library. The primary case is debugging. I want to either output small vectors to some log/dump file (in which case the output will be small one-line), or output some huge structures (vector<Function>, where Function has a lot of data). In the latter case, the output should be multiline, with nice indentation, or I won't understand anything. The second case is STL I/O itself, for example csv files that Reece has mentioned. A particularly interesting question is how the proposed library overlaps with serialization. When outputting vector<Function> I'd prefer the content of 'Function' to be outputted too, preferably by describing the members with the 'serialize' method. And the question is if I can use outfmt library, or have to use the serialization library? The serialization library is pretty large, so, I'd like outfmt to be able to output UDT which have 'serialize' defined. IMO, vector<some_UDT> is a very common case, maybe even more common than vector<pair<int, int> >.
1. What is your evaluation of the design?
Unfortunately, I cannot comment much on this due to bad documentation. See below. There are some things I don't like. First, as many mentioned, the naming is not optimal. For example: boost::io::formatob(v).format(" | "); The 'ob' suffix and two "format" words are confusing. I'd suggest something like: boost::io::stl(v).separator(" | "); or boost::io::separate_with(v, " | "); and for braces something like using namespace boost::io; braces("[", "]", separator(" | ", v)) or maybe even ("[" + io(v)/"," + "]") for nested formats, something like ("[" + io::format( "(" + io::format()/":" + ")" )/"," + "]" ) These are just ideas, though. It's desirable that the library support some multiline output style out of the box, so that I could write: os << io::multiline(v) << ... Again, I'd suggest YAML as such style, but any other readable indented style will be OK. It's also desirable to use the 'serialize' method of classes to output them, instead of requiring operator<< to be always present.
2. What is your evaluation of the implementation?
There should be the <boost/outfmt.hpp> header, including everything else of the library. Some of the lines are longer than 80 characters (e.g. template header of formatob_t has 111 characters). Methods defined in the body of the class are implicitly inline, there's no need to put the "inline" in front of them, as it's just one more word to read. I don't think defining methods in the class is a good idea -- this makes the class interface less obvious. Another issue is that all of methods are inline, and I don't think it's necessary. The iostream operations will take much more time than function call anyway, and inlining everything can lead to code bloat. For example list_object::read is definitely very large method. Why is the formatob_t necessary? It seems to work by delegating everything to the underlying formatter. Can't the 'formatob' just return the appropriate formatter? I might be missing something, but the mechanism for getting the type of formatter from a type to be output seems too complex. First, the type_deducer.hpp file is used, and 'select' computes a 'category'. Then format_deducer.hpp takes the category, and again uses 'select' to obtain the real type of formatter. Why the type_deducer.hpp is needed?
3. What is your evaluation of the documentation?
The documentation leaves much to be desired. I'll walk though some of the aspects (indented text is quote from the docs) First, about the structure. I'd prefer more conventional introduction/detailed docs/reference structure. The current docs, IMO, mix everything, and when reading sequentally, I find both vague phrases and exact synopsis, and very little overview or tutorial material. "providing an extensible framework that sits" How it's extensible? I see only one section about extending and it's just one paragraph long It is often necessary to override the way a type is formatted (written to/read from) to a stream, or to format a subrange of a container or array. The manipulators in this library serve this purpose. For example: int a[] = { 5, 4, 3, 2, 1 }; std::cout << boost::io::formatob( boost::io::range( a, a + 5 )); std::cin >> boost::io::formatob( boost::io::range( a, a + 5 )); I don't think I see a conventional meaning of "manipulator" here. The formatting style is not changed at all. Just a 'range' function is used to create a container from an interator range. boost::io::formatobex< DelimeterType >( const T & ob ); This will format ob according to it's underlying type. How is this different from using the 'formatob' function. What's "underlying type". The "This will format" phrase is very loose, it does not even say anything about stream. I'd suggest: the returned object can be output to a stream, and will produce textual representation of 'ob', or something equally explicit. boost::io::formatob( const T & ob ); This will format ob according to it's underlying type. I'd suggest that this is documented before 'formatobex'. I'd also suggest for formatobex use the "basic" prefix to indicate it's templated on the character type. E.g. stlio and basic_stlio, or something like that. std::cout << boost::io::formatobex< std::string >( v ); // output: [ 1.1, 2.2, 3.3 ] Note that the type construct is automatically deduced and the corresponding format object is constructed. Again, this is very vague. What does it mean for "type construct" to be automatically "deduced". Why would I care about some "format object". You might want to say that under the cover, the type of 'v' determines the type of format object that's created, and that the object is responsible for the actual formatting. This will format ob based on the format object it is passed (FormatObject). Here, the format type is taken from FormatObject::format_type. This allows the nested constructs to be formatted. This is not clear at all. What's 'format type' and how it's "taken". How it allows the nested constructs to be formatted. I think you need to either elaborate on this, or move this passage to a detailed docs section. If you need to specify a range or sub-range, boost::io::formatob will not recongnise it unless it is a container. boost::io::range( ForwardIterator first, ForwardIterator last ); This is more of a language problem. Range, by definition, is a pair (but not std::pair) of iterators. It's never a container. This creates the range [first, last) that can be used by boost::io::format. The range [first, last) exists as long as you have the 'first' and 'last' objects. I think it's better to say: "Returns an object which will output the range [first, last) to the stream....". The remainder of the "range" function overload leaves me wondering if this is reference or overview or what. For a reference, the code examples are not necessary. For tutorial, you don't need to list every overload, just mention that there are others. A FormatTraits class is a class that provides the default values for the open, close and separator delimeters used when rendering the list to a stream. These can be overriden by the user as described in the following section. The class has the general form: This sentence doesn't say why FormatTraits is usefull for a user. I read it like: the class provides default values, and user can override those value when outputting a specific object. It does not seem that the user can specialize the class for his objects, so why the user should know about this class at all? Heh, both provided traits are in the detail namespace? Really, user should not care. boost::io::openclose_formatter is a class that allows the user to change and access the format used for open and close delimeters Here we're definitely in the reference docs already, while I did not get an overall picture yet. Then, what's "change and access the format". If the format can be changed, it is stored somewhere. Where? The code block before this comment defines two classes openclose_formatter_t and openclose_formatter. Is this a typo, or you really have two classes? boost::io::openclose_formatter_t is used by format objects so that a reference to the format object is returned and not a reference to boost::io::openclose_formatter! This makes it possible to call the format function inline. For example: It's completely unclear. Then you go on describing the openclose_formatter class, while I still don't know how I can use that class. Formatters are useful when you want to use a specific format in different places, for example: This should be the first sentence in this section. The FormatObject class has to provide a write function having this syntax: FormatObject::write( os, fo.ob ); Maybe, you mean "The FormatObject class has to be written such that expression FormatObject::write(os, fo.ob) is well formed". My first impression was that you give a signature for the write method. The write function has the form: template< typename T, class OutputStream > inline OutputStream & write( OutputStream & os, const T & value ) const ( // ... return( os ); ); Do you mean that every FormatObject class should have this signature? Then, the phrase about "FormatObject::write(os, fo.ob)" is not necessary.
4. What is your evaluation of the potential usefulness of the library?
Potentially very usefull.
5. Did you try to use the library? With what compiler? Did you have any problems?
No.
6. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
A couple of hours.
7. Are you knowledgeable about the problem domain?
Kind of. I wrote a similar library a few years ago, though much less simple.
And most important, 8. Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion.
I'd a hard question. The basic idea of the library -- formatter objects that you can nest, seems sound to me. However, the interface built on top of that is a bit suboptimal, and the state of documentation is pretty bad. Because of that, I believe that it would be better if the library is rejected this time, and reviewed again later. - Volodya

"John Torjo" <john.lists@torjo.com> wrote in message news:414443E0.60008@torjo.com...
Dear boosters,
The FORMAL Review of "Output Formatter" library begins today, Sept 12, 2004.
Hi, I hope I didn't miss the deadline. Let me start by saying that this is a library I know and love (warts and all). I examined all the source code thouroughly at several points in its development, helped port the library to several compilers, and even contributed little bits. So I'm sorry I don't have time to write a really detailed review. The review came at a bad time for me -- right after the review of my iostreams library. My review is based on reading (most of) the current thread and on my previous experience with the library. I'll reorder the basic review questions:
4. What is your evaluation of the potential usefulness of the library?
Extremely useful.
3. What is your evaluation of the documentation?
I haven't read the current docs, but the last time I read them I found them insufficient, for reasons many have pointed out. There needs to be a general introduction explaining the scope of the library, lots of examples, and clear instuctions on how to extend the library. In order to learn about the library, I had to ask Reece a lot of questions and read the source.
2. What is your evaluation of the implementation?
The implementation is very good, especially those little parts attributed to someone named Jonathan ;-)
5. Did you try to use the library?
Yes, I used it extensively when I was porting it to Borland 5.x, Metrowerks, Comeau, Intel and (with limited success) to VC6.
6. How much effort did you put into your evaluation?
I think I've already answered this one.
7. Are you knowledgeable about the problem domain?
It's hard to say exactly what the problem domain is. If it's logging or debugging, then no, I'm not an expert, but I do know more than a little. However I think the library potentially has much wider applications.
1. What is your evaluation of the design?
I have some serious issues with the design, which I raised with Reece at an early stage. I really don't have any right to complain about them now, however, since I offered to collaborate on the library some time ago but then got busy with other things. I will explain what I think the purpose of the library should be, skecth some concepts and give some example uses. Let me note that I have implemented most of the following ideas, but only as a proof of concept. I see the library as the inverse of Spirit. Spirit takes a linear text and builds complex objects, while the output formatting library takes complex objects and renders them as linear text. Just as an abstract syntax tree does not preserve all the information in the input text, in many cases it will be desirable to loose information when an object is formatted using the present library. For example, sometime you might want a dog to be formatted as follows [ Dog; name: rover; breed: terrier mix; weight: 80 lb; daily habits: unspecified ] Other times, you might just want: [Dog: rover]. Therefore, I think the library should handle output only. I see the library as consiting of three components: I. Type classification for standard and Boost types, including A) a system for classifying types as 1. variable-length sequences of objects of a single type (example: std::vector) 2. heterogenous fixed length sequences of objects (example: boost::tuple) 3. types with more elaborate structure, something like XML Schema content models -- but I never gave this part much thought, so igore it ;-) B) function templates for extracting the elements from instances of the types with the above structures II. A system for allowing user-defined type to advertise their internal structure, so that they can be accessed like the types in I. For example, a Dog class might advertise that it consists of a string name and a float weight. There are a number of ways that this could be done, such as with members-pointers, default-constuctible functors which extract the information, etc. Any combination of these techniques should be allowed. III A framework of composable formatting objects (I'm using the term differently than the current library does) used to customize how complex types are output. A. The main building block is the concept of a Formatter (sketched below). There will be a number of built-in formatters, such as 1. sequence_formatter, for formatting objects of a type I.A.1. using specified opening, closing and separator strings 2. nary_formatter<N>, for formatting objects of a type I.A.2. Nary formatters can be specified with expression templates -- e.g., str("[") << _2 << " : " << _1 << ")" would format a pair (a, b) as [b : a). (Note the reversed order.) I've also expeirmented with the following notation, for formatting user-defined types: str("Dog:") << member(dog_name) << "," << member(dog_height) << "]" B. Styles will be composed from formatters. Formatters can be added to a style without qualification or with the stipulation that they apply only to objects of a given type or only to objects of types which satisfy a given mpl lambda expression. The order in which formatters are specified can create a cascading effect as in CSS. C. A single function boost::io::format, which takes an arbitrary type and returns an object which can be output using operator<<. Examples: cout << boost::io::format(obj); // Uses the default style cout << boost::io::format(obj).with(dog_format()) // Doggy-style cout << boost::io::format(obj) // Uses a complex style .use< is_vector<_> >( sequence_format("[", ",", "]") .use< is_pair<_> >( str("(") << _1 << ":" << _2 << "]" ); In the last example, nested objects which are standard vectors will be formatted [a, b, c, d...], while std::pairs will be formatted (a:b]. So a pair of vectors will look like this: ([a,s,d,f,g]:[a,w,w,e,r]], while a vector of pairs will look like this: [(a:b],(c:d],(e:f],(g:h],(i:j]] This last example suggests that it would be useful to compose formatters and store them so that they can be reused. Unfortunately, once the static type it lost, the compex formatting objects are useless in many cases. Ideally one would use 'auto': auto style = cajun_style().use< is_string<_> >( ... ) .use< ... > .etc With the current language, the best way to store styles is to define functions which return instances of them. This means you have to explicitly describe the return type, but only once. [unspecified style type] cajun_style(); cout << boost::io::format(obj).with(cajun_style()). ---------------------- Finally, let me describe what a formatter looks like. It is a class type with a templated member function format having the following signature template<typename Ch, typename Tr, typename T, typename Context> basic_ostream<Ch, Tr>& format(basic_ostream<Ch, Tr>& out, const T& t, Context& ctx); Here T is the type whose instance is to be formatted, and ctx contains the prevailing Style (a combination of formatters) as well as contextual information like depth of nesting and level of indentation. Formatters can specify that they are able to handle any type or only certain types (such as 3-ary types or types staisfying an mpl lambda expression).
8. Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion.
This is difficult. But here goes ... I think the library should be ACCEPTED, but *only* if it can be done without major changes. I wouldn't mind some of the ideas that I or others have sketched being incorporated into a future version of the library. However, if any major redesign is to be done, I believe another review is crucial, since the various proposed changes by Reece and others have not been spelled out in sufficient detail for them to be scrutinized. Best Regards, Jonathan

"Jonathan Turkanis" <technews@kangaroologic.com> wrote in message news:cit095$s0i$1@sea.gmane.org...
"John Torjo" <john.lists@torjo.com> wrote in message news:414443E0.60008@torjo.com...
Dear boosters,
The FORMAL Review of "Output Formatter" library begins today, Sept 12, 2004.
Hi, I hope I didn't miss the deadline.
Let me start by saying that this is a library I know and love (warts and all).
examined all the source code thouroughly at several points in its development, helped port the library to several compilers, and even contributed little bits.
So I'm sorry I don't have time to write a really detailed review. The review came at a bad time for me -- right after the review of my iostreams library. My review is based on reading (most of) the current thread and on my previous experience with the library.
I'll reorder the basic review questions:
4. What is your evaluation of the potential usefulness of the library?
Extremely useful.
3. What is your evaluation of the documentation?
I haven't read the current docs, but the last time I read them I found them insufficient, for reasons many have pointed out. There needs to be a general introduction explaining the scope of the library, lots of examples, and clear instuctions on how to extend the library. In order to learn about the library, I had to ask Reece a lot of questions and read the source.
2. What is your evaluation of the implementation?
The implementation is very good, especially those little parts attributed to someone named Jonathan ;-)
5. Did you try to use the library?
Yes, I used it extensively when I was porting it to Borland 5.x, Metrowerks, Comeau, Intel and (with limited success) to VC6.
6. How much effort did you put into your evaluation?
I think I've already answered this one.
7. Are you knowledgeable about the problem domain?
It's hard to say exactly what the problem domain is. If it's logging or debugging, then no, I'm not an expert, but I do know more than a little. However I think the library potentially has much wider applications.
1. What is your evaluation of the design?
I have some serious issues with the design, which I raised with Reece at an early stage. I really don't have any right to complain about them now, however, since I offered to collaborate on the library some time ago but then got busy with other things.
I will explain what I think the purpose of the library should be, skecth some concepts and give some example uses. Let me note that I have implemented most of the following ideas, but only as a proof of concept.
I see the library as the inverse of Spirit. Spirit takes a linear text and builds complex objects, while the output formatting library takes complex objects and renders them as linear text. Just as an abstract syntax tree does not preserve all the information in the input text, in many cases it will be desirable to loose information when an object is formatted using the present library. For example, sometime you might want a dog to be formatted as follows
[ Dog; name: rover; breed: terrier mix; weight: 80 lb; daily habits: unspecified ]
Other times, you might just want: [Dog: rover]. Therefore, I think the library should handle output only.
I see the library as consiting of three components:
I. Type classification for standard and Boost types, including A) a system for classifying types as 1. variable-length sequences of objects of a single type (example: std::vector) 2. heterogenous fixed length sequences of objects (example: boost::tuple) 3. types with more elaborate structure, something like XML Schema content models -- but I never gave this part much thought, so igore it ;-) B) function templates for extracting the elements from instances of the types with the above structures
II. A system for allowing user-defined type to advertise their internal structure, so that they can be accessed like the types in I. For example, a Dog class might advertise that it consists of a string name and a float weight. There are a number of ways that this could be done, such as with members-pointers, default-constuctible functors which extract the information, etc. Any combination of these techniques should be allowed.
III A framework of composable formatting objects (I'm using the term differently than the current library does) used to customize how complex types are output. A. The main building block is the concept of a Formatter (sketched below). There will be a number of built-in formatters, such as 1. sequence_formatter, for formatting objects of a type I.A.1. using specified opening, closing and separator strings 2. nary_formatter<N>, for formatting objects of a type I.A.2. Nary formatters can be specified with expression templates -- e.g.,
str("[") << _2 << " : " << _1 << ")"
would format a pair (a, b) as [b : a). (Note the reversed order.) I've also expeirmented with the following notation, for formatting user-defined types:
str("Dog:") << member(dog_name) << "," << member(dog_height) << "]"
B. Styles will be composed from formatters. Formatters can be added to a style without qualification or with the stipulation that they apply only to objects of a given type or only to objects of types which satisfy a given mpl lambda expression. The order in which formatters are specified can create a cascading effect as in CSS.
C. A single function boost::io::format, which takes an arbitrary type and returns an object which can be output using operator<<. Examples:
cout << boost::io::format(obj); // Uses the default style
cout << boost::io::format(obj).with(dog_format()) // Doggy-style
cout << boost::io::format(obj) // Uses a complex style .use< is_vector<_> >( sequence_format("[", ",", "]") .use< is_pair<_> >( str("(") << _1 << ":" << _2 << "]" );
In the last example, nested objects which are standard vectors will be
[a, b, c, d...], while std::pairs will be formatted (a:b]. So a pair of vectors will look like this:
([a,s,d,f,g]:[a,w,w,e,r]],
while a vector of pairs will look like this:
[(a:b],(c:d],(e:f],(g:h],(i:j]]
This last example suggests that it would be useful to compose formatters and store them so that they can be reused. Unfortunately, once the static type it lost, the compex formatting objects are useless in many cases. Ideally one would use 'auto':
auto style = cajun_style().use< is_string<_> >( ... ) .use< ... > .etc
With the current language, the best way to store styles is to define functions which return instances of them. This means you have to explicitly describe the return type, but only once.
[unspecified style type] cajun_style();
cout << boost::io::format(obj).with(cajun_style()).
----------------------
Finally, let me describe what a formatter looks like. It is a class type with a templated member function format having the following signature
template<typename Ch, typename Tr, typename T, typename Context> basic_ostream<Ch, Tr>& format(basic_ostream<Ch, Tr>& out, const T& t, Context& ctx);
Here T is the type whose instance is to be formatted, and ctx contains the prevailing Style (a combination of formatters) as well as contextual information like depth of nesting and level of indentation. Formatters can specify that
I formatted they
are able to handle any type or only certain types (such as 3-ary types or types staisfying an mpl lambda expression).
8. Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion.
This is difficult. But here goes ...
I think the library should be ACCEPTED, but *only* if it can be done without major changes. I wouldn't mind some of the ideas that I or others have sketched being incorporated into a future version of the library. However, if any major redesign is to be done, I believe another review is crucial, since the various proposed changes by Reece and others have not been spelled out in sufficient detail for them to be scrutinized.
Best Regards, Jonathan
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Jonathan Turkanis wrote:
I see the library as the inverse of Spirit. Spirit takes a linear text and builds complex objects, while the output formatting library takes complex objects and renders them as linear text. Just as an abstract syntax tree does not preserve all the information in the input text, in many cases it will be desirable to loose information when an object is formatted using the present library. For example, sometime you might want a dog to be formatted as follows
I 100% agree with that. Moreover I have a first experimental implementation of such a library here, which is able to do formatted output controlled by a structure, which is very much similar to the Spirit grammar DSL. BTW, the name of this library is Tirips (reversed Spirit) :-). For instance you could write: generate(str("abc") << char('d'), someoutputiter); Which will simply output "abcd". The different generator objects are parametrizable with lazy constructs: char const *str = "abc"; char ch = 'd'; generate( str[phoenix::const_(str)] << char_[phoenix::const_(ch)], someoutputiter); Would output "abcd" as well. More sophisticated constructs like the list_ would help in outputting container structures: vector<int> v = ...; generate(list_(',', int_)[phoenix::const_(v)], someoutputiter); outputs a comma separeted list of integers, and so on. I'm currently at the early stages of such a library so there isn't very much code to show, but if anybody is interested I'm happy to collaborate on discussing and implementing this. Regards Hartmut

Hi Hartmut,
More sophisticated constructs like the list_ would help in outputting container structures:
vector<int> v = ...; generate(list_(',', int_)[phoenix::const_(v)], someoutputiter);
outputs a comma separeted list of integers, and so on.
Do we need 'int_' here. With long class names, the expression will become huge. Is it required to specify the type of the object to be output? - Volodya

Vladimir Prus wrote:
More sophisticated constructs like the list_ would help in outputting container structures:
vector<int> v = ...; generate(list_(',', int_)[phoenix::const_(v)], someoutputiter);
outputs a comma separeted list of integers, and so on.
Do we need 'int_' here. With long class names, the expression will become huge. Is it required to specify the type of the object to be output?
Probably not in this case. Seems I gave a bad example ;-) As I've said, it's still in the stage of brainstorming, so any ideas and comments are very welcome. Regards Hartmut

Hartmut Kaiser wrote:
Jonathan Turkanis wrote:
I see the library as the inverse of Spirit. Spirit takes a linear text and builds complex objects, while the output formatting library takes complex objects and renders them as linear text. Just as an abstract syntax tree does not preserve all the information in the input text, in many cases it will be desirable to loose information when an object is formatted using the present library. For example, sometime you might want a dog to be formatted as follows
This is an interesting idea. Boost.Spirit is designed to take a sequence (a pair of forward iterators stored in a "scanner") of data (characters or tokens) and apply a set of rules on that data. Boost.Spirit is also able to behave contextually depending on what the next element(s) are in the data sequence. The Serialization library provides a way to save an object to a device at run-time and load it back at some point in the future, allowing it to be stored in a binary, text or XML format. The data written to the storage is not designed to be read by a human. My library is aimed at this market: producing human-readable output to a standard stream. This is most useful in debugging and tracing, but has applications elsewhere. Roland has given an example that uses lexical_cast: // initialize a std::list container: std::list< int > l = lexical_cast< int >( "[ 1, 2, 3 ]" ); For this to work, input facilities in my library is essential. NOTE: There is an issue with lexical_cast turning off whitespace skipping, but it should be trivial for my library to set and restore this flag during a read operation.
I 100% agree with that. Moreover I have a first experimental implementation of such a library here, which is able to do formatted output controlled by a structure, which is very much similar to the Spirit grammar DSL. BTW, the name of this library is Tirips (reversed Spirit) :-).
[snip]
I'm currently at the early stages of such a library so there isn't very much code to show, but if anybody is interested I'm happy to collaborate on discussing and implementing this.
Maybe it would be a good idea to have several competing proposals. We could then see how they compare when generating certain output formats on different types of data. Maybe Gennadiy could implement his proposal so it can be compared as well. Thoughts? Regards, Reece

on different types of data. Maybe Gennadiy could implement his proposal so it can be compared as well.
Thoughts?
Regards, Reece
See here. http://lists.boost.org/MailArchives/boost/msg71956.php It's most definitely not a proposal, just a prove of concept, but it should do it. Gennadiy

Reece Dunn wrote:
I 100% agree with that. Moreover I have a first experimental implementation of such a library here, which is able to do formatted output controlled by a structure, which is very much similar to the Spirit grammar DSL. BTW, the name of this library is Tirips (reversed Spirit) :-).
[snip]
I'm currently at the early stages of such a library so there isn't very much code to show, but if anybody is interested I'm happy to collaborate on discussing and implementing this.
Maybe it would be a good idea to have several competing proposals. We could then see how they compare when generating certain output formats on different types of data. Maybe Gennadiy could implement his proposal so it can be compared as well.
Sound sensible to me. Regards Hartmut

"Reece Dunn" <msclrhd@hotmail.com> wrote in message news:415317AD.7010604@hotmail.com...
Maybe it would be a good idea to have several competing proposals. We could then see how they compare when generating certain output formats on different types of data. Maybe Gennadiy could implement his proposal so it can be compared as well.
Thoughts?
I agree. My system was more-or-less complete last winter, but the implementation was poor. Eventually I realized I should be using fusion, but then I got too busy. I think I'll try to resurrect it when I get a chance, perhaps in a a couple of weeks. I think it will be somewhat more flexible than your current submission (except that it's output-only), but simpler then Helmut's idea. Best Regards, Jonathan

More sophisticated constructs like the list_ would help in outputting container structures:
vector<int> v = ...; generate(list_(',', int_)[phoenix::const_(v)], someoutputiter);
outputs a comma separeted list of integers, and so on.
Hmmm. I think the sytax is quite hard to understand/use. Best, John -- John Torjo -- john@torjo.com Contributing editor, C/C++ Users Journal -- "Win32 GUI Generics" -- generics & GUI do mix, after all -- http://www.torjo.com/win32gui/ -- v1.4 - save_dlg - true binding of your data to UI controls! + easily add validation rules (win32gui/examples/smart_dlg)

John Torjo wrote:
More sophisticated constructs like the list_ would help in outputting container structures:
vector<int> v = ...; generate(list_(',', int_)[phoenix::const_(v)], someoutputiter);
outputs a comma separeted list of integers, and so on.
Hmmm. I think the sytax is quite hard to understand/use.
My point wasn't the concrete syntax, this is volatile at this stage anyway. My point was the concept to implement a library, which is 'dual' to Spirit, based on the idea, that a grammar describes all possible matchable input sequences, so why not use this 'grammar' to specify, how to generate this input sequences. Regards Hartmut

On 09/23/2004 02:35 PM, Hartmut Kaiser wrote: [snip]
My point wasn't the concrete syntax, this is volatile at this stage anyway. My point was the concept to implement a library, which is 'dual' to Spirit, based on the idea, that a grammar describes all possible matchable input sequences, so why not use this 'grammar' to specify, how to generate this input sequences.
Also, couldn't these 'dual' libraries be used to generate test cases for each other?

Larry Evans wrote:
On 09/23/2004 02:35 PM, Hartmut Kaiser wrote: [snip]
My point wasn't the concrete syntax, this is volatile at this stage anyway. My point was the concept to implement a library, which is 'dual' to Spirit, based on the idea, that a grammar describes all possible matchable input sequences, so why not use this 'grammar' to specify, how to generate this input sequences.
Also, couldn't these 'dual' libraries be used to generate test cases for each other?
Sure. This would be good for performing a feature-by-feature comparison. Hooking the machinery into Boost.Test would be a good way of getting the results. Naturally, the test cases for each library should pass their own tests on supported platforms; the interesting cases will be the test cases from the other libraries. How do we implement the tests? I suggest that we split the information into: [1] the data type (type) being formatted, e.g. std::map< std::string, std::string > with the details of its contents; [2] for output -- the string (str) such that: type val = ...; ostringstream ss; ss << some_fn( val ); BOOST_TEST( ss.str() == str ); [3] for input, the string (str) such that: type val = ...; type read; istringstream ss( str ); ss >> some_fn( read ); BOOST_TEST( is_equal( read, val )); where is_equal( const T & a, const T & b ) returns true iff a == b. This would then provide a basis for comparison between the libraries, including: does the library support formatting of this form? and what are the differences in the code used to perform the formatting? For example: // review-style outfmt: ss << io::formatob( vec. io::containerfmt().format( "< ", " >" )); // new-style outfmt: ss << io::object( vec. fmt::container().decorate( "< ", " >" )); // Volodya-style outfmt: ss << io::object( vec, "< " + fmt::container() + " >" ); // Torjo-style outfmt: // note: $ is used because it would be infrequently used as // part of a decoration (escape = '\$') ss << io::object( vec, fmt::container()( "< $*$ >" )); // decoration-on-stream outfmt: ss << cdecorate( "< ", " >" ) << io::object( vec, fmt::container()); // type deduction in outfmt: ss << io::object( vec ).decorate( "< ", " >" ); Regards, Reece

Reece Dunn wrote:
Larry Evans wrote:
On 09/23/2004 02:35 PM, Hartmut Kaiser wrote: [snip]
My point wasn't the concrete syntax, this is volatile at this stage anyway. My point was the concept to implement a library, which is 'dual' to Spirit, based on the idea, that a grammar describes all possible matchable input sequences, so why not use this 'grammar' to specify, how to generate this input sequences.
Also, couldn't these 'dual' libraries be used to generate test cases for each other?
Sure. This would be good for performing a feature-by-feature comparison. Hooking the machinery into Boost.Test would be a good way of getting the results. Naturally, the test cases for each library should pass their own tests on supported platforms; the interesting cases will be the test cases from the other libraries.
How do we implement the tests? I suggest that we split the information into:
[1] the data type (type) being formatted, e.g. std::map< std::string, std::string > with the details of its contents;
[2] for output -- the string (str) such that: type val = ...; ostringstream ss; ss << some_fn( val ); BOOST_TEST( ss.str() == str );
I think there's a wrap_stringstream() in the sandbox. You could then do: BOOST_TEST( (wrap_strinstream() << some_fn(val)) == str); Best, John -- John Torjo -- john@torjo.com Contributing editor, C/C++ Users Journal -- "Win32 GUI Generics" -- generics & GUI do mix, after all -- http://www.torjo.com/win32gui/ -- v1.4 - save_dlg - true binding of your data to UI controls! + easily add validation rules (win32gui/examples/smart_dlg)

[3] for input, the string (str) such that: type val = ...; type read; istringstream ss( str ); ss >> some_fn( read ); BOOST_TEST( is_equal( read, val )); where is_equal( const T & a, const T & b ) returns true iff a == b.
Boost.Test provides much more powerful tools for testing outputs. You could use BOOST_CHECK_EQUAL. But best of all you could use output_test_stream. Gennadiy.

"Hartmut Kaiser" <hartmutkaiser@t-online.de> wrote in message news:1CAMfg-25Uvzt0@afwd00.sul.t-online.com...
Jonathan Turkanis wrote:
I see the library as the inverse of Spirit. Spirit takes a linear text and builds complex objects, while the output formatting library takes complex objects and renders them as linear text. Just as an abstract syntax tree does not preserve all the information in the input text, in many cases it will be desirable to loose information when an object is formatted using the present library. For example, sometime you might want a dog to be formatted as follows
I 100% agree with that. Moreover I have a first experimental implementation of such a library here, which is able to do formatted output controlled by a structure, which is very much similar to the Spirit grammar DSL. BTW, the name of this library is Tirips (reversed Spirit) :-).
For instance you could write:
generate(str("abc") << char('d'), someoutputiter);
Which will simply output "abcd". The different generator objects are parametrizable with lazy constructs:
<snip examples>
I'm currently at the early stages of such a library so there isn't very much code to show, but if anybody is interested I'm happy to collaborate on discussing and implementing this.
I'm definitely interested. Jonathan

Hi Jonathan,
I see the library as the inverse of Spirit. Spirit takes a linear text and builds complex objects, while the output formatting library takes complex objects and renders them as linear text.
Interesting! Hartmut Kaiser has expressed the similar view (off-list).
II. A system for allowing user-defined type to advertise their internal structure, so that they can be accessed like the types in I. For example, a Dog class might advertise that it consists of a string name and a float weight. There are a number of ways that this could be done, such as with members-pointers, default-constuctible functors which extract the information, etc. Any combination of these techniques should be allowed.
Isn't this close to using 'serialize' for extracting members, that I advocate?
III A framework of composable formatting objects (I'm using the term differently than the current library does) used to customize how complex types are output. A. The main building block is the concept of a Formatter (sketched below). There will be a number of built-in formatters, such as 1. sequence_formatter, for formatting objects of a type I.A.1. using specified opening, closing and separator strings 2. nary_formatter<N>, for formatting objects of a type I.A.2. Nary formatters can be specified with expression templates -- e.g.,
str("[") << _2 << " : " << _1 << ")"
would format a pair (a, b) as [b : a). (Note the reversed order.)
Wow, that's nice coincidence, here's a part of email I've sent to Hartmut yesterday: Maybe, 'list_' should accept two parameters? list_(';', str("(") << _1 << "," << _2)[phoenix::val(v)] There's still a question where do you specify the stream... but basically, the model that each formatter is just a functional object is a very simple one, and that's good.
I've also expeirmented with the following notation, for formatting user-defined types:
str("Dog:") << member(dog_name) << "," << member(dog_height) << "]"
I'd still prefer 'serialize', just so that we have one method of describing members.
C. A single function boost::io::format, which takes an arbitrary type and returns an object which can be output using operator<<. Examples:
cout << boost::io::format(obj); // Uses the default style
cout << boost::io::format(obj).with(dog_format()) // Doggy-style
cout << boost::io::format(obj) // Uses a complex style .use< is_vector<_> >( sequence_format("[", ",", "]") .use< is_pair<_> >( str("(") << _1 << ":" << _2 << "]" );
Oh, MPL lambda? That's a cool idea! But isn't this a bit too flexible? At all events, the smaller the size of the headers I need to include, the better.
Finally, let me describe what a formatter looks like. It is a class type with a templated member function format having the following signature
template<typename Ch, typename Tr, typename T, typename Context> basic_ostream<Ch, Tr>& format(basic_ostream<Ch, Tr>& out, const T& t, Context& ctx);
Here T is the type whose instance is to be formatted, and ctx contains the prevailing Style (a combination of formatters)
If 'format' is a member of formatter, then why context should store some other formatter? Or are they formatters for nested types? - Volodya

"Vladimir Prus" <ghost@cs.msu.su> wrote in message news:citomt$bmh$1@sea.gmane.org...
Hi Jonathan,
I see the library as the inverse of Spirit. Spirit takes a linear text and builds complex objects, while the output formatting library takes complex objects and renders them as linear text.
Interesting! Hartmut Kaiser has expressed the similar view (off-list).
II. A system for allowing user-defined type to advertise their internal structure, so that they can be accessed like the types in I. For example, a Dog class might advertise that it consists of a string name and a float weight. There are a number of ways that this could be done, such as with members-pointers, default-constuctible functors which extract the information, etc. Any combination of these techniques should be allowed.
Isn't this close to using 'serialize' for extracting members, that I advocate?
Could you repeat how this would work? I considered allowing user defined types to provide a function (member or non-member) which returns a list of members for use in serialization; unfortunately this was very wastefull in the (common) case that you don;t need to use all the information
III A framework of composable formatting objects (I'm using the term differently than the current library does) used to customize how complex types are output. A. The main building block is the concept of a Formatter (sketched below). There will be a number of built-in formatters, such as 1. sequence_formatter, for formatting objects of a type I.A.1. using specified opening, closing and separator strings 2. nary_formatter<N>, for formatting objects of a type I.A.2. Nary formatters can be specified with expression templates -- e.g.,
str("[") << _2 << " : " << _1 << ")"
would format a pair (a, b) as [b : a). (Note the reversed order.)
Wow, that's nice coincidence, here's a part of email I've sent to Hartmut yesterday:
Maybe, 'list_' should accept two parameters?
list_(';', str("(") << _1 << "," << _2)[phoenix::val(v)]
Cool.
There's still a question where do you specify the stream... but basically, the model that each formatter is just a functional object is a very simple one, and that's good.
I agree.
C. A single function boost::io::format, which takes an arbitrary type and returns an object which can be output using operator<<. Examples:
cout << boost::io::format(obj); // Uses the default style
cout << boost::io::format(obj).with(dog_format()) // Doggy-style
cout << boost::io::format(obj) // Uses a complex style .use< is_vector<_> >( sequence_format("[", ",", "]") .use< is_pair<_> >( str("(") << _1 << ":" << _2 << "]" );
Oh, MPL lambda? That's a cool idea! But isn't this a bit too flexible? At all events, the smaller the size of the headers I need to include, the better.
Maybe lamda support could be enabled at user option. Anyway, soon compilers will have mpl built in, so it should be no problem;-)
Finally, let me describe what a formatter looks like. It is a class type with a templated member function format having the following signature
template<typename Ch, typename Tr, typename T, typename Context> basic_ostream<Ch, Tr>& format(basic_ostream<Ch, Tr>& out, const T& t, Context& ctx);
Here T is the type whose instance is to be formatted, and ctx contains the prevailing Style (a combination of formatters)
If 'format' is a member of formatter, then why context should store some other formatter? Or are they formatters for nested types?
Right. A 3-ary formatter might do this (by the way, the return type should be void): template<typename Ch, typename Tr, typename T, typename Context> void format(basic_ostream<Ch, Tr>& out, const T& t, Context& ctx) { out << "<" << ctx::format(get<0>)(t) << "," << ctx::format(get<1>)(t) << "," << ctx::format(get<2>)(t) << << ">"; }
- Volodya
Jonathan

Hi Jonathan,
II. A system for allowing user-defined type to advertise their internal
structure, ......
Isn't this close to using 'serialize' for extracting members, that I advocate?
Could you repeat how this would work? I considered allowing user defined types to provide a function (member or non-member) which returns a list of members for use in serialization; unfortunately this was very wastefull in the (common) case that you don;t need to use all the information
I though the system will just work by providing an object with overloaded operator&: class outputter { public: template<class T> outputter& operator&(const boost::nvp<T>& nvp) { cout << nvp.name() << ":" << nvp.value() << "\n"; } }; class my { template<.....> void serialize(Archive& ar......) { ar & BOOST_SERIALIZATION_NVP(i); } int i; }; Why do you think it's common to don't need all the information? Yes, you probably don't need names for many formatters, but then the operator& will be inline and compiler can optimize passing of the name. - Volodya

"Vladimir Prus" <ghost@cs.msu.su> wrote in message news:200409281058.35304.ghost@cs.msu.su...
Hi Jonathan,
II. A system for allowing user-defined type to advertise their internal
structure, ......
Isn't this close to using 'serialize' for extracting members, that I advocate?
Could you repeat how this would work? I considered allowing user defined types to provide a function (member or non-member) which returns a list of members for use in serialization; unfortunately this was very wastefull in the (common) case that you don;t need to use all the information
I though the system will just work by providing an object with overloaded operator&:
class outputter { public: template<class T> outputter& operator&(const boost::nvp<T>& nvp) { cout << nvp.name() << ":" << nvp.value() << "\n"; } };
class my { template<.....> void serialize(Archive& ar......) { ar & BOOST_SERIALIZATION_NVP(i); } int i; };
Why do you think it's common to don't need all the information? Yes, you probably don't need names for many formatters, but then the operator& will be inline and compiler can optimize passing of the name.
I can imagine wanting to generate a report in xml which involves enumerating the employees working on a project. The employees may be represented by complex objects containing extraneous information such as work history, and only the employee name may be needed. In that case, using a serialize method would be wasteful. I don't see why a framework can't provide several options. Jonathan

Jonathan Turkanis wrote:
I though the system will just work by providing an object with overloaded operator&:
class outputter { public: template<class T> outputter& operator&(const boost::nvp<T>& nvp) { cout << nvp.name() << ":" << nvp.value() << "\n"; } };
class my { template<.....> void serialize(Archive& ar......) { ar & BOOST_SERIALIZATION_NVP(i); } int i; };
Why do you think it's common to don't need all the information? Yes, you probably don't need names for many formatters, but then the operator& will be inline and compiler can optimize passing of the name.
I can imagine wanting to generate a report in xml which involves enumerating the employees working on a project. The employees may be represented by complex objects containing extraneous information such as work history, and only the employee name may be needed. In that case, using a serialize method would be wasteful.
Yes, a bit. OTOH, it would be possible to use the same serialize method to build, once, and member name -> offset map, which can be then used. For a name case you can do: template<class> figure_out_name_offset { figure_out_name_offset operator&(nvp& p) { if (p.name() == "name") { m_address = &p.value() } } std::string* m_address; }; Person p; figure_out_name_offset f; p.serialize(f); unsigned offset = (int)f.m_address - (int)&p;
I don't see why a framework can't provide several options.
I'd rather not see several different options when one is good enough. Many options will confuse users. - Volodya

"Vladimir Prus" <ghost@cs.msu.su> wrote in message news:cjdkpt$bgp$1@sea.gmane.org...
Jonathan Turkanis wrote:
I can imagine wanting to generate a report in xml which involves enumerating the employees working on a project. The employees may be represented by complex objects containing extraneous information such as work history, and only the employee name may be needed. In that case, using a serialize method would be wasteful.
Yes, a bit. OTOH, it would be possible to use the same serialize method to build, once, and member name -> offset map, which can be then used. For a name case you can do:
template<class> figure_out_name_offset { figure_out_name_offset operator&(nvp& p) { if (p.name() == "name") { m_address = &p.value() } } std::string* m_address; }; Person p; figure_out_name_offset f; p.serialize(f); unsigned offset = (int)f.m_address - (int)&p;
Does this force the class to represent it's fields as strings? If so, it's contrary to my understanding of output formatting, according to which types simply advertise their structure and the end user has complete control over how subobjects are formatted. Jonathan

Jonathan Turkanis wrote:
template<class> figure_out_name_offset { figure_out_name_offset operator&(nvp& p) { if (p.name() == "name") { m_address = &p.value() } } std::string* m_address; }; Person p; figure_out_name_offset f; p.serialize(f); unsigned offset = (int)f.m_address - (int)&p;
Does this force the class to represent it's fields as strings?
The above code -- yes, but I've used std::string for simplicity. I think it's possible to modify the example in such a way that you won't store an address of a variable but a formatter*, created from using address of object, address of &p.value() and depending on type of p.value() - Volodya

"Vladimir Prus" <ghost@cs.msu.su> wrote in message news:cje34t$f10$1@sea.gmane.org...
Jonathan Turkanis wrote:
template<class> figure_out_name_offset { figure_out_name_offset operator&(nvp& p) { if (p.name() == "name") { m_address = &p.value() } } std::string* m_address; }; Person p; figure_out_name_offset f; p.serialize(f); unsigned offset = (int)f.m_address - (int)&p;
Does this force the class to represent it's fields as strings?
The above code -- yes, but I've used std::string for simplicity. I think it's possible to modify the example in such a way that you won't store an address of a variable but a formatter*, created from using address of object, address of &p.value() and depending on type of p.value()
The problem is that user-defined types have to be given a way to announce the static types of their subelements, or that information will be lost and can't be used by the formatting objects with templated member functions. With serialization, everything work (putting aside pointers to polymorphic objects) because the class itself manages its own deserialization and so knows the static types. For example, suppose you pass an archive object of some sort, with an overloaded operator& (I'd rather use operator<<), to the format member function of a user-defined type: struct Dog { template<typename FormatHelper> void format(FormatHelper& fmt) { fmt & name; fmt & weight; } std::string name; float weight; }; Here, fmt knows the type of the member field name only during the execution of fmt & name. If it wants to format weight first, it has no way to store the name, unless Dog tell it, independently, that its fields have type (string, float). Then fmt can allocate a tuple<string, float> to store the values. This is still wasteful if fmt only wants to use the second field. Maybe I am overlooking some implementation technique. Jonathan

Jonathan Turkanis wrote:
A. The main building block is the concept of a Formatter (sketched below). There will be a number of built-in formatters, such as 1. sequence_formatter, for formatting objects of a type I.A.1. using specified opening, closing and separator strings 2. nary_formatter<N>, for formatting objects of a type I.A.2. Nary formatters can be specified with expression templates -- e.g.,
str("[") << _2 << " : " << _1 << ")"
Forgot to mention in the previous email: I recall that when I was trying to write my own code for this task, I found very usefull to number elements of a sequence. I still have handwritten code for dumping some data which numbers the elements. That's handy when elements are referred by indexes. I'd like such a formatter to be a part of the library; I know there's such example already, but I'd rather see a documented part of a library, not just an example. - Volodya

"Vladimir Prus" <ghost@cs.msu.su> wrote in message news:citpi1$bmh$3@sea.gmane.org...
Jonathan Turkanis wrote:
A. The main building block is the concept of a Formatter (sketched below). There will be a number of built-in formatters, such as 1. sequence_formatter, for formatting objects of a type I.A.1. using specified opening, closing and separator strings 2. nary_formatter<N>, for formatting objects of a type I.A.2. Nary formatters can be specified with expression templates -- e.g.,
str("[") << _2 << " : " << _1 << ")"
Forgot to mention in the previous email: I recall that when I was trying to write my own code for this task, I found very usefull to number elements of a sequence. I still have handwritten code for dumping some data which numbers the elements. That's handy when elements are referred by indexes.
I'd like such a formatter to be a part of the library; I know there's such example already, but I'd rather see a documented part of a library, not just an example.
I think Reece has an example of this. Of course it's easy to do with my framework too. Jonathan
participants (15)
-
Andy Little
-
David Abrahams
-
Gennadiy Rozental
-
Hartmut Kaiser
-
Jeff Flinn
-
John Torjo
-
Jonathan Turkanis
-
Larry Evans
-
Paul A Bristow
-
Pavel Vozenilek
-
Reece Dunn
-
Roland Richter
-
Thorsten Ottosen
-
troy d.straszheim
-
Vladimir Prus