[program_options] multi-line description patch

What do you think about the attached patch to enable correct formated multi-line option descriptions? It is not pretty but is has the following features: - Formats multi-line descriptions using first_column_width and a given line length (currently hardcoded). - If set up for a shell with 80 chars per line it uses only 79 *but* also formats correctly in larger shells. - Handles '\n' in description strings. - Trimms leading *single* spaces in new lines (" that" gets trimmed, but " leave my formatting" not). - Words are not chopped at line endings but put to new line (if not more that half the line would be empty). Known problems: - If the current shell has too short lines the formatting is all messed up. - Chokes on '\t', they are currently not handled in any way. - Sacrifices one char to be able to format correctly in larger shells. - Code is a "iterator arithmetic" hell. :( - The line length should be a parameter of options_description with a default value (probably 80). With this users may query the current shell for its line length, if possible. Patch compiles fine on CW 8.3 and VC 7.1. Maybe heaving two modes would also be a good idea? One as a "safe mode" where one char is sacrifices but the line length may be longer and another mode where the exact line length is known and fully used ... The following code ... ---------------------------------------------------------- po::options_description options("Options"); options.add_options() ("help", "a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg ") ("well_formated", "As you can see this is a very well formatted option description.\n" "You can do this for example:\n\n" "Values:\n" " Value1: does this and that\n" " Value2: does something else\n" " Value3: has a very long\n description.") ; std::cout << options << "\n"; ---------------------------------------------------------- ... gives this output. ---------------------------------------------------------- Options: --help a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg --well_formated As you can see this is a very well formatted option description. You can do this for example: Values: Value1: does this and that Value2: does something else Value3: has a very long description. ----------------------------------------------------------

OK, i have rewritten the multi-line support. I hope it is much easier to read and understand now (but still got some iterator arithmetics in it ...) Now tabs ('\t') have a special function: - A option descriptions consists of one or more paragraphs. - Paragraphs are seperated by a '\n' and may be empty. - Each paragraph has an independend indent relative to first_column_width. - The indent is set as the position of the last tab ('\t') in a paragraph, other tabs are ignored. IMHO multiple indents per paragraph make no sens. - Additionally a first line indent can be simulated with spaces (' ') at the paragraph beginning. So the following code ... ----------------------------------------------- po::options_description options("Options"); options.add_options() ("help", "a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg ") ("well_formated", "As you can see this is a very well formatted option description.\n" "You can do this for example:\n\n" "Values:\n" " Value1: \tdoes this and that, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla\n" " Value2: \tdoes something else, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla\n\n" " This paragraph has a first line indent only, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla") ; std::cout << options << "\n"; ----------------------------------------------- ... gives this output (formatted for 50 char per line) ----------------------------------------------- Options: --help a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg --well_formated As you can see this is a very well formatted option description. You can do this for example: Values: Value1: does this and that, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla Value2: does something else, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla This paragraph has a first line indent only, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla -----------------------------------------------

Looks like there is not much interest in my patch!? In case someone is interested here is new verseion with line length added as a parameter to options_description ctor. BTW: on Win32 getting the line length of the current console is as easy as: HANDLE console = GetStdHandle(STD_OUTPUT_HANDLE); if (console != NULL) { CONSOLE_SCREEN_BUFFER_INFO buffer_info; if (GetConsoleScreenBufferInfo(console, &buffer_info)) { line_length = buffer_info.dwSize.X; } CloseHandle(console); } Bertolt

Bertolt Mildner wrote:
Looks like there is not much interest in my patch!?
I think it's a wonderful idea and the next time I write a program with command-line options I would surely use it. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

Hi Bertolt,
Looks like there is not much interest in my patch!?
Quite the opposite, I'm interested in getting this into program_options. I just seen your first email on Saturday and generally have no time on weekends for any hacking :-( Basically, the change is great but I'd like to clarify some things. 1. You wrote:
- The indent is set as the position of the last tab ('\t') in a paragraph, other tabs are ignored.
Could you clarify this? I could not understand this from the example you've given. 2. I'd prefer that your code be a separate function, which takes a string and outputs it (or returns a new string with proper line-breaks/indents). 3. Such a function really should have comments. It's very hard to understand the exact formatting rules from the code, and such documentation must be present in some form. If you'll provide code comments, I'll eventually update the documentation. You can just copy all the explanations you've given in your email to the comment and review it for clarity. 4. // we need to use one char less per line to work correctly if actual // console has longer lines Could you clarify? 5. // trimm slice if out of bounds if (i + slice > e) { slice -= 1; } Why do you do this? 6. // prevent chopped words if ((*(i + (slice - 1)) != ' ') && (((i + slice) < e) && (*(i + slice) != ' '))) This really needs a more verbose comment. Like 'check if current line ends in non-space, and the next character is non-space too'. 7. slice = min<string::difference_type>(line_length - indent, e - i); Maybe, this should be moved to the top of the loop. I was confused by the assignment which is not used below. 8. This is completely up to you, but I wonder if consistently using string indices everywhere would be clearer. Now you use both indices and iterators and need to convert between them. Thanks, Volodya

"Vladimir Prus" <ghost@cs.msu.su> schrieb im Newsbeitrag news:cp13no$2cr$1@sea.gmane.org... paragraph,
I can try, but don't know if i can get it much clearer than my example ... My intentions behind this feature are: - If tabs are outputted as tabs they may easily destroy the formatting. - It would be nice to indent a paragraph relative to first_column_width. So paragraph formatting works the following way: - All tabs are removed before output. - If the first line of a paragraph should be indented this has to be done with one or more spaces. (There is no trimming done in the first line!) - If a paragraph is spanned across multible lines, following lines are indented (relative to first_column_width) using the position of the last tab in the paragrapth. (Please note that other tabs are not adding to the position of the last one!) So a paragrapth as two indentation levels (relative to first_column_width). A "first line indent" and a indent for succeeding lines, if there are any. OK, this sounds very complex but it isn't in practical usage. Some examples for pragraph formatting: "bla bla bla bla bla bla bla bla bla bla" bla bla bla bla bla bla bla bla bla bla " bla bla bla bla bla bla bla bla bla bla" bla bla bla bla bla bla bla bla bla bla " \tbla bla bla bla bla bla bla bla bla bla" bla bla bla bla bla bla bla bla bla bla
2. I'd prefer that your code be a separate function, which takes a string and outputs it (or returns a new string with proper line-breaks/indents).
OK, should be easy.
OK, i'll try to improve on that.
At last in Win32 consoles the cursor gets moved to a new line after writing to the last possible location in a line. So you basically have two options: - Use full console lines *but* smasch fromatting if console happens to have longer lines. - Use "manual newlines", so sacrifice one char for the '\n' *but* formatting is intact in larger consoles. That is why I wrote in a previous post that may be two modes would be nice: - One where the *exact* line length is known. - One where line_length is just a guess, and formatting works in consoles with *at last* line_length long lines.
The current part (= line) of a paragraph is [i, i + slice]. So if i is increased "i + slice" may move beond the end of the paragraph. BTW: there might be a problem if slice gets 0, have to look into that
More verbrose that the comment in my last patch?
Yes, i think you are right.
I absolutely agree with you! I'm not at all happy with the current implementation my self. So it's on my todo list, too. Thank you very much for your comments! Stay tuned for a new patch. Bertolt

Hi Bertolt,
So, the position of tab basically sets a indent level for the remaining lines in the paragraph. Now I understand this. However, I don't like the "last tab takes effect" rule. What if that last tab is on the third line of the paragraph? Maybe, it's clearer (in code and in docs) to allow only one tab in a paragraph and throw otherwise? Or do you have a use case for multiple tabs?
Yea, I understand now.
Thanks, I understand the situation now. I think loosing 1 character is not a big problem, so introducing two modes is not necessary.
Ok. I guess if "slice = min<....>" were at the top of the loop, I would not be confused by this code -- I though you have to do clipping anyway, not only when there's leading whitespace. BTW, maybe, you can put "slice = " code after "if (!first_line) and remove the clipping inside that "if"?
Sorry, the comment there is OK: // if (lastchar != ' ') && // (exists(lastchar + 1) && (lastchar + 1 != ' ') I misparsed it as old code you've commented out, not as real comment ;-)
Thanks you! I'm looking forward. - Volodya

line!)
OK, to throw on multiple tabs makes perfect sens. (Any hint on what exception class is suitable?) I also see the problem when the tab happens to be not on the first line. The problem is that in cases where line_length is queried from the console this may always happen, just set your console to 10 char long lines -> peng. It's the same problem with (first_column_width >= line_length)! I would prefere to *not* throw in this situations, but only assert on them. I think it would be much better to leave a user with an ill-formatted option description than with only an error msg. And probably set the pragraph indent to something like "tab_pos % (line_length - indent)" to prevent problems in the fromat code. The formatting is smashed anyway ... paragraph.
Ok. I guess if "slice = min<....>" were at the top of the loop, I would
not
Yup, well spotted! Done. Thanks, Bertolt

Bertolt Mildner wrote:
I think program_options::error would work.
Asserting on run-time condition does not seems reasonable to me ;-) Maybe, the right solution is to silently ignore tabs which are not on the first line of a paragraph?
Yes, formatting is smashed anyway, so ignoring the tabs on the seconds line is a viable option. - Volodya

OK, so they are ignored, *but* i still think that asserting on it is vital because in the other case (= fixed length set by user or even worth, line length defaults to m_default_line_length cause user did *not* set a line length at all) it would mean that the formatting simply silently fails. Not very nice form a users point of view! Bertolt

On Wed, 2004-12-08 at 12:13 +0300, Vladimir Prus wrote:
I am joining this discussion really late but I have also written (an in- house) prettifier which reformats the options descriptions. Some thoughts: One of things I did was to align the description lines of different sets of options. (I usually have at least two sets of options, one of which is truly optional, and one of which is mandatory. I.e. an exception is throw if any are missing). I considered doing everything by tabs etc., but decided that it would be much better to embed a "line-wrap control code" in my description text. I just used "\0x01" which is a bit of a kludge but no worse that "the last tab". Leo

length defaults to m_default_line_length cause user did *not* set a
"Leo Goodstadt" <leo.goodstadt@human-anatomy.oxford.ac.uk> schrieb im Newsbeitrag news:1102937091.31683.44.camel@fgu029.anat.ox.ac.uk... line line
Could you clarify this? Maybe give an example?
The "last tab" rule is allready gone! Bertolt

On Monday 13 December 2004 14:24, Leo Goodstadt wrote:
Hi Leo, it's surprising that two people independently did the same! Since the patch from Bertolt is already submitted and reviewed once, I think it will eventually be committed...
But I'm interested in aligning across sets of options, too. I suspect the code to compute alignment is pretty independent from formatting itself and can become a separate patch?
Actually, it's "first tab", now, IIRC. I've no opinion which one is better. - Volodya

A new version of the patch can be found in the Boost Sandbox File Vault (http://boost-sandbox.sourceforge.net/vault/) under /bmildner/program_optins. It has (hopefully) all the changes discussed so far, except for the detailed formatting desciption. I also took the liberty to add a copyright notice. Hope that is OK? BTW: In previous posts i always talked about the "position of the tab" that is used to set the paragraph indent. That is not correct, it is the "index of the tab"! Bertolt

Hi Bertolt,
I've reviewed the patch and think it's almost ready. Remaining questions: 1. You use 2-space indent while the rest of the file uses 4-space. Do you mind if I auto-reformat this? 2. I don't think that passing line width as construct parameter to options_descripton is optimal. It's not really property of options description. A better design would be do only line_width parameter to the options_description::print method. What do you think? 3. Do you plan to add detailed formatting description. If you don't have the time now, no problem, I'll commit the patch anyway. - Volodya

3. Do you plan to add detailed formatting description. If you don't have
But this would mean no more oprator<< for options_descripton! Not a real problem but existing code like os << desc; would have to be changed to desc.print(os, line_length); If you want i can make that change but the question is do you really want to? (Assuming a default line_length for oprator<< does not really look right to me.) the
time now, no problem, I'll commit the patch anyway. I plan to do. But probably not before new year.
Bertolt

Hi Bertolt, sorry for a belated reply, but here it goes anyway.
I think default line length will work for a large percentage of users, and to make output with non-default line_length convenient we can introduce a new function: os << line_length(desc, 40) which will create a special object that will call 'print' with the right parameters. What do you think? Regardless of what we decide on the above, your patch is almost finished, so I've just committed it. Thanks for the work and your patience!
No pressure, but still would be nice ;-) - Volodya

OK OK, i think i will we able to live with it :)
You are welcome!
I'm not sure if i can descibe it in my own language in a way someone will undestand it :( anyway here is a first try: ------------------------------- As the class options_description can be used to generate a help message that can be presented to a user it is important to have some control over the formatting of the descrption of an option. A description has one or more paragraphs. Paragraphs are seperated by a explicit newline ('\n') and may be empty. If a paragraph does not fit in one line it is spanned over multiple lines. The library tries to prevent chopped words and leading spaces in new lines. Words are chopped only if longer than half the available space. Leading spaces are only skipped if not followed by a space. A pragraph has two independend indent levels. One for the first line and and one for the following lines if there are any. The first line indent is done by simply inserting spaces at the beginning of a pragraph. The indent for following lines can be specified by inserting a tabulator character ('\t') where the index of the tabulator is the indent length. Before output the tabulator is removed. If the tabulator happens not to be on the first line of the pragraph or is on the last possible position of the first line it is ignored. Only one tabulator per paragraph is allowed else an exception of type program_options::error is thrown. This way of specifying indents may seem overly complicated but the following examples hopefully demonstrate that usage is rather intuitive. po::options_description options("Options"); options.add_options() ("help", "a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg ") ("well_formated", "As you can see this is a very well formatted option description.\n" "You can do this for example:\n\n" "Values:\n" " Value1: \tdoes this and that, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla\n" " Value2: \tdoes something else, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla\n\n" " This paragraph has a first line indent only, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla"); std::cout << line_length(options, 50) << "\n"; ... gives this output Options: --help a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg a long help msg --well_formated As you can see this is a very well formatted option description. You can do this for example: Values: Value1: does this and that, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla Value2: does something else, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla This paragraph has a first line indent only, bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla ------------------------------- Hope that it makes some sense what i wrote!? There should be better examples, just used the ones i already had. Bertolt PS: did you get my mail with the value_semantic patch?
participants (4)
-
Bertolt Mildner
-
David Abrahams
-
Leo Goodstadt
-
Vladimir Prus