Improving PDF generation - a common look and feel?

Folks, I've been working on improving our ability to generate PDF's from Quickbook/Docbook source using a "torture-test" consisting of a subset of the Math lib docs. I have some questions/feedback required, but first some observations: FO Generators: ~~~~~~~~~~~~~~ I've tried three FO processors: Apache FOP-0.23: lot's of flow control issues, and SVG rendering problems. Don't use unless you have to. Apache FOP-0.93 (latest stable release): Some flow control issues, but hugely improved compared to 0.23. Doesn't render symbol characters (Greek characters for example) correctly without manually editing the FO output. XEP from RenderX (www.renderx.com): Everything I've tried just plain works first time. It's command line compatible with FOP so it fit's into Boost.Build with just a trivial change to your user_config.jam. Only downside is that it's a commercial product and the free personal edition puts a small logo at the bottom of each page. Test outputs from each can be found in the vault: http://boost-consulting.com/vault/index.php?&direction=0&order=&directory=PDF%20Test Improving Our Stylesheets ~~~~~~~~~~~~~~~~~~~~~~~~~ I've made some changes (not committed yet) to our fo.xsl stylesheet to: * Syntax highlight C++ code. * Put a box around code blocks and admonishments. * Improve the appearance of tables. * Added some keep-together instructions to improve flow-control around tables/code/admonishments. I've tried to mimic our HTML stylesheets as far possible, and the effects can be seen in the test PDF's. The Questions: ~~~~~~~~~~~~~~ 1) How do folks feel about the look and feel of these: are these heading in the right direction? 2) Do we want a consistent look/feel across all Boost-PDF's? If yes, is it OK to commit the stylesheet changes (diff's attached)? 3) Many of the page layout options (margins etc) I'm using are expressed as xsl-params in my Jamfile, should these be in the fo.xsl as well? Currently I have: # PDF Options: # TOC Generation: this is needed for FOP-0.9 and later: # <xsl:param>fop1.extensions=1 # TOC for XEP only: <xsl:param>xep.extensions=1 # TOC generation: this is needed for FOP 0.2, but must not be set to zero for FOP-0.9 or XEP! <xsl:param>fop.extensions=0 # No indent on body text: <xsl:param>body.start.indent=0pt # Margin size: <xsl:param>page.margin.inner=0.5in # Margin size: <xsl:param>page.margin.outer=0.5in # Yes, we want graphics for admonishments: <xsl:param>admon.graphics=1 # Set this one for PDF generation *only*: # default pnd graphics are awful in PDF form, # better use SVG's instead: <xsl:param>admon.graphics.extension=".svg" 4) Do we have a consistent location for PDF downloads: if not should we have? Outstanding Issues: ~~~~~~~~~~~~~~~~~~~ The biggest one is that PDF generation is a lot harder than it should be with bjam: if we could get FO's and PDF's placed in a "pdf" subdirectory that would help enormously - currently they get generated in a directory of bjam's choosing under bin.v2 which results in all links to images breaking :-( FO generation also fails if the FO file already exists. So basically I need a Boost.Build expert to help with these: or at least explain how the existing rules work! Anyway, thanks in advance for any feedback you may have, John.

John Maddock wrote:
Outstanding Issues: ~~~~~~~~~~~~~~~~~~~
The biggest one is that PDF generation is a lot harder than it should be with bjam: if we could get FO's and PDF's placed in a "pdf" subdirectory that would help enormously - currently they get generated in a directory of bjam's choosing under bin.v2 which results in all links to images breaking
Will "bin.v2/pdf" be fine? I don't feel like creating directories in the source tree.
:-( FO generation also fails if the FO file already exists.
Which is FOP bug, it seems from your boost-build posting.
So basically :I need a Boost.Build expert to help with these: or at least explain how the existing rules work!
For removing output fop, I think you can do it yourself: add $(RM) $(>) to fop action, and somewhere in fop.jam add: RM = [ common.rm-command ] ; As for path -- I suppose I can do it. How urgent is this? I doubt I'll have any time this week. - Volodya
Anyway, thanks in advance for any feedback you may have,
John.

Vladimir Prus wrote:
The biggest one is that PDF generation is a lot harder than it should be with bjam: if we could get FO's and PDF's placed in a "pdf" subdirectory that would help enormously - currently they get generated in a directory of bjam's choosing under bin.v2 which results in all links to images breaking
Will "bin.v2/pdf" be fine? I don't feel like creating directories in the source tree.
We already put them in the source tree for HTML generation. But yes, it does need to be in the source tree otherwise all external references to graphics are broken. Manually copying them to "some other place" is what I want to avoid. Folks already set their graphic's paths so that generated HTML finds them in the right place, so if we could do the same thing with PDF's so that the FO processor can find the existing graphics that would be great. Docbook XML can stay where it is, it's just the PDF and FO files that need to move.
:-( FO generation also fails if the FO file already exists.
Which is FOP bug, it seems from your boost-build posting.
No happens before FOP is invoked: it's the XSLT stage that fails.
So basically
I need a Boost.Build expert to help with these: or at least explain how the existing rules work!
For removing output fop, I think you can do it yourself: add
$(RM) $(>)
to fop action, and somewhere in fop.jam add:
RM = [ common.rm-command ] ;
Thanks, I will try this out.
As for path -- I suppose I can do it. How urgent is this? I doubt I'll have any time this week.
Well it's not a release issue for sure. Just another of those things that we should fix at some point :-) I've had one go at fixing this myself, but only managed to make things worse :-( If Doug G. or yourself can explain what needs changing I'll have another go. Thanks, John.

John Maddock wrote:
The Questions: ~~~~~~~~~~~~~~
1) How do folks feel about the look and feel of these: are these heading in the right direction? 2) Do we want a consistent look/feel across all Boost-PDF's? If yes, is it OK to commit the stylesheet changes (diff's attached)? 3) Many of the page layout options (margins etc) I'm using are expressed as xsl-params in my Jamfile, should these be in the fo.xsl as well? Currently
One other question: I've had trouble finding decent SVG graphics for admonishments, those supplied with the Docbook stylesheets are really lame and much poorer than the png versions. Anyone know of any other projects we can pinch these from, must be a common Docbook problem? Cheers, John.

John Maddock wrote:
Folks,
I've been working on improving our ability to generate PDF's from Quickbook/Docbook source using a "torture-test" consisting of a subset of the Math lib docs. I have some questions/feedback required, but first some observations:
Thx for doing this. Couple quick comments.
Improving Our Stylesheets ~~~~~~~~~~~~~~~~~~~~~~~~~
I've made some changes (not committed yet) to our fo.xsl stylesheet to:
* Syntax highlight C++ code. * Put a box around code blocks and admonishments. * Improve the appearance of tables. * Added some keep-together instructions to improve flow-control around tables/code/admonishments.
I've tried to mimic our HTML stylesheets as far possible, and the effects can be seen in the test PDF's.
Nice...
The Questions: ~~~~~~~~~~~~~~
1) How do folks feel about the look and feel of these: are these heading in the right direction?
I like the stylesheet so I think you should commit it.
2) Do we want a consistent look/feel across all Boost-PDF's? If yes, is it OK to commit the stylesheet changes (diff's attached)?
Right now we don't really deliver PDF's in the delivery, although maybe that will change in the future. I think it clearly needs to be one per library as we I think we've discussed before. I've been generating one for date-time and posting it on the web for each release -- I'd certainly adopt your new stylesheet. In any case, people can customize the stylesheet if they really want. Jeff

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of John Maddock Sent: 08 May 2007 10:54 To: Boost mailing list; boost-docs Subject: [boost] Improving PDF generation - a common look and feel?
I've been working on improving our ability to generate PDF's from Quickbook/Docbook source using a "torture-test" consisting of a subset of the Math lib docs.
Improving Our Stylesheets ~~~~~~~~~~~~~~~~~~~~~~~~~
I've made some changes (not committed yet) to our fo.xsl stylesheet to:
* Syntax highlight C++ code. * Put a box around code blocks and admonishments. * Improve the appearance of tables. * Added some keep-together instructions to improve flow-control around tables/code/admonishments.
I've tried to mimic our HTML stylesheets as far possible, and the effects can be seen in the test PDF's.
The Questions: ~~~~~~~~~~~~~~
1) How do folks feel about the look and feel of these: are these heading in the right direction?
All contribute to overall looking Smart - but : 1] I note that the default page size is US letter. IMO the ISO Standard A4 would be a better default. But perhaps this is just my revenge for all those documents that I have printed with their sides chopped off! 2] My personal preference (and I believe it is common colour scheme) is for comments in green and digits etc in red (keywords in blue in fine). 3] The code font is Courier. Could it be Lucida Console instead - I find this a much more readable fixed width font? Is this widely available on non-Windows systems? 4] Can some of pdf properties - author, title, subject keywords etc be completed automatically. 5] I note that fast web view is not enabled - many will be reading on screen so this might be useful? 6] Should the document be digitally signed, probably automatically. Does this have copyright implications?
2) Do we want a consistent look/feel across all Boost-PDF's?
Definitely IMO.
If yes, is it OK to commit the stylesheet changes (diff's attached)? Yes. 4) Do we have a consistent location for PDF downloads: if not should we have?
Sounds tidy - alongside html? Would it make finding documentation easier? I still feel a TOC is no substitute for an index. And more important indexing it - one the major problems with Boost documentation is *finding* what you want to know. Google provides a useful index, but it is often confusing to use (too many hits, or too few). Would a Google index of *just the documentation* be better? (Or is there too much documentation not in the right place?) Paul --- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB +44 1539561830 & SMS, Mobile +44 7714 330204 & SMS pbristow@hetp.u-net.com

Paul A Bristow wrote:
1] I note that the default page size is US letter. IMO the ISO Standard A4 would be a better default. But perhaps this is just my revenge for all those documents that I have printed with their sides chopped off!
Probably easily changed, what's the most "portable" option between US and EU sizes?
2] My personal preference (and I believe it is common colour scheme) is for comments in green and digits etc in red (keywords in blue in fine).
Hmmm, I like green comments as well, but I've just copied the HMTL stylesheets: and I would prefer to keep the html and PDF's using the same scheme. Anyone else have any strong feelings on color schemes?
3] The code font is Courier. Could it be Lucida Console instead - I find this a much more readable fixed width font? Is this widely available on non-Windows systems?
Harder to do: all PDF readers are required to support certain core fonts, and I've kept to those. Using other fonts gets you into all kinds of portablity / font embedding / copyright issues.
4] Can some of pdf properties - author, title, subject keywords etc be completed automatically.
I don't know, I guess the information is there in the XML somewhere, I'll try and see if it's possible.
5] I note that fast web view is not enabled - many will be reading on screen so this might be useful?
Don't know how to do that, probably an FO processor option somewhere....
6] Should the document be digitally signed, probably automatically. Does this have copyright implications?
Don't know how to do that either :-(
4) Do we have a consistent location for PDF downloads: if not should we have?
Sounds tidy - alongside html?
Would it make finding documentation easier? I still feel a TOC is no substitute for an index.
And more important indexing it - one the major problems with Boost documentation is *finding* what you want to know. Google provides a useful index, but it is often confusing to use (too many hits, or too few). Would a Google index of *just the documentation* be better? (Or is there too much documentation not in the right place?)
You can easily search the web documentation using "site:www.boost.org" to restrict the Google search criterian, but I admit it shows up a lot of non-HTML pages: Google has an option to restrict the page type in a search as well, but apparently HTML isn't one of the available options :-( Thanks for the comments, John.
participants (4)
-
Jeff Garland
-
John Maddock
-
Paul A Bristow
-
Vladimir Prus