[inspect] Contents in XML <legalnotice> sections not checked for L & C issues?

Hi Rene It seems that the <legalnotice> sections in Boostbook xml files are not checked for L & C issues. Is there any chance that such a feature could be added to the inspect tool before 1.34 goes out? That would surely uncover many more L & C problems. Thanks & Regards, -- Andreas Huber When replying by private email, please remove the words spam and trap from the address shown in the header.

Andreas Huber wrote:
Hi Rene
It seems that the <legalnotice> sections in Boostbook xml files are not checked for L & C issues. Is there any chance that such a feature could be added to the inspect tool before 1.34 goes out? That would surely uncover many more L & C problems.
Could point out an example? -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Rene Rivera wrote:
Andreas Huber wrote:
Hi Rene
It seems that the <legalnotice> sections in Boostbook xml files are not checked for L & C issues. Is there any chance that such a feature could be added to the inspect tool before 1.34 goes out? That would surely uncover many more L & C problems.
Could point out an example?
Sure, have a look at e.g. http://www.boost-consulting.com/boost/libs/date_time/xmldoc/posix_time.xml All by itself this file is ok, as there is a copyright notice in an XML comment. However, the generated HTML page ... http://www.boost.org/regression-logs/cs-win32_metacomm/doc/html/date_time/po... ... only contains a copyright notice, license is absent. No inspect failures are reported for the date_time library. IIUC, inspect could detect this problem by checking the presence & contents of the <legalnotice> section in the boostbook xml. Regards, -- Andreas Huber When replying by private email, please remove the words spam and trap from the address shown in the header.

Andreas Huber wrote:
Rene Rivera wrote:
Andreas Huber wrote:
Hi Rene
It seems that the <legalnotice> sections in Boostbook xml files are not checked for L & C issues. Is there any chance that such a feature could be added to the inspect tool before 1.34 goes out? That would surely uncover many more L & C problems. Could point out an example?
Sure, have a look at e.g.
http://www.boost-consulting.com/boost/libs/date_time/xmldoc/posix_time.xml
All by itself this file is ok, as there is a copyright notice in an XML comment. However, the generated HTML page ...
http://www.boost.org/regression-logs/cs-win32_metacomm/doc/html/date_time/po...
... only contains a copyright notice, license is absent. No inspect failures are reported for the date_time library. IIUC, inspect could detect this problem by checking the presence & contents of the <legalnotice> section in the boostbook xml.
The inspect program checks the entire contents of the files. In this case the "Subject to the Boost Software License..." at the top is the license, so it's not absent. The visible license notice in this case <http://engineering.meta-comm.com/resources/cs-win32_metacomm/doc/html/date_time.html> comes from the top xml file <http://www.boost-consulting.com/boost/libs/date_time/xmldoc/date_time.xml>. So the problem isn't with the inspection program but with BoostBook+DocBook translation which does not include the license info throught the rest of the generated HTML files. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Rene Rivera wrote:
... only contains a copyright notice, license is absent. No inspect failures are reported for the date_time library. IIUC, inspect could detect this problem by checking the presence & contents of the <legalnotice> section in the boostbook xml.
The inspect program checks the entire contents of the files. In this case the "Subject to the Boost Software License..." at the top is the license, so it's not absent. The visible license notice in this case <http://engineering.meta-comm.com/resources/cs-win32_metacomm/doc/html/date_time.html> comes from the top xml file <http://www.boost-consulting.com/boost/libs/date_time/xmldoc/date_time.xml>. So the problem isn't with the inspection program but with BoostBook+DocBook translation which does not include the license info throught the rest of the generated HTML files.
Right, the example I gave was a bad one. Sorry about that! However, searching through all files under boost/tools/inspect for the string "legalnotice" does not result in any matches. Looking at the code of license_check.cpp it seems that inspect only checks whether a file contains the regex boost[\\s\\W]+software[\\s\\W]+license. So my question is: How does inspect check that boostbook generated docs contain the necessary L & C? - It seems it doesn't check the generated HTML (otherwise there would be lots of failures for date_time) - It seems it doesn't check the boostbook xml (otherwise it would be looking for the legalnotice tag, wouldn't it?) What am I missing? Regards, -- Andreas Huber When replying by private email, please remove the words spam and trap from the address shown in the header.

Andreas Huber wrote:
Rene Rivera wrote:
... only contains a copyright notice, license is absent. No inspect failures are reported for the date_time library. IIUC, inspect could detect this problem by checking the presence & contents of the <legalnotice> section in the boostbook xml. The inspect program checks the entire contents of the files. In this case the "Subject to the Boost Software License..." at the top is the license, so it's not absent. The visible license notice in this case <http://engineering.meta-comm.com/resources/cs-win32_metacomm/doc/html/date_time.html> comes from the top xml file <http://www.boost-consulting.com/boost/libs/date_time/xmldoc/date_time.xml>. So the problem isn't with the inspection program but with BoostBook+DocBook translation which does not include the license info throught the rest of the generated HTML files.
Right, the example I gave was a bad one. Sorry about that!
I'm not so sure it was a bad example.
However, searching through all files under boost/tools/inspect for the string "legalnotice" does not result in any matches.
Correct.
Looking at the code of license_check.cpp it seems that inspect only checks whether a file contains the regex boost[\\s\\W]+software[\\s\\W]+license. So my question is: How does inspect check that boostbook generated docs contain the necessary L & C?
It doesn't, as I don't generate the docs before running the inspect. Hence it only checks the CVS state. It relies on the doc generation to correctly follow the copyright+license guidelines when generating docs. And even if I did generate the docs before inspecting, it would be pointless as there is nothing the authors could do about the doc generated files, other than someone fixing the doc tools.
- It seems it doesn't check the generated HTML (otherwise there would be lots of failures for date_time)
Correct, it doesn't.
- It seems it doesn't check the boostbook xml (otherwise it would be looking for the legalnotice tag, wouldn't it?)
Incorrect. Those legalnotice tags should contain the "boost[\\s\\W]+software[\\s\\W]+license" text and hence will be inspected. Hence why you example above is pertinent.
What am I missing?
That the doc tools are broken with regards to propagating license+copyright. :-) -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Rene Rivera wrote:
- It seems it doesn't check the boostbook xml (otherwise it would be looking for the legalnotice tag, wouldn't it?)
Incorrect. Those legalnotice tags should contain the "boost[\\s\\W]+software[\\s\\W]+license" text and hence will be inspected. Hence why you example above is pertinent.
Ok, please have a look at: http://www.boost-consulting.com/boost/libs/thread/doc/thread.xml What if someone removes the <legalnotice> section in this file (or rewords it to whatever he pleases)? My guess is that inspect will not report a failure then because there's still the XML comment. I've not actually found an example for such a problem but then again I haven't looked that hard. This problem, together with the broken license propagation of the doc tools, seems to suggest that inspect would best also check the generated HTML.
What am I missing?
That the doc tools are broken with regards to propagating license+copyright. :-)
No, I got that, at least after your second answer. The doc tools need some fixing too. -- Andreas Huber When replying by private email, please remove the words spam and trap from the address shown in the header.

Andreas Huber wrote:
Rene Rivera wrote:
- It seems it doesn't check the boostbook xml (otherwise it would be looking for the legalnotice tag, wouldn't it?) Incorrect. Those legalnotice tags should contain the "boost[\\s\\W]+software[\\s\\W]+license" text and hence will be inspected. Hence why you example above is pertinent.
Ok, please have a look at:
http://www.boost-consulting.com/boost/libs/thread/doc/thread.xml
What if someone removes the <legalnotice> section in this file (or rewords it to whatever he pleases)? My guess is that inspect will not report a failure then because there's still the XML comment. I've not actually found an example for such a problem but then again I haven't looked that hard.
Yea, that's a correct guess. This is certainly a problem, read that as bug, in such documents. There should only be *one* license statement per file. I think the best approach would be to add a check to the inspect program that complains when it finds multiple license instances in a file.
This problem, together with the broken license propagation of the doc tools, seems to suggest that inspect would best also check the generated HTML.
Strange, it suggest something different to me :-) I find it better to detect errors close to the source of the problem, rather than the farther away point of after generation. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Rene Rivera wrote:
Andreas Huber wrote: [snip]
Ok, please have a look at:
http://www.boost-consulting.com/boost/libs/thread/doc/thread.xml
What if someone removes the <legalnotice> section in this file (or rewords it to whatever he pleases)? My guess is that inspect will not report a failure then because there's still the XML comment. I've not actually found an example for such a problem but then again I haven't looked that hard.
Yea, that's a correct guess. This is certainly a problem, read that as bug, in such documents. There should only be *one* license statement per file.
For boostbook root xml files (like the one above) you'd also need to check that the license appears in the <legalnotice> section.
I think the best approach would be to add a check to the inspect program that complains when it finds multiple license instances in a file.
That's quite a hard problem to solve, especially when a license can appear in arbitrary places (like the XML comment as in the example above). Except of course when you assume that all licenses have some magic words in common.
This problem, together with the broken license propagation of the doc tools, seems to suggest that inspect would best also check the generated HTML.
Strange, it suggest something different to me :-) I find it better to detect errors close to the source of the problem, rather than the farther away point of after generation.
I assume you suggest to keep checking the boostbook source xml only as opposed to also/only check the generated HTML. While I agree with the general statement, in this case, by checking the sources only, you cannot detect when the doctools do something wrong. That's why I suggested to check the generated HTML. BTW, I do see that L & C checking is tricky business in general and can never be made completely foolproof. All I'm saying is that the current checks can be tightened with relatively little additional effort, either by checking the presence and contents of <legalnotice> sections and/or by checking the generated HTML. Then again, I don't know much about boostbook or inspect so I could well be wrong. -- Andreas Huber When replying by private email, please remove the words spam and trap from the address shown in the header.
participants (2)
-
Andreas Huber
-
Rene Rivera