Boost regression testing status

Hi, I'd like to point your attention toward the current status of our regression testing. I'll not express opinions, but these are some facts about 1.34: - Huang-WinXP-x64 - graph - graphviz_test / msvc-8.0 LINK : fatal error LNK1181: cannot open input file '[...]\bgl-viz-vc80-gd-1_34.lib' - For some failures no error message is reported, so that it is pretty much impossible for the developer to fix them; e.g.: http://tinyurl.com/zmpwk - metacomm-v2 - algorithm/minmax - minmax / msvc-7.0 'vsvars32.bat' is not recognized as an internal or external command... - OSL4-V2 - python - andreas_beyer / sun-5.8 #error Python 2.2 or higher is required for this version of Boost.Python. - RudbekAssociates-V2 - iostreams - bzip2_test / msvc-7.1 LINK : fatal error LNK1181: cannot open input file '[...]\boost_iostreams-vc71-gd-1_34.lib' The list is not exhaustive. -- [ Gennaro Prota, C++ developer for hire ]

Gennaro Prota wrote:
Hi,
I'd like to point your attention toward the current status of our regression testing. I'll not express opinions, but these are some facts about 1.34:
The list is not exhaustive.
Nope, my pet hate at present is http://tinyurl.com/jqe7j where the results from one test (an expected failure, marked up) are being listed under a completely different test :-( John.

On Thu, 27 Jul 2006 18:48:12 +0100, "John Maddock" <john@johnmaddock.co.uk> wrote:
Gennaro Prota wrote:
Hi,
I'd like to point your attention toward the current status of our regression testing. I'll not express opinions, but these are some facts about 1.34:
The list is not exhaustive.
Nope, my pet hate at present is http://tinyurl.com/jqe7j where the results from one test (an expected failure, marked up) are being listed under a completely different test :-(
Argh :-( John, while you are here, I saw that you didn't reply about identifying min/max guideline violations in comments being difficult via regexes. That made me think twice, as I supposed it was pretty easy to do with sub_matches or alternation. The basis seems to be "//.*$" for single-line comments and "/\*.*?\*/" for multi-line ones. What am I missing? :-) -- [ Gennaro Prota, C++ developer for hire ]

Gennaro Prota wrote:
On Thu, 27 Jul 2006 18:48:12 +0100, "John Maddock" <john@johnmaddock.co.uk> wrote:
Gennaro Prota wrote:
Hi,
I'd like to point your attention toward the current status of our regression testing. I'll not express opinions, but these are some facts about 1.34: The list is not exhaustive. Nope, my pet hate at present is http://tinyurl.com/jqe7j where the results from one test (an expected failure, marked up) are being listed under a completely different test :-(
Argh :-(
John, while you are here, I saw that you didn't reply about identifying min/max guideline violations in comments being difficult via regexes. That made me think twice, as I supposed it was pretty easy to do with sub_matches or alternation. The basis seems to be "//.*$" for single-line comments and "/\*.*?\*/" for multi-line ones. What am I missing? :-)
Comment looking substring embedded in string constants: const char *x = "/* something"; /* whatever */ And comments within comments: /**/ something /*/ another /**/ /** something(); //*/ another(); /**/ /* something /* another */ Probably more combinations possible :-) -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

On Thu, 27 Jul 2006 13:38:46 -0500, Rene Rivera <grafikrobot@gmail.com> wrote:
Gennaro Prota wrote:
On Thu, 27 Jul 2006 18:48:12 +0100, "John Maddock" <john@johnmaddock.co.uk> wrote:
Gennaro Prota wrote:
Hi,
I'd like to point your attention toward the current status of our regression testing. I'll not express opinions, but these are some facts about 1.34: The list is not exhaustive. Nope, my pet hate at present is http://tinyurl.com/jqe7j where the results from one test (an expected failure, marked up) are being listed under a completely different test :-(
Argh :-(
John, while you are here, I saw that you didn't reply about identifying min/max guideline violations in comments being difficult via regexes. That made me think twice, as I supposed it was pretty easy to do with sub_matches or alternation. The basis seems to be "//.*$" for single-line comments and "/\*.*?\*/" for multi-line ones. What am I missing? :-)
Comment looking substring embedded in string constants:
const char *x = "/* something"; /* whatever */
And comments within comments:
/**/ something /*/ another /**/
/** something(); //*/ another(); /**/
/* something /* another */
Probably more combinations possible :-)
Ah sure. Also "my" "/\*.*?\*/" doesn't match /* */ style comments which actually span multiple lines (:-O). But the intent wasn't to be as accurate as a parser. We were talking about "tweaking" the regexes a bit to avoid false positives. In short we wanted them to be just more accurate than they are(n't) now. If there's no danger to let violations go unnoticed with this, it seems still better than the current code. Example: this gives a false positive in boost/graph/maximum_cardinality_matching.hpp: //[the Tutte-Berge] //formula guarantees that // // 2 * M(G) = min ( |V(G)| + |U| + o(G - U) ) // //where the minimum is taken over all subsets U of //V(G). It wouldn't if we filtered out one-line comments. And if that were in a /* */ style comment, then simply adding "//" in front of the formula line would do the trick. PS: did you read my suggestion about commit triggers? -- [ Gennaro Prota, C++ developer for hire ]

Gennaro Prota wrote:
Ah sure. Also "my" "/\*.*?\*/" doesn't match /* */ style comments which actually span multiple lines (:-O). But the intent wasn't to be as accurate as a parser. We were talking about "tweaking" the regexes a bit to avoid false positives.
It's the false negatives that are worrisome. For example: const char * x = "//foo"; int x = min(10,20); Not that I think anyone would write such code. But strange things do happen :-)
In short we wanted them to be just more accurate than they are(n't) now.
OK. So you're saying we can make the comments conform to something the inspection program will understand?
If there's no danger to let violations go unnoticed with this, it seems still better than the current code. Example: this gives a false positive in boost/graph/maximum_cardinality_matching.hpp:
//[the Tutte-Berge] //formula guarantees that // // 2 * M(G) = min ( |V(G)| + |U| + o(G - U) ) // //where the minimum is taken over all subsets U of //V(G).
It wouldn't if we filtered out one-line comments. And if that were in a /* */ style comment, then simply adding "//" in front of the formula line would do the trick.
I guess you would have to be extremely careful, and especially limiting, with the regex to have some degree of confidence false negatives don't occur.
PS: did you read my suggestion about commit triggers?
I don't remember reading it. But I do remember that was suggested some time ago regarding inspection. I think I mentioned some problems with that back then ;-) -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

On Thu, 27 Jul 2006 14:23:31 -0500, Rene Rivera <grafikrobot@gmail.com> wrote:
It's the false negatives that are worrisome. For example:
const char * x = "//foo"; int x = min(10,20);
Not that I think anyone would write such code. But strange things do happen :-)
Yep.
In short we wanted them to be just more accurate than they are(n't) now.
OK. So you're saying we can make the comments conform to something the inspection program will understand?
I was basically thinking out loud, trying to find a solution. I considered implementing <boostinspect:nominmax>, </boostinspect:nominmax> for instance. But I find it a bit "heavy". Perhaps we can ask developers to write "Min" in comments, with an uppercase 'm'...
[...] I guess you would have to be extremely careful, and especially limiting, with the regex to have some degree of confidence false negatives don't occur.
Yes, that's true, though the current implementation isn't free from problems in that regard. #define BOOST_MY_LIB_SMALLEST min BOOST_MY_LIB_SMALLEST(x, y) // no problem detected here
PS: did you read my suggestion about commit triggers?
I don't remember reading it. But I do remember that was suggested some time ago regarding inspection. I think I mentioned some problems with that back then ;-)
Hmm... I'll search in the mailing list archives :-) -- [ Gennaro Prota, C++ developer for hire ]

On Thu, 27 Jul 2006 14:23:31 -0500, Rene Rivera <grafikrobot@gmail.com> wrote:
PS: did you read my suggestion about commit triggers?
I don't remember reading it. But I do remember that was suggested some time ago regarding inspection. I think I mentioned some problems with that back then ;-)
Rene, please, summarize it here for this poor developer :-) Really, going through the archives was driving me crazy. I can imagine one reason for objection: multifile commits; since CVS doesn't have a commit or rollback logic, if one of the files isn't checked in, say, because of tabs you have to manually figure out the new status of the repository. Apart from that (and suggest to switch to SVN), any other issues? -- [ Gennaro Prota, C++ developer for hire ]

Gennaro Prota wrote:
On Thu, 27 Jul 2006 14:23:31 -0500, Rene Rivera <grafikrobot@gmail.com> wrote:
PS: did you read my suggestion about commit triggers? I don't remember reading it. But I do remember that was suggested some time ago regarding inspection. I think I mentioned some problems with that back then ;-)
Rene, please, summarize it here for this poor developer :-) Really, going through the archives was driving me crazy. I can imagine one reason for objection: multifile commits; since CVS doesn't have a commit or rollback logic, if one of the files isn't checked in, say, because of tabs you have to manually figure out the new status of the repository. Apart from that (and suggest to switch to SVN), any other issues?
I can only remember one other right now. And it's a deployment issue. We would have to match the compiled inspect program to the CVS servers that SF uses. I don't know if they still do this, but this is made harder by SF using multiple servers for CVS, so it's not clear if they are all the same. Of course this is assuming SF even lets one have binaries running within the CVS server, as opposed to scripts. Along with that is the possibility of totally locking ourselves out of CVS access if the program happens to not work. Since we would be unable to fix it by checking in a replacement. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

On Sat, 29 Jul 2006 14:30:16 -0500, Rene Rivera <grafikrobot@gmail.com> wrote:
I can only remember one other right now. And it's a deployment issue. We would have to match the compiled inspect program to the CVS servers that SF uses. I don't know if they still do this, but this is made harder by SF using multiple servers for CVS, so it's not clear if they are all the same. Of course this is assuming SF even lets one have binaries running within the CVS server, as opposed to scripts.
Along with that is the possibility of totally locking ourselves out of CVS access if the program happens to not work. Since we would be unable to fix it by checking in a replacement.
Fair points. Off-hand I can't think of any solution to the multiple server deployment problem, but the latter issue looks approachable. For instance Inspector Rex could simply bark and let you check in anyway (after you throw it a bone). Or it could send a mail somewhere (to someone). Or both things. -- [ Gennaro Prota, C++ developer for hire ]

On Sat, 29 Jul 2006 14:30:16 -0500, Rene Rivera <grafikrobot@gmail.com> wrote:
Gennaro Prota wrote:
[...] I can imagine one reason for objection: multifile commits; since CVS doesn't have a commit or rollback logic, if one of the files isn't checked in, say, because of tabs you have to manually figure out the new status of the repository.
As far as I understand from the SF docs, this is a non-issue, as the script is applied before any file is committed.
[...]
I can only remember one other right now. And it's a deployment issue. We would have to match the compiled inspect program to the CVS servers that SF uses.
It looks like this is transparent to the user: <http://sourceforge.net/docs/E04/#scripts> -- [ Gennaro Prota, C++ developer for hire ]

Gennaro Prota wrote:
On Sat, 29 Jul 2006 14:30:16 -0500, Rene Rivera <grafikrobot@gmail.com> wrote:
Gennaro Prota wrote:
[...] I can imagine one reason for objection: multifile commits; since CVS doesn't have a commit or rollback logic, if one of the files isn't checked in, say, because of tabs you have to manually figure out the new status of the repository.
As far as I understand from the SF docs, this is a non-issue, as the script is applied before any file is committed.
SF docs are terrible at explaining CVS issues ;-) Pre and post commit scripts in CVS are on a per directory basis. So it is possible that files in one dir would be OK while it would fail in the next. Note, I know this from experience, not from any CVS docs I've seen.
[...] I can only remember one other right now. And it's a deployment issue. We would have to match the compiled inspect program to the CVS servers that SF uses.
It looks like this is transparent to the user:
Not sure what you mean by transparent in this case. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Rene Rivera wrote:
Comment looking substring embedded in string constants:
const char *x = "/* something"; /* whatever */
And comments within comments:
/**/ something /*/ another /**/
/** something(); //*/ another(); /**/
/* something /* another */
Probably more combinations possible :-)
Can't you deal with these by first replacing all the instances of a single regular expression that matches both strings and comments with something appropriate? Something like: /\*.*?\*/|//[^\r\n]*|"(?:[^\r\n"\\]|\\.)" Because the matches won't overlap it should get this right.

On Sun, 30 Jul 2006 20:52:50 +0100, Daniel James <daniel_james@fmail.co.uk> wrote:
Can't you deal with these by first replacing all the instances of a single regular expression that matches both strings and comments[...]?
Everything can be done. After all, the editor I'm currently using has no doubts in syntax coloring any of the indicated situations and does it correctly. And it is far from being a sophisticated editor. The issue is, quite frankly, that we have probably done already too much to cope with the problem. If each library author had dealt with problems reported in his own libraries *only* then everything would already be solved. So the question arises spontaneously: why should I, or Rene, undertake what begins to be a project just to "cope with" (fight?) others' disinterest? PS: sorry for being harsh, this isn't addressed in any way to you, of course (unless you are one of those authors and I haven't noticed ;-)). -- [ Gennaro Prota, C++ developer for hire ]

Nope, my pet hate at present is http://tinyurl.com/jqe7j where the results from one test (an expected failure, marked up) are being listed under a completely different test :-(
Argh :-(
John, while you are here, I saw that you didn't reply about identifying min/max guideline violations in comments being difficult via regexes. That made me think twice, as I supposed it was pretty easy to do with sub_matches or alternation. The basis seems to be "//.*$" for single-line comments and "/\*.*?\*/" for multi-line ones. What am I missing? :-)
Actually not much, try "//[^\n\r]*" for single lines though. John.

On Fri, 28 Jul 2006 09:53:39 +0100, "John Maddock" <john@johnmaddock.co.uk> wrote:
Nope, my pet hate at present is http://tinyurl.com/jqe7j where the results from one test (an expected failure, marked up) are being listed under a completely different test :-(
Argh :-(
John, while you are here, I saw that you didn't reply about identifying min/max guideline violations in comments being difficult via regexes. That made me think twice, as I supposed it was pretty easy to do with sub_matches or alternation. The basis seems to be "//.*$" for single-line comments and "/\*.*?\*/" for multi-line ones. What am I missing? :-)
Actually not much, try "//[^\n\r]*" for single lines though.
Thanks for the hint. After a night (or two? not sure :-)) of sleep I decided for the solution indicated in the thread "Inspect Tool Update (was: Boost regression testing status)". Maybe an Inspect V2 could be based on Wave to do full tokenization and preprocessing of the inspected source files. About the regression test mess, dynamic_bitset is not listed in the report, but has some "red" failures on compilers which I don't think were tested before (which I've discovered by chance); and cases where the same compiler passes or not depending on the test runner (e.g. VC7). Honestly, even if all this has an explanation, are such tests useful? -- [ Gennaro Prota, C++ developer for hire ]

Gennaro Prota writes:
About the regression test mess, dynamic_bitset is not listed in the report, but has some "red" failures
A "red" failure indicates a regression from the last-known-good release, whether the toolset is marked as required or not, so it's perfectly normal to have these present on the library's page and absent from the Issues page.
on compilers which I don't think were tested before (which I've discovered by chance);
Which compilers, and why do you think they weren't tested before?
and cases where the same compiler passes or not depending on the test runner (e.g. VC7).
That happens, and when it does, it needs investigation.
Honestly, even if all this has an explanation, are such tests useful?
Well, what is your definition of "such tests"? -- Aleksey Gurtovoy MetaCommunications Engineering

On Mon, 31 Jul 2006 14:11:58 -0500, Aleksey Gurtovoy <agurtovoy@meta-comm.com> wrote:
Gennaro Prota writes:
About the regression test mess, dynamic_bitset is not listed in the report, but has some "red" failures
A "red" failure indicates a regression from the last-known-good release, whether the toolset is marked as required or not, so it's perfectly normal to have these present on the library's page and absent from the Issues page.
Ok. I realized this afterwards.
on compilers which I don't think were tested before (which I've discovered by chance);
Which compilers, and why do you think they weren't tested before?
Not sure, but I don't remember dynamic_bitset<> being regression tested on CW 8.3.
and cases where the same compiler passes or not depending on the test runner (e.g. VC7).
That happens, and when it does, it needs investigation.
Aleksey, regression tests really need some attention. I understand that they are provided as a courtesy, that they don't come for free and all that, but definitely they need attention. I've just fixed a hardcoded cvs repository link which still referred to the old cvsroot, so that no link to source files actually worked. That's alarming, and it's not the test runner's fault: it either means no one follows the links or they give for granted that they do not work. Both are bad signals, IMHO. -- [ Gennaro Prota, C++ developer for hire ]

Gennaro Prota writes:
On Mon, 31 Jul 2006 14:11:58 -0500, Aleksey Gurtovoy <agurtovoy@meta-comm.com> wrote:
Gennaro Prota writes:
About the regression test mess, dynamic_bitset is not listed in the report, but has some "red" failures
A "red" failure indicates a regression from the last-known-good release, whether the toolset is marked as required or not, so it's perfectly normal to have these present on the library's page and absent from the Issues page.
Ok. I realized this afterwards.
on compilers which I don't think were tested before (which I've discovered by chance);
Which compilers, and why do you think they weren't tested before?
Not sure, but I don't remember dynamic_bitset<> being regression tested on CW 8.3.
Well, it was: http://engineering.meta-comm.com/boost-regression/1_33_1/developer/dynamic_b...
and cases where the same compiler passes or not depending on the test runner (e.g. VC7).
That happens, and when it does, it needs investigation.
Aleksey, regression tests really need some attention. I understand that they are provided as a courtesy, that they don't come for free and all that, but definitely they need attention.
Surely there is a lot of room for improvement (as illustrated by http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Boost.Testing), but the above sounds like you think they are in flux, and if so, you should list the particular issues that led you to this conclusion, or otherwise it's FUD. For instance, surely you could check 1.33.1 report pages before making the claim that dynamic_bitset wasn't tested with CW 8.3 (or any other toolset)?
I've just fixed a hardcoded cvs repository link which still referred to the old cvsroot,
That's much appreciated.
so that no link to source files actually worked. That's alarming, and it's not the test runner's fault: it either means no one follows the links or they give for granted that they do not work.
The former.
Both are bad signals, IMHO.
I don't think it's that bad, really: library maintainers don't follow the links because they know what their tests are, and have a working copy anyway, and most users don't follow them because they are not interested in that level of details (or may be they don't know where to report a broken link, which is a problem, but hardly an indication of the regression tests being in flux). -- Aleksey Gurtovoy MetaCommunications Engineering

On Mon, 31 Jul 2006 19:19:45 -0500, Aleksey Gurtovoy <agurtovoy@meta-comm.com> wrote:
Surely there is a lot of room for improvement (as illustrated by http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Boost.Testing), but the above sounds like you think they are in flux, and if so, you should list the particular issues that led you to this conclusion, or otherwise it's FUD. For instance, surely you could check 1.33.1 report pages before making the claim that dynamic_bitset wasn't tested with CW 8.3 (or any other toolset)?
Yes. The "particular issues which led me to this conclusion" are listed in the initial post of this thread. Surely you could read a little above before making the claim that anyone spreads FUD, especially when it comes to people who is being here for years. That said, I don't think I've much more to add to this thread. -- [ Gennaro Prota, C++ developer for hire ]

Gennaro Prota writes:
On Mon, 31 Jul 2006 19:19:45 -0500, Aleksey Gurtovoy <agurtovoy@meta-comm.com> wrote:
Surely there is a lot of room for improvement (as illustrated by http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Boost.Testing), but the above sounds like you think they are in flux, and if so, you should list the particular issues that led you to this conclusion, or otherwise it's FUD. For instance, surely you could check 1.33.1 report pages before making the claim that dynamic_bitset wasn't tested with CW 8.3 (or any other toolset)?
Yes. The "particular issues which led me to this conclusion" are listed in the initial post of this thread.
All of them but one are perfectly ordinary environment/linking issues, no? OK, may be they are not normal that far in the release cycle, but that's hardly a "regression test mess".
- For some failures no error message is reported, so that it is pretty much impossible for the developer to fix them; e.g.: http://tinyurl.com/zmpwk
We will look into this one.
Surely you could read a little above before making the claim that anyone spreads FUD, especially when it comes to people who is being here for years.
I did. -- Aleksey Gurtovoy MetaCommunications Engineering
participants (5)
-
Aleksey Gurtovoy
-
Daniel James
-
Gennaro Prota
-
John Maddock
-
Rene Rivera