Boost Test Library, expected failues, unexpected success

Dave Steffen

13 Feb 2007 13 Feb '07

5:07 p.m.

Hi Folks, We've been using the boost unit test library for over a year now, and we're very happy with it. There's one little nit, though, and I'm curious about what people think it means, and what people have been doing about it. The issue is "expected failures". Expected failures are specified at the test case level, e.g. BOOST_TEST_CASE( test_case_name, expected failures ) { ... } So, one could have a test case with, say, four assertions (a.k.a. BOOST_CHECK), and specify that you expect two to fail. Fine. How do you know if the two that failed were the two you expected to fail? One solution is "don't do that": have only one assertion per test case. We find that to be extremely cumbersome; if it take 20 lines of code to set up an object and put it into a given state, we'd have to duplicate those 20 lines of code across multiple test cases (or, alternately, extract them into a helper function, which is annoying and not always possible). I _think_ what I'd like to see is 'expected failures' marked, not at the test case level, but at the assertion level; that is, instead of specifying "N out of the following M assertions are expected to fail" (as is done now), specify "This assertion is expected to fail" on an assertion-by-assertion basis. I can probably go hack this in to the library, but I'd prefer not to. Any other thoughts or solutions? Thanks!! ---------------------------------------------------------------------- Dave Steffen, Ph.D. Disobey this command! Software Engineer IV - Douglas Hofstadter Numerica Corporation dg@steffen a@t numerica d@ot us (remove @'s to email me)

Show replies by date

Noah Roberts

13 Feb 13 Feb

6:46 p.m.

Dave Steffen wrote:

...

Hi Folks,

We've been using the boost unit test library for over a year now, and we're very happy with it. There's one little nit, though, and I'm curious about what people think it means, and what people have been doing about it.

The issue is "expected failures".

Expected failures are specified at the test case level, e.g.

BOOST_TEST_CASE( test_case_name, expected failures ) { ... }

So, one could have a test case with, say, four assertions (a.k.a. BOOST_CHECK), and specify that you expect two to fail. Fine. How do you know if the two that failed were the two you expected to fail?

One solution is "don't do that": have only one assertion per test case. We find that to be extremely cumbersome; if it take 20 lines of code to set up an object and put it into a given state, we'd have to duplicate those 20 lines of code across multiple test cases (or, alternately, extract them into a helper function, which is annoying and not always possible).

I _think_ what I'd like to see is 'expected failures' marked, not at the test case level, but at the assertion level; that is, instead of specifying "N out of the following M assertions are expected to fail" (as is done now), specify "This assertion is expected to fail" on an assertion-by-assertion basis.

I can probably go hack this in to the library, but I'd prefer not to. Any other thoughts or solutions?

I'm curious why simply inverting the test is not enough. Assuming a BOOST_CHECK_FAILS existed: BOOST_CHECK_FAILS(someobject.doesntwork()); could be just as easily written as: BOOST_CHECK(!sameobject.doesntwork()); It is provable, using basic boolean algebra, that there is no case when there is a check that couldn't be written with either method so I don't understand the requirement for adding the former.

Dave Steffen

6:57 p.m.

Noah Roberts writes:

...

Dave Steffen wrote:

...
Hi Folks, [...] Expected failures are specified at the test case level, e.g.

BOOST_TEST_CASE( test_case_name, expected failures ) { ... } [...] I _think_ what I'd like to see is 'expected failures' marked, not at the test case level, but at the assertion level; that is, instead of specifying "N out of the following M assertions are expected to fail" (as is done now), specify "This assertion is expected to fail" on an assertion-by-assertion basis.

I can probably go hack this in to the library, but I'd prefer not to. Any other thoughts or solutions?

I'm curious why simply inverting the test is not enough. Assuming a BOOST_CHECK_FAILS existed:

BOOST_CHECK_FAILS(someobject.doesntwork());

could be just as easily written as:

BOOST_CHECK(!sameobject.doesntwork());

Well, that looks like any other test assertion, and results in a pass or fail. What we're after here is something different: we're distinguishing between two different kinds of failures, and we want them reported as such. In contrast, what you've got above turns an "expected failure" into a "pass", which isn't what we want. There is a larger question, that is probably better directed at Gennady, since he wrote the library: why support expected failures? What do "expected failures" mean? What's the use case? This is, however, a different discussion (that we can have if people are interested). But the point is that failures and expected failures are reported differently. There are, in effect, three possible outcomes of a test: pass, fail, and fail (but we expected it to). What I'm asking for is that the "expected failure" notion be specified, not at the test case level, but at the test assertion level... but with the same reporting scheme as is currently in use. ---------------------------------------------------------------------- Dave Steffen, Ph.D. Software Engineer IV Disobey this command! Numerica Corporation ph (970) 461-2000 x227 fax (970) 461-2004 - Douglas Hofstadter dgsteffen@numerica.us ___________________ Numerica Disclaimer: This message and any attachments are intended only for the individual or entity to which the message is addressed. It is proprietary and may contain privileged information. If you are neither the intended recipient nor the agent responsible for delivering the message to the intended recipient, you are hereby notified that any review, retransmission, dissemination, or taking of any action in reliance upon, the information in this communication is strictly prohibited, and may be unlawful. If you feel you have received this communication in error, please notify us immediately by returning this Email to the sender and deleting it from your computer.

Gennadiy Rozental

10:23 p.m.

"Dave Steffen" <dgsteffen@numerica.us> wrote in message news:17874.2615.76539.30581@yttrium.numerica.us...

...

...
BOOST_CHECK(!sameobject.doesntwork());

Well, that looks like any other test assertion, and results in a pass or fail. What we're after here is something different: we're distinguishing between two different kinds of failures, and we want them reported as such. In contrast, what you've got above turns an "expected failure" into a "pass", which isn't what we want.

This is really the same what "expected failure" feature does: "temporary" shut up failing test case.

...

There is a larger question, that is probably better directed at Gennady, since he wrote the library: why support expected failures? What do "expected failures" mean? What's the use case? This is, however, a different discussion (that we can have if people are interested).

IMO it should be used primarily as either a temporary solution in case you need to clean up your regression test charts before release and don't have time to fix the failing assertion or as a portability tool when particular assertion is expected to fail under some of the configurations you test against. There may be other usages, but you need not overuse it.

...

But the point is that failures and expected failures are reported differently. There are, in effect, three possible outcomes of a test: pass, fail, and fail (but we expected it to).

What I'm asking for is that the "expected failure" notion be specified, not at the test case level, but at the test assertion level... but with the same reporting scheme as is currently in use.

I don't really see a big advantage over just commenting out the line in question. Gennadiy

Gennadiy Rozental

10:16 p.m.

"Dave Steffen" <dgsteffen@numerica.us> wrote in message news:17873.61503.991989.419018@yttrium.numerica.us...

...

Hi Folks,

We've been using the boost unit test library for over a year now, and we're very happy with it. There's one little nit, though, and I'm curious about what people think it means, and what people have been doing about it.

The issue is "expected failures".

Expected failures are specified at the test case level, e.g.

BOOST_TEST_CASE( test_case_name, expected failures )

If you are using "auto" facility it's done like this: BOOST_AUTO_TEST_CASE_EXPECTED_FAILURES( my_test1, 1 ) BOOST_AUTO_TEST_CASE( my_test1 ) { ..... }

...

{ ... }

So, one could have a test case with, say, four assertions (a.k.a. BOOST_CHECK), and specify that you expect two to fail. Fine. How do you know if the two that failed were the two you expected to fail?

Yep. That's the reason expected failures usage should be limited. Note though that id number of failures os less than expected it's also treated as error.

...

One solution is "don't do that": have only one assertion per test case. We find that to be extremely cumbersome; if it take 20 lines of code to set up an object and put it into a given state, we'd have to duplicate those 20 lines of code across multiple test cases (or, alternately, extract them into a helper function, which is annoying and not always possible).

Fixture is your friend Here is an example of fixture usage: struct F { F() : i( 0 ) { BOOST_TEST_MESSAGE( "setup fixture" ); } // here in constructor you do your 20 lines of code to set up ~F() { BOOST_TEST_MESSAGE( "teardown fixture" ); } int i; }; //____________________________________________________________________________// // this test case will use struct F as fixture BOOST_FIXTURE_TEST_CASE( my_test1, F ) { // you have direct access to non-private members of fixture structure BOOST_CHECK( i == 1 ); } //____________________________________________________________________________// // you could have any number of test cases with the same fixture BOOST_FIXTURE_TEST_CASE( my_test2, F ) { BOOST_CHECK_EQUAL( i, 2 ); BOOST_CHECK_EQUAL( i, 0 ); } //____________________________________________________________________________// Enjoy, Gennadiy P.S. You will need 1.34 RC for above example to work I believe.

Dave Steffen

10:25 p.m.

Gennadiy Rozental writes:

...

"Dave Steffen" <dgsteffen@numerica.us> wrote in message news:17873.61503.991989.419018@yttrium.numerica.us...

[...]

...

...
The issue is "expected failures".

Expected failures are specified at the test case level, e.g.

BOOST_TEST_CASE( test_case_name, expected failures )

If you are using "auto" facility it's done like this:

BOOST_AUTO_TEST_CASE_EXPECTED_FAILURES( my_test1, 1 )

BOOST_AUTO_TEST_CASE( my_test1 ) { ..... }

Right, exactly.

...

...
So, one could have a test case with, say, four assertions (a.k.a. BOOST_CHECK), and specify that you expect two to fail. Fine. How do you know if the two that failed were the two you expected to fail?

Yep. That's the reason expected failures usage should be limited.

Well, yes. In fact, I'm curious about your use cases for expected failures. What do you think it means for a failure to be expected? Anyway...

...

Note though that if number of failures os less than expected it's also treated as error.

Yes, right, absolutely.

...

...
One solution is "don't do that": have only one assertion per test case. We find that to be extremely cumbersome; if it take 20 lines of code to set up an object and put it into a given state, we'd have to duplicate those 20 lines of code across multiple test cases (or, alternately, extract them into a helper function, which is annoying and not always possible).

Fixture is your friend [ ... snip example ... ]

Hey, that's nifty. :-)

...

P.S. You will need 1.34 RC for above example to work I believe.

Ah... we're in the process of moving to 1.33.1 right now. (We have to do regression tests, etc.) We'll go to 1.34 when it comes out, unless it's seriously delayed... hmm... [comments snipped] :-) ---------------------------------------------------------------------- Dave Steffen, Ph.D. Disobey this command! Software Engineer IV - Douglas Hofstadter Numerica Corporation dg@steffen a@t numerica d@ot us (remove @'s to email me)

Gennadiy Rozental

14 Feb 14 Feb

8:27 a.m.

"Dave Steffen" <dgsteffen@numerica.us> wrote in message news:17874.15059.105215.467627@yttrium.numerica.us...

...

...
Yep. That's the reason expected failures usage should be limited.

Well, yes. In fact, I'm curious about your use cases for expected failures. What do you think it means for a failure to be expected?

In my own test modules I don't use it at all. But than here at wark we don't have well esteblished regression testing.

...

Ah... we're in the process of moving to 1.33.1 right now. (We have to do regression tests, etc.) We'll go to 1.34 when it comes out, unless it's seriously delayed... hmm... [comments snipped] :-)

I recommend you to try to use 1.34 RC even if it is Boost.Test only. My guess it should work even when compiled in 1.33.1 environment. Gennadiy

Dave Steffen

4:52 p.m.

Gennadiy Rozental writes:

...

"Dave Steffen" <dgsteffen@numerica.us> wrote in message news:17874.15059.105215.467627@yttrium.numerica.us...

...
...
Yep. That's the reason expected failures usage should be limited.

Well, yes. In fact, I'm curious about your use cases for expected failures. What do you think it means for a failure to be expected?

In my own test modules I don't use it at all. But than here at wark we don't have well esteblished regression testing.

:-) Well, OK then. The only thing _we_ are using it for is to mark old (e.g. pre-Boost Test) hopelessly broken unit tests, and to mark code that should have unit tests but doesn't. I'm in charge of a new project, just starting up, which will get an entirely new code base; I don't plan to have any either. It's good to know I'm not crazy, or missing something obvious.

...

...
Ah... we're in the process of moving to 1.33.1 right now. (We have to do regression tests, etc.) We'll go to 1.34 when it comes out, unless it's seriously delayed... hmm... [comments snipped] :-)

I recommend you to try to use 1.34 RC even if it is Boost.Test only. My guess it should work even when compiled in 1.33.1 environment.

I'll look in to it. We don't have huge processes around here for putting different library versions in to use, but there's a little futzing. :-) I do like the notion of fixtures... ---------------------------------------------------------------------- Dave Steffen, Ph.D. Software Engineer IV Disobey this command! Numerica Corporation ph (970) 461-2000 x227 fax (970) 461-2004 - Douglas Hofstadter dgsteffen@numerica.us ___________________ Numerica Disclaimer: This message and any attachments are intended only for the individual or entity to which the message is addressed. It is proprietary and may contain privileged information. If you are neither the intended recipient nor the agent responsible for delivering the message to the intended recipient, you are hereby notified that any review, retransmission, dissemination, or taking of any action in reliance upon, the information in this communication is strictly prohibited, and may be unlawful. If you feel you have received this communication in error, please notify us immediately by returning this Email to the sender and deleting it from your computer.

6695

Age (days ago)

6696

Last active (days ago)

List overview

Download

7 comments

3 participants

participants (3)

Dave Steffen
Gennadiy Rozental
Noah Roberts