Re: [boost] What Should we do About Boost.Test?

4 Oct 2012

      ...
Mathieu Champlon <m.champlon <at> free.fr> writes:
Hi Mathieu,

This is a big subject. And a riddles with a lot of confusion IMO.
...
I'm a bit puzzled by the kitchen_robot example (and not just because it 
grill chicken without any chicken ! :p).
OK. Let me first reply to the comments specific to my example and later on we 
tackle the problem in general.
...
MockMicrowave::get_max_power looks hard-coded to return a value of 1000, 
so to me this looks like a stub rather than a mock object, or am I 
missing something ?
1000 is indeed hardcoded in this particular mock example, but note that it is 
never tested against, so we are not testing the state. In a sense it is part of 
mock behavior that this particular value is returned. We could have written mock 
differently, where this value is initialized in constructor (or multiple 
possible values). In many scenarios such mocks with hardcoded values will 
properly represent reality and satisty our needs from interaction testing 
standpoint.
...
How would you test the robot calls set_power_level properly in respect 
to different max power values ?
We can actually implement the check in this method and throw exception if it 
fails. I believe the framework may not log exceptions properly yet. It is 
something to be improved.
...
What if the max power could change at 
any time and you would like to 'program' the mock object to test the 
robot (first time return 1000, second time return 2000, etc..) ? And 
what now about a malfunctioning oven which would throw exceptions ?
Would you write a new MockMicrowave implementation for every test case ?
1. I can indeed write a separate subclass with varying behavior in some methods. 
If this mock is end up being used in 10 different test scenarios it is much 
better than repeating mock setup in each test case with all this configuration

2. We can write mock class in a such a way that it can be configured through 
some mock specific interface to tailor to specific behavior you like. This is 
most flexible and will allow you reuse the same class and the same time 
implement arbitrary change in behavior no framework will ever be able provide 
for you.

3. Finally you possibly can implement some support in mock library to implement 
support to specifying some subset of possible behaviors through some template 
magic and compiler specific hacks. I am somewhat doubtful one can provide a nice 
generic interface for this. More over I an mot convince it is worth the effort.

Most importantly though either of these approaches does NOT imply specifying 
your expectations in a test case. It is all about specifying mock behavior. So 
either way it can be used within a bound of my approach.
...
Then there seem to be some kind of trace logging involved which if I 
understand correctly can be seen as describing the expectations a 
posteriori.
Yes. Indeed.
...
From my understanding this logs on the first run then reload the 
expectations and use them as a base to validate new runs.
Yes. Basic record/reply approach.
...
I see a number of problems with this (despite the serialization 
requirement on arguments to be checked), the major one being to not 
allow TDD.
Why is this?
...
Also another common usage of mock objects in test is to document how the 
object under test reacts to the outside world. Moving the expectations 
outside the test makes this difficult (although not impossible, I
Actually IMO it is the other way around. Having these expectations in a plain 
text log file and not in a source code does not decrease a test value from any 
prospective and:
  * the actual source code does not interfere with log, so you can read it 
easier
  * there is no way "unaware" person can peek into your test case and understand 
what exactly interaction expectations are. Looking at log file it is much 
easier. Look at any example of these test cases and tell me this is not the 
case.
...
suppose the test cases and expectations log files could be 
post-processed to produce sequence diagrams or something similar).
Yes. Indeed. And given fixed format of these you can generate proper 
documentation out of them or visa verse you can generate these "pattern" files 
on your requirement phase by people who are not developers at all. It is common 
to describe interaction expectation in requirements (not so much with specific 
behavior of your class, which you test using state expectations).
...
Actually we (my team and company) have attempted this approach in the 
past. It works nicely on small use cases, but quickly tends to get in 
the way of refactoring. When each code change fails a dozen test cases 
which then have to be manually checked only to discover that a small 
variation to the algorithm still produced a perfectly valid expected 
output but with a slightly different resolution path, it tends to be 
very counter-productive.
This only indicates that you maybe testing wrong thing. With interaction based 
testing, while it is useful in some scenarios, it is also very easy to fall into 
a trap of testing implementation. This is the nature of the whole process. 
Either the algorithms above are better of being tested by checking a state of 
produced values or it is possible that your tests where valid, but how can you 
be sure algorithms changes are valid? At the very least your test cases notified 
you about these changes. All you need to do now (at least in my approach) is to 
go through new logs and "accept" them as new pattern files. This is also very 
common and valid approach to interaction based testing.
...
Therefore we started to add ways to relax the expectations in order to 
minimize the false positive test failures : the number of times and the 
order in which expectations happen, that some arguments sometimes are to 
be verified and sometimes not, etc..
And more complicated behavior variations you accept the less valuable your test 
becomes. Somehow you do not write your test cases like this: ok, this function 
sometimes returns 5, sometimes 6 and some rare cases 10. You can do this, but 
this is bad test. The same applies to interaction based testing. You are much 
better of with specific interaction being expected given the same input.
...
In the end spelling out the expectations started to look more like a 
solution and less like a problem.
IMO spelling out non trivial expectation of interactions indicates tool misuse. 
Some behavior configuration of the mocks is fine. Non deterministic behavior 
expectations are not.
...
Do you have any experience on this matter ? Did you manage to overcome 
this issue ?
Well I can go as far as admitting that there is a potential for improvement 
where you can implement some relaxed rules for matching pattern files, but 
misuse of these will make your test case useless.

As I said above, I still believe record/replay approach is more preferable to 
any other one and I do not see a big issues with it as is.

Regards,
Gennadiy

Re: [boost] What Should we do About Boost.Test?

Gennadiy Rozental