Re: [Boost-users] Library Interface Design

14 Sep 2006

      Scott Meyers wrote:
...
Jeff Garland wrote:
...
Testing in total isolation is a myth.  To be a little softer -- it's really 
going to depend on the type of class you are testing whether or not it can be 
rationally tested in isolation.  If you haven't lately, you should re-read 
Lakos's treatment of this subject in Large Scale C++ Software Design. This 
book is 10 years old, but he breaks down testability in a way I've not seen 
anyone else do since doing testing became all the rage.  Most of the 'test 
first' stuff I've seem ignores the inherent untestability of some software.
That's been my impression.  One of the things I've been trying to figure 
out wrt the whole testing hoopla is how well it translates to large 
projects and how it has to be adjusted when things move beyond toy 
examples.  And yes, I probably should go back and reread Lakos.
Well, the testing hoopla 'applies' to the extent that in my experience big 
systems that *don't* have significant testing discipline never see the light 
of day.  That is, they fail under an avalanche of integration and basic 
execution problems before ever being fielded.  As an aside, I always get a 
good laugh out of all the agonizing by various folks over how this and that 
testing technique that they've *recently discovered* on a 15 person project 
applies to large systems.  Big systems have been using these approaches for 
years...or they failed. Now, that's not to say that the level of rigor advised 
by many of the test-first proponents really happens on big projects either. 
Is it economical to spend time writing code to check a 'getter'/'setter' 
interface that will just obviously work?  The answer is no.  In fact, the 
testing you can avoid, just like the coding you can avoid, is really a big 
part of successful big system development.

 From my experience the best-practice of testing depends on what the code is 
used for and what else depends on it.  If it's a widely used library (say 
date-time to pick one :) you want it to be very well unit tested because 
thousands of LOC will depend on it.  Every time you modify it you have to 
retest a large amount of code. It also turns out to be easy to unit test 
because it doesn't depend on much. On the other hand, take the case of a user 
interface which has no other code that depends on it -- my advice is to skip 
most of the unit and automated tests.  For one thing, it's very hard to write 
useful test code.  For another, a human can see in 1 second what a machine can 
never see (ugly layout, poor interaction usability, etc).  Since testing at 
the 'top level' of the architecture depends on basically all the other 
software in the system it tends to change rapidly -- people can quickly adjust 
to the fact that the widgets moved around on the screen, test programs tend to 
be fragile to these sort of changes. And finally,  since no other code depends 
on this code it isn't worth the time -- you can chance it at will. Bottom line 
is that not all code is created equal w.r.t to the need or ease of testing.

Of course the landscape isn't static either -- some good things have happened. 
One thing that's really changed is that the test first/XP/Agile folks have 
managed to convince developers that they actually need to execute their code 
before they deliver -- a good thing.  This often wasn't common practice 10 
years ago. Also, developers have more and more pre-tested code to pull off the 
shelf -- better libraries and less low level code to write and test.

Even with all that, I still say testing isn't enough because I know that even 
the stuff that's *easy* to test will have gaps.  There are literally thousands 
of Boost date-time tests (2319 'asserts' to be exact) that run in the 
regression every day, but I don't believe for a minute that the library is 
bug-free or can't be the source of bugs in other code.

As an example of the latter, initially the date class had no default 
constructor and it is built to guarantee that you can't construct an invalid 
date.  It's also an immutable type, so you can't set parts of a date to make 
an invalid one (you can assign, but you have to go thru checks to do that).  I 
wanted these properties so that I could pass dates around in interfaces and 
wouldn't have to 'check' the precondition that a date is valid when I go to 
use it.  All good, except that dates also allowed 'not_a_date_time', 
+infinity, and -infinity as a valid values.  So if you call date::year() on 
something that's set to not_a_date_time the results are undefined.

Now it's trivial to write some 'incorrect' code and a bunch of tests that 
will always work:

void f(const date& d)
{
    int year = d.year(); //oops....fails in some cases
}

should really always be:

   if (d.is_special()) {
      //do something here
   }
   else {....
      int year = d.year()

So going back to the default constructor, I eventually added one that 
constructs to not_a_date_time after many users requested it.  Mostly for use 
in collections that need this. A very logical choice for default, but my worry 
all along was that people would make the mistake above.  That is, now instead 
of being forced to think about putting some sort of correct date value or 
using not_a_date_time explicitly:

    date d(not_a_date_time);

they can just say

    date d;

Aside from the obvious loss of readability, I worried that with just these few 
lines of code the correctness of a larger program can be undermined by failing 
to check the special states.

So far, I'm not aware of anyone having an issue with this in a large program, 
but I'd be shocked if someone didn't create a bug this way eventually.  It's 
trivial to write and test code that always uses 'valid dates', ship it, and 
everything will work fine.  Then one day someone else will unknowingly make a 
call using a default constructed date and 'boom' a function that's been 
working fine and is fully 'tested' will blow up with unexpected results.

So, is it the right set of design decisions?  I don't know, but there's 
clearly a tension between correctness, 'ease of use', and overall 
applicability.  My take on the EventLogger example is that it's the wrong set 
of choices. There's very little valid use of the object without the stream. 
The stream is a low-level stable library that all programmers should know 
anyway. It's wide open to creating runtime errors that are not localized, and 
it's low level library that I would expect to use all over in a program. So 
I'd want the number of error modes to be as small as possible, because I'm 
certain they won't be writing code to test all the cases....

Jeff