
Scott Meyers wrote:
Jeff Garland wrote:
Testing in total isolation is a myth. To be a little softer -- it's really going to depend on the type of class you are testing whether or not it can be rationally tested in isolation. If you haven't lately, you should re-read Lakos's treatment of this subject in Large Scale C++ Software Design. This book is 10 years old, but he breaks down testability in a way I've not seen anyone else do since doing testing became all the rage. Most of the 'test first' stuff I've seem ignores the inherent untestability of some software.
That's been my impression. One of the things I've been trying to figure out wrt the whole testing hoopla is how well it translates to large projects and how it has to be adjusted when things move beyond toy examples. And yes, I probably should go back and reread Lakos.
Well, the testing hoopla 'applies' to the extent that in my experience big systems that *don't* have significant testing discipline never see the light of day. That is, they fail under an avalanche of integration and basic execution problems before ever being fielded. As an aside, I always get a good laugh out of all the agonizing by various folks over how this and that testing technique that they've *recently discovered* on a 15 person project applies to large systems. Big systems have been using these approaches for years...or they failed. Now, that's not to say that the level of rigor advised by many of the test-first proponents really happens on big projects either. Is it economical to spend time writing code to check a 'getter'/'setter' interface that will just obviously work? The answer is no. In fact, the testing you can avoid, just like the coding you can avoid, is really a big part of successful big system development. From my experience the best-practice of testing depends on what the code is used for and what else depends on it. If it's a widely used library (say date-time to pick one :) you want it to be very well unit tested because thousands of LOC will depend on it. Every time you modify it you have to retest a large amount of code. It also turns out to be easy to unit test because it doesn't depend on much. On the other hand, take the case of a user interface which has no other code that depends on it -- my advice is to skip most of the unit and automated tests. For one thing, it's very hard to write useful test code. For another, a human can see in 1 second what a machine can never see (ugly layout, poor interaction usability, etc). Since testing at the 'top level' of the architecture depends on basically all the other software in the system it tends to change rapidly -- people can quickly adjust to the fact that the widgets moved around on the screen, test programs tend to be fragile to these sort of changes. And finally, since no other code depends on this code it isn't worth the time -- you can chance it at will. Bottom line is that not all code is created equal w.r.t to the need or ease of testing. Of course the landscape isn't static either -- some good things have happened. One thing that's really changed is that the test first/XP/Agile folks have managed to convince developers that they actually need to execute their code before they deliver -- a good thing. This often wasn't common practice 10 years ago. Also, developers have more and more pre-tested code to pull off the shelf -- better libraries and less low level code to write and test. Even with all that, I still say testing isn't enough because I know that even the stuff that's *easy* to test will have gaps. There are literally thousands of Boost date-time tests (2319 'asserts' to be exact) that run in the regression every day, but I don't believe for a minute that the library is bug-free or can't be the source of bugs in other code. As an example of the latter, initially the date class had no default constructor and it is built to guarantee that you can't construct an invalid date. It's also an immutable type, so you can't set parts of a date to make an invalid one (you can assign, but you have to go thru checks to do that). I wanted these properties so that I could pass dates around in interfaces and wouldn't have to 'check' the precondition that a date is valid when I go to use it. All good, except that dates also allowed 'not_a_date_time', +infinity, and -infinity as a valid values. So if you call date::year() on something that's set to not_a_date_time the results are undefined. Now it's trivial to write some 'incorrect' code and a bunch of tests that will always work: void f(const date& d) { int year = d.year(); //oops....fails in some cases } should really always be: if (d.is_special()) { //do something here } else {.... int year = d.year() So going back to the default constructor, I eventually added one that constructs to not_a_date_time after many users requested it. Mostly for use in collections that need this. A very logical choice for default, but my worry all along was that people would make the mistake above. That is, now instead of being forced to think about putting some sort of correct date value or using not_a_date_time explicitly: date d(not_a_date_time); they can just say date d; Aside from the obvious loss of readability, I worried that with just these few lines of code the correctness of a larger program can be undermined by failing to check the special states. So far, I'm not aware of anyone having an issue with this in a large program, but I'd be shocked if someone didn't create a bug this way eventually. It's trivial to write and test code that always uses 'valid dates', ship it, and everything will work fine. Then one day someone else will unknowingly make a call using a default constructed date and 'boom' a function that's been working fine and is fully 'tested' will blow up with unexpected results. So, is it the right set of design decisions? I don't know, but there's clearly a tension between correctness, 'ease of use', and overall applicability. My take on the EventLogger example is that it's the wrong set of choices. There's very little valid use of the object without the stream. The stream is a low-level stable library that all programmers should know anyway. It's wide open to creating runtime errors that are not localized, and it's low level library that I would expect to use all over in a program. So I'd want the number of error modes to be as small as possible, because I'm certain they won't be writing code to test all the cases.... Jeff