Library Interface Design

12 Sep 2006

      ...
The .NET libraries have many objects with many constructors that 
leave the constructed object in a not ready-to-use state.
An example:
System.Data.SqlClient.SqlParameter is a class that describes a bound
I have a question about library interface design in general with a 
strong interest in its application to C++, so I hope the moderators will 
view this at on-topic for a list devoted to the users' view of a set of 
widely used C++ libraries.  In its most naked form, the question is 
this:  how important is it that class constructors ensure that their 
objects are in a "truly" initialized state?

I've always been a big fan of the "don't let users create objects they 
can't really use" philosophy.  So, in general, I don't like the idea of 
"empty" objects or two-phase construction.  It's too easy for people to 
make mistakes. So, for example, if I were to design an interface for an 
event log that required a user-specified ostream on which to log events, 
I'd declare the constructor taking an ostream parameter (possibly with a 
default, but that's a separate question, so don't worry about it):

   EventLog::EventLog(std::ostream& logstream); // an ostream must be
                                                // specified

I've slept soundly with this philosophy for many years, but lately I've 
noticed that this idea runs headlong into one of the ideas of 
testability:  that classes should be easy to test in isolation.  The 
above constructor requires that an ostream be set up before an EventLog 
can be tested, and this might (1) be a pain and (2) be irrelevant for 
whatever test I want to perform.  In such cases, offering a default 
constructor in addition to the above would make the class potentially 
easier to test.  (In the general case, there might be many parameters, 
and they might themselves be hard to instantiate for a test due to 
required parameters that their constructors require....)

Another thing I've noticed is that some users prefer a more exploratory 
style of  development:  they want to get something to compile ASAP and 
then play around with it.  In particular, they don't want to be bothered 
with having to look up a bunch of constructor parameters in some 
documentation and then find ways to instantiate the parameters, they 
just want to create the objects they want and then play around with 
them.  My gut instinct is not to have much sympathy for this argument, 
but then I read in "Framework Design Guidelines" that it's typically 
better to let people create "uninitialized" objects and throw exceptions 
if the objects are then used.  In fact, I took the EventLog example from 
page 27 of that book, where they make clear that this code will compile 
and then throw at runtime (I've translated from C# to C++, because the 
fact that the example is in C# is not relevant):

   EvengLog log;
   log.WriteEntry("Hello World");    // throws: no log stream was set

This book is by the designers of the .NET library, so regardless of your 
feelings about .NET, you have to admit that they have through about this 
kind of stuff a lot and also have a lot of experience with library users.

But then on the third hand I get mail like this:

parameter used in a database statement. Bound parameters are essential 
to prevent SQL injection attacks. They should be exceedingly easy to use 
since the "competition" (string concatenation of parameters into the SQL 
statement) is easy, well understood, and dangerous.
...
However, the SqlParameter class has six constructors. Only two
constructors create a sqlParameter object that can be immediately used. 
The others all require that you set additional properties (of course, 
which additional properties is unclear). Failure to prepare the 
SqlParameter object correctly typically generates an un-helpful database 
error when the SQL statement is executed. To add to the confusion, the 
first ctor shown by intellisense has 10 parameters (which, if set 
correctly, will instantiate a usable object). The last ctor shown by 
intellisense has only 2 parameters and is the most intuitive choice. The 
four in between are all half-baked. It's confusing, and even though I 
use it all the time, I still have to look at code snippets to remember how.

So I'm confused.  Constructors that "really" initialize objects detect 
some kind of errors during compilation, but they make testing harder, 
are arguably contrary to exploratory programming, and seem to contradict 
the advice of the designers of the .NET API.  Constructors that "sort 
of" initialize objects are more test-friendly (also more loosely 
coupled, BTW) and facilitate an exploratory programming style, but defer 
some kinds of error detection to runtime (as well as incur the runtime 
time/space costs of such detection and ensuing actions).

So, library users, what do you prefer, and why?

Thanks,

Scott

Scott Meyers

Kevin Wheatley

Brian Allison

Gennaro Prota

Scott Meyers

Marshall Clow

Scott Meyers

Rene Rivera

David Abrahams

Peter Dimov

David Abrahams

Brian Allison

David Abrahams

Brian Allison

David Abrahams

Brian Allison

Matthias Hofmann

David Abrahams

Peter Dimov

Brian Allison

Yuval Ronen

Jeff Flinn

Jeff Flinn

David Walthall

David Abrahams

Matthias Hofmann

David Abrahams

Edward Diener

loufoque

Paul Davis

loufoque

Anne-Gert Bultena

Roman Neuhauser

Scott Meyers

David Abrahams

David Abrahams

Arkadiy Vertleyb

Rush Manbert

Scott Meyers

divyank shukla

David Abrahams

Scott Meyers

Bill Lear

Scott Meyers

Bill Lear

James Dennett

Gottlob Frege

Peter Dimov

David Abrahams

Rush Manbert

Scott Meyers

Rush Manbert

Jeff Garland

Scott Meyers

Jeff Garland

David Abrahams

Roman Neuhauser

Ovanes Markarian

Joel de Guzman

Jens Theisen

Joel de Guzman

Jens Theisen

David Abrahams

Jens Theisen

David Abrahams

Joel de Guzman

Jens Theisen

Joel de Guzman

Joel de Guzman

Scott Meyers

Joel de Guzman

Bjorn Roald

Delfin Rojas

Johan Nilsson

Edward Diener

David Abrahams

Edward Diener

David Abrahams

Loïc Joly