
The .NET libraries have many objects with many constructors that leave the constructed object in a not ready-to-use state.
An example: System.Data.SqlClient.SqlParameter is a class that describes a bound
I have a question about library interface design in general with a strong interest in its application to C++, so I hope the moderators will view this at on-topic for a list devoted to the users' view of a set of widely used C++ libraries. In its most naked form, the question is this: how important is it that class constructors ensure that their objects are in a "truly" initialized state? I've always been a big fan of the "don't let users create objects they can't really use" philosophy. So, in general, I don't like the idea of "empty" objects or two-phase construction. It's too easy for people to make mistakes. So, for example, if I were to design an interface for an event log that required a user-specified ostream on which to log events, I'd declare the constructor taking an ostream parameter (possibly with a default, but that's a separate question, so don't worry about it): EventLog::EventLog(std::ostream& logstream); // an ostream must be // specified I've slept soundly with this philosophy for many years, but lately I've noticed that this idea runs headlong into one of the ideas of testability: that classes should be easy to test in isolation. The above constructor requires that an ostream be set up before an EventLog can be tested, and this might (1) be a pain and (2) be irrelevant for whatever test I want to perform. In such cases, offering a default constructor in addition to the above would make the class potentially easier to test. (In the general case, there might be many parameters, and they might themselves be hard to instantiate for a test due to required parameters that their constructors require....) Another thing I've noticed is that some users prefer a more exploratory style of development: they want to get something to compile ASAP and then play around with it. In particular, they don't want to be bothered with having to look up a bunch of constructor parameters in some documentation and then find ways to instantiate the parameters, they just want to create the objects they want and then play around with them. My gut instinct is not to have much sympathy for this argument, but then I read in "Framework Design Guidelines" that it's typically better to let people create "uninitialized" objects and throw exceptions if the objects are then used. In fact, I took the EventLog example from page 27 of that book, where they make clear that this code will compile and then throw at runtime (I've translated from C# to C++, because the fact that the example is in C# is not relevant): EvengLog log; log.WriteEntry("Hello World"); // throws: no log stream was set This book is by the designers of the .NET library, so regardless of your feelings about .NET, you have to admit that they have through about this kind of stuff a lot and also have a lot of experience with library users. But then on the third hand I get mail like this: parameter used in a database statement. Bound parameters are essential to prevent SQL injection attacks. They should be exceedingly easy to use since the "competition" (string concatenation of parameters into the SQL statement) is easy, well understood, and dangerous.
However, the SqlParameter class has six constructors. Only two
constructors create a sqlParameter object that can be immediately used. The others all require that you set additional properties (of course, which additional properties is unclear). Failure to prepare the SqlParameter object correctly typically generates an un-helpful database error when the SQL statement is executed. To add to the confusion, the first ctor shown by intellisense has 10 parameters (which, if set correctly, will instantiate a usable object). The last ctor shown by intellisense has only 2 parameters and is the most intuitive choice. The four in between are all half-baked. It's confusing, and even though I use it all the time, I still have to look at code snippets to remember how. So I'm confused. Constructors that "really" initialize objects detect some kind of errors during compilation, but they make testing harder, are arguably contrary to exploratory programming, and seem to contradict the advice of the designers of the .NET API. Constructors that "sort of" initialize objects are more test-friendly (also more loosely coupled, BTW) and facilitate an exploratory programming style, but defer some kinds of error detection to runtime (as well as incur the runtime time/space costs of such detection and ensuing actions). So, library users, what do you prefer, and why? Thanks, Scott

Scott Meyers wrote:
EventLog::EventLog(std::ostream& logstream); // an ostream must be // specified
I've slept soundly with this philosophy for many years, but lately I've noticed that this idea runs headlong into one of the ideas of testability: that classes should be easy to test in isolation. The above constructor requires that an ostream be set up before an EventLog can be tested, and this might (1) be a pain and (2) be irrelevant for whatever test I want to perform. In such cases, offering a default constructor in addition to the above would make the class potentially easier to test. (In the general case, there might be many parameters, and they might themselves be hard to instantiate for a test due to required parameters that their constructors require....)
The key here is that mock/stub objects in C++ are in some way a pain. In your example, what are you going to test with your event log that won't need to be validated by checking the logstream? In my cases I've used regex matching on the logstream, even in my constructor test case I verify that nothing is written to the stream, so I always supply a fake ostream - usually a string based stream that I can validate. I know there was some degree of push back in the blog-o-sphere of over use of this form of Inversion of control (see http://www.martinfowler.com/articles/injection.html for some of the alternatives)
Another thing I've noticed is that some users prefer a more exploratory style of development: they want to get something to compile ASAP and then play around with it. In particular, they don't want to be bothered with having to look up a bunch of constructor parameters in some documentation and then find ways to instantiate the parameters, they just want to create the objects they want and then play around with them. My gut instinct is not to have much sympathy for this argument,
Personally I'd prefer the use of Null objects mostly because I'm objecting to code like this: value myclass:method() { if (member_variable == is_not_set) return NULL_OF_SOME_FORM; // do something with member_variable } slightly better might be: value myclass:method2() { if (member_variable == is_not_set) throw something; // do something with member_variable } but even better is code that reads value myclass:method3() { // do something with member_variable } It fits better with the idea of value objects and makes me more likely to write code that has a stronger exception guarantee, but thats me :-)
So I'm confused. Constructors that "really" initialize objects detect some kind of errors during compilation, but they make testing harder, are arguably contrary to exploratory programming, and seem to contradict the advice of the designers of the .NET API. Constructors that "sort of" initialize objects are more test-friendly (also more loosely coupled, BTW) and facilitate an exploratory programming style, but defer some kinds of error detection to runtime (as well as incur the runtime time/space costs of such detection and ensuing actions).
So I guess I'd add the fact that a constructor IMO should be used to initialize an object into a well defined state including setup of the object's invariant. Two phase construction runs the risk that you might forget something. My view is that objects with large number of parameters passed into the constructor smells of a missing object/responsibility in the design, even if its a data transfer object, my making such a class your trying to direct the programmer to not forget something. Testability isn't really effected by single phase construction. In my experience it helps because all your setup code is in 'one line', not spread about. These are of course guidelines and not hard and fast rules. Kevin -- | Kevin Wheatley, Cinesite (Europe) Ltd | Nobody thinks this | | Senior Technology | My employer for certain | | And Network Systems Architect | Not even myself |

Scott Meyers wrote:
I have a question about library interface design in general with a strong interest in its application to C++, so I hope the moderators will view this at on-topic for a list devoted to the users' view of a set of widely used C++ libraries. In its most naked form, the question is this: how important is it that class constructors ensure that their objects are in a "truly" initialized state?
I find it essential that if something is truly a class, and not just a struct with some member functions, that the constructor leaves the Foo as a Foo and not as an almost-Foo or sort-of-Foo. Arguments for partial construction of a class are almost always from the mentality of C programming, and ignore the concept of invariants - indicating that the one ignoring them has a weak grasp on the usefulness of user-constructed types. Instead of giving you the same answers that have been presented so many times in books, lectures, on the net and even in the Boost mailing lists, let me use a bit of rhetoric to rephrase your question: How important is it that the items you use in everyday life arrive at the point of your using them in a fully constructed state? Say... a vehicle, a computer, a building, a dentist's knowledge... and the list goes on and on. If I'm given a Thing, then let it be a real Thing and not a "carboard imitation". In my experience, if I'm fixing a bug in some code and notice that an object is constructed in multiple phases, then I immediately start redesigning it so that each invariant is a distinct type. When I'm done I've spent about as much time "refactoring" it and testing it as I would have spent fixing the original bug. However, the original bug is gone and I've removed my need to look at the code again in the future for the "next behavioral anomaly". So to me, the issue seems like a no-brainer. If I want to test, and seem to have a dependency, then the dependees should be tested first. If it seems that there is a natural circular dependency, then I haven't broken the types down properly.
So I'm confused. Constructors that "really" initialize objects detect some kind of errors during compilation, but they make testing harder, are arguably contrary to exploratory programming, and seem to contradict the advice of the designers of the .NET API. Constructors that "sort of" initialize objects are more test-friendly (also more loosely coupled, BTW) and facilitate an exploratory programming style, but defer some kinds of error detection to runtime (as well as incur the runtime time/space costs of such detection and ensuing actions).
So, library users, what do you prefer, and why?
Proper testing either has stub objects which properly emulate a presumption or it has already-tested objects which properly represent their invariant. Exploratory progamming is fine (I use it a lot myself), but if I use bubble gum in my exploratory building then I can't complain when unfortunate things happen. As for the advice of the designers of an particular API.... it sounds to me as if your description of their API indicates that you don't consider their API very highly. It sounds (again, from your description) to be hard to use, easy to user to create a bunch of badly working objects, and hard to debug. I do hope that you've got an alternative to that kind of suffering. Or perhaps I misread your description? regards, Brian

On Tue, 12 Sep 2006 07:06:39 -0400, Brian Allison
In my experience, if I'm fixing a bug in some code and notice that an object is constructed in multiple phases, then I immediately start redesigning it so that each invariant is a distinct type.
I'm intrigued :-) Could you please provided an example with a minimal complete definition of the invariant types? -- [ Gennaro Prota. C++ developer, Library designer. ] [ For Hire http://gennaro-prota.50webs.com/ ]

Brian Allison wrote:
As for the advice of the designers of an particular API.... it sounds to me as if your description of their API indicates that you don't consider their API very highly. It sounds (again, from your description) to be hard to use, easy to user to create a bunch of badly working objects, and hard to debug. I do hope that you've got an alternative to that kind of suffering. Or perhaps I misread your description?
I find my intuitions and preferences in tension with advice from others who I believe are worth listening to. This suggests that my intuitions and preferences need refining or that I've misunderstood or misevaluated the advice I find in tension. I'm suspicious of a design for a EventLog that seems to require a stream to be useful, yet still allows a EventLog to be created without one, but this seems to be contrary to the advice of a book on how to design an application framework from the people who are responsible for designing one of the most used APIs in existence (.NET). I respect the people in that position, and for those who care about Amazon ratings, it's been well received there (http://www.amazon.com/Framework-Design-Guidelines-Conventions-Development/dp/0321246756/sr=8-1/qid=1158206783/ref=pd_bbs_1/104-6620534-4167149?ie=UTF8&s=books). I'll note that C++ itself allows "uninitialized" objects with constructors to be created all the time: std::ofstream ofs; std::vector<int>::iterator i; std::string s; In each case, there are a few operations that can legitimately be performed on such objects, but many operations lead to UB. Is this fundamentally different from the EventLog example? For example, replace EventLog in my example with ofstream, and you have std::ofstream ofs; ofs << "Hello World"; Trouble ensues, just as it did in the EventLog example. Scott

At 9:06 PM -0700 9/13/06, Scott Meyers wrote: [ snip ]
I'll note that C++ itself allows "uninitialized" objects with constructors to be created all the time:
std::ofstream ofs; std::vector<int>::iterator i; std::string s;
Just a nit - I think that your third example is not like the others. A std::string, AFAIK, constructed with the default constructor, is perfectly valid - just empty. The other two - those, I'll give you ;-) -- -- Marshall Marshall Clow Idio Software mailto:marshall@idio.com It is by caffeine alone I set my mind in motion. It is by the beans of Java that thoughts acquire speed, the hands acquire shaking, the shaking becomes a warning. It is by caffeine alone I set my mind in motion.

Marshall Clow wrote:
At 9:06 PM -0700 9/13/06, Scott Meyers wrote: [ snip ]
I'll note that C++ itself allows "uninitialized" objects with constructors to be created all the time:
std::ofstream ofs; std::vector<int>::iterator i; std::string s;
Just a nit - I think that your third example is not like the others. A std::string, AFAIK, constructed with the default constructor, is perfectly valid - just empty.
In each case, a default constructor is invoked. They're all valid objects that presumably fulfill their invariants, it's just that you can't safely invoke very many operations on them. On the string s, for example, invoking size is fine, but invoking operator[] is not fine at all. Scott

Scott Meyers wrote:
Marshall Clow wrote:
I'll note that C++ itself allows "uninitialized" objects with constructors to be created all the time:
std::ofstream ofs; std::vector<int>::iterator i; std::string s; Just a nit - I think that your third example is not like the others. A std::string, AFAIK, constructed with the default constructor, is
At 9:06 PM -0700 9/13/06, Scott Meyers wrote: [ snip ] perfectly valid - just empty.
In each case, a default constructor is invoked. They're all valid objects that presumably fulfill their invariants, it's just that you can't safely invoke very many operations on them. On the string s, for example, invoking size is fine, but invoking operator[] is not fine at all.
But that's true for string even with valid non-default construction: std::string s(""); And also true, in possibly invoking erroneous behavior, of all C++ arrays and array like objects. It's not a "defect" of the default construction of string. But a "defect" of the string interface. Given a differently designed interface, that did not have the operator[], would not have that "defect". Note, I'm not suggesting that this would, by inference, be extended to two phase construction designs. But then again, this is a matter of perspective. If one considers the construction to determine the allowable interface, then your string example is apropos. To me that suggest implementing, when required, two phase designs with double construction. For example: const_string s0; string s1(s0+"something"); But that looks hellishly complicated to pull off design wise. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Scott Meyers
Marshall Clow wrote:
At 9:06 PM -0700 9/13/06, Scott Meyers wrote: [ snip ]
I'll note that C++ itself allows "uninitialized" objects with constructors to be created all the time:
std::ofstream ofs; std::vector<int>::iterator i; std::string s;
Just a nit - I think that your third example is not like the others. A std::string, AFAIK, constructed with the default constructor, is perfectly valid - just empty.
In each case, a default constructor is invoked. They're all valid objects that presumably fulfill their invariants, it's just that you can't safely invoke very many operations on them. On the string s, for example, invoking size is fine, but invoking operator[] is not fine at all.
The difference is that there are no special cases in the specification of std::string to cover the default-constructed case. The other two may fall within the invariants, but they're degenerate objects requiring special consideration. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Scott Meyers wrote:
Marshall Clow wrote:
At 9:06 PM -0700 9/13/06, Scott Meyers wrote: [ snip ]
I'll note that C++ itself allows "uninitialized" objects with constructors to be created all the time:
std::ofstream ofs; std::vector<int>::iterator i; std::string s;
Just a nit - I think that your third example is not like the others. A std::string, AFAIK, constructed with the default constructor, is perfectly valid - just empty.
In each case, a default constructor is invoked. They're all valid objects that presumably fulfill their invariants, it's just that you can't safely invoke very many operations on them.
No, a singular iterator is not a valid object and it fulfills no invariants. You can't even copy it without UB. The ofstream can be made to fulfill its invariant if we define its invariant as "true", I guess, or something with a similar utility. A default-constructed std::string is fine. You are not breaking the invariant with invoking op[], you are supplying an invalid index to it. BTW, did you know that the const version of op[] is required to return 0 for s[0] :-) How exactly is it supposed to return 0 when the return type is a char const& is left as an exercise for the implementer.

David Abrahams wrote:
"Peter Dimov"
writes: No, a singular iterator is not a valid object and it fulfills no invariants.
That's arguable. From my POV, if it's in a state that a legal program can create, it's within the invariant by definition.
Then (if I read you correctly) even undefined behavior is within the invariant? Or have I been misunderstanding that legal programs can cause UB? I've always considered that UB is to be treated as "not maintaining the invariant but it's not my fault". But then, whether we consider UB within the invariant (and hence that an iterator is by definition always within the invariant) or whether we consider UB outisde of the invariant but one which doesn't break correctness (and hence that unassigned iterators are not capable of being within/outside of an invariant).... ... is there any practical difference between those two points of view? thanks, Brian

Brian Allison
David Abrahams wrote:
"Peter Dimov"
writes: No, a singular iterator is not a valid object and it fulfills no invariants.
That's arguable. From my POV, if it's in a state that a legal program can create, it's within the invariant by definition.
Then (if I read you correctly) even undefined behavior is within the invariant?
No, invariants are about state and UB is about behavior. Behaviors don't fall inside or outside of states.
Or have I been misunderstanding that legal programs can cause UB?
I don't understand the question, sorry. When a program causes undefined behavior, that falls into the category I'm calling "illegal program." I don't just mean those programs that can be diagnosed as illegal by the compiler.
I've always considered that UB is to be treated as "not maintaining the invariant but it's not my fault".
By whom?
But then, whether we consider UB within the invariant (and hence that an iterator is by definition always within the invariant) or whether we consider UB outisde of the invariant but one which doesn't break correctness (and hence that unassigned iterators are not capable of being within/outside of an invariant)....
... is there any practical difference between those two points of view?
Sorry, this whole behavior-within/without-the-invariant concept doesn't make any sense to me. It's probably just my too-literal mind at work, but if you could rephrase or explain maybe I could answer. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
Brian Allison
writes: David Abrahams wrote:
"Peter Dimov"
writes: No, a singular iterator is not a valid object and it fulfills no invariants.
That's arguable. From my POV, if it's in a state that a legal program can create, it's within the invariant by definition.
Then (if I read you correctly) even undefined behavior is within the invariant?
No, invariants are about state and UB is about behavior. Behaviors don't fall inside or outside of states.
Or have I been misunderstanding that legal programs can cause UB?
I don't understand the question, sorry.
When a program causes undefined behavior, that falls into the category I'm calling "illegal program." I don't just mean those programs that can be diagnosed as illegal by the compiler.
I misunderstood you - I thought you meant "illegal program" in the sense of any program which run afoul of the standard. Thanks for the clarification.

Brian Allison
When a program causes undefined behavior, that falls into the category I'm calling "illegal program." I don't just mean those programs that can be diagnosed as illegal by the compiler.
I misunderstood you - I thought you meant "illegal program" in the sense of any program which run afoul of the standard.
I also consider invoking undefined behavior to be "running afoul of the standard." :) -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
Brian Allison
writes: When a program causes undefined behavior, that falls into the category I'm calling "illegal program." I don't just mean those programs that can be diagnosed as illegal by the compiler.
I misunderstood you - I thought you meant "illegal program" in the sense of any program which run afoul of the standard.
I also consider invoking undefined behavior to be "running afoul of the standard." :)
I realized that with your definition of "illegal program". That would seem to imply that you don't in fact only consider state as an indication of whether or not a program is illegal. On the one hand, it sometimes appears that something which goes outside of the algebra of invariants (forgive me for my imprecision) is your 'illegal program', then you throw in behaviorisms - which have nothing to do with invariants (if I did read you correctly ... <shrug>). Yet you don't allow for the behaviorisms that are specified by the standard in your definition of 'illegal program'. Hm. Perhaps my curiosity->desire to grok Your Standard is no longer on-topic with Library Interface Design, and may even be off-topic for the mailing list. :/

"Peter Dimov"
No, a singular iterator is not a valid object and it fulfills no invariants. You can't even copy it without UB. The ofstream can be made to fulfill its invariant if we define its invariant as "true", I guess, or something with a similar utility.
A singular iterator is a valid object. It's use is limited, however: You can only assign a valid value to it. It may be useful to distinguish between an invalid and an uninitialized object.
A default-constructed std::string is fine. You are not breaking the invariant with invoking op[], you are supplying an invalid index to it. BTW, did you know that the const version of op[] is required to return 0 for s[0] :-) How exactly is it supposed to return 0 when the return type is a char const& is left as an exercise for the implementer.
That's not true, see Section 21.3.4 of the 2003 C++ Standard for the definition of operator[] for std::basic_string<> (note that std::string is a typedef for std::basic_string<char>): [quote] const_reference operator[](size_type pos) const; reference operator[](size_type pos); Returns: If pos < size(), returns data()[pos]. Otherwise, if pos == size(), the const version returns charT(). Otherwise, the behavior is undefined. [/quote] This behaviour does not seem to be difficult to implement. The default constructor only needs to allocate an array of size 1 and initialize its single element with charT(). The constant version of operator[] can then return a constant reference to that element. -- Matthias Hofmann Anvil-Soft, CEO http://www.anvil-soft.com - The Creators of Toilet Tycoon http://www.anvil-soft.de - Die Macher des Klomanagers

Scott Meyers
I'll note that C++ itself allows "uninitialized" objects with constructors to be created all the time:
That doesn't make it a good practice and general, and...
std::ofstream ofs; std::vector<int>::iterator i;
...these two are examples (although at least the documentation about what that means is rigorous and complete)
std::string s;
...but this is not an example of it
In each case, there are a few operations that can legitimately be performed on such objects, but many operations lead to UB. Is this fundamentally different from the EventLog example? For example, replace EventLog in my example with ofstream, and you have
std::ofstream ofs; ofs << "Hello World";
Trouble ensues, just as it did in the EventLog example.
Yeah, nobody these days claims that the iostreams are an example of stellar, state-of-the-art C++ library design. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Scott Meyers wrote:
I'm suspicious of a design for a EventLog that seems to require a stream to be useful, yet still allows a EventLog to be created without one, but this seems to be contrary to the advice of a book on how to design an application framework from the people who are responsible for designing one of the most used APIs in existence (.NET).
I'm still not sure how can you test an EventLog without a stream. What do you test? The default constructor and the destructor? Isn't the whole purpose of the EventLog to output something into its stream?

- i Scott Meyers wrote:
Brian Allison wrote:
As for the advice of the designers of an particular API.... it sounds to me as if your description of their API indicates that you don't consider their API very highly. It sounds (again, from your description) to be hard to use, easy to user to create a bunch of badly working objects, and hard to debug. I do hope that you've got an alternative to that kind of suffering. Or perhaps I misread your description?
I find my intuitions and preferences in tension with advice from others who I believe are worth listening to. This suggests that my intuitions and preferences need refining or that I've misunderstood or misevaluated the advice I find in tension.
Or that the belief you hold about them being worth listening to is not, in fact, properly applied when considering good C++ library design from the user's pouint of view. Remember: you posted this to Boost-users, putting the question of "how important is..." into the context of the users of a library. If I were using a library and I were so easily able to create broken objects, then the first issue I would address is whether or not the library was well-designed. It wouldn't matter how much I liked the guys who wrote the library.
I'm suspicious of a design for a EventLog that seems to require a stream to be useful, yet still allows a EventLog to be created without one, but this seems to be contrary to the advice of a book on how to design an application framework from the people who are responsible for designing one of the most used APIs in existence (.NET). I respect the people in that position, and for those who care about Amazon ratings, it's been well received there (http://www.amazon.com/Framework-Design-Guidelines-Conventions-Development/dp/0321246756/sr=8-1/qid=1158206783/ref=pd_bbs_1/104-6620534-4167149?ie=UTF8&s=books).
Proof by popularity? [The #1 book since Gutenberg has always been The Bible.]
I'll note that C++ itself allows "uninitialized" objects with constructors to be created all the time:
std::ofstream ofs; std::vector<int>::iterator i; std::string s;
Uninitialized objects? The first two are examples of things which don't represent invariants but are interfaces to something else - rarely is it that an interface layer should be used in isolation. [I go a step further and disbelieve that one should ever be *created* in isolation.] The last example is a fine object to create. But we weren't talking about uninitialized objects, we were talking about partial construction, which is a different matter. While the first two are not useable in a default state, they're not meant to ever be created in a *partially constructed* state. Either they're placeholders that are empty and useless, or they're fully assigned and useable.
In each case, there are a few operations that can legitimately be performed on such objects, but many operations lead to UB. Is this fundamentally different from the EventLog example? For example, replace EventLog in my example with ofstream, and you have
std::ofstream ofs; ofs << "Hello World";
Trouble ensues, just as it did in the EventLog example.
Each operation which is valid for the ofstream and iterator is... assignment? I note that neither of them has a partial constructor - which is still the core of the original question, yes? Or perhaps I misread your original poll question? I thought it was about partial construction, not about objects which have an unuseable default state. If instead it was "how do you feel about objects which have an unuseable default state?", then I change my answer to a shrug. After all, I don't use char* unless I have assigned it, and if I can't put off its declaration and can't assign it then I assign to 0. But that's not the same as the antipattern of partial construction. My opinions, as ever, may be ignored at your leisure. They usually are. :) However, they were given in response to your question and not offered out of the blue. regards, Brian

I'm suspicious of a design for a EventLog that seems to require a stream to be useful, yet still allows a EventLog to be created without one, but this seems to be contrary to the advice of a book on how to design an application framework from the people who are responsible for designing one of the most used APIs in existence (.NET). I respect the people in that position, and for those who care about Amazon ratings, it's been well received there (http://www.amazon.com/Framework-Design-Guidelines-Conventions-Development/dp/0321246756/sr=8-1/qid=1158206783/ref=pd_bbs_1/104-6620534-4167149?ie=UTF8&s=books).
Proof by popularity? [The #1 book since Gutenberg has always been The Bible.]
Couldn't agree with you more here...
But we weren't talking about uninitialized objects, we were talking about partial construction, which is a different matter. While the first two are not useable in a default state, they're not meant to ever be created in a *partially constructed* state. Either they're placeholders that are empty and useless, or they're fully assigned and useable.
In each case, there are a few operations that can legitimately be performed on such objects, but many operations lead to UB. Is this fundamentally different from the EventLog example? For example, replace EventLog in my example with ofstream, and you have
std::ofstream ofs; ofs << "Hello World";
Trouble ensues, just as it did in the EventLog example.
Each operation which is valid for the ofstream and iterator is... assignment? I note that neither of them has a partial constructor - which is still the core of the original question, yes?
Or perhaps I misread your original poll question? I thought it was about partial construction, not about objects which have an unuseable default state.
If instead it was "how do you feel about objects which have an unuseable default state?", then I change my answer to a shrug. After all, I don't use char* unless I have assigned it, and if I can't put off its declaration and can't assign it then I assign to 0.
But that's not the same as the antipattern of partial construction.
But I don't agree with you here about the difference between "objects with uninitialized state" and "partial construction". I believe they are the same. As long the the constructor doesn't leave the object in 100% constructed state, it doesn't matter if it's 0% or 50%. It's not 100% either way. So I think the default constructor for std::ofstream and std::container::iterator are a defect in the standard (which is unfortunately probably too late to fix). If an uninitialized state is desired (which is sometimes the case, no doubt about it) then we can all thank God (and Fernando) for Boost.Optional. The usage of optional<> is good for two reasons: 1. It formalizes the uninitialized state, making its existence obvious to both the reader of the code (which doesn't have to look at the documentation to see there is such a state), and to the compiler, which can check the code for some bugs. 2. It allows to have the uninitialized state only where desired - on a per-instance base, rather on a per-class base. This means that the class itself can remain clean of the uninitialized state for any usage which doesn't need it (and therefore suffers from its unwelcome existence). You might say something like "but the standard doesn't have optional, so it had to resolve to other means". Maybe so, but that doesn't make those default constructor not-a-defect; it only makes the lack of std::optional an additional defect as well as those default constructors. Just my opinion... Yuval

"Yuval Ronen"
Or perhaps I misread your original poll question? I thought it was about partial construction, not about objects which have an unuseable default state.
If instead it was "how do you feel about objects which have an unuseable default state?", then I change my answer to a shrug. After all, I don't use char* unless I have assigned it, and if I can't put off its declaration and can't assign it then I assign to 0.
But that's not the same as the antipattern of partial construction.
But I don't agree with you here about the difference between "objects with uninitialized state" and "partial construction". I believe they are the same. As long the the constructor doesn't leave the object in 100% constructed state, it doesn't matter if it's 0% or 50%. It's not 100% either way. So I think the default constructor for std::ofstream and std::container::iterator are a defect in the standard (which is unfortunately probably too late to fix).
I agree wholeheartedly.
If an uninitialized state is desired (which is sometimes the case, no doubt about it) then we can all thank God (and Fernando) for Boost.Optional. The usage of optional<> is good for two reasons: ...
You might say something like "but the standard doesn't have optional, so it had to resolve to other means". Maybe so, but that doesn't make those default constructor not-a-defect; it only makes the lack of std::optional an additional defect as well as those default constructors.
The Optional(zero/one) concept also modeled by pointers(albeit with some performance implications dues to heap allocation and indirection), has been available for some time, has it not? So even non-boosted C++ has not had a 'need' for partial or 2 phase constuction. I think std::ofstream, the notions proposed in the book Scott referenced, and in fact most of MFC are all the result of mis-guided early optimization. Jeff

"Scott Meyers"
I'll note that C++ itself allows "uninitialized" objects with constructors to be created all the time:
std::ofstream ofs;
It's funny how this example is the one always chosen by the proponents of 2 phase construction. Search the newsgroups and you'll find this constructor is the root of many problems encountered by users of streams attempting to operate on an insufficiently intialized stream object.
std::vector<int>::iterator i;
I know, I hate this one too. ;)
std::string s;
This is a properly initialized empty string, analogous to an empty vector.
In each case, there are a few operations that can legitimately be performed on such objects, but many operations lead to UB. Is this fundamentally different from the EventLog example? For example, replace EventLog in my example with ofstream, and you have
std::ofstream ofs; ofs << "Hello World";
What better argument could there be against 2 phase construction? Jeff Flinn

Scott Meyers wrote:
I'll note that C++ itself allows "uninitialized" objects with constructors to be created all the time:
std::ofstream ofs; std::vector<int>::iterator i;
The only time I have found it useful to have default-constructed objects that are otherwise unusable is when I need to store them in a map or other container that requires a default constructible object. That allows me to use the easier operator[] syntax instead of using insert and find. However, I've begun to view that as a defect in the interface of std::map rather than a need to have default constructible objects even when they are not usable. David

David Walthall
The only time I have found it useful to have default-constructed objects that are otherwise unusable is when I need to store them in a map or other container that requires a default constructible object.
No standard containers have that requirement, though.
That allows me to use the easier operator[] syntax instead of using insert and find. However, I've begun to view that as a defect in the interface of std::map rather than a need to have default constructible objects even when they are not usable.
Maybe it's a defect in your desire to avoid find/insert. :) -- Dave Abrahams Boost Consulting www.boost-consulting.com

"David Abrahams"
David Walthall
writes: The only time I have found it useful to have default-constructed objects that are otherwise unusable is when I need to store them in a map or other container that requires a default constructible object.
No standard containers have that requirement, though.
This seems to be another defect, as the standard also defines the following constructor: explicit vector(size_type n, const T& value = T(), const Allocator& = Allocator()); Now how is the following code supposed to compile: #include <vector> struct X { X( int ) {} }; void f() { // Error: No default constructor for X. std::vector<X> v( 5 ); } -- Matthias Hofmann Anvil-Soft, CEO http://www.anvil-soft.com - The Creators of Toilet Tycoon http://www.anvil-soft.de - Die Macher des Klomanagers

"Matthias Hofmann"
"David Abrahams"
schrieb im Newsbeitrag news:87bqphhxjl.fsf@pereiro.luannocracy.com... David Walthall
writes: The only time I have found it useful to have default-constructed objects that are otherwise unusable is when I need to store them in a map or other container that requires a default constructible object.
No standard containers have that requirement, though.
This seems to be another defect, as the standard also defines the following constructor:
explicit vector(size_type n, const T& value = T(), const Allocator& = Allocator());
Now how is the following code supposed to compile:
#include <vector>
struct X { X( int ) {} };
void f() { // Error: No default constructor for X. std::vector<X> v( 5 ); }
That's not a defect; it's simply _not_ supposed to compile! -- Dave Abrahams Boost Consulting www.boost-consulting.com

Scott Meyers wrote:
But then on the third hand I get mail like this:
The .NET libraries have many objects with many constructors that leave the constructed object in a not ready-to-use state. snip...
Reagrding .NET and design It is erroneous to take the .NET libraries as a general indication of C++, or OOP library design. I am sure you are aware that in .NET classes: 1) All data is zero-initialized before construction. 2) Overridden virtual functions can be called on an object BEFORE the object's constructor initialization code is run and AFTER the object's destructor code is run. 3) Default parameters for any member function, including constructors, are not allowed. but if you are not I think you can see why this model, which is not the C++ model and was taken I believe from Anders Hjelsberg work with Delphi, influences ideas about construction of objects in .NET. Furthermore .NET, and other component-oriented APIs, are heavily influenced, for the good I believe, with the ideas of "properties" and "events", both of which have been largely absent from C++ thinking. "Properties" are syntactic sugar for getters and setters of member data and "events" have been implemented very nicely in Boost by the Signals library. Without a visual design environment for setting up "property" values and "event" handlers, which .Net does have BTW with Visual Studio's designers, often an end user must instantiate an object, set the appropriate "properties" and "event" handlers, and only then can he use the functionality of that object. This leads to the alternative idea you have encountered of allowing for default constructors which leave the object in a basically unusual state until the correct "properties" and "event" handlers have been setup to you the object. OTOH, if the "properties" and "events" can be setup using a visually designer, ofter then there is no need to have anything but a default constructor since no data needs to be passed to the constructor to setup an object for use.

Scott Meyers wrote :
An example: System.Data.SqlClient.SqlParameter is a class that describes a bound parameter used in a database statement. Bound parameters are essential to prevent SQL injection attacks. They should be exceedingly easy to use since the "competition" (string concatenation of parameters into the SQL statement) is easy, well understood, and dangerous.
You can construct safe SQL queries with streams or printf-like syntax easily sql << "select first_name, last_name, date_of_birth " "from persons where id = " << id No need to put objects everywhere that complexify everything.

On 9/12/06, loufoque
Scott Meyers wrote :
An example: System.Data.SqlClient.SqlParameter is a class that describes a bound parameter used in a database statement. Bound parameters are essential to prevent SQL injection attacks. They should be exceedingly easy to use since the "competition" (string concatenation of parameters into the SQL statement) is easy, well understood, and dangerous.
You can construct safe SQL queries with streams or printf-like syntax easily
id = "2 ; delete from persons ;" sql << "select first_name, last_name, date_of_birth "
"from persons where id = " << id
Someone just deleted your persons table. Oops. No need to put objects everywhere that complexify everything. _______________________________________________
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Paul Davis wrote :
id = "2 ; delete from persons ;"
sql << "select first_name, last_name, date_of_birth " "from persons where id = " << id
Someone just deleted your persons table. Oops.
In my example sql was supposed to be a special stream type overloaded to escape types correctly. I thought SOCI worked that way, but in fact it seems it is not the case. You could do this, though std::string name; sql << "select phone from phonebook where name = :name", use(name);

I hope I summarized your pros and cons correctly here: 1) 'complete' constructors show the user what to do to use the class. 2) 'incomplete' constructors are easier to use, because they incur less (so-called) overhead.
I've slept soundly with this philosophy for many years, but lately I've noticed that this idea runs headlong into one of the ideas of testability: that classes should be easy to test in isolation. The above constructor requires that an ostream be set up before an EventLog can be tested, and this might (1) be a pain and (2) be irrelevant for whatever test I want to perform. In such cases, offering a default constructor in addition to the above would make the class potentially easier to test. (In the general case, there might be many parameters, and they might themselves be hard to instantiate for a test due to required parameters that their constructors require....)
3) Because of 2 and because (many) of the arguments are 'irrelevant', 'incomplete' constructors are easier in testing, because you get your test-object faster. 4) 'incomplete' constructors are easier when exploring classes. 5) .NET Framework designers know what they're doing. My oppinion is that 'complete' constructors are better then 'incomplete'. I read that most of you feel the same. I also am a big fan of having sufficient overloads to instantiate a class from different perspectives (ie with different arguments). Usually a constructor with many default arguments works very convenient for me. A constructor should only set what is required to get the desired behaviour. Therefore, there *is* no overhead: all arguments are NECESSARY if not, additional constructor overloads can be created. This makes items 2 and 3 IMHO less important. The use of interfaces may greately ease the construction of test-stubs. In C++, I of course mean multiple inheritance with abstract 'interface' base-classes. As for Item 1, I believe that it's very important to tell the user of a class what is expected of the user and what the class provides. One can do this in comments (XML comments in .NET are great for that, I think), but if given the choice I prefer 'self-documenting' compilable code, like argument names and types. I'm not going to comment on item 5 :-) In short, I think the advantages of 'complete' constructors outweigh the disadvantages and the disadvantages can be reduced by proper design and interfaces. my 2 cents, agb

# usenet@aristeia.com / 2006-09-11 22:08:33 -0700:
I've always been a big fan of the "don't let users create objects they can't really use" philosophy. So, in general, I don't like the idea of "empty" objects or two-phase construction. It's too easy for people to make mistakes. So, for example, if I were to design an interface for an event log that required a user-specified ostream on which to log events, I'd declare the constructor taking an ostream parameter (possibly with a default, but that's a separate question, so don't worry about it):
EventLog::EventLog(std::ostream& logstream); // an ostream must be // specified
I've slept soundly with this philosophy for many years, but lately I've noticed that this idea runs headlong into one of the ideas of testability: that classes should be easy to test in isolation. The above constructor requires that an ostream be set up before an EventLog can be tested, and this might (1) be a pain and (2) be irrelevant for whatever test I want to perform.
My take on (2) is that an object which provides operations that don't depend on its state does too much. IOW such methods should be static or another object's responsibility. There /are/ valid cases of optional behavior, looks like that's not what you're talking about.
In such cases, offering a default constructor in addition to the above would make the class potentially easier to test.
As would moving the dissociated behavior elsewhere, no?
Another thing I've noticed is that some users prefer a more exploratory style of development: they want to get something to compile ASAP and then play around with it. In particular, they don't want to be bothered with having to look up a bunch of constructor parameters in some documentation and then find ways to instantiate the parameters, they just want to create the objects they want and then play around with them.
What does "play around with them" mean? I mean, if the constructor argument is essential for the object, the game is reduced to a list of exceptions thrown from the objects' methods. If the argument isn't essential for an instance's behavior, the presence of the unrelated behavior in the class is suspicious.
My gut instinct is not to have much sympathy for this argument,
Rhymes with "Me no like manual me want buttons me be pushing!" :)
but then I read in "Framework Design Guidelines" that it's typically better to let people create "uninitialized" objects and throw exceptions if the objects are then used.
I'd like to see the rationale. What's the benefit? Constructors provide an universally understood, statically enforced, syntactically lightweight version of having no or parameterless constructors and most/all method bodies preceeded with if (!initialized_) { throw new exception("This object is broken."); } That's somewhat like OOP in C, and FMPOV a huge step back. I don't really see how this can benefit the library author who needs to write all those if (!initialized_) checks or the users (including testers) who need to ask themselves (and often dissect the code, the developers have lost track of it long ago): "What combinations of constructor arguments enable individual behaviors?" This hints at the core problem, which is the explosion in the number of states such program contains, and *that* hampers testability). This is especially damaging in dynamic languages (or with unchecked exceptions).
In fact, I took the EventLog example from page 27 of that book, where they make clear that this code will compile and then throw at runtime (I've translated from C# to C++, because the fact that the example is in C# is not relevant):
EvengLog log; log.WriteEntry("Hello World"); // throws: no log stream was set
This book is by the designers of the .NET library, so regardless of your feelings about .NET, you have to admit that they have through about this kind of stuff a lot and also have a lot of experience with library users.
I haven't read the "Framework Design Guidelines" document, does it contain any arguments in support of this guideline?
But then on the third hand I get mail like this:
However, the SqlParameter class has six constructors. Only two constructors create a sqlParameter object that can be immediately used. The others all require that you set additional properties (of course, which additional properties is unclear). Failure to prepare the SqlParameter object correctly typically generates an un-helpful database error when the SQL statement is executed.
Exactly my gripe.
So I'm confused. Constructors that "really" initialize objects detect some kind of errors during compilation, but they make testing harder, are arguably contrary to exploratory programming, and seem to contradict the advice of the designers of the .NET API. Constructors that "sort of" initialize objects are more test-friendly (also more loosely coupled, BTW) and facilitate an exploratory programming style, but defer some kinds of error detection to runtime (as well as incur the runtime time/space costs of such detection and ensuing actions).
Defunct constructors are IMO a code stink. They're a way to rape OOP (procedural code written in terms of classes is ugly and stiff). Surely they ease testing once you lump two or more classes into one, but they shouldn't be tolerated because such siamese twin classes are a basic design mistake (tightly coupled classes) driven to the extreme.
So, library users, what do you prefer, and why?
I prefer using code that's easy to understand, and that means code that has as few (nonproductive) states as possible. I dislike defunct constructors both as a library user, author, and post-factum unit test author, because in all those roles I need to keep track of things the compiler / interpreter would cover for me. The only situation I would like classes with defunct constructors is if I wanted to ridicule their author by using unit tests to demonstrate how easy it is to produce a pile of bugs using that approach. -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991

Roman Neuhauser wrote:
I'd like to see the rationale. What's the benefit?
This is excerpted from pp. 19-21, where I'm not bothering to include ellipses to show where I've elided information. So this is a set of their words (mostly full sentences) knitted together to try to show their arguments, but it doesn't show all their text, so for the full story, you need to consult the book. [Begin pseudoquote] Many developers expect to learn the basics of a new framework very quickly. By experimenting with the framework on an ad hoc basis. The initial encounter with a badly designed API can leave a lasting impression of complexity and discourage some from using the framework. This is why it is very important for frameworks to provide a very low barrier for developers who just want to experiment. Many developers experiment with an API to discover what it does and then adjust their code to get their program to do what they really want. There are several requirements that APIs must meet to be easy to experiment with: - It has to be easy to start using an API, regardless of whether it does what the developer wants it to do. A framework that requires an extensive initialization or instantiating several types and hooking them together is not easy to experiment with. - It has to be easy to find and fix mistakes resulting from incorrect usage of an API. For example, APIs should throw exceptions clearly describing what needs to be done to fix the problems. Scott

Scott Meyers
- It has to be easy to find and fix mistakes resulting from incorrect usage of an API. For example, APIs should throw exceptions clearly describing what needs to be done to fix the problems.
It's always better to detect the problems at compile-time, at least in the world of statically-checked languages. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Scott Meyers
- It has to be easy to start using an API, regardless of whether it does what the developer wants it to do. A framework that requires an extensive initialization or instantiating several types and hooking them together is not easy to experiment with.
How hard is it to set up a new stringstream for testing a component. Or, heck, a file stream? -- Dave Abrahams Boost Consulting www.boost-consulting.com

"Scott Meyers"
EvengLog log; log.WriteEntry("Hello World"); // throws: no log stream was set
I think the biggest argument for the compile-time error detection has always been something like: EvengLog log; if (some runtime condition) log.WriteEntry("Hello World"); Then, this "some runtime condition" may never be tested, and the code will throw in production. It seems to me that the worlds of OO and C++ split some time ago, and moving more and more appart. The new OO books now use Java and C# for their examples. OTOH, most of what's going on in C++ world now is related to language-specific features. I got the impression that people from objectmentor are ready to substitute compile-time error detection with testing. Of course they are familiar with the argument above, so for them it's probably not a big deal. I think neither of approaches is clearly superior, both have benefits and drawbacks, and it depends on each particular user and task which one would work better. But in general, I would vote for complete construction with default arguments and/or overloaded constructors for simplicity. Regards, Arkadiy

Scott Meyers wrote:
I have a question about library interface design in general with a strong interest in its application to C++, so I hope the moderators will view this at on-topic for a list devoted to the users' view of a set of widely used C++ libraries. In its most naked form, the question is this: how important is it that class constructors ensure that their objects are in a "truly" initialized state?
I've always been a big fan of the "don't let users create objects they can't really use" philosophy. So, in general, I don't like the idea of "empty" objects or two-phase construction. It's too easy for people to make mistakes. <snip>
I've slept soundly with this philosophy for many years, but lately I've noticed that this idea runs headlong into one of the ideas of testability: that classes should be easy to test in isolation. <snip>
It seems to me that if I am the library developer, then I will care about the ease of testing individual objects in isolation. However, if I am a library user, I really don't care about testing individual objects. (The library will either come with a test suite that I can run, or I will trust it [at least initially] because of the source) I want to use the library, and I want the public API to make it as hard as possible for me to make silly mistakes. I have been in the situation where testing required a great amount of infrastructure. It's hard to deal with, but I don't think that providing dumbed down object constructors is a very good idea because: 1) You can generally factor out the portion of your class that can be tested without the infrastructure overhead, either through inheritance or aggregation. Then testing that part is still *easy*. 2) If you're serious about testing, you're going to need all the infrastructure anyway. If the object requires an ostream to do its job, then you need to figure out how to give it an ostream
Another thing I've noticed is that some users prefer a more exploratory style of development: they want to get something to compile ASAP and then play around with it. In particular, they don't want to be bothered with having to look up a bunch of constructor parameters in some documentation and then find ways to instantiate the parameters, they just want to create the objects they want and then play around with them. My gut instinct is not to have much sympathy for this argument, but then I read in "Framework Design Guidelines" that it's typically better to let people create "uninitialized" objects and throw exceptions if the objects are then used. In fact, I took the EventLog example from page 27 of that book, where they make clear that this code will compile and then throw at runtime (I've translated from C# to C++, because the fact that the example is in C# is not relevant):
EvengLog log; log.WriteEntry("Hello World"); // throws: no log stream was set
This book is by the designers of the .NET library, so regardless of your feelings about .NET, you have to admit that they have through about this kind of stuff a lot and also have a lot of experience with library users.
Okay, so I can construct a useless object and get an exception thrown if I use it. Why is this useful to me? If I go ahead and write my code as if the object were properly constructed, then my code won't run. If I add a bunch of fake code to catch the exceptions, then I've wasted a lot of time and effort writing useless code. In a case like the example, where I know it might be hard to setup the object correctly, I'd rather have a constructor that took some bogus argument type that set the object up in "simulated success" mode, but only in my debug build. That way I could defer figuring out how to provide the real constructor arguments until I really needed the object to do what it normally does. (Sometimes you really do just want to get on with writing your code, knowing that you need to come back to solve these sorts of problems.) But if I forget to do that and never correctly construct the object, then my release build should throw in the constructor. So my code would look like this: EventLog log (true); // FIXME_rush - Use correct constructor! log.WriteEntry ("Hello World"); // Just asserts the ptr arg and the relevant constructor would look like this: EventLog::EventLog (bool dummyarg) { // This constructor sets us up in simulated success mode. #ifdef DEBUG // or whatever you use m_simulateSuccess = true; #else throw something useful #endif } and all the other constructors initialize m_simulateSuccess to false. Lastly, WriteEntry looks like this: void EventLog::WriteEntry (char const * const pLine) { assert (pLine); if (! m_simulateSuccess) { // The real code is here } } Now that I've written all that and read it over, the default constructor could be the one that sets up simulated success mode, since it's really invalid for properly constructing the object. I'll leave that as an exercise for the reader. ;-) Just one guy's opinion. Best regards, Rush

Rush Manbert wrote:
In a case like the example, where I know it might be hard to setup the object correctly, I'd rather have a constructor that took some bogus argument type that set the object up in "simulated success" mode, but only in my debug build.
I'm glad you mentioned this, because my recent knowledge of unit testing comes from books like Beck's "Test-Driven Development" and Feathers' "Working Effectively with Legacy Code" as well as countless breathless articles lauding the wonder of unit testing and showing how it's all oh-so-simple. My feeling is that it would often be convenient to have a special "test" interface, but I've never seen any discussion of doing that. I can imagine a couple of ways of doing this, one being to literally have a different interface for debug builds, another to have "test only" interface elements cordoned off somewhere by convention (similar to Boost's use of namespaces named "detail").
EventLog::EventLog (bool dummyarg) { // This constructor sets us up in simulated success mode. #ifdef DEBUG // or whatever you use m_simulateSuccess = true; #else throw something useful #endif }
Hmmm, I'd think this entire constructor would exist only in debug builds, e.g., class EventLog { public: #ifdef DEBUG EventLog(bool dummyarg); #endif Unfortunately, conditional compilation is not without its own resultant spaghetti code and concomitant maintenance headaches. Scott

Most large softwares have test event drivers that
that allow some functions/interfaces be called only
for debug enviromnets. and in the code we have
constructs like
DBG_ASSERT()
regards
divyank
--- Scott Meyers
In a case like the example, where I know it might be hard to setup the object correctly, I'd rather have a constructor
Rush Manbert wrote: that took some bogus
argument type that set the object up in "simulated success" mode, but only in my debug build.
I'm glad you mentioned this, because my recent knowledge of unit testing comes from books like Beck's "Test-Driven Development" and Feathers' "Working Effectively with Legacy Code" as well as countless breathless articles lauding the wonder of unit testing and showing how it's all oh-so-simple. My feeling is that it would often be convenient to have a special "test" interface, but I've never seen any discussion of doing that. I can imagine a couple of ways of doing this, one being to literally have a different interface for debug builds, another to have "test only" interface elements cordoned off somewhere by convention (similar to Boost's use of namespaces named "detail").
EventLog::EventLog (bool dummyarg) { // This constructor sets us up in simulated success mode. #ifdef DEBUG // or whatever you use m_simulateSuccess = true; #else throw something useful #endif }
Hmmm, I'd think this entire constructor would exist only in debug builds, e.g.,
class EventLog { public: #ifdef DEBUG EventLog(bool dummyarg); #endif
Unfortunately, conditional compilation is not without its own resultant spaghetti code and concomitant maintenance headaches.
Scott
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Scott Meyers
Hmmm, I'd think this entire constructor would exist only in debug builds, e.g.,
class EventLog { public: #ifdef DEBUG EventLog(bool dummyarg); #endif
Unfortunately, conditional compilation is not without its own resultant spaghetti code and concomitant maintenance headaches.
I'd much rather develop a library of mock objects for testing. Okay a conforming mock stream may be a little work to write, but you write it once and you're done. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
I'd much rather develop a library of mock objects for testing. Okay a conforming mock stream may be a little work to write, but you write it once and you're done.
I take this to mean that you have not found it necessary/desirable/useful to create separate test and client interfaces. Okay. My understanding is that the ability to drop in mock objects requires programming to an interface that can be either a "real" object or a mock object, which in turn suggests using either a template parameter or a base class. Suppose, for example, you have this: class BigHonkinHairyClass { ... // expensive to construct }; and you want to implement some function f that takes a BigHonkinHairyClass as a parameter. The straightforward declaration would be void f(BigHonkinHairyClass& bhhc); // maybe const, doesn't matter But now it's hard to drop in a mock. So I assume you'd modify the interface to be either template<typename T> void f(T& bhhc); // T must more or less model // BigHonkinHairyClass or this: class BigHonkinHairyBase { ... // interface to program against -- }; // uses virtuals class BigHonkinHairyClass: public BigHonkinHairyBase { ... }; class MockHonkinHairyClass: public BigHonkinHairyBase { ... }; void f(BigHonkinHairyBase& bhhc); Is that correct? In other words, you'd come up with an interface that let clients do what they wanted but that was also mock-friendly for testing purposes? That is, you'd take testability into account when designing your client interface? Thanks, Scott

On Thursday, September 14, 2006 at 17:21:17 (-0700) Scott Meyers writes:
... My understanding is that the ability to drop in mock objects requires programming to an interface that can be either a "real" object or a mock object, which in turn suggests using either a template parameter or a base class. Suppose, for example, you have this:
class BigHonkinHairyClass { ... // expensive to construct };
and you want to implement some function f that takes a BigHonkinHairyClass as a parameter. The straightforward declaration would be
void f(BigHonkinHairyClass& bhhc); // maybe const, doesn't matter
But now it's hard to drop in a mock. So I assume you'd modify the interface to be either
template<typename T> void f(T& bhhc); // T must more or less model // BigHonkinHairyClass
or this:
class BigHonkinHairyBase { ... // interface to program against -- }; // uses virtuals
class BigHonkinHairyClass: public BigHonkinHairyBase { ... };
class MockHonkinHairyClass: public BigHonkinHairyBase { ... };
void f(BigHonkinHairyBase& bhhc);
Is that correct? In other words, you'd come up with an interface that let clients do what they wanted but that was also mock-friendly for testing purposes? That is, you'd take testability into account when designing your client interface?
That all depends: I think you reach a point where intrusive testing is rather costly (construction costs, etc.) and external testing (Tcl-based expect scripts, perhaps) is a better approach. This is essentially the approach I have taken over the years. At some point, the unit tests become hopelessly complex, Mock objects begin to weigh development down (if you change the interface, you have to change your mock objects) and a transition is made to higher-level integrated testing. However, I think your points above are essentially correct. Bill

Bill Lear wrote:
This is essentially the approach I have taken over the years. At some point, the unit tests become hopelessly complex, Mock objects begin to weigh development down (if you change the interface, you have to change your mock objects) and a transition is made to higher-level integrated testing.
I assume you always need integrated testing in addition to unit tests, but it sounds like you're saying that at some point, maintaining the unit test framework stops paying for itself, so you abandon it. Is that correct? Scott

On Thursday, September 14, 2006 at 20:52:09 (-0700) Scott Meyers writes:
Bill Lear wrote:
This is essentially the approach I have taken over the years. At some point, the unit tests become hopelessly complex, Mock objects begin to weigh development down (if you change the interface, you have to change your mock objects) and a transition is made to higher-level integrated testing.
I assume you always need integrated testing in addition to unit tests, but it sounds like you're saying that at some point, maintaining the unit test framework stops paying for itself, so you abandon it. Is that correct?
No, just that at some point of complexity, we decide to switch some of the tests "upward". So, let's say you have a project divided into a hierarchy: A B C Where A is lowest level of complexity (basic library code), B middle layer, C upper layer. We might have 95% unit test coverage at A, 85% at B, and 75% at C, with the remainder "covered" by non-unit, externally-driven tests. Bill

Scott Meyers wrote:
David Abrahams wrote:
I'd much rather develop a library of mock objects for testing. Okay a conforming mock stream may be a little work to write, but you write it once and you're done.
I take this to mean that you have not found it necessary/desirable/useful to create separate test and client interfaces. Okay.
My understanding is that the ability to drop in mock objects requires programming to an interface that can be either a "real" object or a mock object, which in turn suggests using either a template parameter or a base class. Suppose, for example, you have this:
class BigHonkinHairyClass { ... // expensive to construct };
and you want to implement some function f that takes a BigHonkinHairyClass as a parameter. The straightforward declaration would be
void f(BigHonkinHairyClass& bhhc); // maybe const, doesn't matter
But now it's hard to drop in a mock. So I assume you'd modify the interface to be either
template<typename T> void f(T& bhhc); // T must more or less model // BigHonkinHairyClass
or this:
class BigHonkinHairyBase { ... // interface to program against -- }; // uses virtuals
class BigHonkinHairyClass: public BigHonkinHairyBase { ... };
class MockHonkinHairyClass: public BigHonkinHairyBase { ... };
void f(BigHonkinHairyBase& bhhc);
Is that correct? In other words, you'd come up with an interface that let clients do what they wanted but that was also mock-friendly for testing purposes? That is, you'd take testability into account when designing your client interface?
That's about what I've found best when building well-tested code, and it fortunately usually also leads to cleaner designs with little coupling that compile nice and quickly even given current slow C++ compilers. -- James

On 9/14/06, Scott Meyers
My understanding is that the ability to drop in mock objects requires programming to an interface that can be either a "real" object or a mock object, which in turn suggests using either a template parameter or a base class. Suppose, for example, you have this:
class BigHonkinHairyClass { ... // expensive to construct };
and you want to implement some function f that takes a BigHonkinHairyClass as a parameter. The straightforward declaration would be
void f(BigHonkinHairyClass& bhhc); // maybe const, doesn't matter
But now it's hard to drop in a mock. So I assume you'd modify the interface to be either
template<typename T> void f(T& bhhc); // T must more or less model // BigHonkinHairyClass
or this:
class BigHonkinHairyBase { ... // interface to program against -- }; // uses virtuals
class BigHonkinHairyClass: public BigHonkinHairyBase { ... };
class MockHonkinHairyClass: public BigHonkinHairyBase { ... };
void f(BigHonkinHairyBase& bhhc);
Is that correct? In other words, you'd come up with an interface that let clients do what they wanted but that was also mock-friendly for testing purposes? That is, you'd take testability into account when designing your client interface?
I design interfaces like that, but not for testing - it's called the Inversion Principle. How often do you really need a BigHonkinHairyClass ? Often, you really just need a subset of its interface. So define that sub-interface. Sometimes use templates, sometimes inheritance, etc. Maybe that doesn't apply for every function f() (ie a different interface for each function!?), but I thought the Inversion Principle would be worth mentioning; YMMV. Tony

Scott Meyers wrote:
David Abrahams wrote:
I'd much rather develop a library of mock objects for testing. Okay a conforming mock stream may be a little work to write, but you write it once and you're done.
I take this to mean that you have not found it necessary/desirable/useful to create separate test and client interfaces. Okay.
My understanding is that the ability to drop in mock objects requires programming to an interface that can be either a "real" object or a mock object, which in turn suggests using either a template parameter or a base class. Suppose, for example, you have this:
class BigHonkinHairyClass { ... // expensive to construct };
and you want to implement some function f that takes a BigHonkinHairyClass as a parameter. The straightforward declaration would be
void f(BigHonkinHairyClass& bhhc); // maybe const, doesn't matter
But now it's hard to drop in a mock. So I assume you'd modify the interface to be either
template<typename T> void f(T& bhhc); // T must more or less model // BigHonkinHairyClass
or this:
class BigHonkinHairyBase { ... // interface to program against -- }; // uses virtuals
class BigHonkinHairyClass: public BigHonkinHairyBase { ... };
class MockHonkinHairyClass: public BigHonkinHairyBase { ... };
void f(BigHonkinHairyBase& bhhc);
Is that correct? In other words, you'd come up with an interface that let clients do what they wanted but that was also mock-friendly for testing purposes? That is, you'd take testability into account when designing your client interface?
This situation should be rare in well-designed code. Either f requires a BigHonkingHairyClass in order to work - by this I mean the precise semantics of BHHC - and it has to be tested using a BHHC; you can't substitute a mock that emulates a BHHC perfectly because it will be a BHHC itself. Or, f doesn't really require a BHHC, and it should be rewritten in one of the two ways above. This allows client A to pass a BHHC and client B to pass something else. From the library design PoV, it doesn't matter whether client B is a test suite or just another module. It is true that in many projects, the lower layers aren't designed as a proper library, since they don't have to serve arbitrary client code. In such a case, having a test suite as a second client can indeed lead to problems like the above. :-) Another angle is that tests should test the behavior that is exercised by the application. If the application uses f with a BHHC, the tests should test f with a BHHC. Testing f with a mock can find some errors, but may easily miss others. This can be substituted by defining a rigorous interface for BHHCs and testing both f and BHHC against that interface, of course, which gets us back to one of the two refactorings given above. (But the f+actual BHHC test should still be part of the suite, IMO.)

"Peter Dimov"
Scott Meyers wrote:
David Abrahams wrote:
I'd much rather develop a library of mock objects for testing. Okay a conforming mock stream may be a little work to write, but you write it once and you're done.
I take this to mean that you have not found it necessary/desirable/useful to create separate test and client interfaces. Okay.
<snip>
This situation should be rare in well-designed code. Either f requires a <snip again>
What he said :) -- Dave Abrahams Boost Consulting www.boost-consulting.com

Scott Meyers wrote:
Rush Manbert wrote:
In a case like the example, where I know it might be hard to setup the object correctly, I'd rather have a constructor that took some bogus argument type that set the object up in "simulated success" mode, but only in my debug build.
I'm glad you mentioned this, because my recent knowledge of unit testing comes from books like Beck's "Test-Driven Development" and Feathers' "Working Effectively with Legacy Code" as well as countless breathless articles lauding the wonder of unit testing and showing how it's all oh-so-simple.
LOL - "oh-so-simple" breaks down pretty quickly in any large system, doesn't it? My feeling is that it would often be convenient to have a
special "test" interface, but I've never seen any discussion of doing that. I can imagine a couple of ways of doing this, one being to literally have a different interface for debug builds, another to have "test only" interface elements cordoned off somewhere by convention (similar to Boost's use of namespaces named "detail").
I have done this in the past. Here's a real world example: The application was for a storage virtualization controller. There were two separate processors running different software. One handled the virtualization and client services, while the other actually knew how to do I/O to the storage devices. My subsystem was a service that did copies of various flavors (entire logical units, etc.). Since it couldn't actually do any I/O, it could send messages to the other processor that created, started, stopped, etc. a copy utility task. In the real case, the utility would send events to my service to let it know how many blocks had been copied, if an error occurred, etc. There could be many instances of copy processes/utilities running concurrently and independently. I had to test the top level service without any support from the copy utility task on the other processor. (You may wonder why. Let's just say that there were two separate groups that developed software for the different processors, and we had somewhat different ideas regarding testability. My group also had a version of our code that ran on Windows and I needed to be able to test it there.) This meant that I needed to simulate the event stream from the utility task. I also had to be able to force errors (again by simulating events). Needless to say, this required a fairly extensive testing API, plus a notion within the objects that implemented the service that it could be running in "test mode". I think it took as much or more work to develop the test interface and the test drivers as it took to develop the service itself, but it was completely worth it. Especially when there was a problem and we needed to sort out whether it was "our" code or "their" code. ;-) Also, since I had this capability, the Windows version of the service could be driven by our management UI, and the copy processes would appear to make progress, stop, start, and complete. The management UI couldn't tell that they were "fake", so that group could use it to test their software. It was really cool, but the reasons that it was even possible was that a testing subsystem was built into our code, and we were required to have complete tests for our subsystems, and I had to consider how to test my subsystem from day one. In fact, our system shipped with the test subsystem included. It was not readily accessible, of course, but was really useful in some cases where we needed to test something on an installed system. This sort of capability in the field can be a real saving grace in an embedded system.
EventLog::EventLog (bool dummyarg) { // This constructor sets us up in simulated success mode. #ifdef DEBUG // or whatever you use m_simulateSuccess = true; #else throw something useful #endif }
Hmmm, I'd think this entire constructor would exist only in debug builds, e.g.,
class EventLog { public: #ifdef DEBUG EventLog(bool dummyarg); #endif
Unfortunately, conditional compilation is not without its own resultant spaghetti code and concomitant maintenance headaches.
Oh, sure. Get the error at compile time if you're going to go this way. The conditional compilation stuff is a definite problem to be avoided if possible. It's hard to avoid if you want to implement a really comprehensive test interface, but does seem less desirable in order to support "exploratory" programming (which I think was the original context here). - Rush

Rush Manbert wrote:
Needless to say, this required a fairly extensive testing API, plus a notion within the objects that implemented the service that it could be running in "test mode".
Was the ability to put an object into test mode part of the client-visible API, or did you somehow have a testing API that clients could not access?
In fact, our system shipped with the test subsystem included. It was not readily accessible, of course, but was really useful in some cases where we needed to test something on an installed system. This sort of capability in the field can be a real saving grace in an embedded system.
This makes it sound like the testing API was in fact visible to clients, but, by convention, it was never used for non-test apps. Is this correct? If so, that's different from, for example, test-only APIs that exist only for debug builds, i.e., a truly separate API that non-test clients can't get at. (I'm not arguing that such an approach is better than what you did, I'm simply trying to understand what you did.) To clarify my interest, I recently sent this to somebody who send me private mail on this thread:
My interest here is not in testing, it's in good design, and good designs facilitate testing. Which means we need to be able to describe how testability affects other design desiderata, such as compile-time error detection, encapsulation, and overly general interfaces. There is a ton of recent literature on testing and testability, but virtually none of it addresses how making something more testable may be in tension with other characteristics we'd like. This thread in Boost is part of my attempt to figure out how the various pieces of the puzzle fit together -- or if they do at all.
Scott

Scott Meyers wrote:
Rush Manbert wrote:
Needless to say, this required a fairly extensive testing API, plus a notion within the objects that implemented the service that it could be running in "test mode".
Was the ability to put an object into test mode part of the client-visible API, or did you somehow have a testing API that clients could not access?
In this case, the only way to interact with the service was through a message interface. In essence, the service was passed a message object. It was also an embedded system, so we were not only the service provider, but we also wrote the client. (I realize that we have strayed from the original question of library design. I'm happy to take this off list if anyone is offended.) The messages that could put the object in test mode were part of the client visible API. Additionally, the service always operated in test mode when it was built as part of the Windows executable, because then there was no choice but to simulate the events that were normally generated by the lower level code. This is why the management UI (our client) always thought that a copy process was real. It couldn't tell the difference between the real thing and the test mode behavior.
In fact, our system shipped with the test subsystem included. It was not readily accessible, of course, but was really useful in some cases where we needed to test something on an installed system. This sort of capability in the field can be a real saving grace in an embedded system.
This makes it sound like the testing API was in fact visible to clients, but, by convention, it was never used for non-test apps. Is this correct? If so, that's different from, for example, test-only APIs that exist only for debug builds, i.e., a truly separate API that non-test clients can't get at. (I'm not arguing that such an approach is better than what you did, I'm simply trying to understand what you did.)
To clarify my interest, I recently sent this to somebody who send me private mail on this thread:
My interest here is not in testing, it's in good design, and good designs facilitate testing. Which means we need to be able to describe how testability affects other design desiderata, such as compile-time error detection, encapsulation, and overly general interfaces. There is a ton of recent literature on testing and testability, but virtually none of it addresses how making something more testable may be in tension with other characteristics we'd like. This thread in Boost is part of my attempt to figure out how the various pieces of the puzzle fit together -- or if they do at all.
I'm not sure how applicable this is to design of a library such as Boost, but my experience has been that having test interfaces available at the subsystem level in the release code version is a very useful thing, even if they are visible to clients. You need to be careful with this, and you really need to protect clients from getting into test mode accidentally, but when you are debugging a large, complex system that may have very limited debugging capabilities (I spent 25 years developing embedded systems, so that's where this viewpoint comes from) it can be very very useful to be able to isolate your subsystems. You can usually only do that if they have been designed so that they can be tested in isolation. I also believe that you often need these sorts of interfaces so that you can force error conditions, especially in a heavily layered system. This lets me test that my subsystem handles errors correctly, but it also allows me to test the error propagation paths out of my subsystem and into the layers above it. I made a little Xcode project that illustrates one way you could approach making objects that have a test mode. I have attached the code and header files to this email. In this case, MyObject has two constructors, each of which takes an initialization object as an argument. One of them takes a "normal" initializer object, while the other takes a "test" initializer. The object is in test mode if you construct it with the second form. There is also a public method that can force a test mode behavior, but only for a test mode object. So my test API is visible to clients. However, if I don't distribute MyObjectInitializerForTest.h, then clients cannot construct a test initializer object, and therefore can't put the object in test mode. Of course, they can see how MyObjectInitializer was declared, so they could figure out how to declare and define a MyObjectInitializerForTest object, but it seems to me that the barrier is sufficiently high that there won't be much of that going on. I took this approach in order to mimic the original case that I described. In that case, the initializer objects were the "normal" or "test" messages. I know that there are slicker ways to do this sort of thing, but this illustrates the basic idea. - Rush /* * MyObjectInitializerForTest.h * */ #ifndef MyObjectInitializerForTest_H #define MyObjectInitializerForTest_H #include "MyObjectInitializerBase.h" class MyObjectInitializerForTest: public MyObjectInitializerBase { public: MyObjectInitializerForTest (int a, bool b) : MyObjectInitializerBase (a,b) {}; ~MyObjectInitializerForTest (void) {}; }; #endif //MyObjectInitializerForTest_H /* * MyObject.cpp * */ #include <iostream> #include "myObject.h" #include "myObjectInitializerForTest.h" MyObject::MyObject (MyObjectInitializer const initializer) : m_testMode (false) , m_a (initializer.m_a) , m_b (initializer.m_b) { testingMemberDataInit (); } MyObject::MyObject (MyObjectInitializerForTest const initializer) : m_testMode (true) , m_a (initializer.m_a) , m_b (initializer.m_b) { testingMemberDataInit (); } MyObject::~MyObject (void) { return; } bool MyObject::methodWithTestModeBehavior (void) { if (m_testMode) { std::cout << "MyObject::methodWithTestModeBehavior: test mode is enabled!"; if (m_doForceResultOfCallToMethodWithTestModeBehavior) { // The return value for this call is forced this time only std::cout << " Forced return value: " << (m_forcedResultOfCallToMethodWithTestModeBehavior ? "true" : "false") << "\n"; m_doForceResultOfCallToMethodWithTestModeBehavior = false; return m_forcedResultOfCallToMethodWithTestModeBehavior; } std::cout << "\n"; } else { std::cout << "MyObject::methodWithTestModeBehavior: test mode is DISABLED.\n"; } // Normal return here - whatever m_b contains return m_b; } void MyObject::forTestingSetResultOfNextCallToMethodWithTestBehavior (bool result) { if (m_testMode) { m_doForceResultOfCallToMethodWithTestModeBehavior = true; m_forcedResultOfCallToMethodWithTestModeBehavior = result; } else { std::cout << "MyObject::forTestingSetResultOfNextCallToMethodWithTestBehavior: Ignored in normal mode!\n"; } } void MyObject::testingMemberDataInit (void) { m_doForceResultOfCallToMethodWithTestModeBehavior = false; m_forcedResultOfCallToMethodWithTestModeBehavior = false; } #include <iostream> #include "MyObject.h" #include "MyObjectInitializerForTest.h" int main (int argc, char * const argv[]) { // insert code here... std::cout << "Hello, World!\n\n"; MyObject normalModeObj (MyObjectInitializer::MyObjectInitializer (1,true)); MyObject testModeObject (MyObjectInitializerForTest::MyObjectInitializerForTest (2, false)); bool result; result = normalModeObj.methodWithTestModeBehavior(); std::cout << "Call returned: " << (result ? "true" : "false") << "\n\n"; result = testModeObject.methodWithTestModeBehavior(); std::cout << "Call returned: " << (result ? "true" : "false") << "\n\n"; // Try to force next return value on normalModeObject (fails) normalModeObj.forTestingSetResultOfNextCallToMethodWithTestBehavior(false); result = normalModeObj.methodWithTestModeBehavior(); std::cout << "Call returned: " << (result ? "true" : "false") << "\n\n"; // Force retuurn value on testModeObject for the next call to methodWithTestModeBehavior() testModeObject.forTestingSetResultOfNextCallToMethodWithTestBehavior(true); result = testModeObject.methodWithTestModeBehavior(); std::cout << "Call returned: " << (result ? "true" : "false") << "\n\n"; result = testModeObject.methodWithTestModeBehavior(); std::cout << "Call returned: " << (result ? "true" : "false") << "\n\n"; return 0; } /* * MyObjectInitializerBase.h * */ #ifndef MyObjectInitializerBase_H #define MyObjectInitializerBase_H class MyObjectInitializerBase { public: MyObjectInitializerBase (int a, bool b) : m_a (a), m_b(b) {}; ~MyObjectInitializerBase (void) {}; int m_a; int m_b; }; #endif //MyObjectInitializerBase_H /* * MyObject.h * */ #ifndef MyObject_H #define MyObject_H #include "MyObjectInitializerBase.h" class MyObjectInitializer: public MyObjectInitializerBase { public: MyObjectInitializer (int a, bool b) : MyObjectInitializerBase (a,b) {}; ~MyObjectInitializer (void) {}; }; class MyObjectInitializerForTest; class MyObject { public: // Normal constructor MyObject (MyObjectInitializer const initializer); // Test Mode constructor MyObject (MyObjectInitializerForTest const initializer); ~MyObject (void); bool methodWithTestModeBehavior (void); // Testing API void forTestingSetResultOfNextCallToMethodWithTestBehavior (bool result); private: void testingMemberDataInit (void); bool m_testMode; int m_a; bool m_b; // Member data used to control testing bool m_doForceResultOfCallToMethodWithTestModeBehavior; bool m_forcedResultOfCallToMethodWithTestModeBehavior; }; #endif //MyObject_H

Scott Meyers wrote:
I have a question about library interface design in general with a strong interest in its application to C++, so I hope the moderators will view this at on-topic for a list devoted to the users' view of a set of widely used C++ libraries. In its most naked form, the question is this: how important is it that class constructors ensure that their objects are in a "truly" initialized state?
Very. Back in the mists of programming history, which you certainly remember, on large scale projects 'uninitialized variables' where a huge source of errors in programs. This is much less true today -- even though it is certainly still possible with the languages we are using. Of course compilers warn you so there's no excuse now. Having objects that construct completely is analogous, but maybe even one step up the foodchain.
I've always been a big fan of the "don't let users create objects they can't really use" philosophy. So, in general, I don't like the idea of "empty" objects or two-phase construction. It's too easy for people to make mistakes.
Exactly. And I believe if you give them this option so it is 'easier to learn' then you will be 'training them' to make mistakes. The real issue, for me at least, is in being able to eliminate incorrect behavior of a 'partially constructed object' in a 10 million line of code program that I can't possibly understand fully. Let's do a small thought experiment. Suppose a program that has been running in production for years starts malfunctioning. I have a stack trace that shows me where something goes wrong, but it only happens infrequently -- that is, all tests pass and geez it's been working for years in the field without problems. So I have to go into detective mode to figure out what's happening. To get from 10 million LOC to say 100K LOC I simply look at the stack trace and see where the failure is happening. Then I can start looking at the objects involved and see what possible failure modes are consistent with the program behavior. Now, if any class in the trace supports a default constructor that can lead to exceptions on later access I have to consider the possibility that the object isn't constructed correctly. This may lead me chasing across thousands of lines of code -- even to different programs if say one program generates data and another uses it. If I know that this is impossible because of the class design it eliminates many failure modes and hence lines of code from consideration. At the end of the story I believe this is much more important because one latent bug like this can cost many thousands of dollars to track down.
So, for example, if I were to design an interface for an event log that required a user-specified ostream on which to log events, I'd declare the constructor taking an ostream parameter (possibly with a default, but that's a separate question, so don't worry about it):
EventLog::EventLog(std::ostream& logstream); // an ostream must be // specified
I've slept soundly with this philosophy for many years, but lately I've noticed that this idea runs headlong into one of the ideas of testability: that classes should be easy to test in isolation. The above constructor requires that an ostream be set up before an EventLog can be tested, and this might (1) be a pain and (2) be irrelevant for whatever test I want to perform. In such cases, offering a default constructor in addition to the above would make the class potentially easier to test. (In the general case, there might be many parameters, and they might themselves be hard to instantiate for a test due to required parameters that their constructors require....)
Testing in total isolation is a myth. To be a little softer -- it's really going to depend on the type of class you are testing whether or not it can be rationally tested in isolation. If you haven't lately, you should re-read Lakos's treatment of this subject in Large Scale C++ Software Design. This book is 10 years old, but he breaks down testability in a way I've not seen anyone else do since doing testing became all the rage. Most of the 'test first' stuff I've seem ignores the inherent untestability of some software. In the EventLog case, I would be totally unconcerned about requiring the ostream -- it's an incredibly stable and well tested library. A 'level 0' component in Lakos lexicon. The issue for testing is more serious when the dependency is on JoesCustomAndEverEvolving class. Here it's a 'problem' since not only do I need to use it, but if its changing frequently it might break my tests. But depending on the component I'm building it might be unreasonable to build a stand-in -- in fact most of the time I think stubs are a waste. Anyway, I can't think of a case where losing the correctness benefit of complete construction will truly help simplify the overall testing effort. One last point. Don't forget that you may have made EventLog testing harder since you now have to add all the tests for the 'incomplete construction' cases to your test suite. And you still have to write tests against the 'full up' scenarios. At least if you are going to perform good coverage...
Another thing I've noticed is that some users prefer a more exploratory style of development: they want to get something to compile ASAP and then play around with it. In particular, they don't want to be bothered with having to look up a bunch of constructor parameters in some documentation and then find ways to instantiate the parameters, they just want to create the objects they want and then play around with them. My gut instinct is not to have much sympathy for this argument,
I have zero sympathy. If you want to build stable and large software systems then they need to get serious about correctness.
but then I read in "Framework Design Guidelines" that it's typically better to let people create "uninitialized" objects and throw exceptions if the objects are then used. In fact, I took the EventLog example from page 27 of that book, where they make clear that this code will compile and then throw at runtime (I've translated from C# to C++, because the fact that the example is in C# is not relevant):
EvengLog log; log.WriteEntry("Hello World"); // throws: no log stream was set
This book is by the designers of the .NET library, so regardless of your feelings about .NET, you have to admit that they have through about this kind of stuff a lot and also have a lot of experience with library users.
No opinion on the framework, but when I'm 'exploring' I most often want to see how I would use a library to write real production code, because more than likely that's what I'm doing. My 'exploration' is more than likely a few hundred line program to sort out how the interfaces work. If I encountered the above code example above, I'd abandon the library as unusable (assuming I had an option). If I couldn't abandon I'd write a wrapper with a default stream to initialize to ensure I didn't make that mistake. And by the way, using defaults or writing initialization methods or classes that provide common defaults is a nice way of making exploring easier. The library can supply these in the example code.
But then on the third hand I get mail like this:
The .NET libraries have many objects with many constructors that leave the constructed object in a not ready-to-use state.
An example: System.Data.SqlClient.SqlParameter is a class that describes a bound parameter used in a database statement. Bound parameters are essential to prevent SQL injection attacks. They should be exceedingly easy to use since the "competition" (string concatenation of parameters into the SQL statement) is easy, well understood, and dangerous.
However, the SqlParameter class has six constructors. Only two constructors create a sqlParameter object that can be immediately used. The others all require that you set additional properties (of course, which additional properties is unclear). Failure to prepare the SqlParameter object correctly typically generates an un-helpful database error when the SQL statement is executed. To add to the confusion, the first ctor shown by intellisense has 10 parameters (which, if set correctly, will instantiate a usable object). The last ctor shown by intellisense has only 2 parameters and is the most intuitive choice. The four in between are all half-baked. It's confusing, and even though I use it all the time, I still have to look at code snippets to remember how.
This seems like a failure of design focus to me. If the 'big constructor' can actually detect a failure at the point of construction it should throw an exception then. Having said all this, the SqlParameter class might be an example of a 'GOF builder' (I didn't look it up so I'm not sure) where the main purpose is to gradually build a more complex object. In which case, I would tend to eliminate all the constructors making the initial state always be 'null' or empty. Then the user would have to call a series of methods to build up the sql command and there might be an explicit call to 'validate' once that process is complete. This would be a case, as opposed to EventLogger, where 'full initialization' on construction might just confuse the purpose of the class.
So I'm confused. Constructors that "really" initialize objects detect some kind of errors during compilation, but they make testing harder, are arguably contrary to exploratory programming, and seem to contradict the advice of the designers of the .NET API. Constructors that "sort of" initialize objects are more test-friendly (also more loosely coupled, BTW)
I don't agree that they are more loosely coupled as ultimately you will still need to supply either a stub or the actual class to do something of use -- certainly in the EventLogger you won't be able to write many tests without setting the i/o stream.
and facilitate an exploratory programming style, but defer some kinds of error detection to runtime (as well as incur the runtime time/space costs of such detection and ensuing actions).
So, library users, what do you prefer, and why?
Your original wisdom was correct -- in most cases I want construction to guarantee a complete object. Jeff

Jeff Garland wrote:
Testing in total isolation is a myth. To be a little softer -- it's really going to depend on the type of class you are testing whether or not it can be rationally tested in isolation. If you haven't lately, you should re-read Lakos's treatment of this subject in Large Scale C++ Software Design. This book is 10 years old, but he breaks down testability in a way I've not seen anyone else do since doing testing became all the rage. Most of the 'test first' stuff I've seem ignores the inherent untestability of some software.
That's been my impression. One of the things I've been trying to figure out wrt the whole testing hoopla is how well it translates to large projects and how it has to be adjusted when things move beyond toy examples. And yes, I probably should go back and reread Lakos. Scott

Scott Meyers wrote:
Jeff Garland wrote:
Testing in total isolation is a myth. To be a little softer -- it's really going to depend on the type of class you are testing whether or not it can be rationally tested in isolation. If you haven't lately, you should re-read Lakos's treatment of this subject in Large Scale C++ Software Design. This book is 10 years old, but he breaks down testability in a way I've not seen anyone else do since doing testing became all the rage. Most of the 'test first' stuff I've seem ignores the inherent untestability of some software.
That's been my impression. One of the things I've been trying to figure out wrt the whole testing hoopla is how well it translates to large projects and how it has to be adjusted when things move beyond toy examples. And yes, I probably should go back and reread Lakos.
Well, the testing hoopla 'applies' to the extent that in my experience big systems that *don't* have significant testing discipline never see the light of day. That is, they fail under an avalanche of integration and basic execution problems before ever being fielded. As an aside, I always get a good laugh out of all the agonizing by various folks over how this and that testing technique that they've *recently discovered* on a 15 person project applies to large systems. Big systems have been using these approaches for years...or they failed. Now, that's not to say that the level of rigor advised by many of the test-first proponents really happens on big projects either. Is it economical to spend time writing code to check a 'getter'/'setter' interface that will just obviously work? The answer is no. In fact, the testing you can avoid, just like the coding you can avoid, is really a big part of successful big system development. From my experience the best-practice of testing depends on what the code is used for and what else depends on it. If it's a widely used library (say date-time to pick one :) you want it to be very well unit tested because thousands of LOC will depend on it. Every time you modify it you have to retest a large amount of code. It also turns out to be easy to unit test because it doesn't depend on much. On the other hand, take the case of a user interface which has no other code that depends on it -- my advice is to skip most of the unit and automated tests. For one thing, it's very hard to write useful test code. For another, a human can see in 1 second what a machine can never see (ugly layout, poor interaction usability, etc). Since testing at the 'top level' of the architecture depends on basically all the other software in the system it tends to change rapidly -- people can quickly adjust to the fact that the widgets moved around on the screen, test programs tend to be fragile to these sort of changes. And finally, since no other code depends on this code it isn't worth the time -- you can chance it at will. Bottom line is that not all code is created equal w.r.t to the need or ease of testing. Of course the landscape isn't static either -- some good things have happened. One thing that's really changed is that the test first/XP/Agile folks have managed to convince developers that they actually need to execute their code before they deliver -- a good thing. This often wasn't common practice 10 years ago. Also, developers have more and more pre-tested code to pull off the shelf -- better libraries and less low level code to write and test. Even with all that, I still say testing isn't enough because I know that even the stuff that's *easy* to test will have gaps. There are literally thousands of Boost date-time tests (2319 'asserts' to be exact) that run in the regression every day, but I don't believe for a minute that the library is bug-free or can't be the source of bugs in other code. As an example of the latter, initially the date class had no default constructor and it is built to guarantee that you can't construct an invalid date. It's also an immutable type, so you can't set parts of a date to make an invalid one (you can assign, but you have to go thru checks to do that). I wanted these properties so that I could pass dates around in interfaces and wouldn't have to 'check' the precondition that a date is valid when I go to use it. All good, except that dates also allowed 'not_a_date_time', +infinity, and -infinity as a valid values. So if you call date::year() on something that's set to not_a_date_time the results are undefined. Now it's trivial to write some 'incorrect' code and a bunch of tests that will always work: void f(const date& d) { int year = d.year(); //oops....fails in some cases } should really always be: if (d.is_special()) { //do something here } else {.... int year = d.year() So going back to the default constructor, I eventually added one that constructs to not_a_date_time after many users requested it. Mostly for use in collections that need this. A very logical choice for default, but my worry all along was that people would make the mistake above. That is, now instead of being forced to think about putting some sort of correct date value or using not_a_date_time explicitly: date d(not_a_date_time); they can just say date d; Aside from the obvious loss of readability, I worried that with just these few lines of code the correctness of a larger program can be undermined by failing to check the special states. So far, I'm not aware of anyone having an issue with this in a large program, but I'd be shocked if someone didn't create a bug this way eventually. It's trivial to write and test code that always uses 'valid dates', ship it, and everything will work fine. Then one day someone else will unknowingly make a call using a default constructed date and 'boom' a function that's been working fine and is fully 'tested' will blow up with unexpected results. So, is it the right set of design decisions? I don't know, but there's clearly a tension between correctness, 'ease of use', and overall applicability. My take on the EventLogger example is that it's the wrong set of choices. There's very little valid use of the object without the stream. The stream is a low-level stable library that all programmers should know anyway. It's wide open to creating runtime errors that are not localized, and it's low level library that I would expect to use all over in a program. So I'd want the number of error modes to be as small as possible, because I'm certain they won't be writing code to test all the cases.... Jeff

Jeff Garland
Well, the testing hoopla 'applies' to the extent that in my experience big systems that *don't* have significant testing discipline never see the light of day. That is, they fail under an avalanche of integration and basic execution problems before ever being fielded. As an aside, I always get a
<snip great post> Great post, Jeff! I learned a lot from it; thanks. -- Dave Abrahams Boost Consulting www.boost-consulting.com

# jeff@crystalclearsoftware.com / 2006-09-14 10:57:36 -0700:
Is it economical to spend time writing code to check a 'getter'/'setter' interface that will just obviously work? The answer is no. In fact, the testing you can avoid, just like the coding you can avoid, is really a big part of successful big system development.
I often write such tests because I know I tend to make this kind of mistakes (unfinished editing after copy/paste): struct astruct { int x() { return x_; } int y() { return y_; } int z() { return x_; } private: int x_, y_, z_; } Hoping that I'll spot these while checking the code after I add it proved to be hopeless, and these microscope tests pinpoint the problem. Otherwise I need to track the bug from the results of higher level tests, which may usually takes (me) more time. -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991

The .NET libraries have many objects with many constructors that leave
I am just curios, why nobody considers policy based design. It is possible to write a class template which would accept an initialization policy and possibly a default constructor. In this case a call to static member of the policy class is responsible to provide a valid ostream instance where the output is done. In my opinion this approach would work similar as Allocator passed class to all STL container types. It is also possible to pass a default ostream policy provider as default allocator is passed. This approach forces users to submit valid initialization interface, without additional constructor arguments. This can also be hidden by using some typedefs in some library header files. The best example for this is std::string. I do not think that approaches of .NET and Java are appropriate at this point. C++ relies on static typing where .NET and Java postpones everything to runtime. With Kind Regards, Ovanes Markarian -----Original Message----- From: Scott Meyers [mailto:usenet@aristeia.com] Sent: Tuesday, September 12, 2006 07:09 To: boost-users@lists.boost.org Subject: [Boost-users] Library Interface Design I have a question about library interface design in general with a strong interest in its application to C++, so I hope the moderators will view this at on-topic for a list devoted to the users' view of a set of widely used C++ libraries. In its most naked form, the question is this: how important is it that class constructors ensure that their objects are in a "truly" initialized state? I've always been a big fan of the "don't let users create objects they can't really use" philosophy. So, in general, I don't like the idea of "empty" objects or two-phase construction. It's too easy for people to make mistakes. So, for example, if I were to design an interface for an event log that required a user-specified ostream on which to log events, I'd declare the constructor taking an ostream parameter (possibly with a default, but that's a separate question, so don't worry about it): EventLog::EventLog(std::ostream& logstream); // an ostream must be // specified I've slept soundly with this philosophy for many years, but lately I've noticed that this idea runs headlong into one of the ideas of testability: that classes should be easy to test in isolation. The above constructor requires that an ostream be set up before an EventLog can be tested, and this might (1) be a pain and (2) be irrelevant for whatever test I want to perform. In such cases, offering a default constructor in addition to the above would make the class potentially easier to test. (In the general case, there might be many parameters, and they might themselves be hard to instantiate for a test due to required parameters that their constructors require....) Another thing I've noticed is that some users prefer a more exploratory style of development: they want to get something to compile ASAP and then play around with it. In particular, they don't want to be bothered with having to look up a bunch of constructor parameters in some documentation and then find ways to instantiate the parameters, they just want to create the objects they want and then play around with them. My gut instinct is not to have much sympathy for this argument, but then I read in "Framework Design Guidelines" that it's typically better to let people create "uninitialized" objects and throw exceptions if the objects are then used. In fact, I took the EventLog example from page 27 of that book, where they make clear that this code will compile and then throw at runtime (I've translated from C# to C++, because the fact that the example is in C# is not relevant): EvengLog log; log.WriteEntry("Hello World"); // throws: no log stream was set This book is by the designers of the .NET library, so regardless of your feelings about .NET, you have to admit that they have through about this kind of stuff a lot and also have a lot of experience with library users. But then on the third hand I get mail like this: the constructed object in a not ready-to-use state.
An example: System.Data.SqlClient.SqlParameter is a class that describes a bound
parameter used in a database statement. Bound parameters are essential to prevent SQL injection attacks. They should be exceedingly easy to use since the "competition" (string concatenation of parameters into the SQL statement) is easy, well understood, and dangerous.
However, the SqlParameter class has six constructors. Only two
constructors create a sqlParameter object that can be immediately used. The others all require that you set additional properties (of course, which additional properties is unclear). Failure to prepare the SqlParameter object correctly typically generates an un-helpful database error when the SQL statement is executed. To add to the confusion, the first ctor shown by intellisense has 10 parameters (which, if set correctly, will instantiate a usable object). The last ctor shown by intellisense has only 2 parameters and is the most intuitive choice. The four in between are all half-baked. It's confusing, and even though I use it all the time, I still have to look at code snippets to remember how. So I'm confused. Constructors that "really" initialize objects detect some kind of errors during compilation, but they make testing harder, are arguably contrary to exploratory programming, and seem to contradict the advice of the designers of the .NET API. Constructors that "sort of" initialize objects are more test-friendly (also more loosely coupled, BTW) and facilitate an exploratory programming style, but defer some kinds of error detection to runtime (as well as incur the runtime time/space costs of such detection and ensuing actions). So, library users, what do you prefer, and why? Thanks, Scott _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Scott Meyers wrote:
So I'm confused. Constructors that "really" initialize objects detect some kind of errors during compilation, but they make testing harder, are arguably contrary to exploratory programming, and seem to contradict the advice of the designers of the .NET API. Constructors that "sort of" initialize objects are more test-friendly (also more loosely coupled, BTW) and facilitate an exploratory programming style, but defer some kinds of error detection to runtime (as well as incur the runtime time/space costs of such detection and ensuing actions).
So, library users, what do you prefer, and why?
One-phase construction definitely! Testing in isolation is a different
matter and should not degrade the interfaces. There are ways to allow
isolated testing without degrading the interface. It all boils down
to decoupling and isolating dependencies. My favorites:
1) Use a template where policies can be replaced by hooks to the
testing engine that tests the expected results. In your example,
I'd imagine this interface: template <class Printer> EventLog.
2) Use callbacks. In the example you provided, I'd imagine EventLog
calls logstream to print. So, I'd use a constructor like:
EventLog::EventLog(boost::function

Joel de Guzman
One-phase construction definitely!
I'd like to ask you a question about your parser framework that is related to this. I probably get most of the following wrong, so please be patient. More often than not, a parser constructs objects in a hierarchical manner that reflects the grammar: print(a + b); might be expressed as statement(plus_expression(symbol("a"), symbol("b"))) in some language. The latter objects are better be designed without default constructors, in order to guarantee that we never end up with, say, a meaningless symbol() or plus_expression(). When I write a parser by hand, that's straight forward: boost::optional< plus_expression > parse_plus_expression(input_tokens tokens) { ... } The function will return boost::none if the expression can't be parsed and no default plus_expressions will be necessary. With spirit, there appear to be two approaches: 1) semantic actions, which store away the parsed expression in some way and 2) closures, which implicitly "return" their first member (member1). I don't like 1 for most purposes, because it's too imperative for my taste. I need to keep track of what my actions did and, if something fails to parse at a point and backtracking occurs, need to manually revert the changes. This seems error prone to me. 2 looks better, but, and this is the connection to the OP, it appears that the returned value, as all closure members, must be default constructible. In particular, I risk returning such default constructed values when I failed to assign to them. It's very likely that I missed something, as I didn't seriously tried using spirit. Can you shed some light on this? Jens

Jens Theisen wrote:
Joel de Guzman
writes: One-phase construction definitely!
I'd like to ask you a question about your parser framework that is related to this. I probably get most of the following wrong, so please be patient.
Well, the proper mailing list is: https://lists.sourceforge.net/lists/listinfo/spirit-general Anyway...
With spirit, there appear to be two approaches:
1) semantic actions, which store away the parsed expression in some way
and
2) closures, which implicitly "return" their first member (member1).
I don't like 1 for most purposes, because it's too imperative for my taste. I need to keep track of what my actions did and, if something fails to parse at a point and backtracking occurs, need to manually revert the changes. This seems error prone to me.
2 looks better, but, and this is the connection to the OP, it appears that the returned value, as all closure members, must be default constructible. In particular, I risk returning such default constructed values when I failed to assign to them.
It's very likely that I missed something, as I didn't seriously tried using spirit. Can you shed some light on this?
Yeah. You did miss something. The closure members can be initialized. See "Initializing closure variables" section in http://tinyurl.com/85hbq "Sometimes, we need to initialize our closure variables upon entering a non-terminal (rule, subrule or grammar). Closure enabled non-terminals, by default, default-construct variables upon entering the parse member function. If this is not desirable, we can pass constructor arguments to the non-terminal."... Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Joel de Guzman
Well, the proper mailing list is: https://lists.sourceforge.net/lists/listinfo/spirit-general
It also fits very well into this thread.
Yeah. You did miss something. The closure members can be initialized. See "Initializing closure variables" section in http://tinyurl.com/85hbq
And, to stick with my example, what would be a good initialisation for a plus_expression? You don't want to replace default construction with a copy from what's conceptionally a default. The topic question of this thread is about whether or not you should construct objects prior to when you have enough information to meaningfully do so, and for returning parsed values in spirit, this is neither at closure contstruction time nor at rule/grammar invocation time. I have the impression that spirit either forces you to do exactly that or else use actions, which is even more messy. Am I correct in that observation? Best regards, Jens

Jens Theisen
Joel de Guzman
writes: Well, the proper mailing list is: https://lists.sourceforge.net/lists/listinfo/spirit-general
It also fits very well into this thread.
Yeah. You did miss something. The closure members can be initialized. See "Initializing closure variables" section in http://tinyurl.com/85hbq
And, to stick with my example, what would be a good initialisation for a plus_expression?
You don't want to replace default construction with a copy from what's conceptionally a default. The topic question of this thread is about whether or not you should construct objects prior to when you have enough information to meaningfully do so, and for returning parsed values in spirit, this is neither at closure contstruction time nor at rule/grammar invocation time.
I have the impression that spirit either forces you to do exactly that or else use actions, which is even more messy.
Am I correct in that observation?
I may totally misinterpreting the question here, but in case I'm not... Spirit has to use "funny assignment semantics" for rules so that grammars can be recursive. There will always be some symbols whose identity needs to be established so they can be used on the RHS of other rules before they can really be initialized as rules. I think that's a reasonable trade-off to make in order to get syntax that's close to traditional EBNF. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams
I may totally misinterpreting the question here, but in case I'm not...
And I might be misinterpreting the answer, but we probably mean different things. You're talking about the rules themselves, are you? What I want not to be default-constructed is the value that is constructed by the rule/grammar. The following is taken from the documentation: factor = ureal_p[factor.val = arg1] | '(' >> expression[factor.val = arg1] >> ')' | ('-' >> factor[factor.val = -arg1]) | ('+' >> factor[factor.val = arg1]) ; I'm not complaining that `factor' must have been default constructed. I'm worrying about the return value, which is represented by the `val' closure member. The point where we know what to return is where the assignments are, but by design we must have some value earlier than that. This piece of code is analogous to: optional< value_t > factor(tokens_t tokens) { value_t ret; if(optional< value_t > temp = ureal_p(tokens)) ret = temp; else if(optional< value_t > temp = expression(tokens)) ret = temp; else ... return ret; } where one would like to have optional< value_t > factor(tokens_t tokens) { if(optional< value_t > temp = ureal_p(tokens)) return temp; else if(optional< value_t > temp = expression(tokens)) return temp; else ... } What if value_t is not an int in a calculator example, but, say, a `binary_expression' object for some programming language parser? Any default is clearly completely bogus, and I will have to clutter my program with sanity checks that make sure that I really have a proper object, as I expect. Best regards, Jens

Jens Theisen
I'm not complaining that `factor' must have been default constructed. I'm worrying about the return value, which is represented by the `val' closure member. The point where we know what to return is where the assignments are, but by design we must have some value earlier than that.
Ah, I understand. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
Jens Theisen
writes: I'm not complaining that `factor' must have been default constructed. I'm worrying about the return value, which is represented by the `val' closure member. The point where we know what to return is where the assignments are, but by design we must have some value earlier than that.
Ah, I understand.
I think Jens got it incorrectly. The var will *not* be created at all on an unsuccessful match. No, the value is not created prior to entering the rule. The value is created *lazily* after, and only after, a successful match is made. That's the beauty of lazy evaluation. Don't be misled by the syntax. On a no-match, the result is an optional<T>(), like in your example. See match class in match.hpp and notice the optional_type val; that's your attribute. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Joel de Guzman
I think Jens got it incorrectly. The var will *not* be created at all on an unsuccessful match.
Which is cool, but misses my point. Take the example again, but this time I have made a mistake: factor = ureal_p[factor.val = arg1] | '(' >> expression[factor.val = arg1] >> ')' | ('-' >> factor) | ('+' >> factor[factor.val = arg1]) ; In the third branch, I forgot to assign something. It will compile, though it better should not - the parsed return value is bogus. How do I prevent such bugs? Hand-written parsers can be written in such a safe way: optional< value_t > factor(tokens_t tokens) { // value_t is not default constructible if(optional< value_t > temp = ureal_p(tokens)) return temp; else if(optional< value_t > temp = expression(tokens)) return temp; else ... return none; } How to do it with spirit? Regards, Jens

Jens Theisen wrote:
Joel de Guzman
writes: I think Jens got it incorrectly. The var will *not* be created at all on an unsuccessful match.
Which is cool, but misses my point. Take the example again, but this time I have made a mistake:
[...]
How to do it with spirit?
I think this is going off topic from the thread. I suggest we continue this in the Spirit list: https://lists.sourceforge.net/lists/listinfo/spirit-general Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Jens Theisen wrote:
Joel de Guzman
writes: Well, the proper mailing list is: https://lists.sourceforge.net/lists/listinfo/spirit-general
It also fits very well into this thread.
Yeah. You did miss something. The closure members can be initialized. See "Initializing closure variables" section in http://tinyurl.com/85hbq
And, to stick with my example, what would be a good initialisation for a plus_expression?
You don't want to replace default construction with a copy from what's conceptionally a default. The topic question of this thread is about whether or not you should construct objects prior to when you have enough information to meaningfully do so, and for returning parsed values in spirit, this is neither at closure contstruction time nor at rule/grammar invocation time.
I have the impression that spirit either forces you to do exactly that or else use actions, which is even more messy.
Am I correct in that observation?
I'm sorry, you've lost me completely. Well, not completely, I may have a hint. But I'm not sure I fully understand you. Can you explain a bit more? Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Joel de Guzman wrote:
One-phase construction definitely! Testing in isolation is a different matter and should not degrade the interfaces. There are ways to allow isolated testing without degrading the interface. It all boils down to decoupling and isolating dependencies. My favorites:
1) Use a template where policies can be replaced by hooks to the testing engine that tests the expected results. In your example, I'd imagine this interface: template <class Printer> EventLog.
Or the more conventional OO approach of subclassing from ostream (in the example) and passing in a null or mock derived class object for testing purposes. Your approach has the drawback that it's now more difficult to have a container of all EventLog objects (because Printer is part of the type) and the OO approach has the drawback of requiring the introduction of a base class and virtuals in cases where they might otherwise not be necessary. To modify the example, suppose the EventLog constructor requires a Widget, and Widget is a large nonpolymorphic object with no virtuals. I'd still pass the Widget by reference, but subclassing it for testing would be ineffective, due to the lack of virtuals. Either way the desire to make the class testable affects the interface seen by users. This is not a complaint, just an observation. In another post, I noted that it seems like it'd be nice to be able to somehow create a "testing only" interface separate from the "normal" client interface.
2) Use callbacks. In the example you provided, I'd imagine EventLog calls logstream to print. So, I'd use a constructor like: EventLog::EventLog(boost::function
print) instead. So, instead of calling logstream << stuff directly, I'll call print(stuff). For the testing engine, I'll replace it with something that tests the expected results. All these falls under the "Hollywood Principle: Don't call us, we'll call you". IMO, with proper design, you can have both single phase construction *and* isolation testing.
But can you also have maximal inlining and, where needed by clients, runtime polymorphism? Templates preserve inlining but tend to sacrifice polymorphism (e.g., it's hard to have a container of (smart) pointers to EventLog<T> objects for all possible Ts), while base class interfaces and callbacks preserve polymorphism at the expense of easy inlining. Scott

Scott Meyers wrote:
Joel de Guzman wrote:
One-phase construction definitely! Testing in isolation is a different matter and should not degrade the interfaces. There are ways to allow isolated testing without degrading the interface. It all boils down to decoupling and isolating dependencies. My favorites:
1) Use a template where policies can be replaced by hooks to the testing engine that tests the expected results. In your example, I'd imagine this interface: template <class Printer> EventLog.
Or the more conventional OO approach of subclassing from ostream (in the example) and passing in a null or mock derived class object for testing purposes.
Your approach has the drawback that it's now more difficult to have a container of all EventLog objects (because Printer is part of the type) and the OO approach has the drawback of requiring the introduction of a base class and virtuals in cases where they might otherwise not be necessary. To modify the example, suppose the EventLog constructor requires a Widget, and Widget is a large nonpolymorphic object with no virtuals. I'd still pass the Widget by reference, but subclassing it for testing would be ineffective, due to the lack of virtuals.
Either way the desire to make the class testable affects the interface seen by users. This is not a complaint, just an observation. In another post, I noted that it seems like it'd be nice to be able to somehow create a "testing only" interface separate from the "normal" client interface.
2) Use callbacks. In the example you provided, I'd imagine EventLog calls logstream to print. So, I'd use a constructor like: EventLog::EventLog(boost::function
print) instead. So, instead of calling logstream << stuff directly, I'll call print(stuff). For the testing engine, I'll replace it with something that tests the expected results. All these falls under the "Hollywood Principle: Don't call us, we'll call you". IMO, with proper design, you can have both single phase construction *and* isolation testing.
But can you also have maximal inlining and, where needed by clients, runtime polymorphism? Templates preserve inlining but tend to sacrifice polymorphism (e.g., it's hard to have a container of (smart) pointers to EventLog<T> objects for all possible Ts), while base class interfaces and callbacks preserve polymorphism at the expense of easy inlining.
How many Ts (for all EventLog<T>) do you need? For deployment in an application, surely the set of Ts is bounded. If there is a need to put them in a container, I'd place them in a tuple or a fusion::set, or if you have more than one instances of each, a tuple or a fusion::set of std::vector(s). Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Scott Meyers wrote:
I have a question about library interface design in general with a strong interest in its application to C++, so I hope the moderators will view this at on-topic for a list devoted to the users' view of a set of widely used C++ libraries. In its most naked form, the question is this: how important is it that class constructors ensure that their objects are in a "truly" initialized state?
very important in my view
I've always been a big fan of the "don't let users create objects they can't really use" philosophy. So, in general, I don't like the idea of "empty" objects or two-phase construction. It's too easy for people to make mistakes. So, for example, if I were to design an interface for an event log that required a user-specified ostream on which to log events, I'd declare the constructor taking an ostream parameter (possibly with a default, but that's a separate question, so don't worry about it):
EventLog::EventLog(std::ostream& logstream); // an ostream must be // specified
I've slept soundly with this philosophy for many years, but lately I've noticed that this idea runs headlong into one of the ideas of testability: that classes should be easy to test in isolation. The above constructor requires that an ostream be set up before an EventLog can be tested, and this might (1) be a pain and (2) be irrelevant for whatever test I want to perform. In such cases, offering a default constructor in addition to the above would make the class potentially easier to test. (In the general case, there might be many parameters, and they might themselves be hard to instantiate for a test due to required parameters that their constructors require....)
Testing in isolation make sense if you are the writer of the library, not as the user if it. Internal (none-public) interfaces are appropriate for such use-cases. As a library user I prefer to trust the library, if I conclude it is no reason to trust it, I prefer to find alternatives to using that particular library. Confusing interfaces are to me traps waiting to take away my trust. It does not help that intentions are good and that there may exist valid use-cases. Confusing interfaces are still confusing. Why mess up an otherwise good external interface of your component with confusing junk for the sake of testability. If you really need such an interface, make it a separate one. This "special needs" interface should not be the first that pops up in the face of library users. Hide it so only specialist looking for it will find it, and make sure they are aware what territory they are entering. Preferably, in my view, such interfaces should be internal to your component, hence supporting testing and other needs for the library writer without messing up the API.
Another thing I've noticed is that some users prefer a more exploratory style of development: they want to get something to compile ASAP and then play around with it.
Ok, sounds good and perfectly reasonable,
In particular, they don't want to be bothered with having to look up a bunch of constructor parameters in some documentation and then find ways to instantiate the parameters, they just want to create the objects they want and then play around with them.
I can not see how adding confusing constructors or other confusing methods in the interface would help anything. If I explore new territory I certainly would like to be able to make valid assumptions about the objects I use based on intuition. If I think I test-drive a snow-mobile, and don't realize I have forgot to add a belt to it, then I have no idea of what I am exploring. A reasonable default behavior must be the better solution for exploring.
My gut instinct is not to have much sympathy for this argument, but
agreed
then I read in "Framework Design Guidelines" that it's typically better to let people create "uninitialized" objects and throw exceptions if the objects are then used.
I would much prefer build time diagnostics if possible. That said, sometimes you need data holder objects which potentially have costly default constructors. If in general the state of the object after default initialization is legal, but not very useful, then there may be a better idea to give the object a defined not_valid state into which the default construction takes it. If that is done, throwing upon access of the object may be an OK solution. Users could also by use of policies decide if exceptions are thrown or explicit checking of an is_valid() member should be used. Note that there is a clear distinction here between uninitialized as in undefined, and uninitialized as in a defined not_valid state. The latter is in my view only a good solution for classes used to hold data. Either as optimalization or more often as means to aid application logic. The former is never a good solution.
In fact, I took the EventLog example from page 27 of that book, where they make clear that this code will compile and then throw at runtime (I've translated from C# to C++, because the fact that the example is in C# is not relevant):
EvengLog log; log.WriteEntry("Hello World"); // throws: no log stream was set
It does not make any sense to me why a reference to std::cout or something similar could not be used as the default stream here. I fail to see the benefit of the solution in the code example above.
This book is by the designers of the .NET library, so regardless of your feelings about .NET, you have to admit that they have through about this kind of stuff a lot and also have a lot of experience with library users.
I am not convinced they have through so much about this, they may have - but I am not convinced. It may not be wise to assume the solution is good based on an assumption that some really smart people have thought long and hard about these aspects of the API usability. Even if that was the case and this in fact is the best solution for C#, the reasoning behind it may not apply to C++. These types of interfaces are nothing new, maybe the part that throws on uninitialized use is of newer date. But except from that, this looks like patterns both library developers and users has been used to since long before OO and C++ caught fire. I am afraid we are so used to it that we miss the chance to see and call out how bad it looks. ------ Bjørn Roald

Scott Meyers wrote:
<snip>
So, library users, what do you prefer, and why?
I prefer complete construction 100% - if an object cannot operate it should throw from its constructor. I think the design of .NET libraries is influenced by the capabilities of all the languages it supports, and also because methods may be called _after_ an object is dispossed via a direct call to the Dispose method. In .NET it is pretty common to create an object, set a bunch of properties and then use it (UI widgets for example). However, even when some of the properties of an object may be set after construction, the object constructor should set default values for member variables so the object methods can be called safely. For example, if I create a button instance via a default constructor I should be able to ask the button to draw itself before I set X, Y, width, height and color (even if it draws a black 1x1 box). Those are my 2 cents, Thanks, -delfin

[I think I'm mostly repeating what other people have said here, but decided to post anyway] Scott Meyers wrote:
I have a question about library interface design in general with a strong interest in its application to C++, so I hope the moderators will view this at on-topic for a list devoted to the users' view of a set of widely used C++ libraries. In its most naked form, the question is this: how important is it that class constructors ensure that their objects are in a "truly" initialized state?
Essential, but there's no rule without an exception.
I've always been a big fan of the "don't let users create objects they can't really use" philosophy. So, in general, I don't like the idea of "empty" objects or two-phase construction. It's too easy for people to make mistakes. So, for example, if I were to design an interface for an event log that required a user-specified ostream on which to log events, I'd declare the constructor taking an ostream parameter (possibly with a default, but that's a separate question, so don't worry about it):
EventLog::EventLog(std::ostream& logstream); // an ostream must be // specified
I've slept soundly with this philosophy for many years, but lately I've noticed that this idea runs headlong into one of the ideas of testability: that classes should be easy to test in isolation.
Actually, I think the above is an example of a class that's easy to test in isolation. In fact, close to perfect ;-) You're requiring a reference to a polymorphic object in the ctor, which is great from a testing point of view. It's possible to provide a ostringstream if you want to test the formatting, or some variety of "nullstream" if you want to test other stuff. Dependency injection via ctors are my preferred way of making (C++) objects testable. Also, in this very case the ostream isn't provided just for testability - it's an essential part of the interface. For your user requirement (~"needs an ostream for which to log events") there's really no good default either, as e.g. std::cout requires a console application. Perhaps a different example would be better to avoid this distraction?
The above constructor requires that an ostream be set up before an EventLog can be tested, and this might (1) be a pain and (2) be irrelevant for whatever test I want to perform. In such cases, offering a default constructor in addition to the above would make the class potentially easier to test. (In the general case, there might be many parameters, and they might themselves be hard to instantiate for a test due to required parameters that their constructors require....)
For the general case, that's true. For this specific case, I don't agree. I'm a bit curious though, about what constitutes a test from your point of view. From time to time in your posting, I'm not getting the grip of whether you are talking about unit testing from the authors point of view, or exploratory testing.
Another thing I've noticed is that some users prefer a more exploratory style of development: they want to get something to compile ASAP and then play around with it. In particular, they don't want to be bothered with having to look up a bunch of constructor parameters in some documentation and then find ways to instantiate the parameters, they just want to create the objects they want and then play around with them.
If the constructor arguments (or rather, their contributions to the functional state of the object) are essential for the functionality of the object, they shouldn't have unusable defaults - i.e. cause violations of later method call preconditions. As a developer I occasionally find myself adding extra arguments to ctors, or ctors overloads, just to make the dang thing testable without having to access an external resource, such as the file system, or the underlying OS API. For those cases, where the extra argument or overload are there just for the sake of testability, there always exists a reasonable default (or perhaps even only one real implementation). I just try to make sure that those extra arguments won't have to be provided by the casual user. I can agree that this last thing is a kind of interface pollution, but IMHO it is essential to be able to test as much as possible in isolation.
My gut instinct is not to have much sympathy for this argument, but then I read in "Framework Design Guidelines" that it's typically better to let people create "uninitialized" objects and throw exceptions if the objects are then used.
I think your gut instinct is correct. Also, taking design guidelines for .NET (which are perhaps absolutely appropriate there) and attempting to apply them to C++ programming might not be the right way to go.
In fact, I took the EventLog example from page 27 of that book, where they make clear that this code will compile and then throw at runtime (I've translated from C# to C++, because the fact that the example is in C# is not relevant):
EvengLog log; log.WriteEntry("Hello World"); // throws: no log stream was set
Yuk. If there was really a need for this lazy init, I think a null outputter would be better in this case. But it would depend on the application in question. [snip lots of .NET discussion]
So I'm confused. Constructors that "really" initialize objects detect some kind of errors during compilation, but they make testing harder,
Again, if the ctor arguments are essential for the operation of the object and have no reasonable defaults, what's the alternative, really? If the arguments are non-essential, don't require them, but perhaps provide additional overloads to allow customized construction. An example for latter would be the inclusion of a filter for the EventLog class, e.g. based on message contents or priority (as I believe someone else said also).
are arguably contrary to exploratory programming,
I might be saying the same thing over and over again, but how can you explore something unusable? As a side not though, I often prefer using unit testing as an exploratory tool.
and seem to contradict the advice of the designers of the .NET API.
Different platform and philosophy, unless you're talking about C++/CLI.
Constructors that "sort of" initialize objects are more test-friendly (also more loosely coupled, BTW)
I don't understand how loose coupling and "sort-of-initialized" objects connect?
and facilitate an exploratory programming style, but defer some kinds of error detection to runtime (as well as incur the runtime time/space costs of such detection and ensuing actions).
So, library users, what do you prefer, and why?
As above. Regards, Johan Nilsson

Johan Nilsson wrote:
and seem to contradict the advice of the designers of the .NET API.
Different platform and philosophy, unless you're talking about C++/CLI.
C++/CLI is not an exception. C++/CLI follows the .Net philosophy and rules regarding the construction of objects, not the C++ philosophy and rules, as I pointed out in another post on this thread. Because of that it is not necessarily apropos to use .Net as a basis for discussion of constructor philosophy in C++, although .Net's reliance on properties and events does have an anology to work that has been done with C++ lately, and does affect how one thinks about constructing and using objects.

Edward Diener
Johan Nilsson wrote:
and seem to contradict the advice of the designers of the .NET API.
Different platform and philosophy, unless you're talking about C++/CLI.
C++/CLI is not an exception. C++/CLI follows the .Net philosophy and rules regarding the construction of objects, not the C++ philosophy and rules, as I pointed out in another post on this thread. Because of that it is not necessarily apropos to use .Net as a basis for discussion of constructor philosophy in C++, although .Net's reliance on properties and events does have an anology to work that has been done with C++ lately, and does affect how one thinks about constructing and using objects.
C++/CLI's default null-initialization of references doesn't in any way justify designs that pass out half-baked objects to users, any more than we would approve of passing out half-baked objects containing only shared_ptr<>s in C++. It's bad design. It makes the client of a class responsible for things that the class designer should have taken care of. That's a universal no-no, regardless of what language the code is written in. Default null-initialization of references does make it easier to not think about certain exception-safety and object state issues and "get away with it" some of the time, but that doesn't make code written that way correct. Usually, it just means that bugs may be masked or harder to detect. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
Edward Diener
writes: Johan Nilsson wrote:
and seem to contradict the advice of the designers of the .NET API. Different platform and philosophy, unless you're talking about C++/CLI.
C++/CLI is not an exception. C++/CLI follows the .Net philosophy and rules regarding the construction of objects, not the C++ philosophy and rules, as I pointed out in another post on this thread. Because of that it is not necessarily apropos to use .Net as a basis for discussion of constructor philosophy in C++, although .Net's reliance on properties and events does have an anology to work that has been done with C++ lately, and does affect how one thinks about constructing and using objects.
C++/CLI's default null-initialization of references doesn't in any way justify designs that pass out half-baked objects to users, any more than we would approve of passing out half-baked objects containing only shared_ptr<>s in C++. It's bad design. It makes the client of a class responsible for things that the class designer should have taken care of. That's a universal no-no, regardless of what language the code is written in.
That's really .Net's default null-initialization of references, since C++/CLI plays by the .Net rules. Most .Net objects aren't half-baked objects to use, but they are objects which are created with a default constructor, which gives people the idea that they are not ready to use. They rely quite a bit on properties being set in order to use, and those properties are set via the design-time interface in Visual Studio, which then injects code in the default constructor to set the properties. So while it looks like these objects are not ready to use, they really are. This is an instance where programmers who are not cognizant of a particular technology jump to conclusions which are not true based on a similar technology which behaves differently. Standard C++ does not have a design-time interface which sets private data members from within a constructor, the equivalent of .Net's properties, so it appears to standard C++ programmers that the .Net default constructor, which is the common case, creates objects which are not ready to use whereas this is not the situation at all. In my initial response in this thread I may not have been clear about how .Net works, but I did say that the .Net default constructor methodology is not indicative of the issue discussed, where the OP thought it might be a good idea not to have his objects ready to use immediately upon construction and thought that the .Net default constructors were an indication of that case.
Default null-initialization of references does make it easier to not think about certain exception-safety and object state issues and "get away with it" some of the time, but that doesn't make code written that way correct. Usually, it just means that bugs may be masked or harder to detect.
See above vis-a-vis properties being set in the constructor buy the .Net visual designer interface.

Edward Diener
Most .Net objects aren't half-baked objects to use, but they are objects which are created with a default constructor, which gives people the idea that they are not ready to use. They rely quite a bit on properties being set in order to use, and those properties are set via the design-time interface in Visual Studio, which then injects code in the default constructor to set the properties. So while it looks like these objects are not ready to use, they really are.
Edward, I know little of .Net, so I'd appreciate it if you could help me out here. When you say "design-time interface," what are you talking about? Some kind of GUI? Are you saying that this GUI modifies the compiled binary, leaving no textual trace of member initialization values in the original source for the class? [for what it's worth, I don't think the answers to this question can affect my stance on Scott's question at all. Either the objects are fully-baked, or not, upon construction. How they come to be baked, and whether this "design-time interface" thing is a good idea, are separate issues] -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams a écrit :
Edward Diener
writes: Most .Net objects aren't half-baked objects to use, but they are objects which are created with a default constructor, which gives people the idea that they are not ready to use. They rely quite a bit on properties being set in order to use, and those properties are set via the design-time interface in Visual Studio, which then injects code in the default constructor to set the properties. So while it looks like these objects are not ready to use, they really are.
Edward, I know little of .Net, so I'd appreciate it if you could help me out here. When you say "design-time interface," what are you talking about? Some kind of GUI?
I think so.
Are you saying that this GUI modifies the compiled binary, leaving no textual trace of member initialization values in the original source for the class?
The GUI modifies the source code. -- Loïc

David Abrahams wrote:
Edward Diener
writes: Most .Net objects aren't half-baked objects to use, but they are objects which are created with a default constructor, which gives people the idea that they are not ready to use. They rely quite a bit on properties being set in order to use, and those properties are set via the design-time interface in Visual Studio, which then injects code in the default constructor to set the properties. So while it looks like these objects are not ready to use, they really are.
Edward, I know little of .Net, so I'd appreciate it if you could help me out here. When you say "design-time interface," what are you talking about? Some kind of GUI?
Yes. The Visual Studio designer allows one to drop components into other components and forms, and modifies the source code constructors which create the component and/or form. A component is also a component container, allowing other components to be nested inside it, and a form is the equivalent of a Windows window which naturally can have components in it. A component is a particular kind of class and controls, which people are used to seeing in forms ( windows ), are just visual components.
Are you saying that this GUI modifies the compiled binary, leaving no textual trace of member initialization values in the original source for the class?
The original source. The Visual Studio designer is actually modifying the constructors to setup the components in the source code and then this gets compiled into the resulting binary. The areas being modified are marked off in the source so that the user does not change them but allows the Visual Studio designer do it instead. The actual way it does this is better than it sounds. When you first create a component and/or form from menus in Visual Studio, the constructors get an InitializeComponents(); call added to them with the appropriate definition later in the source code. It is in this InitializeComponents() definition that the Visual Studio Designer manipulates code based on the properties and events of components added to a component and/or a form. There are two constructors, a default one which takes no parameters and an other one which is passed a container and in which codes gets added automatically which adds the component to the container.
[for what it's worth, I don't think the answers to this question can affect my stance on Scott's question at all. Either the objects are fully-baked, or not, upon construction. How they come to be baked, and whether this "design-time interface" thing is a good idea, are separate issues]
I agree with your analysis. I was only trying to point out that the default constructor ( or container constructor ) of .Net was erroneously causing others, including the OP, to think that .Net creates classes which are not ready to be used upon construction. They are, by and large, ready to use immediately because the necessary code to setup their properties and events has already been injected into the constructors when the programmer manipulated a form or a component in the visual designer. So having someone point out to the OP that .Net is an example of an environment in which two-phase construction is the rule is just plain wrong, and any inference about software design of classes following this idiom should not be made from that erroneous conclusion.

Edward Diener wrote:
Johan Nilsson wrote:
and seem to contradict the advice of the designers of the .NET API.
Different platform and philosophy, unless you're talking about C++/CLI.
C++/CLI is not an exception.
[snip] I guess my comment was somewhat ambiguous here. What I meant was that the C++ "platform" is different from the .Net platform, and that C++/CLI belonged in the .Net camp. // Johan

Hi Scott, Scott Meyers wrote:
I have a question about library interface design in general with a strong interest in its application to C++, so I hope the moderators will view this at on-topic for a list devoted to the users' view of a set of widely used C++ libraries. In its most naked form, the question is this: how important is it that class constructors ensure that their objects are in a "truly" initialized state?
I've always been a big fan of the "don't let users create objects they can't really use" philosophy. So, in general, I don't like the idea of "empty" objects or two-phase construction. It's too easy for people to make mistakes. So, for example, if I were to design an interface for an event log that required a user-specified ostream on which to log events, I'd declare the constructor taking an ostream parameter (possibly with a default, but that's a separate question, so don't worry about it):
EventLog::EventLog(std::ostream& logstream); // an ostream must be // specified
One-phase construction that establishes the class invariant is the one of the two main benefits of OO programming (the other being polymorphism). So I rate it as very important. As for testability, then things like the strategy pattern helps a lot. In your example above, the argument is passed by refernce and so you could perhaps use a dummy class derived from std::ostream. For tightly coupled code (and otherwise), DbC is indispensible. Michael Feathers describes a range of techniques for writing test for legacy code in "Working effectively with legacy code". cheers Thorsten

Scott Meyers
I have a question about library interface design in general with a strong interest in its application to C++, so I hope the moderators will view this at on-topic for a list devoted to the users' view of a set of widely used C++ libraries. In its most naked form, the question is this: how important is it that class constructors ensure that their objects are in a "truly" initialized state?
Speaking as a library developer ;-) it's extremely beneficial and almost always the right thing to do.
I've always been a big fan of the "don't let users create objects they can't really use" philosophy. So, in general, I don't like the idea of "empty" objects or two-phase construction. It's too easy for people to make mistakes. So, for example, if I were to design an interface for an event log that required a user-specified ostream on which to log events, I'd declare the constructor taking an ostream parameter (possibly with a default, but that's a separate question, so don't worry about it):
EventLog::EventLog(std::ostream& logstream); // an ostream must be // specified
I've slept soundly with this philosophy for many years, but lately I've noticed that this idea runs headlong into one of the ideas of testability: that classes should be easy to test in isolation. The above constructor requires that an ostream be set up before an EventLog can be tested, and this might (1) be a pain
Just about any amount of pain in setting up a test (as long as it's possible) is justified in order to to get the interface right for users.
and (2) be irrelevant for whatever test I want to perform.
How so?
In such cases, offering a default constructor in addition to the above would make the class potentially easier to test.
How so?
(In the general case, there might be many parameters, and they might themselves be hard to instantiate for a test due to required parameters that their constructors require....)
Sorry, that's too recursive for me to understand. A concrete example would help.
Another thing I've noticed is that some users prefer a more exploratory style of development: they want to get something to compile ASAP and then play around with it.
Often true.
In particular, they don't want to be bothered with having to look up a bunch of constructor parameters in some documentation and then find ways to instantiate the parameters, they just want to create the objects they want and then play around with them.
Uh, yeah, but if you don't supply enough information to meaninfully construct the object, you don't have something you can play with. And encouraging users to "play" with dangerous, half-constructed objects isn't exactly friendly.
My gut instinct is not to have much sympathy for this argument,
Good gut ;-)
but then I read in "Framework Design Guidelines" that it's typically better to let people create "uninitialized" objects and throw exceptions if the objects are then used.
That's terrible advice. It leads to a style of development where you're always checking to see if your program state is good. It's a horrible burden on maintainability; the "not good" path is almost never covered in any tests and so is probably broken anyway.
In fact, I took the EventLog example from page 27 of that book, where they make clear that this code will compile and then throw at runtime (I've translated from C# to C++, because the fact that the example is in C# is not relevant):
EvengLog log; log.WriteEntry("Hello World"); // throws: no log stream was set
This book is by the designers of the .NET library, so regardless of your feelings about .NET, you have to admit that they have through about this kind of stuff a lot and also have a lot of experience with library users.
I do? Why?
But then on the third hand I get mail like this:
The .NET libraries have many objects with many constructors that leave the constructed object in a not ready-to-use state.
An example: System.Data.SqlClient.SqlParameter is a class that describes a bound parameter used in a database statement. Bound parameters are essential to prevent SQL injection attacks. They should be exceedingly easy to use since the "competition" (string concatenation of parameters into the SQL statement) is easy, well understood, and dangerous.
However, the SqlParameter class has six constructors. Only two constructors create a sqlParameter object that can be immediately used.
Omigosh.
The others all require that you set additional properties (of course, which additional properties is unclear).
Failure to write documentation that makes a component's requirements clear is not the mark of a library designer who has "thought about this kind of stuff a lot!"
Failure to prepare the SqlParameter object correctly typically generates an un-helpful database error when the SQL statement is executed. To add to the confusion, the first ctor shown by intellisense has 10 parameters (which, if set correctly, will instantiate a usable object). The last ctor shown by intellisense has only 2 parameters and is the most intuitive choice. The four in between are all half-baked. It's confusing, and even though I use it all the time, I still have to look at code snippets to remember how.
So I'm confused. Constructors that "really" initialize objects detect some kind of errors during compilation,
Yes.
but they make testing harder,
I've not heard a convincing argument that they do that.
are arguably contrary to exploratory programming,
Nor that.
and seem to contradict the advice of the designers of the .NET API.
Well, that should tell you something about the designers of the .NET API. If they find it necessary to sacrifice sound interfaces in order to get testability and explore-ability, they have weak design muscles and not nearly the experience you give them credit for.
Constructors that "sort of" initialize objects are more test-friendly
I'll believe it when I see it.
(also more loosely coupled, BTW)
How so?
and facilitate an exploratory programming style,
Exploratory programming with capriciously-broken parts is _not_ fun!
but defer some kinds of error detection to runtime (as well as incur the runtime time/space costs of such detection and ensuing actions).
So, library users, what do you prefer, and why?
As a library user, I prefer documentation that describes what I need to do to use a component correctly and designs that try not to let me use broken parts. Naturally there are design choices that can be made *within those constraints* that make testing and exploration easier, but there is absolutely no need to sacrifice static checking to reach these ends! -- Dave Abrahams Boost Consulting www.boost-consulting.com

It seems to me that what constitutes an initialised state of an object is largely library defined. One size does not fit all, some libraries will suit certain tasks better than others. As long as the state after construction is known and documented then it seems that even the EventLog with a default constructor would be a strictly correct (if not that useful) design. Some of the main reasons for two phase construction is to allow for relationships between long lived objects, and often to allow complex operational properties to be assigned or adjusted dynamically through the life of the objects - without constraint from external (outside program control) objects. Whether it is called "two phase construction" or whether it is just that we say the object is defined as having an active and inactive state does not matter much. AFAICT this is the principle that is being discussed. Mostly I am thinking of database and GUI applications, but we can demonstrate the principle using a file class (ignoring the availability of streams)... MyFileClass f; // an empty, unopen, unnamed file object f.set_file_name("afile"); // a named but still unopen file f.open(ForRead); // now an open for reading the file Is it bad to allow MyFileClass to be constructed in this way? It would seem to partially violate the resource acquisition is initialisation principle although exactly what happens internally I have not defined and it can still ensure appropriate release of resources when the object is destroyed. The advantage of such a system is the ability to hook up potentially complex objects and use events (or signals or whatever you want to call them) to keep them in touch using observer patterns. For example we may connect an object to f and observe signals that it sends out about what file it represents, whether it is open, and perhaps the contents of the file. ie. Your typical document/view sort of thing. Someone earlier mentioned vehicles (amongst other things) as an example of an object we expect to arrive in a fully constructed state. And this is a good example of what I am talking about here. I DO expect my car to be fully constructed when I get it - but I DONT expect it to be continually in motion, or have the engine running all the time. If I press the accelerator when it is not running then nothing happens. It is a fully constructed object, it is just not very useful until it has been turned on. (cf. MyFileClass when not open. A car does not have to be turned on to exist, a file does not have to be open to exist - it is just that some things are not possible until the object is active.) I may want to engage four wheel drive. I dont mind turning it off while I do it (if I have too), but I dont expect to take the entire car to pieces and reassemble it in four wheel drive mode. (cv. MyFileClass when open for read and want to write.) I have many applications where the same principles apply. The objects have complex relationships and may have many defining or dynamically adjustable attributes. It is not always convenient to destroy and recreate entire sets of objects every time I want to change some aspect of their operation. That is my current take on it. Single phase construction is obviously best where feasible - it keeps everything simpler and clearer. But at this time I can see no alternative but two phase construction (objects with active/inactive state) for some situations. Are there practical alternatives? If so I'd love to hear about them. -- Geoff Worboys Telesis Computing

Geoff Worboys wrote:
That is my current take on it. Single phase construction is obviously best where feasible - it keeps everything simpler and clearer. But at this time I can see no alternative but two phase construction (objects with active/inactive state) for some situations.
Are there practical alternatives? If so I'd love to hear about them.
As I explained in another post in this thread, I think boost::optional is the ultimate solution to the active/inactive state problem, so the class itself can still have single phase construction. Yuval

Yuval Ronen wrote:
As I explained in another post in this thread, I think boost::optional is the ultimate solution to the active/inactive state problem, so the class itself can still have single phase construction.
There is a lot about boost::optional that I am not entirely happy about, but we dont really want to get into that on this thread, however the general idea of using optional or a similar facility is still very interesting. I imagine that the idea would be to refactor the current class with active/inactive state into two separate classes. The long lived object (needed for complex structural reasons) that would operate as normal, but all the active-dependent interface would be moved to the new/separate class and accessed through a member function returning an optional<T> (or similar). I can see where this would make excellent sense in several of my applications. To optimise access inside tight loops you could assign the optional<T> to a T& - and so only be checking existence at the start... as opposed to my present problem of having to choose between assertion or exception checking at the top of each top-level class function. Yes, I like the thought very much. Thank you. -- Geoff Worboys Telesis Computing

Geoff Worboys wrote:
I can see where this would make excellent sense in several of my applications. To optimise access inside tight loops you could assign the optional<T> to a T& - and so only be checking existence at the start... as opposed to my present problem of having to choose between assertion or exception checking at the top of each top-level class function.
Exactly.
Yes, I like the thought very much. Thank you.
My pleasure! :-)

"Scott Meyers"
I have a question about library interface design in general with a strong interest in its application to C++, so I hope the moderators will view this at on-topic for a list devoted to the users' view of a set of widely used C++ libraries. In its most naked form, the question is this: how important is it that class constructors ensure that their objects are in a "truly" initialized state?
I can think of one example for not fully constructing objects: I've got some multi-threaded programs where I have found that it is quite convenient for each of my threads to be a C++ object. The private thread variables are C++ private data members. These threads cooperate with one another by sending messages to each other through locking message queues. In this way private data members do not need to be locked, as each message is processed sequentially. The problem is in starting up. If each one of these threads knows about each other via some method (pointer, or reference), then when a thread starts up its own business, whatever that is, then most likely it will start generating data and sending messages. When it tries to send a message to a thread that does not yet exist, BOOM!!! (yes, I have experienced this boom). My solution to this is not to start the threads running in the constructor. I can create all of the thread objects in any order I like, then I can start them running as threads in any order that I like, through a member function call to each one. If a thread sends a message to another thread object not yet running, then the message simply waits in his queue, until he wakes up and begins processing messages. Consequently, at program shutdown time, one cannot just start deleting thread objects. If you were to delete a thread object that other thread objects are messaging, BOOM!! (or as they say in this newsgroup, undefined behavior). My answer to this is to signal each thread to come to a stop on its own accord. Once all threads are stopped, then they can be deleted in any order. So, I don't have monolithic destructors, either. Of course, you may consider a thread object that is not yet running is actually fully constructed. I am not sure of the definition. my .02, Robert Kindred [] So, library users, what do you prefer, and why?
Thanks,
Scott

On Fri, 15 Sep 2006 11:16:21 -0300, Robert Kindred
I can think of one example for not fully constructing objects:
I've got some multi-threaded programs where I have found that it is quite convenient for each of my threads to be a C++ object. The private thread variables are C++ private data members. These threads cooperate with one another by sending messages to each other through locking message queues. In this way private data members do not need to be locked, as each message is processed sequentially.
The problem is in starting up. If each one of these threads knows about each other via some method (pointer, or reference), then when a thread starts up its own business, whatever that is, then most likely it will start generating data and sending messages. When it tries to send a message to a thread that does not yet exist, BOOM!!! (yes, I have experienced this boom).
My solution to this is not to start the threads running in the constructor. I can create all of the thread objects in any order I like, then I can start them running as threads in any order that I like, through a member function call to each one. If a thread sends a message to another thread object not yet running, then the message simply waits in his queue, until he wakes up and begins processing messages.
Consequently, at program shutdown time, one cannot just start deleting thread objects. If you were to delete a thread object that other thread objects are messaging, BOOM!! (or as they say in this newsgroup, undefined behavior). My answer to this is to signal each thread to come to a stop on its own accord. Once all threads are stopped, then they can be deleted in any order. So, I don't have monolithic destructors, either.
Of course, you may consider a thread object that is not yet running is actually fully constructed. I am not sure of the definition.
I find this situation sometimes too. My solution follows Stroustrup's advice. When I can't break a cycle I create an entity that stands for the collection of interrelated instances/classes. In your example each thread object maintains an invariant, but the collection of threads have a stronger invariant than the sum of invariants. That can be seen in that both construction and destruction require special care. In Haskell some cases of two phase construction can be avoided because of it's lazy nature. You can have references to objects that still don't exist, without having destructive assignments. It really rocks: http://www.haskell.org/hawiki/TyingTheKnot Bruno

I much prefer to know that all objects are always valid. Otherwise I have to constantly keep in my mind the state of the objects as I read through the program. A small piece of code is not "transparently correct" as it depends on a "hidden" internal state. This means I have to rely more on program testing and will have more hard to find bugs. As the program gets large, (and they're always getting larger!) the problem just gets worse. In order for me to be confident that a given program is correct (or has few bugs) is for it to be composed of individually verifiably correct modules. So minimizing the number of internal states helps correctness. Of course I realize that this is not always possible to eliminate "two phase construction" - in C++ often due the problems of handling exceptions in constructors but I still I prefer to minimise this. "Exploratory Progamming" - I haven't heard of this but I can imagine what it might mean. It doesn't sound good to me. Though I must confess I have fallen into the habit of testing ideas - but, in my case, this usually means relying on compile time syntax checking to help figure out how to use boost libaries which often isn't easy from the documentation. Robert Ramey Scott Meyers wrote: ...
So, library users, what do you prefer, and why?
Thanks,
Scott

Uhh what? Throw exceptions from your constructor... Sorry for the TP. -----Original Message----- From: boost-users-bounces@lists.boost.org on behalf of Robert Ramey Of course I realize that this is not always possible to eliminate "two phase construction" - in C++ often due the problems of handling exceptions in constructors but I still I prefer to minimise this.
participants (37)
-
Anne-Gert Bultena
-
Arkadiy Vertleyb
-
Bill Lear
-
Bjorn Roald
-
Brian Allison
-
Bruno Martínez
-
David Abrahams
-
David Walthall
-
Delfin Rojas
-
divyank shukla
-
Edward Diener
-
Gennaro Prota
-
Geoff Worboys
-
Gottlob Frege
-
James Dennett
-
Jeff Flinn
-
Jeff Garland
-
Jens Theisen
-
Joel de Guzman
-
Johan Nilsson
-
Kevin Wheatley
-
loufoque
-
Loïc Joly
-
Marshall Clow
-
Matthias Hofmann
-
Ovanes Markarian
-
Paul Davis
-
Peter Dimov
-
Rene Rivera
-
Robert Kindred
-
Robert Ramey
-
Roman Neuhauser
-
Rush Manbert
-
Scott Meyers
-
Sohail Somani
-
Thorsten Ottosen
-
Yuval Ronen