
Over the past few years, the style of our abstract interfaces for system-dependent features has bothered me more and more. Here’s my challenge to those happy with the status quo: Find me an interface in Boost or the C++ standard that represents an observable inter-program interface, such that, for any popular operating system, I can’t find a feature of the operating system that is impossible to express using that interface without relying on undefined behavior, undocumented behavior, or circumventing access specifiers. First, a realistic example is in order. We all know and love C++ standard iostreams (“iostreams library,” <http://www.cplusplus.com/ref/iostream>), which is almost certainly the most widespread operating system abstraction in the C++ world. So we create a std::ifstream, and use it to open some file. Now we want to map this file to a region of memory. Do what?! Suddenly, everyone breaks out in laughter, with tears rolling down their cheeks, little globules of spittle escaping from the corners of their mouths, ridiculing the very notion of any standard library being involved in such a reasonable and common operation. But why? The operating system can do this, and it’s not incompatible in the slightest with the concept or interface of std::ifstream. In fact, if we had some way of stealing the secret file descriptor stashed away inside the filebuf, we could do it—but programmers, deep down, feel that this sort of access just isn’t right. Grab some popcorn and watch programmers sweat and squirm in their own cognitive dissonance as they try to find an elegant solution to this insolvable problem (“Legalize access to file descriptors now!” <http://gcc.gnu.org/ml/libstdc++/2005-02/msg00090.html>). Let me get straight to the heart of the problem: we love our concrete interfaces. Every programmer loves the ability to be able to manipulate a class, knowing its complete interface, exactly what sort of animal it is, and exactly what its going to do—and know all of this before the program ever runs for the first time. I hope I’m not bursting anyone’s bubble here, but if when we’re talking about components that abstract system interfaces, then this just isn’t the way the world works. It’s time we grow up, get out of the playpen, and realize that there really is a big scary world of polymorphism out there. My core belief is this. We don’t fully know the capabilities of the system components we manipulate, not at compile time, not at load time, and perhaps not even by the time the program halts. Any interface that pretends to is a lie, and most of the time, a pretty bad liar at that. Here’s another one of my favorite examples, in Boost.Threads (“Boost.Threads,” <http://www.boost.org/doc/html/threads.html>). boost::thread lacks a method to forcefully terminate a thread, despite the fact that many threading systems have one. However, we can’t add one, because there exists at least one threading system that doesn’t have this feature. Well, we *could*, but then we’d be playing a game of chicken with the user daring them to call a function that has completely undecidable behavior. Now let’s say the user was really determined get this feature, and so she decided to write her own thread class. Nope, she loses again! Because her class is not named boost::thread, the class is incompatible with all of the rest of the thread manipulation functions, and so is entirely unusable with Boost.Threads. Do you like scary movies? I have an idea for one. It’s about a future C++ standard that includes a threading library that I can’t use if I want forced termination semantics, or any other feature that any operating system has that the library lacks. Besides the inability of our concrete interfaces to support a capability set that varies, and the lack of ability to reimplement these interfaces compatibly, concrete schemes also can’t represent what I call /multi-interface systems/. Some environments, such as Cygwin (“Cygwin Information and Installation,” <http://www.cygwin.com>), have two (or more) entirely distinct sets of system interfaces that may be used as underlying primitives for a particular concept. In the case of Cygwin and threads, the environment supports both the POSIX and Windows thread interfaces. Both may be used simultaneously, and as each system has its own unique characteristics, it may be entirely reasonable to do so. But, Boost.Threads can’t do this, and really couldn’t possibly be expected to, in its current form. Note that the total set of capabilities is not decidable at compile time, or even load time. For instance, let’s say we’re using a process class on a System V-style system. We’re implementing a debugger, and so we’d like to get access to the process’ core memory through the /proc interface. However, up until we actually try to do so, we really have no way of knowing whether this is supported, as the /proc filesystem may well not be mounted. I’m calling for polymorphism. These interfaces really are conceptually polymorphic; let’s reflect that in our language. Let’s give the user the tools she needs to be able to write her own compatible classes when the ones we write prove insufficient. For the sake of exposition, let me propose a sketch of a possible design for a process class. A generic process is represented by an abstract base class. Derived from it are classes for the major types of processes: POSIX, Windows, DCE, and whatever else. Derived from each of those are specific variants of these types, with additional or extended capabilities. For example, a child of the POSIX class might be a class implementing the ptrace() process debugging interface. Naturally, we’d instantiate objects of these classes with some factory. Since we might not know at creation time exactly what capabilities the system has, or what capabilities are needed, we need a copying mechanism to construct a new process from an old one. This copy might, for example, copy the POSIX base, so as to get the process identifier, and slice the rest off, as unneeded. Clearly there would need to be significant design effort put into this area. This and other implementation issues are mostly tangential from my primary concerns. Finally, pointers are a pain, so we can wrap pointers to a process in your favorite smart pointer. This smart pointer might have additional associated mechanics to enable it to automatically select the best kind of process creation machinery as it is able. I could see this basic process class (which is practically concrete) as having a very pleasant and desirable syntax. Lastly, performance needs to be addressed. I’m not at all worried about making operations that would be normal function calls into virtual function calls, and you shouldn’t be, either. An indirect call will often have a cost an order of magnitude less than the cost of the actual underlying system operation, which might involve many more indirect calls, context switches, and synchronization. A more significant concern is the additional machinery needed to support virtual inheritance. RTTI may also be necessary to fully exercise the class hierarchy’s capabilities. However, when using the subset of features that would be available to an equivalent concrete implementation, RTTI and similar should not be needed, so a user shouldn’t have to pay (too much) for what she’s not using. So here’s my question to the Boost community. How many people have similar concerns and experiences? How often, in real code, do concrete classes prove insufficient? Who here has to entirely reimplement libraries like Boost.Threads for relatively silly reasons? Are there any alternate solutions for the problem I describe? What unforeseen problems might there be with this polymorphic style? Would you use a library such as Boost.Thread if it had been rewritten in this manner? Submitted with Love, Aaron W. LaFramboise