The Learning C/C++urve

Columns

The Learning C/C++urve: Reflective C++

Bobby Schmidt

Bobby discusses various forms of nothing, from NULL pointers that never amount to anything to little pictures of nothing in particular.

Copyright © 1998 Robert H. Schmidt

Last month I promised I was done with pointers. I lied. Like Ellen Ripley in the latest Alien flick, pointers in this column just refuse to die. If this troubles you, blame the author of two obscenely influential C++ books for much of your woe.

You Say NULL, I Say nil

The author in question is Diligent Reader Scott Meyers, author of the seminal tomes Effective C++ [1] and More Effective C++ [2]. Scott once told me he considered himself a human kill file. One of his undead ideas wandered into my inbox:
"Regarding your column in the December CUJ, I think the approach I describe on pp. 111-112 of [Effective C++, Second Edition] is better than your nil, overlooking for the moment that my design requires support for member templates. What do you think?"
What did I think? At first I thought he was bored writing his talk for the Software Development (SD) conference, and needed a diversion. Later I thought he may have been serious, and took a gander at his approach. As the text on page 111 of Effective C++ shows, Scott is trying to solve the familiar problem of overloading in the presence of NULL:
#define NULL 0
void f(int);
void f(void *);
 
// error -- which 'f' gets called?
f(NULL);
Scott wants a NULL replacement that can't be confused for an int, but still has NULL's property of turning itself into any null pointer type. His solution
class NullClass
    {
public:
    template<class T>
    operator T *() const
        {
        return 0;
        }
    };
 
const NullClass NULL;
relies on a comparatively recent addition to C++: member function templates. The template operator T *() lets NullClass act as if it has valid conversions for every possible pointer type:
class NullClass
    {
public:
    operator char() const;
    operator int() const;
    operator unsigned() const;
    // ... ad infinitum
    };
with the compiler picking the correct conversion as needed. If we extend the example to
int *p = NULL;
the compiler treats the initialization effectively as
int *p = NULL.operator int *();
and instantiates the function NullClass::operator int *(). That function returns the integer 0, which the compiler converts to the null pointer of type int *.

Unlike the real macro NULL, this NULL is not an integer, and does not turn itself into an integer, preventing code like
int i = NULL; // error
On page 112, Scott's embellished NullClass:

is anonymous, preventing the manufacture of multiple NULL objects,

declares a public pointer-to-member conversion,

declares a private (unimplemented) operator&, preventing NULL's address from being taken. As Scott notes, my nil can have its address taken, which he correctly calls an "unexpected side effect."

My Take

Scott is trying to solve a problem different from mine. I want a replacement for NULL that allows more fine-tuned interaction with my auto_pointer class, while Scott wants a plug-and-play replacement for the real NULL. In my response to Scott I noted that:

Anonymous class names make me nervous, especially in light of type equivalence rules and the One Definition Rule. But I also acknowledge his method pollutes the global name space less.

Pointers to members are so seldom used, I hadn't even considered them for auto_pointer. In fact, the Standard Template Library (STL) auto_ptr class — from which I took my original inspiration — makes no attempt to handle pointers to members. For a complete NULL replacement like Scott's, pointers to members make sense.

I bet a lot of compilers don't support member templates correctly

if at all, a point I bring home shortly.

The smarts in Scott's design lie within the pointer itself. The smarts in mine lie in auto_pointer: when passed a dummy argument of type class nil, auto_pointer knows to treat itself as a conceptual NULL pointer. The nil object is completely unremarkable and unaware.

I don't want all of NULL's features. Instead, I want to selectively enable or disable certain pointer behavior. Scott's NULL replicates most of the real NULL's behavior, desirable and otherwise.

Scott admits in print that users can still mix the raw symbol 0 with pointers, completely ignoring his NULL. This is an advantage I see in my auto_pointer/nil combination: by declaring certain auto_pointer members private, I prevent users from mixing 0 and void * with auto_pointer. Scott's NULL, on the other hand, is at the mercy of C++'s built-in behavior for pointers.
As Dan Saks reminds me, using the name NULL renders a program non-portable and ill-formed, since the name is reserved for the C++ Standard Library. That's why I chose the name nil for my replacement but again, Scott is looking to make a silent replacement that leaves all NULL client code intact.
In the end, we are solving related but distinct problems. If your goal is to use real C++ pointers but replace NULL, favor Scott's design. If your goal is to replace pointers, consider my design.

Member Templates, Briefly

In a followup mail, Scott noted that his solution probably doesn't allow
if (NULL)
which the real NULL (and my nil) does allow. I say probably because we are both puzzled by the behavior we saw in MSVC 5.0 [3]. Given Scott's definition of NULL, the code
if (NULL)
    *NULL;
compiles in MSVC — but neither of us can figure out why. We think this should fail to compile.

The controlling expression in an if statement is of type bool. But there is no conversion from NULL to bool; instead, NULL would have to first transform itself into one of its "infinite" possible pointer types, which would then convert to bool. Since no one pointer-to-bool conversion is best, the compiler should find the conversion ambiguous.
I'm guessing MSVC is implicitly turning NULL into void *; to support this theory, I found that adding
private:
    operator void *() const;
to the class definition kept the example from compiling. I have temporarily lost access to MSVC, so I can't investigate further.

As for *NULL, I haven't a clue to what the compiler is doing. It apparently thinks it's turning NULL into some pointer type, but I don't know which one. I believe this is a bug in MSVC, but again I am not able to research more. As Scott so charmingly puts it:
"MSVC 5 takes an interesting approach to member templates. It parses them, but it doesn't instantiate them unless you explicitly tell it to, and of course that defeats the purpose. Sigh. Maybe they'll fix this in the next release. Then again, maybe they'll follow their lead on operator new[],which has been in the standard since 1992 and still isn't recognized by MSVC..."
(And I thought I had an attitude.)
If you don't have at least one of Scott's books, turn in your Diligent Reader pledge pin until you get one. They are among the few books that I consistently put on the Recommended Reading page of my course handouts.

Mind Your Ps and Qs and ==s

When I publish coding examples, I often leave them purposely incomplete, hoping that you will discover where the design can be improved. Diligent Reader Keith Davies made just such a discovery in my December 1997 column. At the end of that column, I sketched ways that an auto_pointer equality operator could work:
auto_pointer<int> p;
int *q = NULL;
 
p == p; // #1, OK, no conversion
p == q; // #2, OK, converts from int *
p == nil; // #3, OK, converts from class nil
Keith's concern is with example #2:
p == q;
which is really translated as
p.operator==(q);
Since there is no auto_pointer<int>::operator== that takes an int *, q converts to a type that operator== does take:
p.operator==(auto_pointer<int>(q));
This requires the construction of a temporary auto_pointer<int> object, as I mentioned in print with the original example:

You may also object to constructing temporary objects this way. You could create specialized operator== and operator!= overloads tuned to real pointers and nil, but I leave that as an exercise for the student.

But Keith goes further, correctly noting that example #2 is not commutative; that is, the expression p == q is not always equivalent to q == p. While the former calls an auto_pointer member (as shown above), the latter actually calls the language's built-in == operator for pointers:
q == p.operator int *();
As long as you mix auto_pointers and NULL real pointers (what Keith calls "mundane" pointers), these two permutations of == will net to the same result. The trouble comes if you mix auto_pointer with a non-NULL real pointer:
auto_pointer<int> p;
static int n;
int *q = &n; // q is a non-NULL real pointer
 
p == q; // really p.operator==(auto_pointer<int>(q))
q == p; // really q == p.operator int *()
While q == p works as expected, p == q does not. When the temporary object auto_pointer<int>(q) is destroyed, the auto_pointer destructor effectively calls delete on q. For a NULL q, this is no problem, since delete NULL is well-defined and benign. But deleting a non-NULL q — especially one that was never dynamically allocated to start with — is disastrous.
Keith found the simple solution: add one more operator== overload to auto_pointer:
bool operator==(T const *);
Now the expression p == q translates as
p.operator==(q);
requiring no construction (and dangerous destruction) of a temporary auto_pointer object.

Hip Hip Array!

Well, fun is fun, but we really must get on with our array abstraction, lest you lose all faith in me. Our path will be much like the one we took with pointers:

Identify properties of real arrays.

Among those properties, separate interface from implementation.

Create a class definition that emulates the array interface's semantics and (where possible) syntax.

Incrementally omit array interface properties we don't want, and add non-array interface properties we do want. This is typically the hardest (and most interesting) part of the design.

At every step, hide the implementation behind the class's encapsulation barrier.

Arrays are a primitive form of container, or object that holds a collection other objects. While the C++ STL makes available all kinds of containers, and the C Standard Library has long supported containers like FILE, arrays remain the simplest container mechanism built in to these languages directly.
C and C++ array containers share several fundamental properties:

A given array a contains a fixed number of other objects, called a's elements. The number of elements N is called a's length. Once a is created, N cannot change.
sizeof(a) yields the total number of bytes occupied by a. Each a element is of the same static type and occupies sizeof(a) / N bytes.

a has value semantics. Storage for a's elements exists within a itself.

a supports random access of its elements, referenced by successive whole number indices ranging from 0 to N - 1. Those elements are stored contiguously in their indexed order.

The name a represents a non-modifiable lvalue, even if a is declared non-const.

In expressions, a often converts or "decays" to a pointer. The expression a[i] is equivalent to *(a + i).

(We'll run into other properties, especially for C++ arrays, as we go along.)
Next month, we'll partition the interface from the implementation, mapping both into a C++ class. Until then, I leave you with this exercise: if you were writing the next column, how would you partition these properties? Why? Really be aware of your rationale, and remember Heisenberg: the presence of implementation properties can affect interface properties (and vice-versa).

Artistic Expressions

Around the time you read this, I'll be pontificating at SD West. One of my talks features a discussion of lvalues vs. rvalues, a topic potentially confusing to C++ programmers — the simple C model of "lvalues are objects, rvalues are values" breaks down in the presence of class objects, and requires careful construction of user-defined operators.
As we get more into the array class design, we'll find lvalues and rvalues to be vital design elements. As a teaser of sorts, I want to mention a related parlor trick sure to thrill C++ newbies.
Consider the simple C++ expression
++i
where i is a modifiable lvalue of some built-in type. Because the result of ++i is itself a modifiable lvalue, it can in turn be the operand of another pre-increment operation [4]:
++++i
The same notion extends to pre-decrement:
++--++i
To make the pattern even more fun, toss in unary plus and minus:
+-++--++i
identifiers with all underscores:
+-++--++__
and binary + and -:
+-++--++___+___-___
If you possess a pulsing right brain, try crafting ASCII art. For instance, the opaque expression
+ + + ++ ++ _ - _
rendered in two dimensions becomes a charming little tree:
        +
       + +
      ++ ++
       _-_
With the addition of other operators, the possibilities frankly boggle this writer's mind.[Gee, Bobby, are you trying to tell us you're running out of work? We can remedy that. mb] If you have especially clever or aesthetic examples of such artistic expressions, send them to me; I may publish a couple, depending on how distracted my editors are that day [5].

WG14 Follies

I am writing this in mid-December 1997. The C9X Committee Draft has just hit the virtual streets for public review. You can find a copy at <http://www.dkuug.dk/JTC1/SC22/open/n2620/>. Also read more about it on CUJ's website (http://www.cuj.com).
The page <http://www.x3.org/press/1997/pr97157.htm> has guidelines for public comments on the new Draft. The comment period closes on 3 March 1998, so if you have suggestions or complaints, submit them now. Any comments missing this launch window will be considered for the second public review.

Erratica

In response to my January 1998 column, Diligent Readers Gary Powell and the now-ubiquitous Scott Meyers recommend an article in the July/August 1997 issue of the C++ Report. The article, by Carlo Pescio, is called "Template Metaprogramming." According to Gary:
"[The article] uses templates to generate types which are at least a specified number of bits. I think it would do exactly what you want less the use of the restricted keywords unsigned<bits> and signed<bits>."
I'm trying to get a copy of the article in question; once I do, I'll summarize my reaction here. [For more on template metaprogramming, also see Pescio's "Binary Template Metaprogramming," in the February 1997 issue of CUJ mb]
And finally, I have fan mail from some bloke named Stan Kelly-Bootle:
"Bobby: you are correct in saying that log(-2) is not a real number (CUJ, December 97) but wrong with: "...taking the log of a negative number is a mathematical no-no." Not at all! log(-n) has an infinite number of values, the principal value being simply [sic] log(n) + i*pi (where n>0; i is sqrt(-1) and the logs are Napierian [base e]). The better C++ math packages have a ComplexNumber class and a log function that takes arbitrary arguments."
Stan caught me making a silent assumption. I was implicitly working within the mathematical real number system, since I was discussing floating-point representations of those real numbers. My mistake was not making this assumption explicit. Now that complex types are a standard part of ISO C9X and C++, I suppose we all need to stop making such unspoken assumptions. o

Notes

[1] Scott Meyers. Effective C++ Second Edition (Addison-Wesley, 1997). ISBN 0201924889.
[2] Scott Meyers. More Effective C++ (Addison-Wesley, 1996). ISBN 020163371X.
[3] Neither of my Mac OS compilers supports member templates, so I can't compare their behaviors.
[4] For this trick, I'm purposely ignoring the implications of expression evaluation ordering and sequence points.
[5] Using identifiers and operators only — no comments or quoted char/string literals. And of course, all examples must compile.

Bobby Schmidt is a freelance writer, teacher, consultant, and programmer. He is also a member of the ANSI/ISO C standards committee, an alumnus of Microsoft, and an original "associate" of (Dan) Saks & Associates. In other career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him at 14518 104th Ave NE Bothell WA 98011; by phone at +1-425-488-7696, or via Internet e-mail as rschmidt@netcom.com.