C++ Theory and Practice

Columns

C++ Theory and Practice: Basing Style on Design Principles

Dan Saks

In honor of the completed C++ Standard, Dan revisits his first themes from seven years back.

Copyright © 1998 by Dan Saks

As reported in last month's CUJ, the C++ standard is nearly a reality. After eight years of work, the C++ standards committee agreed that the draft C++ Standard was complete enough and correct enough to call it "done." They voted to submit the draft as a Final Draft International Standard (FDIS). That FDIS becomes an official International Standard (IS) after approval by a formal vote among various national standards organizations. If all goes well, the FDIS might be approved by the time this article appears in print.
It is now seven years (to the month) since the first of my "Stepping Up to C++" columns appeared in what was then the "just plain C" Users Journal. Back then I was the C++ columnist, and my columns ran only every other month. Mine was the only C++ article in the March 1991 issue, and it was entitled "Writing Your First Class." It was the first of a series in which I transformed a C program into an object-oriented C++ program.
In the years since I wrote that series, I've done considerable work with that programming example. I've used variations of that program in other CUJ articles to illustrate C++ language features and programming techniques. I've used other versions in seminars on software design and programming style. Now, in looking back at that early series, I find there are a number of things in the program that I'd do differently. There are also other things which, while I'd do them the same, I'd explain them differently.
I like to think that I've learned a thing or two over the years, and the things that I would do differently are things that I can do better. C++ has also grown considerably over the years. The availability of exceptions, namespaces, templates, and an extensive standard library have all changed the way many of us write C++.
In the coming months, I will revisit the programming example that I used in that series seven years ago. That example is a program that generates an alphabetized cross-reference listing of the identifiers that appear in the input stream. As I convert the program from C to object-oriented C++, I will explain how some of the newer language and library facilities lead to some nice improvements. I'll also try to steer you away from some misuses of those facilities.
Converting a C program to an object-oriented C++ is a good exercise for C programmers who are just learning C++, and I certainly hope that those of you who are beginners will stick around for this. However, I really am aiming this particular example at experienced C++ programmers. I will assume familiarity with C++ features such as classes, access control, member functions, constructors, destructors, inlining, and maybe more. I will also draw on my recent explanation of storage duration and linkage (See "C++ Theory and Practice: Storage Classes and Language Linkage," CUJ, December 1997).
What then does this programming example have to offer experienced programmers? First, building on my recent explanation of namespaces (see "C++ Theory and Practice: An Introduction to Namespaces," CUJ, January 1998), I will use the example as a context for describing namespaces, using-declarations, and using-directives in greater detail. Secondly, I will to use the example as a forum for presenting numerous C++ programming style guidelines.

Programming Style

My Editor-in-Chief doesn't like me to use the phrase "programming style." I guess he's concerned that that "programming style" conjures up images of heated debates over the more superficial aspects of programming, such as where to place curly braces and whether to use spaces or tabs for indenting. Indeed, people often use the word "style" in contrast to "substance." Maybe he has a legitimate concern.
Let's look up "style" in a dictionary and see what it says. Ah, here it is. It says "An enclosure for pigs." Oops, that's "sty." Under "style" it says, among other things, "A customary manner of presenting printed material, including usage, punctuation, spelling, typography, and arrangement"[1]. Style does indeed include issues that are purely matters of form, such as typography and arrangement, but it also includes "usage." "Usage" has a pretty broad meaning. It covers just about any choices you make regarding the words or phrases you use to express ideas. Those choices can affect how others interpret the substance of your ideas. I'm inclined to apply the term "programming style" just as broadly. Here's my take on it. The source code you write has two audiences:

translation tools, such as compilers, interpreters and even analyzers such as lint

programmers, most likely humans

Translation tools impose some pretty unforgiving requirements on source code, in the form of syntactic and semantic rules. Your code must conform to those rules or risk being rejected by the tools. Beyond satisfying the language rules, I believe that what you put into your source code and how you organize it into files is largely a matter of programming style.

Two of the best books on programming style deal seem to share this view. Although it is over twenty years old, and the programming examples are all in Fortran and PL/I, Kernighan and Plauger [2] is a classic and well worth reading. Most of their recommendations still apply in C++ programming today. (You can ignore that stuff they wrote about Fortran's arithmetic-if.) Cargill [3] continues in the spirit of Kernighan and Plauger. Although his recommendations focus specifically on C++, many of his suggestions apply to other object-oriented languages as well. Both of these books deal almost exclusively with issues of language usage rather than presentation format. As I work through my programming example, I will draw upon these and possibly other sources as appropriate.
Here, for example, is a style issue that addresses C++ language usage. In C++, a struct is just a class in which all bases and members are public by default. For example,
struct D : B
    {
    void f();
    T m;
    };
has exactly the same meaning as
class D : public B
    {
public:
    void f();
    T m;
    };
This raises a question: is there a reason to use both keywords (class and struct) in a C++ program? If you are rewriting a C program into C++, the original code will almost certainly use struct. Should you change all occurrences of struct to class? If you are writing new C++ code, is there ever a reason to declare a type using the keyword struct?
I think of this as a programming style issue. I don't know what else to call it. Interesting as they are, I'm not going to try to answer these questions now, out of context. I will to address them when they arise in the course of rewriting the cross-reference program.
To further whet your appetite, here are other questions that I also consider style issues:

On what basis do you distribute declarations among source and header files?

What declarations belong in namespaces, and what names should still be global?

When should you use incomplete types?

When should you declare a function inline?

I intend to address all of these in the coming months.

A Basis for [Dis]Agreement

As I rewrite the cross-reference program, I will be making judgements about which C++ features to use. Although I may not say so explicitly at each juncture, in effect I will be implying that one way of writing the program is in some sense "better" than another. Many times the reasons will be self-evident to experienced programmers. At other times they won't be so obvious, and you may find yourself disagreeing with me. But hey, that's cool.
Although a good style rule has wide applicability, few (if any) rules apply in every situation. That's why there are hardly any style rules, but possibly many guidelines. If you understand the rationale for a guideline, you can determine when it's appropriate to deviate from it.
Any rational programming style guideline should support your notion of "better." But "better" could mean different things to different people, or even to one person working on different projects. Among other things, "better" could be:

more nearly correct (having fewer bugs)

more maintainable (taking less time or money to fix bugs and add enhancements)

more functionality for less effort

more efficient (executing in less time or with fewer physical resources such as memory or disk space)

more portable (available on a wider variety of software and hardware platforms)

Some criteria for "better" may conflict with others. For example, you might reasonably opt to sacrifice maintainability for portability in software you would like to sell to a wider market. Whatever you decide, your programming style should reflect your sense of priorities.
So that you'll know where I'm coming from when I say "this is better," I'll tell you that I think correctness and maintainability are far-and-away the most important qualities of software. I believe firmly that, although increasing portability and efficiency may reduce maintainability, it should never be allowed to compromise correctness.

The Illusion of Simplicity

Twenty years ago, structured analysis and design were in vogue. Methodologists preached that analysis, design, and programming should be discrete stages of software development. Developers paid lip service to this view, even if few actually practiced it.

Object-oriented analysis and design are now the fashion of the day. It's now socially acceptable for developers to openly admit that software development is an iterative process in which we use what we learn from programming to refine the results of earlier analysis and design.
Thus, with current programming languages and design methods, it's often hard to tell where design leaves off and programming begins. Good programming style reinforces good designs, while it exposes flaws in poorer designs. On the other hand, poor programming style can undermine even the best designs.
If you had a tool that could generate complete and correct programs from design specifications, it really wouldn't matter if the source code that the tool generated were incomprehensible. If the resulting programs did what you wanted them to do, then you wouldn't have to look at the code. In that case, then programming style wouldn't really matter.
Unfortunately, such tools are rare. Where they are available, they can handle only a fraction of what we want of them. Most of us still have to develop most of our code the old-fashioned way — we write it. Style does matter.
You can improve the odds that your code will be correct and maintainable if you design and implement it so that it's as simple as it can be. There's no brilliant insight here. Simple programs are easier to create and maintain than complicated programs.
The best way to simplify a piece of software is to eliminate unnecessary features. I don't recall where I saw this, but I believe Gordon Bell (who, among other things, was the architect of Digital's PDP-11 minicomputers) wrote that:

The cheapest, fastest and most reliable components of a computer system are those that aren't there.

This is just as true for software as it is for hardware.

This issue is a bit of a hot button for me. I am very disturbed by the overly elaborate functionality of much of the software I must use. All those bells and whistles come at the price of reliability, as well as usability. I'd gladly give up animated pop-up help and integrated Web browsing in my word processor if it wouldn't lock up while printing large documents. I'd gladly give up plug and pray (sic) hardware if I didn't have to completely reinstall the operating system on my notebook two or three times a year. (You don't have to write to me and tell me to switch to Linux. I'm already thinking seriously about it.)
Being realistic, even if you were to strip software systems to the bone, many of them would still be pretty complex. The best way to cope with complex software is to make it look simpler than it really is. I'm particularly fond of Grady Booch's line:

The task of the software development team is to engineer the illusion of simplicity[4].

The basic technique for engineering the illusion of simplicity is to decompose systems into simpler abstractions. An abstraction is a view of something that sets aside (hides) those aspects of that something that are irrelevant for current purposes. Not all decompositions lead to good abstractions. That is, just dividing up a big piece of software into units (functions or classes) doesn't necessarily make the overall structure simpler. It does only if you find the appropriate boundaries between the units.
Object-oriented techniques are supposed to simplify an application's design by using abstract data types to model parts of the application. If done properly, an abstract type describes the outward behavior of an object, yet hides the underlying implementation details. This creates the illusion of simplicity by (1) organizing a program so you can view it from various angles, and (2) selectively hiding details from each view. In the coming months, I will try to show how good programming style reinforces that illusion.

On Being Strict

I'll get to specific style issues next month. In the meantime, I'd like to address this thought-provoking piece of mail from the ever-vigilant Eric Nagler. (Eric is a prominent C++ teacher and the author of Learning C++.) He wrote:
Hi there. Here is my question in regards to your article ('C++ Theory and Practice: An Introduction to Namespaces,' CUJ, January 1998):
"A namespace cannot have a trailing semicolon either"
Why not? Isn't this the same as any other needless semicolon, for instance, as in a null statement? C++ compilers never complain about a semicolon at the end of a function body:
class X
{
    X() {}; // needless semicolon
};
nor in the global namespace scope:
int x = 1;
;
;
For what it's worth, neither Borland nor Microsoft complains about the code above. So are you sure that it's a syntax error?
EriC++
The EBNF grammar for a simple declaration is
simple-declaration =
    [ decl-specifier-seq ]
       [ init-declarator-list ] ";" .
In Nagler's example,
int x = 1;
int is the optional decl-specifier-seq and x = 1 is the optional init-declarator-list.

The grammar clearly says that a simple-declaration can be just a semicolon, so it appears that
int x = 1;;;
should be okay, as should
class X
{
    X() {}; // needless semicolon
};
Although the grammar for namespace-definition does not allow it to end in a semicolon, it seems that
namespace N
    {
    ...
    };      // needless semicolon
should also be okay because the trailing semicolon is just another empty declaration.

However, the C++ Standard adds the following constraint (in paragraph 3 of clause 7):

In a simple-declaration, the optional init-declarator-list can be omitted only when declaring a class or enumeration, that is, when the decl-specifier-seq contains either a class-specifier, an elaborated-type-specifier with a class-key, or an enum-specifier.

This says that a declaration such as
int ;
is an error because the init-declarator-list is empty and the declaration declares neither a class nor enumeration. It also says that
;
is an erroneous declaration for the same reason.

Neither the Borland nor Microsoft compiler considers an empty declaration at namespace scope as an error, even when compiling in strict standard-conforming mode. (With Borland, you use the -A option to compile in "ANSI" mode. With Microsoft, the -Za option compiles with language extensions disabled.) I must admit, this made me question my interpretation of the standard. However, I got the support I needed from the Edison Design Group (EDG) C++ front end, which considers an empty declaration as an error when compiling in strict mode. I also discovered that the Borland compiler considers an empty declaration in class scope as an error when compiling in strict mode. So, I stand by my interpretation.
Interestingly, a needless semicolon immediately after a member-function definition is not an error, as in
class X
{
    X() { };  // needless semicolon
    ;         // error
};
The C++ grammar includes this production:
member-declaration =
    function-definition [ ";" ] .
This allows an extraneous semicolon only in this particular case.

In summary, my understanding is that, regardless of what your compiler currently accepts, Standard C++ does not allow empty declarations. It allows an extraneous semicolon only after a function definition in class scope.o

References

[1] Peter Davies, ed. The American Heritage Dictionary (Dell, 1970).
[2] Brian Kernighan and P.J. Plauger. The Elements of Programming Style (Prentice Hall, 1974).
[3] Tom Cargill. C++ Programming Style (Addison-Wesley, 1992).
[4] Grady Booch. Object-Oriented Design with Applications (Benjamin/Cummings, 1991).
Dan Saks is the president of Saks & Associates, which offers training and consulting in C++ and C. He is active in C++ standards, having served nearly seven years as secretary of the ANSI and ISO C++ standards committees. Dan is coauthor of C++ Programming Guidelines, and codeveloper of the Plum Hall Validation Suite for C++ (both with Thomas Plum). You can reach him at 393 Leander Dr., Springfield, OH 45504-4906 USA, by phone at +1-937-324-3601, or electronically at dsaks@wittenberg.edu.