C++ Theory and Practice

Columns

C++ Theory and Practice

Dan Saks

Storage Classes and Linkage

Function names aren't the only thing overloaded in C++. Storage class keywords really carry a lot of freight.

Copyright © 1997 by Dan Saks

Over the past couple of years, I've written a number of articles on the structure of declarations in C++. (See "The Column That Needs a Name: Understanding C++ Declarations," CUJ, December 1995, and "The Column That Needs a Name: Understanding C++ Declarators," CUJ, January 1996.) In those articles, I focused on those parts of a declaration that combine to form a data type. In so doing, I mostly ignored the other parts of a declaration, namely, storage class specifiers (such as extern and static) and initializers.

Storage class specifiers add little to the syntax of declarations but they add a lot to the semantics. For example, the presence or absence of the extern specifier in a global data declaration can determine whether that declaration is also a definition, and therefore allocates storage. The static storage class specifier is especially interesting because it has multiple personalities. In some contexts, the keyword static affects the way a program allocates storage. It can also affect how separately-compiled source files link together. In other contexts, it restricts the behavior of member functions.

This month, I'll try to make up for my neglect of storage class specifiers. My focus will be on how they distinguish definitions from plain declarations, and how they affect program linkage. But first, I do have few remarks about the syntax.

Storage Class Specifiers

C++ has five storage class specifiers: the keywords auto, register, extern, static, and mutable. The storage class specifiers are one of three categories of decl-specifiers. The other categories are type specifiers (such as float, int, unsigned, or an identifier naming a type) and function specifiers such as inline or virtual. (For a more thorough discussion of decl-specifier, see the articles listed above.)

Nearly every object and function declaration has two parts: a decl-specifier-seq (a sequence of decl-specifiers), followed by a declarator-list (a comma-separated list of one or more declarators). For example, in the declaration
static char const *table[N];
static, char, and const are the decl-specifiers. Of these, static is a storage class specifier, while char and const are type specifiers. *table[N] is a declarator-list with just one declarator.

The order of the decl-specifiers doesn't matter to the compiler. The storage class specifier can appear first, last, or sandwiched between type specifiers. Thus,
const static char *table[N];
has the same meaning as the previous declaration.

A declarator is a declarator-id (the name being declared) along with any operators (*, &, (), and []) that might surround it. For instance, the declarator-id in *table[N] is the identifier table. The operators in the declarator combine with the type specifiers to specify the type in a declaration. For example, the *table[N] specifies that table is an "array with N elements of type pointer to ..." Pointer to what? To the type described by the type specifiers which, in this case, are const char. Thus, table has type "array with N elements of type pointer to const char".

The storage class specifier in a declaration does not contribute to the type. Rather, it applies directly to the declarator-id. For example, in
const static char *table[N];
the attributes associated with the keyword static apply to the object named table.

The decl-specifier-seq in a declaration can contain more than one type specifier. For example, in
unsigned long int *p;   // OK
unsigned, long, and int are all type specifiers. However, the decl-specifier-seq can contain at most one storage class specifier. For example,
static mutable int counter; // error
is an error because it contains two storage class specifiers: static and mutable.

The keyword typedef is also a decl-specifier. However, a given decl-specifier-seq cannot contain both typedef and a storage class specifier. For example,
typedef register int i; // error
is an error.

In C, typedef is a storage class specifier, so the prohibition against combining typedef with a storage class specifier is merely a consequence of the more general prohibition against more than one storage class specifier. In C++, typedef is not a storage class specifier.

The storage class specifiers extern and static can occur in object and function declarations in either namespace scope or in block scope, except that static cannot appear in a function declaration in block scope. (Namespace scope includes file scope, which is outside class and block scope.) For example,
static void f()     // OK
    {
    static int g(); // error
    ...
    }
Here, the static specifier in f's declaration is okay because it occurs at namespace scope. The static specifier in g's declaration is an error because it occurs at block scope. [We realize that we have not discussed namespaces very much to date in CUJ; however, Dan is laying the groundwork to discuss them at length in a future article — mb]

The static specifier can occur in declarations in class scope.

auto and register can occur in declarations in block scope or in a formal parameter list, but not in namespace scope. For example:
register char *p;       // error
     
void f(register int i); // OK
C does not allow auto in a parameter declaration. C++ does allow it, but it has no effect.

The mutable specifier can occur only in object declarations at class scope.

Declarations vs. Definitions

Earlier I mentioned that the presence or absence of a storage class specifier can determine whether a declaration is also a definition. Let's take a moment to review the distinction between declarations and definitions.

In C++, as in C, a declaration introduces a name into a program, and specifies attributes for that name. If the declarator-id in the declaration designates an object or function, and the declaration reserves storage for that object or function, then that declaration is also a definition.

For example,
int abs(int n);
is a function declaration. It merely establishes that a function named abs with the declared type exists somewhere in the program. In contrast,
int abs(int n)
    {
    return n > 0 ? n : -n;
    }
is a function definition as well as a declaration. It not only establishes that the function exists somewhere, it reserves storage by supplying the code for the function body.

As in C, a C++ program can repeat certain object and function declarations. The repeated declarations need not be identical, but they must be compatible. For example, repeated declarations of a function need not use the same formal parameter names, as long as the function's signature and return type remain the same. That is, given
void f(int a, int b);
then
void f(int i, int j);
void f(int, int);
are valid redeclarations of f.

A program can declare a given object or function more than once, but define it only once. The draft C++ Standard refers to this requirement as the One Definition Rule (ODR). C also requires that programs define each object and function exactly once, but it muddies the waters by allowing tentative object definitions. A tentative definition might be a bona fide definition, or it might be just a declaration, depending on the other declarations in the program.

For example, when it occurs at file scope in a C program
int i;
is a tentative definition. If
int i = 0;
occurs at file scope later in the same translation unit, then it is i's definition, and the previous tentative definition is just a declaration after all.

Thus, in C, you can't always tell whether a declaration is also a definition just by looking at it. In C++, you can. In particular:

A function declaration is a definition if it includes the function's body.

An object declaration is a definition unless it contains the extern specifier and no initializer, or it declares a static data member in a class declaration.

For example,
extern int total;       // declaration
is a declaration, while all of
int total;              // definition
int total = 0;          // definition
extern int total = 0;   // definition
are definitions.

Linkage

A name declared in one part of a program might refer to a name defined elsewhere in the program. The declaration and definition might be in different scopes, or in completely different translation units. The linkage of a name is the extent to which a name might refer to a name declared elsewhere. C and C++ provide for three levels of linkage:

A name with external linkage denotes an entity that can be referenced via names declared in other scopes in the same or different translation units.

A name with internal linkage denotes an entity that can be referenced via names declared in other scopes in the same translation unit.

A name with no linkage denotes an entity that cannot be referenced via names from other scopes.

The linkage of an object or function is determined by a combination of factors: its scope, its storage class specifier, and the linkage established in a prior declaration. The linkage of an object also depends on whether or not that object is const. I don't think I can explain in one pass how all these factors play against each other. So I'll begin with some overly general rules, and gradually refine them.

Linkage at Namespace Scope

In general, the name of an object or function declared at namespace scope has external linkage if it is declared with the extern specifier or no storage class specifier. It has internal linkage if it is declared with the static specifier. For example, if the following declarations appear at file scope in file1.cpp,
// file1.cpp
     
int i = 0;          // external
extern int j;       // external
static int k;       // internal
then i and j have external linkage, and k has internal linkage. Now, suppose file2.cpp contains the following declarations at file scope
// file2.cpp
     
extern int i;       // external
int j;              // external
static int k;       // internal
When you compile and link the files together, then the program will have one object named i with external linkage defined in file1 and referenced in file2. It will have another object named j with external linkage defined in file2 and referenced in file1. The program will have two objects named k, each with internal linkage in each file.

Actually, an object declared const at namespace scope with no storage class specifier has internal linkage by default. For example, if declared at namespace scope as
int const N = 100;  // internal
then N has internal linkage. C and C++ are different in this regard. In C, a const object declared at file scope has external linkage by default. Of course, in either language you can always override the default with an explicit storage class specifier, as in
extern int const N = 100;   // external
In C++, you can omit the initializer from the declaration of a const object only if you explicitly declare it extern. For example,
extern int const M; // OK, external
int const N;        // error
Here, the declaration for M is just a declaration because it has an extern specifier and no initializer. The declaration for N is a definition, but it's in error because the initializer is missing.

Earlier I said that the name of a (non-const) object or function declared at namespace scope has external linkage if it is declared with the extern specifier or no storage class specifier. Actually, it can have internal linkage if a previous declaration gave it internal linkage. For example,
static void f(); // internal
int const N = 100;  // internal
...
void f();           // still internal
extern int const N; // still internal
Apparently, the extern specifier (or no storage class specifier) doesn't mean just "external linkage." It means "external linkage if not previously declared with internal linkage." Interestingly, if you reverse the order of the declarations from the previous example, const object N gets external (instead of internal) linkage and the declaration for f with the static specifier becomes an error:
extern int const N; // external
void f();           // external
...
int const N = 100;  // still external
static void f();    // error
Here now is a more precise statement of the rules for the linkage of objects and functions in namespace scope:

An object or function has internal linkage if it is explicitly declared static or previously declared with internal linkage.

A const object has internal linkage if it is neither explicitly declared extern nor previously declared with external linkage.

Otherwise, an object or function has external linkage.

Linkage at Block Scope

The name of an object has no linkage if it is declared either in block scope and without the extern specifier, or in a parameter declaration. For example, in
void f(int i)       // f: external
    {               // i: no
    int j;          // j: no
    static int k;   // k: no
    ...
    }
i, j, and k all have no linkage.

The linkage of an object declared at block scope with the extern specifier depends on declarations that appear in enclosing scopes. When it encounters an object declared extern in block scope, the compiler looks outward through enclosing scopes to find the declaration of an object with the same name as the object in block scope. The compiler will not look past the first namespace scope that it encounters.

If the compiler finds a declaration in an enclosing scope for a name with linkage and the same type as the object in block scope, then the object declared in block scope gets the same linkage as the name found by the lookup. If the compiler finds a declaration for an object with a different type or for something other than an object, the program is in error. Otherwise, the object at block scope gets external linkage. For example,
static int i = 0;   // internal
extern int j;       // external
     
void f()
    {
    extern int i;   // internal
    extern float j; // error
    extern int k;   // external
    ...
    }
Here, the declaration for i at block scope has an extern specifier, so the compiler looks in outer scopes for an object declaration for i with type int. The declaration it finds at file scope declares i with internal linkage, so the i at block scope also has internal linkage. The declaration for j at block scope is an error because it has a different type from j declared at file scope. k at block scope has external linkage because there is no declaration for k in an enclosing scope.

A function declared in block scope can have an extern specifier or no storage class specifier at all. (Again, it cannot have a static specifier.) However, its linkage does not depend on the presence or absence of the extern specifier. Rather, it depends on the linkage of a matching declaration in an enclosing scope. That is, the linkage of a function declared at block scope is determined by lookup rules similar to those that determine the linkage of an object declared extern at block scope.

If the compiler finds a declaration in an enclosing scope for a function with the same name and type as the function in block scope, then the function in block scope gets the same linkage as the name found by the lookup. If the compiler finds a declaration for something other than a function, the program is in error. Otherwise, the function in block scope gets external linkage. For example,
static int f();     // internal
extern int g;       // external
     
void f()
    {
    extern int f(); // internal
    int g();        // error
    ...
    }
Here, the declaration for f at block scope refers to the declaration for f at file scope, and therefore has internal linkage. The declaration for function g at block scope conflicts the declaration for object g at file scope, and so is an error.

Linkage in Class Scope

Classes declared in namespace scope have external linkage. What this really means is that their member functions have external linkage. Even static member functions and static data members have external linkage. This is also true for member allocation and deallocation functions (operators new and delete), which are static members even if not declared with the keyword static.

This is what I meant when I said that the keyword static has multiple personalities. When applied to names at namespace scope, the keyword static declares internal linkage. At block scope, it declares no linkage. And at class scope, it declares external linkage. o

Dan Saks is the president of Saks & Associates, which offers training and consulting in C++ and C. He is active in C++ standards, having served nearly seven years as secretary of the ANSI and ISO C++ standards committees. Dan is coauthor of C++ Programming Guidelines, and codeveloper of the Plum Hall Validation Suite for C++ (both with Thomas Plum). You can reach him at 393 Leander Dr., Springfield, OH 45504-4906 USA, by phone at +1-937-324-3601, or electronically at dsaks@wittenberg.edu.