Questions & Answers

Columns

Questions & Answers

Pete Becker

Hiding Passwords in C

Pete reveals how not to send a character when it's typed, and how not to delete elements of an array.

To ask Pete a question about C or C++, send e-mail to petebecker@acm.org, use subject line: Questions and Answers, or write to Pete Becker, C/C++ Users Journal, 1601 W. 23rd St., Ste. 200, Lawrence, KS 66046.

Q

I'm developing a UNIX C program with embedded SQL to report database activity. I am having a problem hiding, or replacing with an '*', the keystrokes when the user types in his/her Sybase password. Do you have any ideas?

— Leon Thornton

A

This is a fairly common question, and unfortunately, neither C nor C++ has a good answer for it. Both languages are designed to be usable on a broad variety of systems, some of which may not be capable of hiding keystrokes. There are non-portable solutions, though; as usual, be sure that you isolate such non-portable code so that it can be easily replaced if you need to move you application to a different system.

Many C and C++ compilers provide a function named getpass that will do exactly what you need. Here's Borland's description of getpass, from the Borland C++ 5.0 Programmer's Guide:

Syntax:
#include <conio.h>
  char *getpass(const char *prompt);
Description:
Reads a password.

getpass reads a password from the system console after prompting with the null-terminated string prompt and disabling the echo. A pointer is returned to a null-terminated string of up to eight characters (not counting the null-terminator).

Note: Do not use this function for Win32s or Win32 GUI applications.

Return Value: The return value is a pointer to a static string which is overwritten with each call.

The Borland documentation indicates that this function is also available on many UNIX compilers.

Here's a short example of how to use getpass:
#include <conio.h>
#include <stdio.h>
#include <string.h>
     
int main()
{
     char password[9];
     strcpy( password, getpass("Password: ") );
     printf( "\nYou typed "%s'\n", password );
     return 0;
}
If you compile and run this program, you'll see that it doesn't echo the characters that you type in. That approach seems to have gone out of fashion these days, so if you prefer to show an asterisk for each character as it's typed in, you need to write your own function. This requires handling each character as it comes in, saving the character in an internal array, and echoing the asterisk:
char *getpass1( const char *prompt )
{
     static char pass[9];
     char *current = pass;
     int ch;
     printf( "%s", prompt );
     for(;;)
          {
          ch = getch();
          if( ch == '\r' )
               break;
          *current++ = (char)ch;
          putch('*');
          if( current == pass+8)
               break;
          }
     *current = '\0';
     return pass;
}
There are two non-portable constructs in this code. First, there's the function getch, which reads a character from the keyboard but doesn't echo it. That's critical to making our password function work. Second, there's the comparison of the input character with '\r' in order to terminate the loop. Despite its appearance, this loop actually exits when the user hits ENTER. That's because getch is reading raw input from the keyboard, and doesn't translate it into the usual C conventions. On a PC, ENTER generates a keystroke with the same value as '\r'.

You may have also noticed that if you press backspace while typing in your password it shows up as another asterisk and counts as one of the input characters. In fact, you can type eight backspaces, and getpass1 will happily return them all as the password. That's because getpass1 handles backspaces in exactly the same way as all other characters: it simply stores them in the password buffer. However, when you use printf to display the resulting password as a string, it looks like the backspaces have done their job: the character before the backspace does not appear in the output. A little more digging shows that this deletion is being done by printf. The backspaces are still in the password itself. To see this, add the following to the main function above, immediately after the call to printf:
i = 0;
while( password[i] != '\0' )
     printf( " %02X", password[i++] );
printf( "\n" );
You'll also have to add a definition of i at the top of the function. Once you've added this code, try typing a password that contains a backspace or two.

Handling backspaces is straightforward. Let's rewrite getpass1 to fix this problem and to make sure that we handle all non-text characters correctly. getpass2 appears in Listing 1.

Now we have a robust function that requests a password from the user. Remember, though, that it's not portable. If you move this code to a different system you'll need to replace getch with code that gets a single character and doesn't echo it; and you'll need to check what character code is produced by the ENTER key. Neither task is hard to do, but if you forget them your code won't work.

Q

We've been self-teaching C++ for about 1 year. We use Borland C++ (ver. 3.1 and 4.5). So far we haven't had any serious problems. But recently we wrote the following code:
#include<iostream.h>
class A
{
 public:
          A() { cout << "A::A()\n";}
          virtual ~A() { cout << "A::~A()\n";}
          int i;
};
class B : public A
{
 public:
          B() : A() { cout << "B::B()\n";}
          virtual ~B()  { cout << "B::~B()\n";
          int j;
};
int main(void)
{
  A* tab = new B[3];
  delete[] tab;
}
After running this code we saw :
A::A()
B::B()
A::A()
B::B()
A::A()
B::B()
A::~A()
A::~A()
A::~A()
We haven't found any reasons why it had worked in that way. Why hasn't the destuctor of class B been called? Is it a compiler bug or are we doing anything wrong? Any suggestions would be appreciated.

— Janusz Perek & Adam Orcholski

A

It's not a compiler bug. You're trying to do something that the language does not support.

Arrays often confuse programmers who are new to C. C++ adds polymorphism to the picture, and this makes arrays even more confusing, because people expect arrays to somehow act polymorphically. The fact is that arrays in C++ act almost exactly the same as they do in C. The designers of C++ decided not to try to fix the problems that arrays have in C, but to encourage the use of classes to provide the features that people want in arrays.

The usual explanation of the fundamental problem with arrays is that they are not first class objects. This term has no technical meaning in the C++ language — if you read the current ANSI/ISO working paper you will not find the term "first class object" anywhere. However, this term probably gives you a sense of the general problem: arrays do not have the properties of C++ classes.

The problem that you're seeing arises because there is no such thing as a pointer to an element of an array. When you use the name of an array you get a pointer to its first element. This is simply a pointer to an object, and the compiler does not know that this object happens to be part of an array. So in this code fragment:
B *bptr = new B;
B *arrptr = new B[3];
the two pointers, bptr and arrptr, are treated in exactly the same way by the compiler. The compiler does not know that one points to the first element of an array and that the other points to a single object. All it knows is that each pointer points to an object of type B. It's up to you to know that arrptr in fact points to the first element of an array, and that the expression arrptr[2] is meaningful; but that bptr points to a single object and that the expression bptr[2] is meaningless. The compiler will allow you to use either of these expressions in your program. The first will behave sensibly. The second will not. Further, the compiler also does not know how to delete these things properly. That's why we have to know to use array delete when we have a pointer to an array and to not use it when we don't.

When you add polymorphism into the mix the opportunities for mistakes multiply. The expression new B[3] returns a pointer to the first element of the newly allocated array. Since that pointer is simply a pointer to an object of type B, it can be converted into a pointer to its base class, A, without requiring a cast. That is, it's perfectly valid to assign this value to a pointer to A:
A *aptr = new B[3];
The compiler doesn't know that aptr points to an array of B's. All that it knows is that aptr points to an object of type A, and it will generate code accordingly.

This means that you can use aptr in any context in which you could use a pointer to an ordinary A object and expect aptr to work correctly. In particular, you can call virtual functions through it and they will behave normally. However, since the rules involving arrays have not been changed from C, pointer arithmetic with aptr simply will not work. All the compiler knows about aptr is that it points to an object of type A. In an expression such as aptr+1 the compiler does the arithmetic by computing the address of the next object of type A that would follow the object pointed to by aptr. That is, it increases the value of aptr by the size of an object of type A, just as in C. Since aptr in fact points to an object of type B, unless B and A happen to be exactly the same size, this pointer arithmetic will not produce a useful pointer. The compiler will produce a pointer that points into the middle of the first object instead of pointing to the beginning of the second object in the array, and any use of this pointer will probably be disastrous.

It would have been possible to require the compiler to get this pointer arithmetic right. In fact, several people have made suggestions that this be done. There's a straightforward mechanism for fixing this problem; it would work like this: the compiler would not automatically use the size of A when doing pointer arithmetic. Instead, whenever a class had virtual functions the compiler would add a virtual function that returned the actual size of the object, and it would call this function when deciding how to increment the pointer. This method would make pointer arithmetic work correctly for arrays of polymorphic types. However, this would make pointer arithmetic considerably slower than necessary in the common case when a pointer points to an array of objects of its actual type. That is, in the expression
B *arrptr = new B[3];
the code generated by the C approach works just fine, and the added overhead of a virtual call to get the size of the object that arrptr points to would be entirely unnecessary.

In your example, at the point where the delete operator is invoked, the compiler knows that it has a pointer to an A object. What you've seen is that the compiler simply calls A's destructor, despite the fact that this destructor is virtual. That's permissible, because the language definition says that the effect of your delete expression is undefined. Here's a paragraph from clause 5.3.5 of the ANSI/ISO working paper:

In the first alternative (delete object), if the static type of the operand is different from its dynamic type, the static type shall be a base class of the operand's dynamic type and the static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.

The last sentence is the key here. Since you allocated an array of B's, you must invoke delete with a B*. If you do anything else the behavior is undefined. That is, anything at all can happen.

Once again, the language definition could have required that this work the way you expect, and it is possible to implement that behavior. Such an implementation would make all array deletes of polymorphic types a bit slower and more complicated, but it could be done. However, that still doesn't address all of the problems that arrays present. Rather than add a couple of ad hoc solutions to particular problems, the decision was made to leave arrays alone. Don't expect arrays to behave correctly in contexts other than ordinary C usage. o

Pete Becker is a consultant who specializes in teaching C and C++ to experienced programmers. He spent eight years in the C++ group at Borland international, both as a developer and manager. He is a member of the ANSI/ISO C++ standardization committee. He can be reached by e-mail at petebecker@acm.org.