SecureProgramming.com
Login
Username: 
Password: 
Forgot your password?
Create a new account





Truncating Data Carefully in C and C++Category: Input Validation
Language: C, C++, and Objective-C
Posted by John Viega on Mon, Sep 08, 2003 (03:28 AM) GMT

Problem
When avoiding buffer overflows by truncating data, there is the possibility of introducing new problems. Additionally, one should watch out for situations where an attacker can truncate data in a way the program doesn't expect.

Solution
In general, be wary of the semantics you'd like to have, and be very careful that truncation doesn't thwart those semantics. In particular, make sure you completely understand the semantics of calls that you use to truncate data, particularly strncat() and strncpy().

Discussion
It may seem counterintuitive, but all too often, security "fixes" introduce new security problems themselves. This happens regularly when using truncation as a solution for preventing buffer overflows.

There are plenty of common ways to mess up data truncation in C. First, data truncation generally requires the programmer to manage buffer lengths. It's easy to accidentally truncate to a size that is too large. This commonly happens when using a function like strncat(), which has the following signature:
char *strncat(char *s, const char *append, size_t count);
The final parameter specifies the maximum number of characters from append to use, but many programmers assume that it specifies the maximum size of s. Therefore, the following construct is not generally secure:
strncat(buf, somestr, sizeof(buf));
In particular, it is susceptible to overflow any time buf is non-null and somestr can be long enough.

Another common truncation problem is that some truncating calls do not guarantee NULL termination. Particularly, strncpy() does not write a trailing zero when there would be overflow, instead filling up the buffer with as much data as possible. If subsequent calls assume that there's always a valid NULL terminator within the bounds of a buffer, disasters can happen. When using strncpy(), one can avoid this problem by always forcing the last character of the buffer to be a NULL, no matter what happens.

On some platforms, the two previous calls can be alleviated with calls to strlcat() and strlcpy(). These calls are similar to the two discussed above, but have more intuitive semantics. See Recipe 3.3 in the Secure Programming Cookbook for C and C++ for portable implementations of these calls.

Sometimes the attacker can force truncation of important data. For example, a bad but common practice is to allow people to write to any file on the file system using a particular application-specific extension. Consider what might happen in the following case:
snprintf(buf, sizeof(buf), "/%s/%s.spc", path, filename);
even if the attacker controls only the filename variable, she can still write to any file under the path simply by adding enough forward slashes to the beginning of the line to force snprintf() to truncate the last four characters (note that "/etc//////password" refers to the same file as "/etc/password" on any Unix system). Of course, performing path canonicalization can help solve this problem in particular instances, but it's generally a better idea to truncate the path to some length smaller than the buffer size and then write in the file extension at the end.

A similar problem occurs when an untrusted user can introduce a NULL terminator (or sometimes a newline or carriage return) into input where it isn't expected. This is often possible when the on-the-wire protocol encodes strings as a length followed by data of the requisite length.

Another similar scenario that an attacker can subvert occurs when one is parsing an Internet protocol that uses CRLF separators, and calculates the length of the string by counting characters to the first carriage return (\r) and then later treats the line feed (\n) as the end of the line (perhaps for printing purposes). The problem here is that, if one doesn't do adequate checking, the attacker can insert arbitrary data between the carriage return and the line feed.


[Python Powered]