Previous: 11.2.1 String I/O: gets() and puts()
Up: 11.2 Library String Functions
Next: 11.2.3 String Operations: strcmp() and strcat()
Previous Page: 11.2.1 String I/O: gets() and puts()

11.2.2 String Manipulation: strlen() and strcpy()

As our next task, let us consider reading lines of text and finding the longest line in the input:

STRSAVE: Read text lines until end of file; save the longest line and print it.

Our approach is similar to the algorithm for finding the largest integer in a list of integers. We save the current ``guess'' at the longest line in a string, and, as each new line is read, we compare the length of the new line with that of the current longest line. If the length of the new line is greater than that of the current longest, we will save the new line into the longest and proceed. To begin, we initialize the longest line to an empty string; the shortest of all strings. Here is the algorithm:

initialize longest to an empty string

while not EOF, read a line if length of new line > length of current longest save new line into longest

print longest

To implement this algorithm, we must consider how we can perform the required operations on the strings holding the new line and the current longest line. We already know how to read and write strings; we also need the operations of finding the length of a string and saving a string. For the former task, the standard library provides a function:

int strlen(STRING s);

which returns the length of a string, s, i.e. the number of characters in s excluding the terminating NULL.

For the second operation, we can consider the implementation of the maximum integer algorithm and how we saved the new maximum value - we used an assignment operator. However, this will not work for strings. Remember, the string is implemented as a character pointer. If we simply assigned one string variable to another, we would only be saving the pointer to the first string, not the string characters themselves. Then, when we read the next input line, we would overwrite the current string as well. Instead we need to copy the new line string into the current longest string. The standard library provides a function for this operation:

STRING strcpy(STRING dest, STRING source);

which copies a string pointed to by source into a location pointed to by dest. The function returns the destination pointer, dest. This is the equivalent of an assignment operation for data type, STRING.

The prototypes for these and other standard library string functions are in a header file, string.h. We can now write the program implementing our algorithm as shown in Figure 11.3.

Notice, we initialize the current longest string by using strcpy() to copy an empty string into longest. It is also possible to initialize it as follows:

*longest = '\0';
or,
longest[0] = '\0';
Use of strcpy() makes it clear that an empty string is copied into longest. It has the flavor of assigning a string constant to another string, the same way longest is updated to the new string, s, within the loop body. Thus, we are sticking with our concept of an abstract data type by only using the defined functions to perform operations on data of the type, STRING. A sample session is shown below:

Remember that assignments cannot be used to store strings into arrays. When a string is to be stored into a specified character array, use strcpy() to copy one string to another; do NOT use an assignment operator.

Implementing strcpy()

The standard library provides the function srtcpy() for us to use; however, it is instructive to look at how such a function can be written. Let us write our version of strcpy() to copy string, t, into string, s:

/* File: str.c */
     /* Function copies t into s */

#include "strtype.h"

STRING our_strcpy(STRING s, STRING t) { while (*t != '\0') { *s = *t; s++; t++; } *s = '\0'; return s; }

The arguments passed to formal parameters, s and t, are of type STRING, i.e. character pointers. The loop is executed as long as *t is not NULL. In each iteration, a character is copied into (the string pointed to by) s from (the string pointed to by) t by the assignment of *t to *s. The pointers s and t are then incremented so they point to the next character positions in the two arrays. If t does not point to a NULL, the loop repeats and copies the next character, etc. If t points to a NULL, the loop terminates. After the loop terminates, a terminating NULL is appended to s. The function returns the pointer, s.

Notice, there is a problem with this implementation. The function returns the value of s, however, this is no longer a pointer to the destination string --- s has been incremented as the string was copied and now points to the end of the destination string. We leave the repair of this function as an exercise (see Problem 11).

Several alternate versions of our_strcpy() can be written as follows (Note: these versions return void rather than a STRING):

/* File: str.c - continued */
     void our_strcpy2(STRING s, STRING t)
     {
         while ((*s = *t) != '\0') {
              s++;
              t++;
         }
     }
In the above, the while condition uses the assignment expression whose value is the character assigned to check against NULL. If the value is NULL, the loop is terminated; however, the assignment places the terminating NULL character before the loop is terminated. Here is another variation:
/* File: str.c - continued */
     void our_strcpy3(STRING s, STRING t)
     {
         while (*s = *t) {
              s++;
              t++;
         }
     }
In the while loop, when the assigned character is '0', the value of the expression is zero, and therefore false. Otherwise, the character assigned is not NULL, and the value of the expression is true. The loop terminates correctly when it should. It is also possible to include increments in the while expression:
while (*s++ = *t++)
         ;
Here, *t is assigned to *s, and then s and t are incremented. The next version uses array indexing; otherwise, it is identical to the last version:
/* File: str.c - continued */
     void our_strcpy4(STRING s, STRING t)
     { int i;

i = 0; while (s[i] = t[i]) { i++; } }

Memory Allocation for Strings

When a function is used to put values into an array, it is important that memory for the array be allocated by the calling function. Consider the following possible error:

/* COMMON BUG */
    char *s;            /* should be: char s[SIZE]; */

strcpy(s, "Hello, good morning to all");

The pointer variable, s, can store only a pointer value; no memory is allocated for a string of characters. Nor is the pointer variable s initialized. The function, strcpy(), assumes that s points to memory where a string can be stored. No such memory has been allocated, nor does s point to any valid location - the program will crash.

A second type of error can occur if the calling function does not allocate memory for a string, but instead depends on the called function to do so. Let us consider an example in which a string copy function allocates memory for the copied string and returns a pointer to it, and see where the error leads us. Here is the function:

/* File: allocerr.c */
     #include <stdio.h>
     #include "strtype.h"

/* COMMON ERROR */ STRING scopy(STRING t) { char s[100]; int i = 0;

while (s[i] = t[i]) i++;

return s; }

The function copies a string into an (automatic) array variable defined in the function, and returns a pointer to the array. When the function returns to the calling function, the memory for the array, s, is freed automatically. The value of s is returned, but s now points to garbage. Of course, the compiler does not flag an error, since the value of s can be legitimately returned. The fact that it now points to garbage is a program logic error.

Let us see what happens when we use this function in a program. We declare a STRING variable, p, which is assigned the value of the pointer returned by the above function, scopy().

/* File: allocerr.c - continued */
     /* PROGRAM BUG */
     main()
     {   STRING p,  scopy(STRING t);

p = scopy("hello"); puts(p); }

The function, scopy(), returns a pointer to an array which has already been freed for other uses. The now freed memory, previously holding the array, must be assumed to have garbage value. The pointer to this garbage is assigned to p. The function, puts(), assumes p is a valid string and will print whatever garbage p points to, not the original meaningful string. Without a clear understanding, the above type of error is hard to pinpoint. The freed memory holding the array may or may not be immediately used for other purposes; thus, sometimes, puts() in the above example may print a (partly) meaningful string. At other times, it will print out all garbage.

The only solution is to declare all the needed arrays in the calling function, main() and pass them as arguments to called functions. The called functions can then put strings in these arrays and the calling function, main(), can later use these strings without any problem. The correct structure is as follows:

...
void scopy (STRING s, STRING t);
main()
{   char s[SIZE], t[SIZE];

scopy(s, t); ... }

Using String Functions with Substrings

The function, strcpy(), is given two character pointers, one to the destination array and one to the source string. These pointers may point to any character position within an array which corresponds to a substring beginning at that position, continuing to the next NULL in the array. We can call our string functions with arguments that are substrings of other strings. For example, we can copy a substring of t into any location in s:

/*  File: partstr.c
    Program shows overwriting part of a string with part of another.
*/
#include <stdio.h>
#include <string.h>
#define SIZE 100

main() { char s[SIZE], t[SIZE];

printf("***Partial Strings***\n"); strcpy(s, "This can be trouble"); strcpy(t, "Insert string");

printf("Old s: "); puts(s); printf("Old t: "); puts(t);

strcpy(s + 3, t + 5); printf("New s: "); puts(s); }

Sample Session: The program copies a substring starting at t + 5 into a location pointed to by s + 3. String copy terminates with a NULL; any remaining characters in string s after the first NULL are not part of the string.

We can even use strcpy() to copy part of a string to a different location in the string itself. As always, we must be sure that we are dealing with NULL terminated strings and must also take care that the copy process does not overwrite useful data. For example, here is a loop that eliminates leading white space from a string, s:

strcpy(s, "    Aloha");
while (isspace(*s))
    strcpy(s, s + 1);
The function, isspace(), is a library routine that returns True if the argument is a white space. (We have indicated white space explicitly by a *). The loop is executed as long as *s, the first character of s, is a space. In the loop, the string starting at s + 1 is copied into s, character by character. Each time the loop is executed, one leading white space is removed from s. Here are the successive strings starting with the original (again we use white space indicator *).

****Aloha
***Aloha
**Aloha
*Aloha
Aloha

When a string is copied into itself by strcpy(), as long as destination index is less than the source index, we overwrite only the desired characters. If the destination index is greater than the source index, destination characters will be overwritten. For example:

strcpy(s, "abcdef");
strcpy(s+1, s);
The second strcpy() copies s[0], i.e. 'a' into s[1]; then copies s[1] into s[2]; then copies s[2] into s[3]; etc. All elements of s are overwritten with 'a', even the NULL, resulting in a non-valid string - a logic error.

Next, let us consider moving the NULL position. Since the first NULL terminates a string, we can move the NULL to squeeze out unneeded trailing characters. Here is a loop that eliminates trailing white space:

while (isspace(s[strlen(s) - 1]))
         s[strlen(s) - 1] = NULL;
Starting with the original, successive strings are shown below with an explicit terminating NULL (again, we use a * as a white space indicator):
Aloha****\0
Aloha***\0
Aloha**\0
Aloha*\0
Aloha\0



Previous: 11.2.1 String I/O: gets() and puts()
Up: 11.2 Library String Functions
Next: 11.2.3 String Operations: strcmp() and strcat()
Previous Page: 11.2.1 String I/O: gets() and puts()

tep@wiliki.eng.hawaii.edu
Sat Sep 3 07:04:57 HST 1994