Lab: A Review of C

CSC 213 - Operating Systems and Parallel Algorithms - Weinman



Summary:
You will review some important facts about C pointers, strings, I/O, and memory management.
Assigned:
Tuesday 4 September
Due:
10:30 PM Monday 10 September
Objectives:
 
Collaboration:
You will complete this lab in teams as assigned by the instructor. Review the syllabus for guidelines on lab-related discussions with non-partners.
Resources:
 

Introduction

The purpose of this assignment is to help you create, debug, and extend C programs that run within a shell environment and use basic I/O and string manipulation functions.
The scenario: Your well-intentioned-but-inexperienced pair-programming buddy has just written some code for the first assignment. Unfortunately, they dropped the class/were abducted by aliens, and it is now up to you to pick up where they left off.
The accompanying program (Makefile and main.c) contain examples of good and bad programming practices and include deliberate errors. Your job is to find and fix the errors, implement missing features, and learn some tricks of the trade in the process.
Luckily, a test program, similar to one that will be used to evaluate your submissions, is included to help you along. You can (and should) use it to check your work. See the the section on testing your code for details.
There are three parts to this laboratory:

Exercises

Preliminaries

Do this laboratory on any MathLAN workstation. Copy the starter files to somewhere in your home directory.
cp -R ~weinman/public_html/courses/CSC213/2012F/labs/code/c_review ~/somewhere/

Part A: Crash course in C

Answer the following questions. Try to identify the key to each problem and keep your answers concise and to the point; 2-3 sentences should suffice. These questions bring up important points about pointer usage and control flow in C. Keep these in mind when working on the remainder of the lab.
  1. Consider the following C program.
    #include <string.h>
    int main(int argc, char *argv[])
    {
            char *temp;
            strcpy(temp, argv[0]);
            return 0; 
    }
    Why is the above code incorrect (i.e., likely to crash)?
  2. Consider the following C program.
    #include <string.h>
    int main(int argc, char *argv[])
    {
            char temp[9];
            int i = 0;
            strcpy(temp, argv[0]);
            return i;
    }
    A buffer overflow occurs when the program name is 9 characters long (e.g., "12345.exe"). Why?
  3. Consider the following C program.
    #include <string.h>
    int main(int argc, char *argv[])
    {
            char *buffer = "Hello";
            strcpy(buffer, "World");
            return 0;
    }
    Why does this program crash?
  4. Consider the following C snippet.
    void myfunc()
    {
            char b[100];
            char *buffer = &b[0];
            strcpy(buffer, "World");
    }
    Is this correct? What's a simpler expression for &b[0]?
  5. Consider the following C program.
    #include <stdio.h>
    int main(int argc, char* argv[])
    {
            printf("%s %s %sn",*argv, (*(argv+1)) + 2, *(argv+2));
            return 0;
    }
    If this code is executed using the following line, what will be the output?
    program1 -n5 abc
  6. Consider the following C program.
    #include <stdio.h>
    #include <string.h>
    char *myfunc(char **argv)
    {
            char buffer[100];
            strcpy(buffer, "hello");
            return buffer;
    }
     
    int main(int argc, char *argv[])
    {
            char *s = myfunc(argv);
            printf("%sn", s);
    }
    What's wrong with this?
  7. This question is not about pointers, but about control flow.
    switch (c) {
        case 'a':
        case 'e':
        case 'i':
        case 'o':
        case 'u':
          printf("%c is a vowel.n", c);
          break;
        case 'y':
          printf("y is sometimes a vowel.n");
        default:
          printf("%c is a consonant.n", c);
    }
    Why are two messages printed if the variable c refers to the value 'y'?

Part B: Fixing the bugs

Examine the provided program main.c. The purpose of this program is to count words specified as command-line arguments. Read the description of the program and its functionality in the comment at the top of main.c. Now read through the rest of main.c and the Makefile and understand what each part does.
Finally, compile and run the program from the shell:
make
(ignore the compiler warning for now) 
./main
The program compiles and links... so it must work! But is it really doing what it is supposed to do?
Answer the following questions, fixing the bugs in main.c as you go along.
  1. Explain why this program uses the exclamation operator with the strcmp() function.
  2. Explain why the 'LENGTH' macro returns the length of an array. Would it work with a pointer to a dynamically allocated array? (Hint: understand sizeof).
  3. Explain and fix the logical flow bug within the switch statement. (What happens when the -h option is used?)
  4. Explain and fix the argument parsing error. (Why is entry_count never zero?)
  5. Fix print_result() to print results correctly and in the same order as the words were specified on the command line. Explain your solution.

Part C: Enhancements

Now that the bugs have been ironed out, it's time to add some functionality to our word counting program. Follow the instructions below to create the best word counter program this side of the Mississippi.
  1. Alter the program such that only the correct output is sent to the standard output stream (stdout), while error and help messages are sent to the standard error stream (stderr). (Hint: use fprintf.) See the expected output listed in the comment at the top of main.c for an example of what should go to stdout.
  2. Implement an optional command-line switch '-fFILENAME' that sends program output to a file named FILENAME (i.e., filename specified as a command line argument).
  3. Add support for matching arbitrary numbers of words, not just 5. (Hint: Use malloc. It's ok if you allocate a bit more memory than is actually used.)
  4. Safeguard the program from buffer overflow attacks, in which data is written beyond the end of a memory allocation. (Hint 1: gets(3) is BAD. Use fgets(3) instead, which specifies the maximum number of characters to be read in. Hint 2: Be careful about the newline character 'n' at the end of the line; gets and fgets handle it differently.)
  5. Allow multiple words to be specified per line. Words may be separated by spaces or by any punctuation, including slashes and quotation marks. (Hint 1: Understand strtok(3). Hint 2: Recall how to escape special characters in strings.)

Testing your code

The testrunner program is included so that you can check your progress as you implement different parts of this laboratory. It has been tested on the MathLAN, but may not work as intended in other environments.
To run all tests:
$ make test 
To run a specific test, e.g., stderr_output:
$ ./main -test stderr_output
As you can see, testrunner is implemented entirely in C. If you completed this lab quickly, or are just plain curious, feel free to look at the implementation and see if you can figure out how everything works. The relevant files are: c_review_tests.* and testrunner.*. (This is entirely optional.)
Note: You should remove or disable any additional debugging output you may have created before running the tests or submitting your lab. One way to do this easily is through the use of the preprocessor directive #ifdef:
#ifdef DEBUG
fprintf(stderr,"My string %s %dn",var1,var2); 
#endif
Then add -DDEBUG to the CCOPTS line in the Makefile during development, and remove it before testing or submission. Make sure that your code still compiles and runs without the debug output!

What to turn in

Acknowledgment

This assignment was adapted by Janet Davis from operating system programming problems by Lawrence Angrave at the University of Illinois at Champaign-Urbana (UIUC).

Footnotes:

1The chapters date to the era when the UNIX Systems Programmer's manual actually sat on your shelf. You can find a copy in the MathLAN library.