Lab: IO Buffering

CSC 213 - Operating Systems and Parallel Algorithms - Weinman



Summary:
You will measure the effects of system calls and buffering on file I/O.
Assigned:
Tuesday 27 November
Due:
10:30 PM Monday 3 December
Objectives:
 
Collaboration:
You will complete this lab in teams as assigned by the instructor.

Background

*nix provides two primary ways to do I/O. The usual functions, open(2), read(2), write(2), and close(2) are for input and output to files among other devices. These are system calls that trap to the kernel, and are typically referred to as unbuffered I/O routines. The standard C library also provides buffered I/O routines, fopen(3), getc(3), putc(3), fread(3), and fclose(3) among others, that are a further abstraction of the system calls above.

Preliminaries

Exercises

Part A: Reading files

  1. Write a program unbufread.c with a function
    long unbufread( int fd, size_t count )
    using the unbuffered I/O operation read(2) to read through all the data in the input file associated with the file descriptor fd. Note that in particular, your function should use count as the number of bytes to be read each time a call to read is made (i.e., the buffer size). The return value should be the total number of calls made to read or -1 if an error occurs, when you should print an appropriate error message to stderr using fprintf(3) or perror(3).
    Note that the integral numeric type size_t is defined in the standard header <sys/types.h>, which is included with <unistd.h>, which read(2) requires.
  2. Your command-line program should then take a buffer size count and a filename as arguments, call the function above, and report
    You do not need to fork any separate processes. For example,
    ./unbufread 10 /tmp/amt3.log
    2415   0.001778   0.000712   0.002383
    ./unbufread 16 /etc/ssh_host_dsa_key
    Unable to open input file: Permission denied
    ./bufread 16 /blah
    Unable to open input file: No such file or directory
  3. Write an analogous program bufread.c with a function
    long bufread( FILE* stream, size_t size )
    using the buffered library call fread(3) to read through the input file associated with the file stream pointer stream. Note in particular that your function should read size bytes per call to fread. The return value should be the total number of calls made to fread or -1 if an error occurs, when you should print an appropriate error message to stderr using fprintf(3) or perror(3).
  4. Your command-line program should function the same as before, but call bufread. For example,
    ./bufread 10 /tmp/amt3.log
    2414   0.000202   0.000798   0.001681

Part B: Empirical analysis

  1. To effectively test the performance of these programs, we will need a relatively large file. To avoid swamping the home directory server and testing network effects more than filesystem/kernel effects, you will create this file in the /tmp directory on your workstation's local disk; create a 100,000,000 byte file using the following command:
    head -c100000000 /dev/urandom > /tmp/bigfile 
    Explain how this command works. You will need to do a bit of research on shell commands and Linux pseudo-devices.
  2. Run each of your programs with read buffer sizes of 20...20 bytes. That is: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072, 262144, 524288, and 1048576. Collect the output data in a text file or spreadsheet.
  3. Using your favorite graphing program (a spreadsheet, gnuplot, MATLAB, etc.), create graphs of the (three) times per read and times per byte for each method as the read size changes. You will need to decide whether to use linear or logarithmic axes.
  4. Organize your findings in a short report that (like your image restoration analysis) should:
    Note that performing the experiment and summarizing the results are separate steps and both come before you draw conclusions. To present honest and understandable results, we must present the basic data first (so the reader can draw their own conclusions) before we insert our bias.

What to turn in

Extra Credit

The following are a few extra credit possibilities.

Acknowledgment:

This assignment was inspired by Figure 3.1 of Advanced Programming in the UNIX Environment, by W. Richard Stevens. It is based on a version by Jerod Weinman given in a prior offering of CSC 213, with some text in Part B from a similar assignment by Janet Davis, which she credits as "adapted from one given by Barton Miller at the University of Wisconsin," Madison, and the extra credit "borrowed from Fred Kuhn at Washington University in St. Louis."
The background (including head-1.c and head-2.c) and Part A are:
Copyright © 2012 Jerod Weinman.
ccbyncsa.png
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.