CSC 213, Fall 2008 : Schedule : Lab 3
Goals: To understand and appreciate device abstractions and, in particular, I/O buffering.
Reading:
Collaboration: You will complete this lab in teams of 2, as assigned by the instructor. (You may, of course, consult with other classmates on design and debugging.)
Discussion: Unix provides two primary ways to do file I/O.
Since device management on the system is designed to mimic the file interface, we expect the usual functions open(2), read(2), write(2), and close(2) to be available for file I/O. These are all system calls in the kernel, and are typically referred to as unbuffered I/O.
The C standard library also provides buffered I/O routines -- fopen(3), getc(3), putc(3), and fclose(3) (among others) -- that are a further abstraction of the system calls above.
Overview: In this lab, you will write three programs and collect some statistics to accomplish the following:
Unbuffered I/O: Write a procedure
int unbufcp(int bufSz, int fdsrc, int fddst)that copies the input file associated with the file descriptor fdsrc to an output file associated with the file descriptor fddst using the unbuffered I/O operations read(2) and write(2). In particular, your function should use bufSz as the number of bytes to be read each time a call to read(2) is made. The return value should be the total number of calls made to read(2) or -1 if an error occurs. If possible, print an appropriate error message to stderr (using fprintf(3) or perror(3)). (Note the Unix diff(1) command will reveal any differences between files.)
Buffered I/O: Write a procedure
int bufcp( FILE *src, FILE * dst)that copies the input file associated with the file stream src to an output file associated with the file stream dest using the buffered I/O operations getc(3) and putc(3). The return value should be the total number of calls made to getc(3) or -1 if an error occurs. As above, print an appropriate error message to stderr when possible.
Resource Usage: Write one or two programs that use the system call getrusage(2) to query the resource usage caused by your copy functions. This system call can be used by any process to examine its own resource usage or that of its children. Use fork(2) to create a child process that calls the appropriate copy function, but open any files before the fork. The parent process should query the resource usage of the child after the copy. Your program(s) should report:
The return value from the copy function
The total user time (in seconds)
The total system time (in seconds)
Other values you think might be interesting from struct rusage
Empirical Analysis:Use your program(s) above to copy a large file (>50MB) from your system's local file system to /dev/null. (On most systems, /tmp is a local fs, while your home directory is not.) Collect results for buffer sizes of 20 ... 20. That is: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072, 262144, 524288, and 1048576.
Also acquire resource usage results by running the same copy task using the buffered I/O functions.
If times are very small, you may wish to consider running the experiment multiple (e.g. 5-10) times and record averages for each buffer size. Organize your results in a table or several graphs, and use it as the basis to answer the following questions.
What is the performance trend in terms of time, and other measures, for different buffer sizes?
How do the times of unbuffered (optimal and otherwise) and buffered I/O compare?
How do you suppose the number of system calls compares between unbuffered and buffered I/O?
Why might an application programmer prefer buffered or unbuffered I/O? (Consider both program performance and programming effort.)
How can you explain the system time curve for unbuffered I/O?
Hint: You may wish to write a very short program using stat(2) on your source file and examine the struct stat field st_blksize to help answer this question.
Are there any values in struct rusage that don't seem to be filled in by the OS (i.e., are always 0) but you think would be interesting? Why?
In your write-up, be sure to report the details of the file you used, and anything you may have learned about the filesystem on the machine used.