Lab: Scheduling Simulation
CSC 213 - Operating Systems and Parallel Algorithms - Weinman
At the heart of quantitative reasoning is a single question: Compared
to what?
-- Edward R. Tufte (1990), Envisioning Information,
p. 67.
- Summary:
- You will implement several scheduling algorithms, comparing
their performance in a discrete-event simulation.
- Assigned:
- Tuesday 16 September
- Due:
- 11:30 PM Monday 22 September
- Objectives:
-
- Learn about event-driven simulation for modeling the system behavior
over time.
- Understand scheduling algorithms via implementation.
- Comparatively analyze throughput, response, and turnaround times for
different scheduling algorithms and overheads.
- Practice using list structures in C.
- Collaboration:
- You will complete this lab in teams
as assigned by the instructor.
Background
Discrete-event
simulation is an approach to modeling a sequence of events that
occur at specific points in time. The times of some events are known
by the modeler before running the simulation (e.g., times at which
jobs are created by users). Some events arise in response to other
events, and are unknown before the simulation starts (e.g., times
at which jobs actually begin running).
Discrete-event simulation depends on maintaining a current simulation
time, which typically starts at zero. Like real-world time, simulation
time increases monotonically; it can only move forward. In addition,
an event queue keeps track of events that will happen in the future.
The event queue is ordered by event time, so that the events are processed
in the correct order. As each event is processed, new events may be
added to the event queue. Functions that process events are called
event handlers.
The body of the simulation is a simple loop:
-
time = 0;
initialize_simulation_events();
while (events_remain())
{
time = time_of_next_event();
process_next_event();
}
The hard part is deciding what events are needed and how they should
be processed to model the real-world system. A simulation model is
necessarily an abstraction and simplification of the real world. Simulation
results suggest what can be expected of real-world performance, but
their use is limited when relevant complications of the real world
are abstracted away. You will be asked to reflect on ways in which
this simulation is a simplification of a real system.
Exercises
Preliminaries
Do this laboratory on any MathLAN workstation. Copy the starter
files to somewhere in your home directory.
-
$ cp -R ~weinman/public_html/courses/CSC213/2014F/labs/code/scheduling ~/somewhere/
You will use these files in the exercises below. Here is a brief overview:
- eventq.h contains the eventq_node_t data structure,
out of which the event queue is built, along with prototypes of functions
for manipulating the queue of simulation events.
- eventq.c implements the event queue.
- scheduler_simulation.c contains the main()
function which drives the simulation. It maintains the simulation
time as a global variable.
- stats.h and stats.c define a data structure for
tracking statistics on the scheduler's performance and define functions
for initializing and printing it.
- scheduler.h defines data types and prototypes functions related
to job descriptions and the ready queue.
- scheduler_fcfs.c provides an implementation of the First-Come
First-Served scheduler algorithm.
- scheduler_job_data.txt provides information about the jobs
to be run in the simulation.
- Makefile contains instructions to build the program sim_fcfs
and other programs that you will write.
Part A: Understanding the code
The following questions get you to dive into the starter files. Note
that you can use the command-line tool grep(1)
to find where functions and data types are defined and used, e.g.,
-
$ grep eventq_next_event_time *.c *.h
- Read the Makefile, then build the program sim_fcfs
using the appropriate make target.
- Run the simulator to verify that it works:
-
$ ./sim_fcfs
- This program produces a lot of debugging output. Figure out how to
reduce the amount of debugging output. (Hint: Read scheduler_simulation.c
and the Makefile).
-
What is the purpose of the __EVENTQ_H__
and __SCHEDULER_H__ macros?
- The simululation depends on three major data structures:
- the ready queue (job_queue_node_t* ready),
- the event queue (eventq_node_t* eventq), and
- the performance statistics (performance_data_t* stats).
For each data structure, explain whether and how code in scheduler_simulation.c
can change it.
- Each struct eventq_node includes a pointer to an event handler
function. A different design would have been to include a numeric
flag indicating the event type, and by implication, which function
should be used to handle the event. What are the advantages and disadvantages
of this other approach?
- How is the function eventq_next_event_time used?
That is, explain its purpose and the context(s) where it is called.
- How does this simulation model represent the scheduler overhead (i.e,
the time consumed by running the scheduler)? What assumptions are
being made? Are these assumptions appropriate?
-
Think of at least one thing that happens in
a real system and is relevant to job (or process) completion time,
but is not represented in this simulation model. Why do you think
it has been left out?
Part B: Non-preemptive scheduling
-
Although the scheduler overhead is compiled into
the program, you can easily change its definition at compile-time
using a make command such as the following:
-
$ make clean -B OVERHEAD=0.01 sim_fcfs ; ./sim_fcfs
Run the existing program to get average waiting (aka response) times,
turnaround times, and throughputs for the FCFS scheduling algorithm,
using values of 0.0, 0.01, 0.02, 0.03, 0.04, and 0.05 for the scheduler
overhead. In your text file or spreadsheet, record your measurements
in an organized fashion.
- Make a copy of scheduler_fcfs.c named scheduler_sjn.c.
Use it as a template for implementing the SJN scheduling algorithm.
Remember to cite the source file in your derivative.
-
Run the simulation with SJN for the same cases as
in step B.1.
-
Evaluate your results as follows.
- Compare the magnitudes of the average wait times, turnaround times,
and throughputs for FCFS and SJN. What can you conclude about the
relative effectiveness of FCFS and SJN?
- Rather than comparing statistics' magnitudes, compare their relative
changes with respect to overhead changes. (That is, if you plotted
them as curves, how would the derivatives compare?) What can you conclude
about the relative impact of the scheduler overhead on FCFS compared
to SJN?
Part C: Simple preemptive scheduling
- A simple round-robin (RR) scheduling algorithm can be obtained
by using the FCFS scheduler you are given, but changing the simulate_job
event handler in scheduler_simulation.c as indicated by
/* YOUR CODE HERE */. If the constant
QUANTUM is positive, a job is allowed to run continuously
only for that time slice. If the job will not finish within the time
slice, the time remaining for the job is reduced by the quantum, and
the scheduler is invoked again.
-
Run the RR scheduling simulation for the same
values for the scheduler overhead as you did in step B.1.
Record your measurements in an organized fashion in your text file
or a spreadsheet. (You may wish to note the default value of QUANTUM
given in the Makefile.)
-
Fix the overhead to 0.03 and test four different
time quanta: 0.05, 0.1, 0.15, 0.2, and 0.25. For example,
-
$ make clean -B OVERHEAD=0.03 QUANTUM=0.05 sim_rr ; ./sim_rr
In your text file or spreadsheet, record your measurements in an organized
fashion. Your format should allow the reader to make comparisons easily.
(See the quote from Tufte at the top.)
-
Evaluate your new results comparing preemptive
and non-preemptive scheduling as follows.
- Using the data from B.3 and C.2,
compare the magnitudes of the average wait time, and average turnaround
time, and throughput for SJN and RR. What can you conclude about the
relative effectiveness of the nonpreemptive and preemptive with respect
to each metric?
- What can you conclude about the relative impact of the scheduler overhead
on RR compared to SJN?
- Using the data from C.3, what can you conclude
about the impact of the quantum on each of the statistics?
Part D: Priority preemptive scheduling
- Make a copy of scheduler_fcfs.c named scheduler_mlq.c.
Remember to cite the source file in your derivative. Use it as a template
for implementing a multi-level queue scheduling algorithm with the
following rules (adapted from Arpaci-Dusseau2)
- Rule 1:
- If Priority(A) > Priority(B),
A runs (B doesn't).
- Rule 2:
- If Priority(A)=Priority(B),
A and B run in round-robin with queue-specific quanta
-
quantum = QUANTUM / 2p-1
1 <= p <= NUM_PRIORITY_LEVELS is the priority of the job.
Note that this implies twice the processing time for each successively
lower priority;
- Rule 3:
- Jobs are assigned an initial priority as given in the
job file.
- Rule 4:
- Once a job uses a time slice, its priority is reduced.
Hint: Use an array of ready queues, but be careful to index
them without off-by-one-errors (p does not start at zero).
Use NUM_PRIORITY_LEVELS=5 levels of priority; the value
NUM_PRIORITY_LEVELS is defined in both scheduler_mlq.c
and scheduler_simulation.c for you via the Makefile;
it may be overridden with the -B option.
Note that Rules 2 and 4 will require additional modifications to scheduler_simulation.c,
but these modifications should not change the results of your round-robin
scheduler.
-
Benchmark the MLQ priority scheduler using the same
parameters as in B.1 and C.3. In
your text file or spreadsheet, record your measurements in an organized
fashion. Again, use a format that allows allow the reader to make
comparisons easily.
-
Evaluate your new results comparing round-robin
and priority preemptive scheduling.
- Using data from C.2, C.3,
and D.2, compare the average wait time, and average turnaround
time, and throughput for RR and MLQ. What conclusions can you draw
about the relative effectiveness of these algorithms, with respect
to each metric?
- Using the three statistics, what can you conclude about the relative
impact of the scheduler overhead on MLQ compared to RR?
- Using the three statistics, what can you conclude about the relative
impact of the quantum on MLQ compared to RR?
What to turn in
A single tar file containing:
- the simulator C files and Makefile.
(Please make clean and turn off debugging before you package!)
- A single pdf containing (merged)
- Your enscripted scheduler_*.c
- Transcript of compilation and runs of all programs
-
make clean all
./sim_fcfs
./sim_sjn
./sim_rr
./sim_mlq
- Answers to your questions in A.4-A.9
- Formatted data from B.1, B.3, C.2,
C.3, and D.2.
- Analysis (answers to the questions) from B.4,
C.4, and D.3.
Extra Credit
Change your MLQ to implement the last rule:
- Rule 5:
- After some time period S, move all the jobs in the
system to the highest-priority queue.
Note this will involve some minor additional machinery in the scheduler.
How does this change impact the metrics? What is the relative impact
of the "voodoo constant" S?
Completing the implementation and analysis will be worth an additional
grade point (i.e., 25%) on this assignment.
Acknowledgment:
This assignment is adapted from assignments given by Henry Walker,
Janet Davis, and Jerod Weinman in prior offerings of CSC 213 and from
Lab Exercise 7.1 in the third edition of Operating Systems,
by Gary Nutt