Lab: Reliable Data Transfer
CSC 364 - Computer Networks - Weinman
- Summary:
- We implement a reliable data transfer layer protocol
- Assigned:
- Friday 21 February
- Due:
- 11:59 PM Tuesday 4 March
- Objectives:
-
- Internalize principles of reliable data transfer
- Restore C programming skills
- Learn to verify correct protocol operation
- Background
- Section 3.4 in Kurose and Ross (Computer Networks)
explains the principles of reliable data transfer, including details
of the alternating bit protocol rdt3.0 (3.4.1) and the Go-Back-N
protocol (3.4.3)
- Collaboration:
- You will complete this lab in pairs assigned by
the instructor.
Overview
In this laboratory programming assignment, you will be writing the
sending and receiving transport-level code for implementing a simple
reliable data transfer protocol. There are two versions of this lab,
the alternating-bit protocol version and the Go-Back-N version. This
lab should be fun since your implementation will differ very
little from what would be required in a real-world situation.
Since you don't have standalone machines (with an OS that you can
modify), your code will have to execute in a simulated hardware/software
environment. However, the programming interface provided to your routines,
i.e., the code that would call your entities from above and from below
is very close to what is done in an actual UNIX environment. (Indeed,
the software interfaces described in this programming assignment are
much more realistic that the infinite loop senders and receivers that
many texts describe). Stopping/starting of timers are also simulated,
and timer interrupts will cause your timer handling routine(s) to
be activated.
The routines you will write
The procedures you will write are for the sending entity (A) and the
receiving entity (B). Only unidirectional transfer of data (from A
to B) is required. Of course, the B side will have to send packets
to A to acknowledge (positively or negatively) receipt of data. Your
routines are to be implemented in the form of the procedures described
below. These procedures will be called by (and will call) procedures
emulating a network environment. The overall structure of the environment
is shown in Figure 1.
Figure 1:
Structure of the emulated environment.
The unit of data passed between the upper layers and your protocols
is a message, which is declared as:
-
struct msg {
char data[DATA_LENGTH];
};
This declaration, and all other data structure and emulator routines,
as well as stub routines (i.e., those you are to complete) are in
the file, rdt.c, described later. Your sending entity will
thus receive data in DATA_LENGTH byte chunks from layer5;
your receiving entity should deliver DATA_LENGTH byte chunks
of correctly received data to layer5 at the receiving side.
The unit of data passed between your routines and the network layer
is the packet, which is declared as:
-
struct pkt {
int seqnum;
int acknum;
int checksum;
char payload[DATA_LENGTH];
};
Your routines will fill in the payload field from the message
data passed down from layer5-easy to do with memcpy(3).
The other packet fields will be used by your protocols to insure reliable
delivery, as we've studied.
The routines you will write are detailed below. As noted above, such
procedures in real-life would be part of the operating system, and
they would be called by other procedures in the operating system.
- A_output(message),
- where
message is a structure of type msg, containing data
to be sent to the B-side. This routine will be called whenever the
upper layer at the sending side (A) has a message to send. It is the
job of your protocol to insure that the data in such a message is
delivered in-order, and correctly, to the receiving side upper layer.
- A_input(packet),
- where
packet is a structure of type pkt. This routine
will be called whenever a packet sent from the B-side (i.e., as a
result of a tolayer3() being done by a B-side procedure)
arrives at the A-side. packet is the (possibly corrupted)
packet sent from the B-side.
- A_timerinterrupt()
- This routine will be called when
A's timer expires (thus generating a timer interrupt). You'll probably
want to use this routine to control the retransmission of packets.
See starttimer() and stoptimer() below for how the
timer is started and stopped.
- A_init()
- This routine will be called once by the simulator,
before any of your other A-side routines are called. It can be used
to do any required initialization.
- B_input(packet),
- where
packet is a structure of type pkt. This routine will be called
whenever a packet sent from the A-side (i.e., as a result of a tolayer3()
being done by a A-side procedure) arrives at the B-side. packet
is the (possibly corrupted) packet sent from the A-side.
- B_init()
- This routine will also be called once by the
simulator, before any of your other B-side routines are called. It
can be used to do any required initialization.
Software Interfaces
The procedures described above are the ones that you will write. The
emulator provides the following routines which can be called by your
routines:
- starttimer(calling_entity, increment),
- where
calling_entity is either 0 (for starting
the A-side timer) or 1 (for starting the B side timer), and
increment is a float value indicating the
amount of time that will pass before the timer interrupts. A's timer
should only be started (or stopped) by A-side routines, and similarly
for the B-side timer. To give you an idea of the appropriate increment
value to use: a packet sent into the network takes an average of 5
time units to arrive at the other side when there are no other messages
in the medium.
- stoptimer(calling_entity),
- where
calling_entity is either 0 (for stopping
the A-side timer) or 1 (for stopping the B side timer).
- tolayer3(calling_entity, packet),
- where
calling_entity is either 0 (for the A-side
send) or 1 (for the B side send), and packet is a structure
of type pkt. Calling this routine will cause the packet to
be sent into the network, destined for the other entity.
- tolayer5(calling_entity, message),
- where
calling_entity is either 0 (for A-side delivery
to layer 5) or 1 (for B-side delivery to layer 5), and message
is a structure of type msg. With unidirectional data transfer,
you would only be calling this with calling_entity
equal to 1 (delivery to the B-side). Calling this routine
will cause data to be passed up to layer 5.
The simulated network environment
A call to procedure tolayer3() sends packets into the medium
(i.e., into the network layer). Your procedures A_input()
and B_input() are called when a packet is to be delivered
from the medium to your protocol layer.
The medium is capable of corrupting and losing packets. It will not
reorder packets. When you compile your procedures and the emulator
procedures together and run the resulting program, you will be asked
to specify values regarding the simulated network environment:
- Number of messages to simulate. The emulator (and your routines)
will stop as soon as this number of messages have been passed down
from layer 5, regardless of whether or not all of the messages have
been correctly delivered. Thus, you need not worry about undelivered
or unACK'ed messages still in your sender when the emulator stops.
Note that if you set this value to 1, your program will terminate
immediately, before the message is delivered to the other side. Thus,
this value should always be greater than 1.
- Loss. You are asked to specify a packet loss probability.
A value of 0.1 would mean that one in ten packets (on average) are
lost.
- Corruption. You are asked to specify a packet loss probability.
A value of 0.2 would mean that one in five packets (on average) are
corrupted. Note that the contents of payload, sequence, ack, or checksum
fields can be corrupted. Your checksum should thus include
the data, sequence, and ack fields.
- Tracing. Setting a tracing value of 1 or 2 will print out
useful information about what is going on inside the emulation (e.g.,
what's happening to packets and timers). A tracing value of 0 will
turn this off. A tracing value greater than 2 will display all sorts
of odd messages that are for my own emulator-debugging purposes. A
tracing value of 2 may be helpful to you in debugging your code. You
should keep in mind that real implementors do not have underlying
networks that provide such nice information about what is going to
happen to their packets!
- Average time between messages from sender's layer5. You can
set this value to any non-zero, positive value. Note that the smaller
the value you choose, the faster packets will be be arriving to your
sender.
Assignment
Alternating Bit Protocol
Design
Write a design statement for your alternating bit protocol using the
routines above. Your protocol must send both ACK and NAK messages.
How you accomplish that is up to you; due to the alternating bit,
you could implement NAKs explicitly or implicitly in a fashion similar
to the rdt2.2 protocol. Note that receiving a NAK will allow
the sender to take corrective action more quickly.
In particular, you should clarify the meaning of the fields in struct
pkt as you are using them (i.e., with respect to ACK and NAK) and
draw or very clearly describe the FSMs for sender and receiver. (Professor
Weinman completed this lab without doing so, which caused unnecessary
grief.)
Implementation
Write the procedures, A_output(), A_input(),
A_timerinterrupt(), A_init(), B_input(), and
B_init() which together will implement a stop-and-wait (i.e.,
the alternating bit protocol, which we referred to as rdt3.0
in the text) unidirectional transfer of data from the A-side to the
B-side.
You should perform a check in your sender to make sure that when A_output()
is called, there is no message currently in transit. If there is,
you can simply ignore (drop) the data being passed to the A_output()
routine.
To prevent such dropped outgoing packet, you can choose a very large
value for the average time between messages from sender's layer5,
so that your sender is never called while it still has an outstanding,
unacknowledged message it is trying to send to the receiver. I'd suggest
you choose a value of 1000.
Put your procedures in a file called abp.c. You will need
the initial version of this file, containing the emulation routines
and the stubs for your procedures. You can obtain this program from
the MathLAN:
-
$ cp /home/weinman/courses/CSC364/labs/rdt/rdt.c somewhere/abp.c
Your procedures should print a message whenever an event occurs at
your sender or receiver (i.e., a message/packet arrival, or a timer
interrupt) indicating the procedure invoked, the event interpretation,
as well as any action taken in response.
As you proceed to test your implementation, if you find you missed
something in your design, be sure to update and revise your statement
to reflect what you plan to do (and end up doing) in your code.
Go-Back-N Protocol
"Instead" of a stop-and-wait protocol, you can implement a pipelined
Go-Back-N protocol. Some new considerations (which do not apply to
the alternating bit protocol) for the procedures you'd write include:
- A_output(message),
- where message
is a structure of type msg, containing data to be sent to
the B-side.
Your A_output() routine will now almost certainly be called
when there are outstanding, unacknowledged messages in the medium-implying
that you will have to buffer multiple messages in your sender. Also,
you'll also need buffering in your sender because of the nature of
Go-Back-N: sometimes your sender will be called but it won't be able
to send the new message because the new message falls outside of the
window (which is full).
Rather than have you worry about buffering an arbitrary number of
messages, it will be OK for you to have some finite, maximum number
of buffers available at your sender (say for 50 messages) and have
your sender simply drop the request should all 50 buffers be in use
at one point (Note: using the values given below, this should never
happen!) In the "real-world," of course, one would have to come
up with a more elegant solution to the finite buffer problem!
- A_timerinterrupt()
- This routine will be called when
A's timer expires (thus generating a timer interrupt). Remember that
you've only got one timer, and may have many outstanding, unacknowledged
packets in the medium, so you'll have to think about how to use this
single timer.
Design
Write a design statement documenting and explaining your choices in
how to implement the Go-Back-N protocol (and why they are correct).
Your protocol must send both ACK and NAK messages. How exactly you
accomplish this is up to you. In particular, you should clarify the
meaning of the fields in struct pkt as you are using them
(i.e., with respect to ACK and NAK messages) and draw or very clearly
describe the FSMs for sender and receiver.
Implementation
Write the procedures, A_output(), A_input(), A_timerinterrupt(),
A_init(), B_input(), and B_init() which
together will implement a Go-Back-N unidirectional transfer of data
from the A-side to the B-side, with a window size of WINDOW_SIZE.
Copy a new version of the network emulator into a file called gbn.c:
-
$ cp /home/weinman/courses/CSC364/labs/rdt/rdt.c somewhere/gbn.c
- Note:
- To "wrap around" to the beginning of of an array, use
modular arithmetic:
-
index = (index + 1) % BUFFER_LENGTH; // Increment index with wrap-around to zero
Extra Credit
For extra credit, you can implement bidirectional transfer of messages
for either protocol. In this case, entities A and B operate as both
a sender and receiver. You may piggyback acknowledgments on data packets,
though it is simpler not to do so. To get the emulator to deliver
messages from layer 5 to your B_output() routine, you will
need to change the declared value of BIDIRECTIONAL from 0
to 1.
Be sure to update your design statement to explain your implementation
of bidirectional transfer.
Evaluation
Assuming the general guidelines for lab exercises are met, completing
the alternating bit protocol will be considered "Good" (a B),
while completing the Go-Back-N protocol would be considered "Excellent"
(an A).
Although the grading for the two versions is not cumulative, the learning
is. I therefore strongly recommend you complete the alternating
bit protocol before attempting the Go-Back-N protocol.
Completing the extra credit correctly for either protocol would raise
a grade by one-third grade point (i.e., a B to B+).
Suggestions
- Checksums
- You can use whatever approach for checksumming you
want. Remember that the sequence number and ack field can also be
corrupted. We would suggest a TCP-like checksum, which consists of
the sum of the (integer) sequence and ack field values, added to a
character-by-character sum of the payload field of the packet (i.e.,
treat each character as if it were an 8 bit integer and just add them
together).
- Globals
- Note that any shared state
among your routines needs to be in the form of global variables. Note
also that any information that your procedures need to save from one
invocation to the next must also be a global (or static) variable.
For example, your routines will need to keep a copy of a packet for
possible retransmission. It would probably be a good idea for such
a data structure to be a global variable in your code. Note, however,
that if one of your global variables is used by your sender side,
that variable should NOT be accessed by the receiving side entity,
since in real life, communicating entities connected only by a communication
channel can not share global variables.
Please locate any global variable declarations together in an easy
to find place (preferably just above your routines).
- Time
- There is a float global variable called time that
you can access from within your code to help you out with your diagnostics,
should you find it necessary.
- START SIMPLE
- Set the probabilities of loss and corruption to
zero and test out your routines. Better yet, design and implement
your procedures for the case of no loss and no corruption, and get
them working first. Then handle the case of one of these
probabilities being non-zero, and then finally both being non-zero.
- Debugging
- We recommend that you set the tracing level to 2 and
put LOTS of printfs in your own code while your debugging
your procedures.
What to turn in
- Your abp.c or gbn.c file
- A single PDF containing (merged):
- Your design statement
- Enscripted version of your abp.c or gbn.c file
- A transcript of the compilation of your abp.c or gbn.c
file
- A transcript of running your program
- For abp, use a loss probability of 0.1, and a corruption
probability of 0.3, and a trace level of 2.
- For gbn, use a loss probability of 0.2, and a corruption
probability of 0.2, a trace level of 2, and a mean time between arrivals
of 10.
- A brief statement explaining (via your transcript) how your protocol
correctly recovered from packet loss and corruption. If completing
the extra credit, be sure to document and explain correct bidirectional
communication with loss and corruption recovery.
Submissions missing the PDF or files in any other formats will not
be graded. C files lacking compilation and run transcripts will not
be graded.
Acknowledgments
This lab is a barely modified version of Programming Assignment 5: Implementing a Reliable Transport Protocol,
by James F. Kurose and Keith W. Ross.