CSC 213, Fall 2008 : Schedule : Lab 12


Lab 12: Sensor Network

Goals: Learn about using RPCs by implementing a "push"-based sensor network, like those used for stock quotes, environmental monitoring, weather updates, sports scores, etc. You'll also gain further practice with threads and synchronization.

Background reading:

Collaboration: You will complete this lab in pairs of your choosing.

Overview: You will implement a sensor network where a set of independent "sources" each pushes its current value at a specified period to a proxy server, which stores them. A "sink" may connect to the proxy server and request a subscription to a subset of the sources. The proxy will then push the sum of the subscribed source values to the sink at a specified period. To ensure consistency, a synchronization strategy must be employed in the proxy. Threads will be needed to handle all the processing in the proxy.


Network Description

sensor network

As indicated by the picture above, there are three main components to this sensor network, sources, the proxy, and sinks. An arbitrary number of sources should be able to register with the proxy and begin sending updates of their values. This will be done with RPC calls, where a source acts as a client and the proxy as the server.

A sink will make a subscription request to the proxy, sending its hostname and a list of sources it would like to receive updates from. The proxy will then make RPC calls to the sink, pushing updates.

In our network, the proxy's update to the sink is the synchronized sum of the values of the sources in a sink's subscription. This means that, while calculating the sum, none of the already-added values should change until the total sum has been calculated. This is a consistency requirement that ensures the sum represents a total of values actually held at some moment in time--even if this comes at the expense of making updates wait.

The proxy should accept updates from sources in parallel and process the update asynchronously. This means that when a proxy receives an update, it should spawn a new thread for the update and immediately return (from the RPC call) to the source. Within this thread, the proxy stores the new value in a synchronized fashion, so as not to disrupt any sum being calculated for a sink's subscription. (Be sure to distinguish the semantics of an asynchronous function return, and the synchronous, i.e., mutexed, modification of the sensor value in the proxy.

The proxy should also process sink subscriptions in parallel. Thus, when a sink requests a subscription, the proxy should spawn a new thread to handle the periodic sum and push.

The source updates should be "timestamped" so that the proxy only sets its local value to the most recent update. (Note that no "time" is actually necessary to guarantee this.)

To minimize overhead, network connections should not be repeatedly opened and closed, but only opened once when they are needed. For example, a source should only need to open a connection once to the proxy, and a proxy should only open a connection once to a sink.

Part A

(Re-)Read everything for part A before you begin. It is likely that your thinking and the design process will benefit from knowing about your end goals. You may even want to have another look at parts B and C.

  1. Design the interface to the proxy needed for the sources. This should be in the form of a well-commented IDL (Interface Description Language) .x file. Be sure to also explain how the proxy will handle source registration requests and keep track of updates from the sources (i.e., who an update is coming from). Your protocol need not be "secure" with an unforgeable identity.
  2. Design the interface to a sink needed for the proxy. You should assume that there is only one sink running an RPC server on any given host. (Again, an IDL file will do).
  3. Design the interface to the proxy needed for a sink to request subscriptions. (Again, by creating an IDL file). Note that the proxy will need to know how to connect to the sink (via RPC); you are welcome to explore/propose alternatives, but one way is to have the sink provide its hostname to the proxy (see gethostname(2)). You may assume that a hostname is no longer than 255 characters (plus a terminating null-byte).
  4. You will use the POSIX thread library's mutex type (pthread_mutex_t) to perform synchronization. Carefully describe an algorithm for calculating the source updates and subscribed sum values in a synchronized fashion that will prevent deadlock. Your pseudo-code algorithm(s) should be accompanied by a sufficiently convincing explanation of why deadlock cannot occur (a formal proof is not required).
  5. Why do we need to time stamp updates from the sources? (What could go wrong if we didn't and why/how could that happen?)

Part B

(Re-)Read everything for part B before you begin. The implementations will be dependent upon one another, and like above, your thinking and the design process will benefit from a more global perspective.

  1. Implement the source component. The command-line usage might be something like:
    ./source proxyHost period
    where proxyHost would be the name of the server running the proxy process (i.e., localhost, euler.cs.grinnell.edu) and period is the time between updates. To look for interesting behavior later, you may consider having the period be measured in microseconds (see usleep(3)). Your source program should:

  2. Implement the parts of the proxy component that are needed to receive registrations and updates from sources. The command-line usage will probably be something very simple, like:
    ./proxy
    Remember that if you compile with RPC_SVC_FG defined, your server process will not fork and background itself. This will make it much easier to kill and restart, using shell process management, since you won't have to hunt for the PID. Your proxy program should:

  3. Fully test the running of your source and proxy programs. Explain what tests you have conducted and why they are sufficient to demonstrate the correct behavior of this portion of the system.

    You may run these all on the same host. However, if you run things on different hosts and get strange errors when you try to start your proxy or connect the source to the proxy, be sure your RPC program number does not conflict with a classmate's.

  4. Demonstrate the operation of this half of your sensor network by including some diagnostics in the code so that:

    Turn in the results of one run (output from each source and the proxy) with at least 3 sources, each with a different period so that it is clear things are behaving as specified. Note the periods used for each source (and include units!) if it is not clear. Twenty update requests received by the proxy is sufficient. Do not include more than fifty.

Part C

  1. Implement the sink component. The command-line usage might be something like:

    ./sink proxyHost period src ...
    where proxyHost would be the name of the server running the proxy process and period is the requested time between updates from the proxy. A variable number of sources would be specified, using whatever scheme you have determined to identify them (integers are simple and sufficient; see strtol(3)).

    Since the sink is BOTH an RPC client to the proxy (when the subscription request is made) and an RPC server to the proxy (as it receives updates), some special care is in order. When you compile the .x file for the sink's interface to the proxy, it will generate a _svc.c file that already has a main(int,char*[]) function. This is the function that is called when the server (e.g., sink) starts up and gets ready to receive RPC requests from clients (e.g. the proxy).

    I suggest writing a function start(int,char*[]) in your own .c file (with the server-side RPC functions for the sink) that takes care of

    After you have used rpcgen to create the _svc.c file, you can insert a call to your start routine with the command-line arguments at the very beginning of the automatically-generated main function.

    Alternatively, with less head-ache but slightly more effort when you want to run the system, you could start the sink for receiving updates first, and write a separate program (with its own main doing what start does above), to send the subscription request to the proxy.

  2. Implement the proxy component that handles subscription requests from sinks and periodically pushes updates to them. Your proxy program should:

    Depending on when the connection to the sink is made from the proxy (with a CLIENT*), you may need to add a delay (see sleep(3)) while handling the sink's request to allow the sink's main routine to finish setting up the RPC server after your added start call.

    (Hint: Think about what should be concurrent, what should happen before the thread is created or in the thread, and when the proxy's RPC function should return to the sink. The right order of events must take place otherwise the reverse connection cannot be established.)

  3. Fully test the running of your source, proxy, and sink programs. Explain what tests you have conducted and why they are sufficient to demonstrate the correct behavior of this portion of the system. Be sure, also to test for memory leaks! (See the program top(1) for monitoring memory usage.)

    You may run these all on the same host. The same caveat about RPC program numbers (cf., B.3) applies if you run them on different hosts.

  4. Demonstrate the operation of your complete sensor network by adding more diagnostics in the code so that:

    Turn in the results of two runs (output from each source, the proxy, and a sink) with at least 4 sources and one sink. Vary the number of sources in the subscription for the sink between the two runs. Note the periods used for each source and sink (including units!) if it is not clear. Try to pick periods that show "interesting" behavior (for your own definition of "interesting"). Again, at least 20 and no more than 50 source-to-proxy updates for each run is sufficient.
  5. Create your own experiment to test the scaleability of the proxy by varying some of the following:

    You might measure:

    The point is to be creative and look for something interesting, guided by your knowledge of the network or how you expect it to behave.

Details to Remember

Work To Be Turned In

This project is the largest and most comprehensive of the term. It is highly recommended that you do not fall behind on the milestones below, and that you continue working toward the next milestone as soon as you have completed one. Do not wait until the last minute to work on this project! By now, you will hopefully not be surprised at the attention to detail required in systems programming. In addition to the material requested above, follow the instructions for submitting programs.

Jerod Weinman

Created on June 17, 2008