Build a Datapath, part III

Due: 10:30 pm Thursday 1 May 2025
Summary: You will complete the datapath for a simple Logisim-based ISA, implement additional pseudoinstructions, and add a keyboard for input.
Collaboration: You will continue work during lab in the same pairs of your choice. You must complete the work together in these groups, whether during or after the scheduled lab time.
Submitting: Submit datapath-part3.zip and references.txt to Gradescope. Be sure to follow the submission guidelines and use precisely these file names. (The archive must contain at least datapath.circ, rules.py, microprogram.hex, programs/fibonacci.s, programs/string_reverse.s, programs/string_uppercase.s, and programs/calculator.s).

Overview

For this lab you will finish implementing the PIPS instruction set, add some useful pseudoinstructions with assembler rules, incorporate keyboard input, and write three useful programs using input and output devices.

It may be dangerous to proceed without having finished the previous portions of the lab. Please complete all previous parts before working on this lab.

One thing to point out is that the PIPS assembler supports character constants. While the previous lab asked you to manually convert characters to ASCII codes, you can actually type characters in single quotes. Before moving on to the lab, make sure this rewritten hello_world.s program works with your assembler and datapath:

# Print "Hello World!" to the PIPS terminal
# Charlie Curtsinger
.constant HALT 0xff00
.constant TERM 0xff10

nop
main:
  li $s0, TERM
  li $t0, 'H'
  sb $t0, 0($s0)
  li $t0, 'e'
  sb $t0, 0($s0)
  li $t0, 'l'
  sb $t0, 0($s0)
  sb $t0, 0($s0)
  li $t0, 'o'
  sb $t0, 0($s0)
  li $t0, ' '
  sb $t0, 0($s0)
  li $t0, 'W'
  sb $t0, 0($s0)
  li $t0, 'o'
  sb $t0, 0($s0)
  li $t0, 'r'
  sb $t0, 0($s0)
  li $t0, 'l'
  sb $t0, 0($s0)
  li $t0, 'd'
  sb $t0, 0($s0)
  li $t0, '!'
  sb $t0, 0($s0)
  j  HALT

Grading Guidelines

Your datapath will ultimately be assessed by a test suite, and the final grade mostly determined by how many tests your datapath passes.

In addition to verifying circuit and program correctness, other elements of style will also be considered.

For circuits: wires, gates, and pins should be laid out in a fashion that facilitates comprehension.
For programs: clarity, organization, and overall efficiency with respect to the number of instructions required to complete any given task. In part that means
- providing documentation for each procedure that indicates its purpose, parameters (with any preconditions), and return values or postconditions,
- ensuring comments clearly indicate the purpose of each instruction (or short set of instructions), rather than the literal interpretation, AND
- nicely aligning the various labels, fields, and comments in your code.

Part A: Bit Shifting

There are three useful arithmetic instructions we did not implement in the first lab for the datapath: logical left shift (sll), logical right shift (srl), and arithmetic right shift (sra). PIPS does not implement shift instructions using separate opcodes; instead, there are two fields in any rformat instruction that control shifting:

shift type (bits 10–11): These bits indicate whether the datapath should perform no shifting (00), shift left (01), logically shift right (10), or arithmetically shift right (11).
shift amount (bits 0–3): These bits tell the datapath how many bits to shift left or right.

The control inputs tell the datapath how to shift a value, but what value are we shifting? In PIPS, the datapath shifts the value read from the register file’s second read port, $R_{d1}$. Your datapath should take the value from $R_{d1}$ and send it to three places, and not all of these should be shifted. Here are all three places $R_{d1}$ is connected to:

$R_{d1}$ should connect to a multiplexor that chooses a new value for the program counter. This is used for the jr instruction. This connection should pass in $R_{d1}$ without shifting it. (You’ve made this connection already in Datapath lab 2 part A.)
$R_{d1}$ should connect to the memory controller’s $W_d$ input. This is used to store values from a register into data memory. This connection should pass in $R_{d1}$ without shifting it. (You’ve made this connection already in Datapath lab 2, part C.)
$R_{d1}$ should connect to a multiplexor that chooses between this value and the immediate for the ALU’s $B$ input. This is the only connection that should receive a shifted value of $R_{d1}$. (You’ve made this connection already in Datapath lab 1, part A.)

You will need to break the connection between the Register File and the multiplexor that chooses the ALU’s $B$ input, and instead send the Register File data off to a part of the datapath that performs the instruction’s requested shift operation. The shifting circuit can get a little messy, so use Logisim’s Project menu option “Add Circuit” to create a new subcircuit named shifter. (For grading purposes, you must use exactly this name.) This subcircuit should have the following inputs and outputs (use exactly these names on your pins):

shift_type (2-bit input): This should come directly from the instruction’s corresponding field.
shift_amount (4-bit input): This should also come directly from the instruction field.
data_in (16-bit input): This input will connect to the register file’s $R_{d1}$ output. This input carries the value we need to shift.
data_out (16-bit output): This output will send the shifted value out to the multiplexor that chooses the ALU’s $B$ input.

Inside the subcircuit you will need to add a four-input multiplexor for 16-bit data lines. This multiplexor will choose whether to pass the unshifted, left shifted, right logically shifted, or right arithmetically shifted value from data in to data out. You will need to add three different shifter components from Logisim’s Arithmetic component section—one for each type of shift. Under the component settings, choose 16-bit values and select the type of shift you want each shifter to perform. As with the ALU we built earlier in the semester, we will ask the shifter to perform all of the different types of shifting, then use our multiplexor to choose the shifted value the instruction has requested.

Connect your shifter subcircuit to your datapath and verify that all of your previous instructions still work (your well-documented test programs should now come in handy). If you accidentally pass a shifted value of $R_{d1}$ to data memory you will see some strange behavior there. Once you have verified that your datapath still works with all of your existing instructions you can move on to add assembler rules for shift instructions.

Adding `sll`, `srl`, and `sra` Instructions

Now that you have added shifting to your datapath we can create assembler rules for the PIPS shifting instructions. Remember that these instructions do not use a dedicated opcode; we’ll build their functionality using an existing opcode with some clever instruction encoding.

Consider the following PIPS assembly instruction:

sll $t0, $t1, 4

This instruction should take the value in $t1, shift it left by four bits, and store the result in $t0. Even though the third operand is an immediate, this must be an rformat instruction because only rformat instructions support shifting. Let’s look at the encoding for this instruction field-by-field:

$opcode$: We want to shift values, but we don’t have a shifting opcode. We need an operation that asks the ALU to do nothing. Adding zero is a reasonable choice, so we’ll set our opcode to add.
$r0$: This field holds the register we want to write to, so for our example instruction is should be set to the register $t0.
$r1$: This field specifies the register we read from the register file’s first port. This value isn’t shifted, so it shouldn’t be $t1. Since we want to add zero to our shifted value, we should choose the $zero register.
$r2$: This is the register we’ll read from $R_{d1}$ and then shift, so this should be set to $t1.
$link$: This isn’t a jal instruction, so we’ll leave link set to False.
$shift\_type$: We want a left shift, which is the binary value 01. The PIPS module has a constant we can use here, so we’ll pass in pips.SHIFT_LEFT.
$shift\_amt$: This is the amount of shifting we want to perform, so we can pass in the third operand here.

To summarize, the call to pips.rformat for the specific instruction above will look like this:

pips.rformat(opcode='add', r0='$t0', r1='$zero', r2='$t1', shift_type=pips.SHIFT_LEFT, shift_amt='4')

Using this example encoding as a guide, add a general rule for the sll instruction to your rules.py and then write a simple program to test it with your datapath. Once you’ve verified that sll works, add rules for srl and sra as well. (Note that pips.py features similar constant definitions for the two right shifts.)

Make sure you test right shifting negative values to verify that srl and sra have the expected (distinct) behavior; logically right-shifting a negative number fills the left binary digits in with zeros, whereas arithmetically right-shifting the negative number should fill in the left digits with ones. Thinking about the numerical result of shifting is helpful as well. Arithmetically right-shifting a number by one divides that number by two, preserving its signed representation, but logically right-shifting that number divides it by two in unsigned representation.

Part B: Useful Pseudoinstructions

The starter code for the first lab included a translation rule for the li pseudoinstruction, but there are quite a few other useful pseudoinstructions that we’d like to have for PIPS.

The `not` Pseudoinstruction

We’ll start with a simple one: the not instruction. This instruction has two operands, an input register, and an output register. The instruction not $t0, $t1 should read the value of the $t1 register, flip all of its bits, and store the result in $t0.

There is no opcode for not, but you can implement it using another opcode in a single instruction. You will need to figure out how to implement a not operation using the instructions your assembler already supports.

Add the following python code to your rules.py file and fill in the translation rule for not.

@assembler.instruction('not #, #', 1)
def not_instr(dest, src):
  # Fill in the translation rule here

Because you are implementing not based on existing instructions, you should use the translation functions you already wrote as building blocks, rather than calling pips.rformat(...). As an example, look at the li pseudoinstruction; this translation rule calls the addi_instr function to encode an addi instruction that implements an li operation.

Once you have not working you can move on to two other categories of pseudoinstructions.

The `push` and `pop` Pseudoinstructions

One of the trickiest aspects of MIPS assembly programming is dealing with stack variables. PIPS shares its stack discipline with MIPS. However, because we control both the architecture and assembly language, we can introduce new pseudoinstructions to make stack operations easier.

Some architectures, such as x86, have push and pop assembly instructions. These instructions simplify stack usage. The instruction push $t0 would move the stack down by two bytes and save the value of $t0 to the newly-allocated stack space. A corresponding pop $t0 instruction would load the value from the current stack pointer into $t0, then move the stack up two bytes. Notice that we move the stack by two bytes rather than four because PIPS registers hold only 16 bits.

This pseudoinstruction makes it easy to quickly save a value to the stack by pushing it, then restore it later by popping into the same register. It is important that you remember the order you pushed values to the stack; when you pop them, they will come off in the reverse order they were pushed; this is a stack data structure, even if it is a bit different than how you’ve implemented stacks in the past.

While PIPS does not support these instructions natively, we can implement them as pseudoinstructions. Let’s first look at the push pseudoinstruction. When we see an instruction such as push $t0, this should translate to two PIPS instructions:

  addi $sp, $sp, -2
  sw   $t0, 0($sp)

We can write a PIPS assembler rule to generate this sequence; the assembler rule must specify that it will generate two instructions, translate them, and then return the two instructions as follows:

@assembler.instruction('push #', 2) # <- notice the 2 here. This tells the assembler that we will emit two instructions for this rule
def push_instr(reg):
  return addi_instr('$sp', '$sp', '-2') + sw_instr(reg, '0', '$sp')

As you can see above, we generate two instructions by translating them individually, then “adding” them together (which is really concatenating two strings in Python). The assembler rule must report that it intends to generate two instructions or the assembler will report an error when you run this rule, so make sure to keep the 2 in the assembler.instruction decorator.

Write a similar pseudoinstruction rule for the pop instruction. Test your implementation by modifying your fibonacci.s program from the second datapath lab to use only push and pop instructions instead of explicit stack accesses. (The only allowable instance of $sp in your program should be the initialization.) Once your modified program works you may move on.

Branch Pseudoinstructions

MIPS has useful branch pseudoinstructions that make it a bit easier to write conditional jumps. These pseudoinstructions—blt, ble, bgt, and bge—are implemented using a sequence of two instructions: an slt and either bne or beq. You will have to implement these instructions for PIPS as well, but there is a slight complication; MIPS has a special register called $at (for assembler temporary) where it can save intermediate values from slt instructions. PIPS does not have an $at register, so we’re going to have to modify some other register. The easiest one to choose is the first operand to our pseudoinstructions. For example, we can translate the pseudoinstruction blt $t0, $t1, label to:

  slt $t0, $t0, $t1
  bne $t0, $zero, label

To make it clear that blt will modify one of its input registers, we will borrow Scheme’s naming scheme and add a ! to the end of the pseudoinstruction.

Implement the blt!, ble!, bgt!, and bge! pseudoinstructions by adding translation rules to your rules.py file. You must implement each of the four pseudoinstructions using just two PIPS instructions. Don’t forget to tell the assembler that these pseudoinstructions will be translated to two PIPS instructions in the assembler.instruction decorator.

Important caveat: The $zero register cannot be the first argument to these pseudoinstructions, because it is read-only.

Part C: Keyboard Support

You added a terminal to your datapath in the previous lab, but most computers also give you the ability to provide input to your programs as well. We’ll use the same strategy of memory-mapped I/O to connect Logisim’s keyboard component to the datapath.

Add a keyboard device to your datapath somewhere near your memory controller. You can find this device in Logisim’s Input/Output section of components.

The keyboard device in Logisim has a short buffer. When you press a key (after selecting the keyboard with the hand tool) it places a character in the buffer. When the keyboard is enabled and its clock input rises, it deletes the first value from its buffer. The keyboard continuously outputs the value of the first character in the buffer to the $\text{Data}$ output, located on the lower right corner of the component.

Connect the keyboard’s clock input to a clock component. You should not use an inverted clock here; we do not want the keyboard buffer to change state at the same time its value is being written to a register.

As with the terminal, we need to choose a memory address that corresponds to the keyboard device. We’ll use 0xff20, since this is the next available word in the upper memory region after the terminal device. We want to allow the keyboard to advance to the next character in its buffer when we the program is reading from the address 0xff20. Add a comparator and constant to check if the memory address is equal to 0xff20.

Remember that the memory controller’s $A$ input is connected to the ALU output. We want to enable the keyboard only when the current instruction accesses memory at the address 0xff20, not just when its ALU output is 0xff20. Use an AND gate to turn on the keyboard’s enable pin (located in the lower left corner) when the comparator’s $=$ output is true and the datapath is configured to read from memory. Warning: Be careful not to connect to the keyboard’s $\text{clear}$ input!

Now, hook the keyboard’s $\text{Data}$ output to a bit extender the convert the 7-bit value to a 16-bit value. This is the value we want to send back to the register file instead of the data value read from memory. Most of the time we will pass the memory controller’s $R_d$ output to the register file, but when the program is loading from address 0xff20 it should send the keyboard buffer output back instead. Add a multiplexor to choose between the memory controller’s $R_d$ output and the keyboard’s $\text{Data}$ output based on the result of the comparator you added earlier in this part of the lab.

Now that you have the keyboard connected, save and assemble this simple PIPS program that echoes keyboard input to the TTY output:

# Echo characters typed to the terminal until a newline is read
# Charlie Curtsinger
.constant HALT      0xff00
.constant TERM      0xff10
.constant KBD       0xff20
.constant STACK_TOP 0xf800

nop
main:
  li   $s0, TERM       # Address of the terminal output for memory-mapped I/O
  li   $s1, KBD        # Address of the keyboard input for memory-mapped I/O
  li   $sp, STACK_TOP  # Predefined initial stack address
  
loop_top:
  lb  $t0, 0($s1)           # Load a character from the keyboard
  beq $t0, $zero, loop_top  # If it is the null terminator, start over
  sb  $t0, 0($s0)           # Write the character to the terminal
  li  $t1, '\n'             # Load a newline constant
  beq $t0, $t1, end         # If the character was the newline, stop looping
  j   loop_top              # Start over

end:
  j   HALT                  # Enable halt pin and stop the PC incrementing

Load the assembled program into instruction memory. If you haven’t already, select the fastest tick speed from the “Tick Frequency” section of the “Simulate” menu. Start this program by choosing “Ticks Enabled” from the “Simulate” menu.

Using the hand tool, click on the keyboard buffer and type. You should see the program transfer characters from the keyboard buffer to the terminal output. (Note your keyboard press will need to span a clock tick.) The program should halt once you hit enter. Once this simple test is working you can move on to the next part of the lab.

Warnings:

Running this program from the command line will result in slightly different behavior. Logisim waits for a newline character before passing inputs to a keyboard buffer in TTY mode, so you won’t see any output until you hit enter.
If you type a line that is longer than the keyboard’s buffer it will discard trailing characters.
Command line runs also seem to trigger a bug that causes Logisim to skip the first input character on some runs; it’s fine if this happens in the command line, but it should not happen when running the processor inside the Logisim window.
Running in the Logisim window with a high clock speed can also trigger erratic behavior; you can pre-load keyboard input before enabling the clock and/or use a sufficiently low clock rate to avoid experiencing this bug (1-4 Hz often seems to work).

Part D: Interesting Programs

In this final part of the datapath lab sequence, you will write three interesting programs that use input and output.

String Reverse

Write a program that reads in a line of text and prints it out in reverse. The program should continue reading until it hits a newline, then print and halt.

Hint: You might be thinking you need to allocate memory to store the line you read. While you could do this by choosing a reasonable address somewhere in the middle of data memory, there is a far simpler solution that uses recursion (i.e., without appealing to your procedure reverse from the homework.

Save your program in the file string_reverse.s.

Convert to Uppercase

Write a program that reads in a line of text and prints it out with all lowercase letters converted to uppercase. Your program should leave numbers, punctuation, and uppercase letters as-is.

Save your program in the file string_uppercase.s.

One-Function Calculator

Write a program that allows the user to type in a one-digit number, a plus sign, and another one-digit number. After typing these inputs followed by a newline character, the calculator should convert the typed digits to numbers, add them, and print the result (also followed by a newline).

You are welcome to reuse the number printing code you wrote for the previous lab. Run the whole program in a loop so it will continue accepting and evaluating user input until the user hits enter without typing a command. Your program does not need to handle invalid inputs; type carefully!

Save your program in the file calculator.s

Package to Submit

When you are finished, please create an archive of your required datapath directory elements using the following command, starting from inside the datapath directory:

$ cd ..
$ zip datapath-part3.zip datapath/rules.py datapath/microprogram.hex datapath/datapath.circ datapath/programs/fibonacci.s datapath/programs/string_reverse.s datapath/programs/string_uppercase.s datapath/programs/calculator.s

Note that although the command line beginning with zip may wrap in your browser, you should enter (copy and paste) the entire command (through calculator.s) all at once.

This should create a datapath-part3.zip file one directory up from your datapath. Submit this archive to complete the lab.

CC-BY-NC-SA This work is licensed under a Creative Commons Attribution NonCommercial ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.