Algorithms and Computer Programming

CSC 105 - The Digital Age - Weinman



Summary:
We discuss the general properties of algorithms and how they are expressed in computer programming languages, such as Python.

Algorithms

Recall that an algorithm is an ordered sequence of instructions for solving a problem. There are certain elements that often arise in a wide variety of algorithms; I like to think of these as the fundamental "ingredients" of algorithms.
One of my favorite cookbooks (Essentials of Classic Italian Cooking by Marcella Hazan) has an entire chapter devoted entirely to basic ingredients, because it is so important for cooks to know and understand them. For instance, there are multiple pages devoted solely to garlic, another several pages on olive oil, and many on other fundamentals. Having a solid understanding of these fundamentals helps us put these ingredients together in recipes (algorithms) that result in fantastic creations.
These algorithmic "ingredients" are similarly worthwhile to think about. After all, computer scientists regularly rely on them when writing new algorithms. In the coming weeks, you also will be expressing algorithms in a programming language, so spending a little bit of time on them will be similarly valuable. We will cover five basic ingredients: variables, subroutines, parameters, conditionals, and repetition.

Variables: Named values

We like to name things. You have a name. Perhaps you've named a dog or other family pet. We give things names because it often helps us specify who or what we want to refer to in a clear, compact manner. Whereas our tendency in natural language is to refer to things ambiguously with pronouns like "it," variables give us the ability to refer to values unambiguously.
Another way to think of a variable is like a named box that stores a value. For instance, think of pi. This is a name that a well-known number, 3.14159. Despite being called a variable, the value of pi doesn't vary; it's just a clear name for a value.
The table below demonstrates two variable names and values those names might refer to.
Variable name Value stored
weekday    "Monday"
day 23
pi 3.14159

Conditionals: Handling different situations

Sometimes we like our algorithms to do different things depending on whether certain things are true or not. For instance, an algorithm for a robot might move forward if the path is free; otherwise it might turn slightly.
Conditionals often have the form:
if something is true: 
  do something
As in our robot example above, we might have a richer construction that gives an alternative behavior:
if something is true:
 do something
else:
 do something else
Sometimes there are more than two possibilities. In that case, there will often be a longer sequence of tests and actions.

Repetition

Sometimes we need our algorithm to do things again and again. For example, when a cell phone receives a call, its software might be programmed to play the ring tone several times (maybe even until the user picks up). It can be tedious or even impossible to explicitly write out repeated commands multiple times, so computer scientists have devised ways to express repeated actions more compactly.
Repetition often takes one of the following forms:
  1. While some condition is met, repeat action(s) .
  2. Repeat action(s) a fixed number of times.
As an example of the first case, while you are hungry you might continue eating food at the dining hall. This algorithm might look like
while hungry:
  get food
  eat food
The second case is fairly straightforward. "Turn right three times" might be expressed as
for step in range(1,3):
  turn right

Subroutines: Named helper algorithms

Many algorithms require the use of some common, or more basic, operations. In fact, most algorithms we write refer to some other algorithms. Whether it's adding numbers, reading input from a keyboard, or searching for a word in a block of text, computer scientists write algorithms for common actions and use them in other algorithms. This allows us to understand algorithms more intuitively at a higher, more abstract level than if every minute detail were explicitly written out at length.
For example, we might want to choreograph a beautiful and intricate robot dance. Writing out all of the individual movements in one long series of commands is certainly possible, but it would not be easy for humans to understand how the pieces collect into related parts. It would be nearly impossible to modify the algorithm in a sensible way. Instead, we might collect the movements into larger wholes, giving those parts names, so the whole process becomes easier to read and think about. Suppose one part of the dance called for wiggling. We might name a subroutine "wiggle" with the requisite steps:
def wiggle(numTimes):
# Wiggle (rotate left, rotate right) the given number of times
    for step in range(1,numTimes):
        turnLeft(1,1) # Turns at full speed, for one second
        turnRight(1,1)
The program portions beginning with the hash character # are called "comments"; the computer ignores any text following the hash, which signals that what is written is strictly for the human audience and is not to be interpreted as a precise instruction. With this wiggle subroutine defined, Thus a larger part of the dance may now read
forward(0.5, 2)
wiggle(2)
backward(1, 2)
wiggle(3)
You might see that in addition to making this simple dance easier to read, we have also been able to reuse the wiggle instructions instead of explicitly rewriting similar for loops to do the repetition.

Parameters: Named inputs

You may have noticed from our wiggle example above that for many subroutines, it is useful to provide them inputs that help guide their behavior. For instance, numTimes is an input to wiggle specifying how many turn pairs should be done. We call these inputs parameters. Perhaps you recognize this term from mathematics. When you saw the mathematical expression f(x)=2x-1-1, you might say that x is the parameter to the function f (which, incidentally, tells you the value of the largest twos-complement number you can store with x bits). In many programming languages, subroutines are also thought of as functions, and the inputs, like x, are parameters that tell us where (or really, how) to evaluate the function/subroutine. The parameters then influence what value or action the algorithm produces. Giving the value 16 as x to the function f would produce the familiar result 32767; or we might simply say f(16)=32767. In any case, x is the parameter name, which has the value 16 in this example.
For example, another very useful subroutine called input might handle all the details of prompting a program's user for input and reading data from the keyboard. Many programming languages indicate a subroutine and its parameters the same way f(x) indicates a mathematical function f and its parameter x. For example:
start = input("Enter a number: ")
In the expression above, we use the symbol = to assign a value to the variable named start. The procedure input takes some text as a parameter, which is used as a prompt to the program user who presumably types in a value, which is then given to the variable start.

What is a computer program?

We have now covered some of the main ingredients for writing down algorithms. Because the chief purpose of an algorithm is to compute the solution to some problem, it is typically useful to express algorithms in a way that can be understood by computers.

Programming languages

Programming languages are designed by computer scientists to express algorithms. Most of these can be "understood" by computers. We need them because English (and other "natural" languages) can be ambiguous, as in Groucho Marx's famous line:
Time flies like an arrow; fruit flies like a banana.
To a programmer, it often seems that computers will misinterpret whatever you write. However, the specification of a particular programming language tells us what computers "hear" when we program them. The syntax of a language tells us its grammar rules. When it comes to programming, computers have no judgment. They cannot figure out what you want. Instead, they give error messages for seemingly small mistakes (like your tutorial professor, who really cares about commas). Perhaps similarly, these messages can be somewhat cryptic, especially to the beginner.1

Program files

We can think of computer programs at several levels. Over the next few weeks, we'll be studying all of them. There are two (or three) levels for thinking about a program and accompanying methods for translating from higher, human intelligible versions to the lower, computer-ready versions. We'll look at the levels first and then briefly describe the "translation" process.
The "highest" level (as in, farthest from the computer) of a program is called the source code. Typically this is a text file written by a programmer in a formal language. Just like the informal algorithms we have discussed in class, the file would contain a sequence of instructions expressed more formally. Just like natural languages, there are many varieties of formal languages, each with their relative strengths. Programs in source code form are not executed directly by the computer. Instead, they must be transformed into a lower level. However, this source code form is much easier for humans to understand, write, and reason about.
The next lower level of a program is called assembly code. This form of language expresses an algorithm in the simplest steps that can be directly implemented by a computer. Assembly code is still a text file (i.e., a sequence of bits interpreted as ASCII). Thus, humans can still read it, but assembly programs are more difficult to interpret because they have very little abstraction. Consider an algorithm for throwing a ball. Source code is like the instruction to "swing your arm over your head", while assembly code might be like describing the sequence of neurons in your brain that must fire to accomplish the same task.
Assembly code is primarily a mnemonic for machine code, which is also called object or executable code. Machine code is largely the same sequence of instructions, but written in a binary format that the central processor of a computer can understand directly. As you might guess, these are very detailed instructions and they are generally not interpretable by humans.
During World War II, programmers for the earliest computers essentially wrote their programs in machine code. Fortunately, we now have programs that automatically tackle the conversion of high level source code to low level assembly and further to machine code. This translation happens in one of two different ways, compiling and interpreting.
Compiling a source code program is a complete translation to object code in advance of executing the code. One writes the source code, runs a compiler (which is just another computer program) to generate the corresponding object code. We can then execute that object code directly on the central processor whenever it is needed.
In contrast, we could interpret a computer program "on the fly". In this framework, we translate an instruction from source code to machine code just before it is run on the processor. Another program called the interpreter is run, and it translates instructions from source to machine code for the processor as needed. The translations are not saved.
Interpreted languages are often easier to understand and quickly write programs in. Because the translations happen on the fly, they are also often used interactively. However, because the intermediate translation step is required, they may not be as fast as using a compiled program.
In this class, we'll be using the interpreted programming language Python for writing some high level source code. While it is easy to learn, it is still of course a formal language and therefore picky about things. All the code you have seen so far is actual Python. Hopefully you have found it to be somewhat easy to read.

Acknowledgments

"What is a computer program?" adapted from materials by Marge Coahran. "Algorithms" adapted from materials by Samuel A. Rebelsky. Used by permission.
Copyright © 2011, 2014, 2015 Jerod Weinman.
cc-by-nc-sa.png This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.

Footnotes:

1Because these messages are themselves authored by programmers, this confusion simply emphasizes the importance of good writing - especially writing that considers the audience and its needs.