Data is all around us and the amount of data stored increases every single day. In today’s world, decisions must be data-driven and so it is imperative that we be able to process, analyze, and understand the data we collect. Other important factors include the security and privacy of data. Businesses and governments need to answer important questions such as “Where should this data be stored?”; “How should this data be stored?”; and even, “Should this data be stored at all?”. The answers to these questions for Health Canada and personal health data is very different from the answers Nintendo might come up with for the next Animal Crossing game.
We begin our study of computer science by developing definitions for different categories of data. A data type is a way of categorizing data. A description of a data type conveys two important pieces of information:
For example, we could say that a person’s age is a natural number, which would tell us that values like 25 and 100 would be expected, while an age of -2 or “David” would be nonsensical. Knowing that a person’s age is a natural number also tells us what operations we could perform (e.g., “add 1 to the age”), and rules out other operations (e.g., “sort these ages alphabetically”).
In this section, we’ll review the common data types that we’ll make great use of in this course: numeric data, boolean data, textual data, and various forms of collections of data. Many terms and definitions may be review from your past studies, but be careful—they may differ slightly from what you’ve learned before, and it will be important to get these definitions exactly right.
Here are some types of numeric data, represented as familiar sets of numbers.
All numeric data types support the standard arithmetic operations (addition, subtraction, multiplication, division, and exponentiation), as well as the standard comparisons for equality (using \(=\)) and inequality (\(<\), \(\leq\), \(>\), \(\geq\)). And of course, you are familiar with many more numeric functions, like log and sin; these will come up throughout the course.
One additional arithmetic operation that may be less familiar to you is the modulo operator, which produces the remainder when one integer is divided by another. We’ll use the percent symbol \(\%\) to denote the modulo operator, writing \(a \% b\) to mean “the remainder when \(a\) is divided by \(b\)”. For example, \(10 \% 4 = 2\) and \(30 \% 3 = 0\).
Some arithmetic operations are undefined for particular numbers; for example, we can’t divide by zero, and we can’t take the square root of a negative number.
A boolean is a value from the set \(\{\text{True}, \text{False}\}\). Think of a boolean value as an answer to a Yes/No question, e.g. “Is this person old enough to vote?”, “Is this country land-locked?”, and “Is this service free?”.
Booleans can be combined using logical operators. The three most common ones are:
Next week, we’ll discuss these logical operators in more detail and introduce a few others.
A string is an ordered sequence of characters, and is used to represent text. A character can be more than just an English letter (\(a\), \(b\), \(c\), etc.): number digits, punctuation marks, spaces, glyphs from non-English alphabets, and even emojis are all considered characters, and can be part of strings. Examples include a person’s name, your chat log, and the script of Shakespeare’s Romeo and Juliet.
We typically will surround strings with single-quotes to differentiate them from any surrounding text, e.g., ‘David’. We can also use double-quotes (“David”) to surround a string, but in this course we will generally prefer single-quotes for a reason we’ll discuss in Section 1.3.
A string can have zero characters; this string is called the empty string, and is denoted by `’ or the symbol \(\epsilon\).
Here are some common operations on strings. \(s\), \(s_1\), and \(s_2\) are all variables representing strings.
\(|s|\): string length/size. Returns the the number of characters in \(s\).
\(s_1 = s_2\): string equality. Returns whether \(s_1\) and \(s_2\) have the same characters, in the same order.
\(s + t\): string concatenation. Returns a new string consisting of the characters of \(s\) followed by the characters of \(t\). For example, if \(s_1\) represents the string ‘Hello’ and \(s_2\) represents the string ‘Goodbye’, then \(s_1 + s_2\) is the string ‘HelloGoodbye’.
\(s[i]\): string indexing. Returns the \(i\)-th character of \(s\), where indexing starts at 0. (So \(s[0]\) returns the first character of \(s\), \(s[1]\) returns the second, etc.) For example, if \(s\) represents the string ‘Hello’, then \(s[0]\) is ‘H’ and \(s[4]\) is ‘o’.
A set is an unordered collection of zero or more distinct values, called its elements. Examples include: the set of all people in Toronto; the set of words of the English language; and the set of all countries on Earth.
We write sets using curly braces in two different ways:
A set can have zero elements; this set is called the empty set, and is denoted by \(\{\}\) or the symbol \(\emptyset\).
Here are some common set operations.\(A\) and \(B\) represent sets.
\(|A|\): returns the size of set \(A\), i.e., the number of elements in \(A\).
\(x \in A\): returns True when \(x\) is an element of \(A\); \(y \notin A\) returns True when \(y\) is not an element of \(A\).
\(A \subseteq B\): returns True when every element of \(A\) is also in \(B\). We say in this case that \(A\) is a subset of \(B\).
A set \(A\) is a subset of itself, and the empty set is a subset of every set: \(A \subseteq A\) and \(\emptyset \subseteq A\) are always True.
\(A = B\): returns True when \(A\) and \(B\) contain the exact same elements.
The following operations return sets:
\(A \cup B\), the union of \(A\) and \(B\). Returns the set consisting of all elements that occur in \(A\), in \(B\), or in both.
Using set builder notation: \(A \cup B = \{x \mid x \in A \text{ or } x \in B\}\).
\(A \cap B\), the intersection of \(A\) and \(B\). Returns the set consisting of all elements that occur in both \(A\) and \(B\).
Using set builder notation: \(A \cap B = \{x \mid x \in A \text{ and } x \in B\}\).
\(A \setminus B\), the difference of \(A\) and \(B\). Returns the set consisting of all elements that are in \(A\) but that are not in \(B\).
Using set builder notation: \(A \setminus B = \{x \mid x \in A \text{ and } x \notin B\}.\)
\(A \times B\), the (Cartesian) product of \(A\) and \(B\). Returns the set consisting of all pairs \((a, b)\) where \(a\) is an element of \(A\) and \(b\) is an element of \(B\).
Using set builder notation: \(A \times B = \{(x, y) \mid x \in A \text{ and } y \in B\}.\)
\(\cP(A)\), the power set of \(A\), returns the set consisting of all subsets of \(A\).Food for thought: what is the relationship between \(|A|\) and \(|\cP(A)|\)? For example, if \(A = \{1,2,3\}\), then \[\cP(A) = \big\{ \emptyset, \{1\},\{2\},\{3\},\{1,2\},\{1,3\},\{2,3\},\{1,2,3\}\big\}.\]
Using set builder notation: \(\cP(A) = \{S \mid S \subseteq A\}\).
A list is an ordered collection of zero or more (possibly duplicated) values, called its elements. List data is used instead of a set when the elements of the collection should be in a specified order, or if it may contain duplicates. Examples include: the list of all people in Toronto, ordered by age; the list of words of the English language, ordered alphabetically, and the list of names of students at U of T (two students may have the same name!), ordered alphabetically.
Lists are written with square brackets enclosing zero or more values separated by commas. For example, \([1, 2, 3]\).
A list can have zero elements; this list is called the empty list, and is denoted by \([]\).
Here are some common list operations.\(A\) and \(B\) represent lists.
\(|A|\): returns the size of \(A\), i.e., the number of elements in \(A\) (counting all duplicates).
\(x \in A\): same meaning as for sets.
\(A = B\): \(A\) and \(B\) have the same elements in the same order.
\(A[i]\): list indexing. Returns the \(i\)-th element of \(A\), where the indexing starts at 0. So \(A[0]\) returns the first element of \(A\), \(A[1]\) returns the second, etc.
\(A + B\): list concatenation. Returns a new list consisting of the elements of \(A\) followed by the elements of \(B\). This is similar to set union, but duplicates are kept, and order is preserved.
For example, \([1, 2, 3] + [2, 4, 6] = [1, 2, 3, 2, 4, 6]\).
Finally, a mapping is an unordered collection of pairs of values. Each pair consists of a key and an associated value; the keys must be unique in the mapping, but the values can be duplicated. A key cannot exist in the mapping without a corresponding value.
Mappings are used to represent associations between two collections of data. For example: a mapping from the name of a country to its GDP; a mapping from student number to name; and a mapping from food item to price.
We use curly braces to represent a mapping. This is similar to sets, because mappings are quite similar to sets. Both data types are unordered, and both have a uniqueness constraint (a set’s elements are unique; a mapping’s keys are unique). Each key-value pair in a mapping is written using a colon, with the key on the left side of the colon and its associated value on the right. For example, here is how we could write a mapping representing the menu items of a restaurant: \[\{\text{`fries'}: 5.99, \text{`steak'}: 25.99, \text{`soup'}: 8.99\}.\]
Here are some common set operations.\(M\) and \(N\) represent mappings.
The data types we’ve studied so far are not the only kinds of data that we encounter in the real world, but they do form a basis for representing all kinds of more complex data. We’ll study how to represent more complex forms of data later in this course, but here’s one teaser: representing image data.

Images can be represented as a list of integers. Each element in the list corresponds to a very tiny dot on your screen—a pixel. For each dot, three integer values are used to represent three colour channels: red, green, and blue. We can then add these channels together to get a very wide range of colours (this is called the RGB colour model). Somehow, our computers are able to take these sequences of integers and translate them into a sequence of visible lights and if these lights are arranged in a particular way, well, a cat appears!
For the thousands of years of human history before the mid-twentieth century, humans collected, analysed, and created data by hand. Digital computers were a revolution not just in technology but in civilization because of their ability to store more data than could fit on all the sheets of paper in the world, and to perform computations on this data faster and more reliably than an army of humans. Today, we rely on complex computer programs to operate on data in a variety of ways, from sending messages back and forth with loved ones, organizing data in documents and media, to running simulations of physical, social, and biological systems.
Yet for all their computation power, computers have one fundamental limitation: they have no agency, no inherent ability to make decisions about what to do. All they can do is take a set of (possibly very complex!) instructions, what we call a computer program, and execute them—no more, and no less. And so if we, as computer scientists, want to harness the awesome power of computers, we need to learn how give these instructions in a way that a computer understands. We need to learn how to speak to a computer.
A programming language is a way of communicating a set of instructions to a computer. Like human languages such as English, a programming language consists of a set of allowed words and the rules for putting those words together to form phrases with a coherent meaning. In your past learning of a (human) language, you’ve likely referred to these rules as the grammar of a language. Unlike human languages, a programming language must be precise enough to be understood by a computer, and so operates with a relatively small set of words and very structured rules for putting them together. Learning a programming language can be frustrating at first, because even a slight deviation from these rules results in the computer being unable to comprehend what we’ve written. But our time and efforts spent mastering the rules of a programming language yield a wonderful reward: the computer will not just understand what we’re saying, but faithfully execute them.
A program is simply the text of the instructions we wish to instruct the computer to execute; we call this text program code to differentiate it from other forms of text. To write programs in a particular language, we need to understand two key properties of the language. The first is the syntax of a programming language, which is the name we give to the rules governing what constitutes a valid program in the language. Before a computer can execute a program, it must read the instructions for the program; the syntax of the programming language specifies the format of these instructions. The second concept is the semantics of a programming language, which refers to the rules governing the meaning of different instructions in the language. Once the computer has read the instructions in a program, it begins executing them. The language semantics specifies what the computer should do for each instruction.
Just as there are thousands of human languages in the world today, each with their own vocabulary, grammar, and stylistic conventions, so too is there a plethora of programming languages that we can choose from. In this course, we’ll use the Python programming language, which offers a simple, beginner-friendly syntax and a set of language instructions whose semantics are both powerful and accessible.
Now, neither our computer hardware nor operating system understand the Python programming language. Instead, the creators of the Python language developed a program called the Python interpreter, whose job is to take programs written in the Python language and execute the instructions. So when you “download Python”, what you’re actually downloading and installing is this Python interpreter software. You can think of the Python interpreter as a mediator between you the programmer, writing communicating in Python, and the computer hardware that actually executes instructions.
There are two ways of writing code in the Python language to be understood by the interpreter. The first is to write full programs in the Python language, saving them as text files, Python programs use the .py file extension to distinguish them from other text files. and then running them through the Python interpreter. This is the standard way of writing programs: write the instructions, and then run them with the interpreter. The second way is to run the Python interpreter in an interactive mode, which we call the Python console or Python shell. In this mode, we can write small fragments of Python code and ask the Python interpreter to execute each fragment one at a time. The Python console is useful for experimenting and exploring the language, as you get feedback after every single instruction. The drawback is that interactions with the interpreter in the Python console are ephemeral, lost every time you restart the console. So we’ll use the following approach through the course: use the Python console to learn about and experiment with the Python language, and write full programs in .py files.
Data is all around us, but so are computers. If decisions must be data-driven then computers are an excellent tool for processing that data. Especially when we consider that computers are several orders of magnitude faster at computing data than a human. The problem is that computers need to be told exactly how to process the data, and we can do so using one of several programming languages. In this section, we see how data types are represented in Python and how we can use Python to perform operations for us. We’ll learn about some subtle, but crucial, differences between our theoretical definitions of data types from Section 1.1 and what Python can actually represent. But first, we’ll introduce some general terminology for using the interactive Python console.
When we first start the Python console, we see the following:
The text >>> is called the Python prompt: the console is “prompting” us to type in some Python code to execute. If we type in a simple arithmetic expression,
and press Enter, we see the following output:
The interpreter took our bit of code, 4 + 5, and calculated its value, 9. A piece of Python code that produces a value is called an expression, and the act of calculating the value of an expression is called evaluating the expression.
The expression 4 + 5 looks simple enough, but technically it is formed from two smaller expressions—the numbers 4 and 5 themselves. We can ask Python to evaluate each of these, though the result is not very interesting.
A Python literal is the simplest kind of Python expression: it is a piece of code that represents the exact value as written. For example, 4 is an integer literal representing the number 4.
To sum up,The pun was not originally intended, but we are pointing it out… the expression 4 + 5 consists of two smaller expressions, the literals 4 and 5, joined together with the arithmetic operator +, representing addition. We’ll devote the rest of this section to exploring the different kinds of data types we can use in Python: both how to write their literals, and what operations we can perform on them.
int, float)Python has two data types for representing numeric data: int and float. Let’s start with int, which stands for “integer”, and is the data type that Python uses to represent integers.
An int literal is simply the number as a sequence of digits with an optional - sign, like 110 or -3421.
Python supports all of the arithmetic operations we discussed in Section 1.1. Here are some examples; try typing them into the Python console yourself to follow along!
In the last prompt, we included some additional text—# This is "2 to the power of 5". In Python, we use the character # in code to begin a comment, which is code that is ignored by the Python interpreter. Comments are only meant for humans to read, and are a useful way of providing additional information about some Python code. We used it above to explain the meaning of the ** operator in our demo.
Python supports the standard precedence rules for arithmetic operations,sometimes referred to as “BEDMAS” or “PEMDAS” performing exponentiation before multiplication, and multiplication before addition and subtraction:
Just like in mathematics, long expressions like this one can be hard to read. So Python also allows you to use parentheses to group expressions together:
>>> 1 + ((2 ** 3) * 5) # Equivalent to the previous expression
41
>>> (1 + 2) ** (3 * 5) # Different grouping: "(1 plus 2) to the power of (3 times 5)"
14348907When we add, subtract, multiply, and use exponentiation on two integers, the result is always an integer, and so Python always produces an int value for these operations. But dividing two integers certainly doesn’t always produce an integer—what does Python do in this case? It turns out that Python has two different forms of division. The first is the operator //, and is called floor division (or sometimes integer division). For two integers x and y, the result of x // y is equal to the quotient \(\frac{\texttt{x}}{\texttt{y}}\), rounded down to the nearest integer. Here are some examples:
>>> 6 // 2
3
>>> 15 // 2 # 15 ÷ 2 = 7.5, and // rounds down
7
>>> -15 // 2 # Careful! -15 ÷ 2 = -7.5, which rounds down to -8
-8But what about “real” division? This is done using the division operator /:
This returns 2.5, which is a value of Python’s float type, which Python uses to represent arbitrary real numbers. A float literal is written as a sequence of digits followed by a decimal point (.) and then another sequence of digits. 2.5, .123, and 1000.00000001 are all examples of float literals.
All of the arithemtic operations we’ve looked at so far work with float values too:
>>> 3.5 + 2.4
5.9
>>> 3.5 - 20.9
-17.4
>>> 3.5 * 2.5
8.75
>>> 3.5 / 2.5
1.4
>>> 2 ** 0.5
1.4142135623730951The last expression, 2 ** 0.5, calculates the square root of 2. However, this actually poses a problem for Python: since \(\sqrt 2\) is an irrational number, its decimal expansion is infinite, and so it cannot be represented in any finite amount of computer memory. More precisely, computers use a binary system where all data, including numbers, are represented as a sequence of 0s and 1s. This sequence of 0s and 1s is finite since computer memory is finite, and so cannot exactly represent \(\sqrt 2\). We will discuss this binary representation of numbers later this year.
] The float value that’s produced, 1.4142135623730951, is an approximation of \(\sqrt 2\), but is not equal to it. Let’s see what happens if we try to square it:
This illustrates a fundamental limitation of float: this data type is used to represent real numbers, but cannot always represent them exactly. Rather, a float value approximates the value of the real number; sometimes that approximation is exact, like 2.5, but most of the time it isn’t.
3 vs. 3.0Here’s an oddity:
Even though \(\frac{6}{2}\) is mathematically an integer, the results of the division using // and / are subtly different in Python. When x and y are ints, x // y always evaluates to an int, and x / y always evaluates to a float, even if the value of \(\frac{\texttt{x}}{\texttt{y}}\) is an integer! So 6 // 2 has value 3, but 6 / 2 has value 3.0. These two values represent the same mathematical quantity—the number 3—but are stored as different data types in Python, something we’ll explore more later in this course when we study how ints and floats actually work in Python.
ints and floatsSo to summarize: for two ints x and y, x + y, x - y, x * y, x // y, and x ** y all produce ints, and x / y always produces a float. For two floats, it’s even simpler: all six of these arithmetic operations produce a float.Even //. Try it!
But what happens when we mix these two types? An arithmetic operation that is given one int and one float always produces a float. You can think of a float as a parasite—even in long arithmetic expressions where only one value is a float, the whole expression will evaluate to a float.
| Operation | Description |
|---|---|
a + b |
Returns the sum of the a and b |
a - b |
Returns the result of subtraction of b from a |
a * b |
Returns the result of multiplying a by b |
a / b |
Return the result of dividing a by b |
a % b |
Return the remainder when a is divided by b |
a ** b |
Return the result of a being raised to the power of b |
a // b |
Return the floored division a / b |
Finally, the numeric comparison operators are represented in Python as follows:
| Operation | Description |
|---|---|
a == b |
Returns whether a and b are equal. |
a != b |
Returns whether a and b are not equal (opposite of ==). |
a > b |
Returns whether a is greater than the value of b. |
a < b |
Returns whether a is less than the value of b. |
a >= b |
Returns whether a is greater than or equal to b. |
a <= b |
Returns whether a is less than or equal to the value b. |
Here are a few examples:
And returning to our discussion earlier, we can see that even though ints and floats are different types, Python can recognize when their values represent the exact same number:
In these examples, we’ve seen the values True and False produced as a result of these comparison expressions. You can probably tell exactly what they mean, but let’s take a moment to introduce them formally.
bool)In Python, boolean data is represented using the data type bool. Unlike the broad range of numbers we just saw, there are only two literal values of type bool: True and False.
There are three boolean operators we can perform on boolean values: not, and, and or.
>>> not True
False
>>> True and True
True
>>> True and False
False
>>> False or True
True
>>> False or False
FalseOne note about the or operator in Python is that it is the inclusive or, meaning it produces True when both of its operand expressions are True.
Just as we saw how arithmetic operator expressions can be nested within each other, we can combine boolean operator expressions, and even the arithmetic comparison operators:
str)All Python code is text that we type into the computer, so how do we distinguish between text that’s code and text that’s data, like a person’s name? Python uses the str (short for “string”) data type to represent textual data. A str literal is a sequence of characters surrounded by single-quotes ('). Python allows string literals to be written using either single-quotes or double-qutoes ("). We’ll tend to use single-quotes in this course to match how Python displays strings, as we’ll see in this section. For example, we could write this course’s name in Python as the string literal 'Foundations of Computer Science I'.
Now let’s see what kinds of operations we can perform on strings. First, we can compare strings using ==, just like we can for ints and floats:
Python supports string indexing to extract a single character from a string. Remember, string indexing starts at 0. s[0] represents the first character in the string s.
And Python supports concatenation using the familiar + operator:
>>> 'One string' + 'to rule them all.'
'One stringto rule them all.'
>>> 'One string ' + 'to rule them all.' # Note the extra space!
'One string to rule them all.'One operation that we did not cover in Section 1.1 is a fun quirk of Python: string repetition.
And of course, all of these string operation expressions can be nested within each other:
set)Python uses the set data type to store set data. A set literal matches the notation we use in mathematics: the literal begins with a { and ends with a }, and each element of the list is written inside the braces, separated from each other by commas. For example, {1, 2, 3} is a set of ints, and {1, 2.0, 'three'} is a set of elements of mixed types.
Like other data types, sets can be compared for equality using ==. Remember that element order does not matter when comparing sets!
Python also supports the “element of” (\(\in\)) set operation using the in operator.
Python also allows not and in to be combined to form an operator that corresponds to the set operation \(\notin\):
We’ll see in the next chapter how other set operations such as union and intersection are supported in Python.
list, tuple)Python uses two different data types to store list data: list and tuple. list literals are written the same way as set literals, except using square brackets instead of curly braces. Lists support the same operations we saw for strings and sets earlier:
>>> [1, 2, 3] == [1, 2, 3] # List equality comparison; order matters!
True
>>> [1, 2, 3] == [3, 2, 1]
False
>>> (['David', 'Mario', 'Jacqueline', 'Diane'])[0] # List indexing
'David'
>>> ['David', 'Mario'] + ['Jacqueline', 'Diane'] # List concatenation
['David', 'Mario', 'Jacqueline', 'Diane']
>>> 1 in [1, 2, 3] # List "element of" operation
Truetuple literals are written using regular parentheses instead, but otherwise support the above operations as well.
>>> (1, 2, 3) == (1, 2, 3) # Tuple equality comparison
True
>>> (1, 2, 3) == (3, 2, 1)
False
>>> ('David', 'Mario', 'Jacqueline', 'Diane')[0] # Tuple indexing
'David'
>>> ('David', 'Mario') + ('Jacqueline', 'Diane') # Tuple concatenation
('David', 'Mario', 'Jacqueline', 'Diane')
>>> 1 in (1, 2, 3) # Tuple "element of" operation
TrueSo why does Python have two different data types that represent the same kind of data? There is an important technical distinction between list and tuple that we’ll learn about later in this course, but for now we’ll generally stick with list.
dict)Python stores mapping data using a data type called dict, short for “dictionary”. dict literals are written similarly to sets, with each key-value pair separated by a colon. For example, we can represent the mapping from the previous section with the dictionary literal {'fries': 5.99, 'steak': 25.99, 'soup': 8.99}. In this dictionary, the keys are strings, and the values are floats.
But if both sets and dictionaries use curly braces, then does the literal {} represent an empty set or an empty dictionary? The answer (for historical reasons) is an empty dictionary—Python has no literal to represent an empty set. Instead, we represent an empty set with set(), which is syntax we haven’t yet seen and will explore later.
Dictionaries also support equality comparison using ==. They support key lookup using the same syntax as string and list indexing:
And finally, they support checking whether a key is present in a dictionary using the in operator:
So far, we’ve been writing expressions in the Python console using only literals and operators. But as the computations we want to perform get more complex, relying on just literals and operators is very cumbersome. We can write very complex nested expressions, but this makes our code very hard to understand.
For example, suppose we’re given three points in the Cartesian plane \((1, 3)\), \((2, 5)\), \((10, -1)\) that form a path, and we want to find the length of this path.

We’d like to use this formula for the distance \(d\) between two points \((x_1, y_1)\) and \((x_2, y_2)\):
\[ d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2}. \]
We could write this as a single arithmetic expression and have Python evaluate it:
But typing in this expression is quite error-prone, and hard to understand. Just like in mathematics, we can improve our code by breaking down this problem into intermediate steps. Python (like all other programming languages) gives us a ways to bind values to names, so that we can refer to those values later on in subsequent calculations.
A variable is a piece of code that is a name that refers to a value. We create variables in Python using the syntax:
which is a form of Python code called an assignment statement. You might wonder why we use the term “statement” rather than “expression” for assignment. An expression is a piece of Python code that is evaluated to produce a value. When we execute an assignment statement, it doesn’t produce a value—it instead defines a variable.
Python executes an assignment statement in two steps:
= is evaluated, producing a value.After the assignment statement is executed, the variable may be used to refer to the value. Here’s how we can use variables to simplify the calculation above:
>>> distance1 = ((1 - 2) ** 2 + (3 - 5) ** 2) ** 0.5 # Distance between (1, 3) and (2, 5)
>>> distance2 = ((2 - 10) ** 2 + (5 + 1) ** 2) ** 0.5 # Distance between (2, 5) and (10, -1)
>>> distance1 # A variable is an expression; evaluating it produces the value it refers to
2.23606797749979
>>> distance2
10.0
>>> distance1 + distance2 # The total distance
12.23606797749979Because variables are used to store intermediate values in computations, it is important to choose good variable names so that you can remember what the purpose of each variable is. This might not seem that important in our above example because there were only two variables, but as you start writing larger programs, you’ll have to grapple with dozens, if not hundreds, of variables, and choosing good names will be paramount.
For now, we’ll introduce a few simple rules that you should follow when choosing variable names:
All variable names should use only lowercase letters, digits, and underscores. So distance1, not Distance1.
When a variable name consists of multiple words, write each word in lowercase and separate them with an underscore. You aren’t allowed to use spaces in variable names. For example, we might create a variable to refer to the total distance by doing
We use the name total_distance rather than totaldistance or totalDistance (the latter is a naming style used in other programming languages, but not here).
Avoid single-letter variable names and non-standard acronyms/abbreviations, outside of some mathematical contexts.
For example, we might have used d1 and d2 instead of distance1 and distance2 because d is the variable we used for distance in our above formula. However, we should not use td instead of total_distance, because a second person wouldn’t immediately understand what td stands for.
As our programs get larger, it is useful to have a principled way to keep track of the variables and data used by the programs. A memory model is a structured way of representing variables and data in a program. The term “memory” here refers to the computer memory used to actually store the data. For the next few weeks, we’re going to use the value-based Python memory model, which simply uses a table to represent the associations between variables and their associated values. For example, the value-based memory model for our above example is the following:
| Variable | Value |
|---|---|
distance1 |
2.23606797749979 |
distance2 |
10.0 |
To wrap up our introduction to data in Python, we’re going to learn about one last kind of expression that allows to build up and transform large collections of data in Python.
Recall set builder notation, which is a concise way of defining a mathematical set by specifying the values of the elements in terms of a larger domain. For example, suppose we have a set \(S = \{1, 2, 3, 4, 5\}\). We can express a set of squares of the elements of \(S\): \[\{ x^2 \mid x \in S \}.\]
It turns out that this notation translates naturally to Python! To start, let’s go into the Python Console and create a variable that refers to a set of numbers:
Now, we introduce a new kind of expression called a set comprehension, which has the following syntax: Careful with this: even though set comprehensions also use curly braces, they are not the same as set literals. We aren’t writing out the individual elements separated by commas.
Evaluating a set comprehension is done by taking the <expr> and evaluating it once for each value in <collection> assigned to the <variable>. This is exactly analogous to set builder notation, except using for instead of \(|\) and in instead of \(\in\). Here’s how we can repeat our initial example in Python using a set comprehension:
Pretty cool, eh? If you aren’t sure exactly what happened here, it’s useful to write out the expanded form of the set comprehension:
{x ** 2 for x in numbers}
== {1 ** 2, 2 ** 2, 3 ** 2, 4 ** 2, 5 ** 2} # Replacing x with 1, 2, 3, 4, and 5.It goes even further—we can use set comprehensions with a Python list as well.
In fact, as we’ll see later in this course, set comprehensions can be used with any “collection” data type in Python, not just sets and lists.
Even though set comprehensions draw their inspiration from set builder notation in mathematics, Python has extended them to other data types.
A list comprehension is very similar to a set comprehension, except its syntax uses square brackets instead of curly braces:
Once again, <collection> can be a set or a list:
One word of warning: because sets are unordered but lists are ordered, you should not assume a particular ordering of the elements when a list comprehension generates elements from a set—the results can be unexpected!
A dictionary comprehension is again similar to a set comprehension, but specifies both an expression to generate keys and an expression to generate their associated values:
Out of all three comprehension types, dictionary comprehensions are the most complex, because the left-hand side (before the for) consists of two expressions instead of one. Here is one example of a dictionary comprehension that creates a “table of values” for the function \(f(x) = x^2 + 1\).
Our last example in this section will be to illustrate how multiple variables are used within the same comprehension expression. First, recall how we defined the Cartesian product of two sets using set builder notation: \[ A \times B = \{ (x, y) \mid x \in A \text{ and } y \in B \}.\] In this expression, the expression \((x, y)\) is evaluated once for every possible combination of elements \(x\) of \(A\) and elements \(y\) of \(B\).
The same holds for set, list, and dictionary comprehensions. We can specify additional variables in a comprehension by adding extra for <variable> in <collection> clauses to the comprehension. For example, if we define the following sets:
then we can calculate their Cartesian product using the following set comprehension: Remember, sets are unordered! Don’t get hung up on the unusual order in the output.
>>> {(x, y) for x in nums1 for y in nums2}
{(3, 30), (2, 20), (2, 10), (1, 30), (3, 20), (1, 20), (3, 10), (1, 10), (2, 30)}In general, if we have a comprehension with clauses for v1 in collection1, for v2 in collection2, etc., then the comprehension’s inner expression is evaluated once for each combination of values for the variables. This illustrates yet another pretty impressive power of Python: the ability to combine different collections of data together in a short amount of code.
The physics behind how we perceive colour is incredibly interesting, but also complex. Humans have developed a broad range of names of colours to identify categories like “red” in everyday language. Although the names we use for colours vary widely from language to language! Yet these categories can be fairly broad and imprecise; useful for everyday communication, but not for computer graphics and design. So in this section, we’ll learn about how computers represent colour data.
Mathematics can help us represent colours by a combination of numbers; the rules for how numbers map to colours is called a colour model. Many colour models exist, but one of the most common is the RGB colour model. At some point in your youth, you may have discovered that mixing two colours together (i.e., with paint, crayons, etc.) produces a different colour. The RGB colour model is based on the same idea: each colour is represented by three numbers, one for the “amount” of red, green, and blue to be mixed together.
A common form of the RGB colour model in a computer is called the RGB24 colour model, and allows for each of the red, green, and blue amounts to be a number between 0 and 255, inclusive. Though RGB24 is quite common, software like Photoshop allow for a larger range of numbers, enabling more granularity it their colour representations. You can look up the term deep colour to find out more about more sophisticated colour models Formally, we can define the set \(S = \{0, 1, \dots, 255\}\) and \(\mathcal{C}\) to be the set of all possible colours in the universe. Then the RGB colour model is a function \(RGB_{24}: S \times S \times S \to \mathcal{C}\) that takes in red, green, and blue values from \(S\) and returns a colour. This \(RGB_{24}\) function is one-to-one, as every combination of (red, green, blue) values produces a different colour.
| RGB Value | Colour |
|---|---|
| (0, 0, 0) | |
| (255, 0, 0) | |
| (0, 255, 0) | |
| (0, 0, 255) | |
| (181, 57, 173) | |
| (255, 255, 255) |
The RGB24 colour model translates naturally to Python: we represent a colour value as a tuple of three integers, where each integer is between 0 and 255, inclusive. For example, we can use (0, 0, 0) to represent a pure black, and (181, 57, 173) to represent a shade of purple. Of course, just representing these values as tuples doesn’t automatically make them colours:
But as you’ll see in your first tutorial this year, we can pass these tuples to operations that expect colour values, and get remarkable results.

In the previous chapter, we began our study of programming in Python by studying three main ingredients: literals, operators, and variables. We can express complex computations using just these forms of Python code, but as the tasks we want to perform grow more complex, so too does the code we need to write. In this chapter, we’ll learn about using functions in Python to organize our code into useful logical blocks that can be worked on separately and reused again and again in our programs.
Before looking at functions in Python, we’ll first review some of the mathematical definitions related to functions from the First-Year CS Summer Prep.
Let \(A\) and \(B\) be sets. A function \(f : A \to B\) is a mapping from elements in \(A\) to elements in \(B\). \(A\) is called the domain of the function, and \(B\) is called the codomain of the function.
Functions can have more than one input. For sets \(A_1, A_2, \dots, A_k\) and \(B\), a \(k\)-ary function \(f: A_1 \times A_2 \times \dots \times A_k \to B\) is a function that takes \(k\) arguments, where for each \(i\) between \(1\) and \(k\), the \(i\)-th argument of \(f\) must be an element of \(A_i\), and where \(f\) returns an element of \(B\). We have common English terms for small values of \(k\): unary, binary, and ternary functions take one, two, and three inputs, respectively. For example, the addition operator \(+ : \R \times \R \to \R\) is a binary function that takes two real numbers and returns their sum. For readability, we usually write this function as \(x+y\) instead of \(+(x,y)\).
We’ve seen that Python has many operators, like + that can be used on various data types. These operators are actually functions represented by symbols (e.g., addition through the + symbol). But there aren’t enough symbols to represent every function we could ever want. So Python also defines several functions that we can use to perform additional operations; these functions are called built-in functions, as they are made automatically available to us anywhere in a Python program. For example, the Python function abs takes a single numeric input and returns its absolute value. But how do we actually use it?
A Python expression that uses a function to operate on a given input is called a function call, and has the same syntax as mathematics: <function_name>(<argument>, <argument>, ...). For example, here are two examples of a function call expressions that call abs:
Function calls are central to programming, and come with some new terminology that we’ll introduce now and use throughout the next year.
abs(-10), the -10 is the argument of the function call.abs(-10), we say that -10 is passed to abs.abs(-10) is 10.In your mathematical studies so far, you’ve mainly studied unary numeric functions, i.e., functions that take in just one numeric argument and return another number. Examples include the sin and log functions. In programming, however, it is very common to work with functions that operate on a wide variety of data types, and a wide number of arguments. Here are a few examples of built-in functions that go beyond taking a single numeric argument:
The len function takes a string or collection data type (e.g., set, list) and returns the size of its input. While we defined “size” of these data types back in Section 1.1, we didn’t cover them in Python in the last chapter because we were waiting to get to functions.
The sum function takes a collection of numbers (e.g., a set or list whose elements are all numbers) and returns the sum of the numbers.
The sorted function takes a collection and returns a list that contains the same elements as the input collection, sorted in ascending order.
The max function is a bit special, because there are two ways it can be used. When it is called with two or more inputs, those inputs must be numeric, and in this case max returns the largest one.
But max can also be called with just a single argument, a non-empty collection of numbers. In this case, max returns the largest number in the collection.
The range function we saw in the last chapter takes in two integers start and stop and returns a value representing a range of consecutive numbers between start and stop - 1, inclusive. For example, range(5, 10) represents the sequence of numbers 5, 6, 7, 8, 9. If start >= end, then range(start, end) represents an empty sequence.
typeThe last built-in function we’ll cover in this section is type, which takes any Python value and returns its type. Let’s check it out: The term class that you see returned here is the name Python uses to refer to mean “data type”. More on this later.
>>> type(3)
<class 'int'>
>>> type(3.0)
<class 'float'>
>>> type('David')
<class 'str'>
>>> type([1, 2, 3])
<class 'list'>
>>> type({'a': 1, 'b': 2})
<class 'dict'>If you’re ever unsure about the type of a particular value or variable, you can always call type on it to check!
Just like other Python expressions, you can write function calls within each other, or mix them with other kinds of expressions.
However, just as we saw with deeply nested arithmetic expressions earlier, too much nesting can make Python expressions difficult to read and understand, and so it is a good practice to break down a complex series of function calls into intermediate steps using variables:
The built-in functions we’ve studied so far all have one interesting property in common: they can all be given arguments of at least two different data types: for example, abs works with both int and float, len and sorted work with set list (and others), and type works with values of absolutely any data type. In fact, this is true for almost all built-in functions in Python, as part of the design of the language itself.
However, Python’s data types also support operations that are specific that that particular data type: for example, there are many operations we can perform on strings that a specific to textual data, and that wouldn’t make sense for other data types.
Python comes with many functions that perform these operations, but handles them a bit differently than the built-in functions we’ve seen so far. A function that is defined as part of a data type is called a method. The terms function and method are sometimes blurred in programming, particularly from language to language, but for us these terms have precise and distinct meanings! All methods are functions, but not all functions are methods. For example, the built-in functions we looked at above are all not methods. We refer to functions that are not methods as top-level functions. We’ll see later how we define functions and methods in Python, but for now let’s look at a few examples of methods.
One str method in Python is called lower, and has the effect of taking a string like 'David' and returning a new string with all uppercase letters turned into lowercase: 'david'. To call this method, we refer to it by first specifying the name of the data type it belongs to (str), followed by a period (.) and then the name of the method.
Here are a few other examples of methods for different data types, just to give you a sense of the kinds of operations that are allowed.
>>> str.split('David wuz hear') # str.split splits a string into words
['David', 'wuz', 'hear']
>>> set.union({1, 2, 3}, {2, 10, 20}) # set.union performs the set union operation
{1, 2, 3, 20, 10}
>>> list.count([1, 2, 3, 1, 2, 4, 2], 2) # list.count counts the number of times a value appears in a list
3 # (remember, a list can have duplicates!)Python provides many built-in top-level functions and methods for us, but as we start writing more code, it is essential for us to be able to create our own functions specific to the problem we are solving. In this section, we’ll learn how to define our own top-level functions in Python. And later on, we’ll study how to define our own data types and methods as well.
First, let’s recall how we define a function in mathematics. We first specify the function name, domain, and codomain: for example, \(f: \R \to \R\). Then, we write the function header and body, usually in a single line: for example, \(f(x) = x^2\). We do this so often in mathematics that we often take parts of this for granted, for example leaving out the domain/codomain specification, and usually choosing \(f\) as the function name and \(x\) as the parameter name. However, the functions we’ll implement in Python are much more diverse, and so it will be important to be explicit in every part of this process.
Here is the complete definition of a “squaring” function in Python. Take a moment to read through the whole definition, and then continue reading to learn about this definition’s different parts.
def square(x: float) -> float:
"""Return x squared.
>>> square(3.0)
9.0
>>> square(2.5)
6.25
"""
return x ** 2This function definition is the most complex form of Python code we’ve seen so far, so let’s break this down part by part.
The first line, def square(x: float) -> float: is called the function header. Its purpose is to convey the following pieces of information:
square).x and type float.->, float. In this example, the function’s parameter and return type are the same, but this won’t always be the case. The syntax for a function header for a unary function is:
Compared to our mathematical version, there are two main differences. First, we chose the name square rather than f as the function name; in Python, we will always pick descriptive names for our functions rather than relying on the conventional “\(f\)”. And second, we use data types to specify the function domain and codomain: the code x: float specifies that the parameter x must be a float value, and the code -> float specifies that this function always returns a float value.
We can express this restriction in an analogous way to \(f: \R \to \R\) by writing float -> float; we call float -> float the type contract of the square function.
The next seven lines, which start and end with triple-quotes ("""), is called the function docstring. This is another way of writing a comment in Python: text that is meant to be read by humans, but not executed as Python code. The goal of the function docstring is to communicate what the function does.
The first part of the docstring, Return x squared., is an English description of the function. The second part might look a bit funny at first, since it seems like Python code:Or more precisely, it looks like the Python console!
This part of the docstring shows example uses of the function, just like the examples we showed of built-in functions in the previous section. You can read the first example literally as “when you type square(3.0) into the Python console, 9.0 is returned” and the second as “when you type square(2.5) into the Python console, 6.25 is returned”. These examples are called doctest examples, for a reason we’ll see in a future section. While a English description may technically be enough to specify the function’s behaviour, doctest examples are invaluable for aiding understanding of the function behaviour (which is why we use them in teaching as well!).
The function docstring is indented inside the function header, as a visual indicator that it is part of the overall function definition. Unlike many other programming languages, this kind of indentation in Python is mandatory rather than merely recommended. Python’s designers felt strongly enough about indentation improving readability of Python programs that they put indentation requirements like this into the language itself.
The final line, return x ** 2, is called the body of the function, and is the code that is executed when the function is called. Like the function docstring, the function body is also indented so that it is “inside” the function definition.
This code uses another keyword, return, which signals a new kind of statement: the return statement, which has the form:
When a return statement is executed, the following happens:
<expression> is evaluated, producing a value.In the previous section, we called built-in functions, and took for granted that they worked properly, without worrying about how they work. Now that we’re able to define our own functions, we are ready to fully understand what happens when a function is called.
As an example, suppose we’ve defined square as above, and then call it in the Python console:
When we press Enter, the Python interpreter evaluates the function call by doing the following:
2.5, and then assign 2.5 to the function parameter x.square function, by doing:
x ** 2, which is 6.25 (since x refers to the value 2.5).6.25 back to the Python console.square(2.5) evaluates to 6.25, and this is displayed on the screen.As we observed in the previous section, we can combine multiple function calls within a single expression. What happens when we call square twice in the same expression? For example:
We can step through this as well; notice how we’ve duplicated the text from before to illustrate the similarities between calling square(2.5) and -square(-1.0).
+ in left-to-right order, so evaluate square(2.5) first.
2.5, and then assign 2.5 to the function parameter x.square function, by doing:
x ** 2, which is 6.25 (since x refers to 2.5).6.25 back to the Python console.square(-1.0) to be evaluated.
-1.0, and then assign -1.0 to the function parameter x.square function, by doing:
x ** 2, which is 1.0 (since x refers to -1.0).1.0 back to the Python console.6.25 + 1.0, which evaluates to 7.25. This value is displayed on the screen.While it is possible to define functions directly in the Python console, this isn’t a good approach: every time we restart the Python console, we lose all our previous definitions. So instead, we save functions in files so that we can reuse them across multiple sessions in the Python console (and in other files).
For example, suppose we have the following file called my_functions.py:
def square(x: float) -> float:
"""Return x squared.
>>> square(3.0)
9.0
>>> square(2.5)
6.25
"""
return x ** 2In PyCharm, we can right-click and select “Run File in Python Console”. This will start the Python console and run our file, which then allows us to call our function square just like any built-in function:
Let’s now look at a more complex example that will illustrate a function definition that takes in more than one parameter.
Recall the distance formula from Section 1.4 to calculate the distance between two points \((x_1, y_1), (x_2, y_2)\) in the Cartesian plane: \[d = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2}\]
We’ll now write a function in Python that calculates this formula. This function will take two inputs, where each input is a tuple of two floats, representing the \(x\)- and \(y\)-coordinates of each point. When we define a function with multiple parameters, we write the name and type of each parameter using the same format we saw earlier, with parameters separated by commas from each other. Here is the function header and docstring:
def calculate_distance(p1: tuple, p2: tuple) -> float:
"""Return the distance between points p1 and p2.
p1 and p2 are tuples of the form (x, y), where the x- and y-coordinates are points.
>>> calculate_distance((0, 0), (3.0, 4.0))
5.0
"""In order to use the above formula, we need to extract the coordinates from each point. This is a good reminder of tuple indexing, and the fact that function bodies can consist of more than one statement. Remember: the function body’s statements are executed one at a time until a return statement is executed.
Now that we have the four coordinates, we can apply the above formula and return the result
Putting this all together, we have:
def calculate_distance(p1: tuple, p2: tuple) -> float:
"""Return the distance between points p1 and p2.
p1 and p2 are tuples of the form (x, y), where the x- and y-coordinates are points.
>>> calculate_distance((0, 0), (3.0, 4.0))
5.0
"""
x1 = p1[0]
y1 = p1[1]
x2 = p2[0]
y2 = p2[1]
return ((x1 - x2) ** 2 + (y1 - y2) ** 2) ** 0.5Our above function body is perfectly correct, but you might notice that the ** 2 expressions exactly mimic the body of the first function we defined in this section: square. And so we can reuse the square function inside the body of calculate_distance:
def calculate_distance(p1: tuple, p2: tuple) -> float:
"""Return the distance between points p1 and p2.
p1 and p2 are tuples of the form (x, y), where the x- and y-coordinates are points.
>>> calculate_distance((0, 0), (3.0, 4.0))
5.0
"""
x1 = p1[0]
y1 = p1[1]
x2 = p2[0]
y2 = p2[1]
return (square(x1 - x2) + square(y1 - y2)) ** 0.5This example of function reuse is quite small, but as our programs grow larger, it will be essential to organize our code into different functions. We’ll explore this idea in more detail, and other principles of good function and program design, throughout this course.
One of the key purposes of functions is to separate different computations in a program, so that we don’t have to worry about them all at once. When we write our code in separate functions, we can focus on working with just a single function, and ignore the rest of the code in other functions.
One way in which Python support this way of designing programs is through separating the variables in each functions so that a function call can only access its own variables, but not variables defined within other functions. In this section, we’ll explore how this works, learning more about how Python keep track of function calls and variables.
Consider the example from the previous section:
def square(x: float) -> float:
"""Return x squared.
>>> square(3.0)
9.0
>>> square(2.5)
6.25
"""
return x ** 2The parameter x is a variable that is assigned a value based on when the function was called. Because this variable is only useful inside the function body, Python does not allow it to be accessible from outside the body. We say that x is a local variable of square because it is limited to the function body. Here is another way to put it, using an important new definition. The scope of a variable is the places in the code where that variable can be accessed. A local variable of a function is a variable whose scope is the body of that function.
Let’s illustrate by first creating a variable in the Python console, and then calling square.
We know that when square is called, its argument expression n + 3.5 is evaluated first, producing the value 13.5, which is then assigned to the parameter x. Now let’s consider what the memory model looks like when the return statement is evaluated. A naive diagram would simply show the two variables n and x and their corresponding values: We do not show result because it hasn’t been assigned a value yet; this only happens after square returns.
| Variable | Value |
|---|---|
n |
10.0 |
x |
13.5 |
But this is very misleading! In our memory model diagrams, we group the variables together based on whether they are introduced in the Python console or inside a function:
| Variable | Value |
|---|---|
n |
10.0 |
| Variable | Value |
|---|---|
x |
13.5 |
We use the name __main__ to label the table for variables defined in the Python console. This is a special name in Python—more on this later. Inside the body of square, the only variable that can be used is x, and the outside in the Python console, the only variable that can be used is n. This may seem a tricky at first, but these memory model diagrams are a good way to visualize what’s going on. At the point that the body of square is evaluated, only the “square” table in the memory model is active:
| Variable | Value |
|---|---|
n |
10.0 |
| Variable | Value |
|---|---|
x |
13.5 |
But after square returns and we’re back to the Python console, the “square” table is no longer accessible, and only the __main__ table is active:
| Variable | Value |
|---|---|
n |
10.0 |
result |
182.25 |
| Variable | Value |
|---|---|
x |
13.5 |
Trying to access variable x from the Python console results in an error:
>>> n = 10.0
>>> square(n + 3.5)
182.25
>>> x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not definedThe principle of “separate tables” in our memory model applies even when we use the same variable name in two different places. Suppose we modify our example above to use x instead of n in the Python console:
Following the same reasoning as above, the argument expression x + 3.5 is evaluated to produce 13.5, which is then assigned to the parameter x. Does this modify the x variable in the Python console? No! They are different variables even though they share the same name.
| Variable | Value |
|---|---|
x |
10.0 |
| Variable | Value |
|---|---|
x |
13.5 |
We can confirm this after the function call is evaluated by checking the value of the original x.
Here is what our memory model looks like after square has returned:
| Variable | Value |
|---|---|
x |
10.0 |
result |
182.25 |
| Variable | Value |
|---|---|
x |
13.5 |
Our last example in this section involves two functions, one of which calls the other:
def square(x: float) -> float:
"""Return x squared.
>>> square(3.0)
9.0
>>> square(2.5)
6.25
"""
return x ** 2
def square_of_sum(numbers: list) -> float:
"""Return the square of the sum of the given numbers."""
total = sum(numbers)
return square(total)Let’s first call our new function square_of_sum in the Python console:
We can trace what happens at three points when we call square_of_sum:
Right before square_of_sum is called (from console)
|
Right before square is called (from square_of_sum)
|
Right before square returns
|
||||||||||||||||||||||||||||
|
|
|
From these diagrams, we see how the list [1.5, 2.5] is passed from the console to square_of_sum, and how the number 4.0 is passed from square_of_sum to square.
Now suppose we wanted to do something a bit silly: have square access total instead of x. We know from our memory model that these variables should be assigned the same value, so the program’s behaviour shouldn’t change, right?
def square(x: float) -> float:
"""Return x squared.
>>> square(3.0)
9.0
>>> square(2.5)
6.25
"""
return total ** 2 # Now we're using total instead of x
def square_of_sum(numbers: list) -> float:
"""Return the square of the sum of the given numbers."""
total = sum(numbers)
return square(total)Let’s see what happens when we try to call square_of_sum in the Python console now:
>>> nums = [1.5, 2.5]
>>> square_of_sum(nums)
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "<input>", line 15, in square_of_sum
File "<input>", line 9, in square
NameError: name 'total' is not definedAn error occurs! Let’s take a look at the state of memory when square is called (this is the same as above):
| Variable | Value |
|---|---|
nums |
[1.5, 2.5] |
| Variable | Value |
|---|---|
numbers |
[1.5, 2.5] |
total |
4.0 |
| Variable | Value |
|---|---|
x |
4.0 |
Well, there is indeed both a total variable and an x variable with the same value, 4.0. So why are we getting this error? Python’s rule for local scope: a local variable can only be accessed in the function body it is defined. Here, the statement return total ** 2 is in the body of square, but attempts to access the local variable of a different function (square_of_sum). When the Python interpreter attempts to retrive the value of total, it looks only in the scope of square, and doesn’t find total, resulting in a NameError.
The somewhat non-intuitive point about this behaviour is that this happens even when square_of_sum is still active. In our example, square is called from within square_of_sum, and so the variable total does exist in Python’s memory—it just isn’t accessible. While this might seem like a limitation of the language, it’s actually a good thing: this prevents you from accidentally using a variable from a completely different function when working on a function.
In this section, we learned about how Python handles local variables, by making them accessible only from within the function that they are defined. Though we hope this makes intuitive sense, some of the details and diagrams we presented here were fairly technical. We recommend coming back to this section in a few days and reviewing this material, perhaps by explaining in your own words what’s happening in each example. You can also practice drawing this style of memory model diagram for future code that you write.
So far we have learned about Python’s built-in functions and various data type methods. But these form a small fraction of all the functions that the Python programming language comes with. Python’s other functions (and even other data types) are separated into various modules, which is another name we give to Python code files. Unlike the functions and data types we’ve seen so far, these modules are not automatically loaded when run the Python interpreter, as they contain more specialized functions and data types. So in this section, we’re going to learn how to load one of these modules and use their definitions.
import statementTo load a Python module, we use a piece of code called an import statement, which has the following syntax:
For example, here is how we could load the math module in the Python console:
Like the other statements we’ve seen so far, import statements do not produce a value, but they do have an important effect. An import statement introduces a new variable (the name of the module being imported) that can be used to refer to all definitions from that module.
For example, the math module defines a function log2 which computes the base-2 logarithm of a number. To access this function, we use dot notation: This notation is the same as accessing data type methods, but log2 is not a method. It’s a top-level function, just one that happens to be defined in the math module.
What other functions are contained in the math module? We’ll make use of a few other later in this course, but if you’re curious you can call the special built-in function dir on the the module (or any other module) to see a list of functions and other variables defined in the module:
>>> dir(math)
['__doc__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'comb', 'copysign', 'cos', 'cosh', 'degrees', 'dist', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'isqrt', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'perm', 'pi', 'pow', 'prod', 'radians', 'remainder', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc']Ignoring the first few with the double underscore, we see some familiar looking names, like ceil, floor, pi, and sin. We’ve linked to the documentation for the math module in the References section below.
datetime modulePython comes with far more modules than we’ll have time to learn about in this course. However, just to illustrate the breadth of these modules, we’ll briefly introduce one more that will be useful occasionally throughout the course.
The datetime module provides not just functions but new data types for representing time-based data. The first data type we’ll study here is date, which is a data type that represents a specific date.
>>> import datetime
>>> canada_day = datetime.date(1867, 7, 1) # Create a new date
>>> type(canada_day)
<class 'datetime.date'>
>>> term_start = datetime.date(2020, 9, 10)
>>> datetime.date.weekday(term_start) # Return the day of the week of the date
3 # 0 = Monday, 1 = Tuesday, etc.Note the double use of dot notation in that last expression. datetime.date is the data type being accessed, and .weekday accesses a method of that data type.
We can compare dates for equality using == and chronological order (e.g., < for comparing one date comes before another). We can also subtract dates, which is pretty cool:
The difference between two dates is an instance of the datetime.timedelta data type, which is used to represent an interval of time. What the above expression tells us is that 55,954 days have passed between the first day of the fall semester and the day of Canada’s confederation. Fun fact: Canada’s confederation first consisted of only four provinces: Ontario, Quebec, Nova Scotia, and New Brunswick.
Up to this point, we’ve covered several different data types, functions, methods, and now modules in Python. It might be starting to feel a bit daunting, and we wanted to take a moment to pause and look at the bigger picture. Our goal in showing you these elements of Python is not to overwhelm you, but instead to give you a taste of the language’s powerful computational capabilities. But this course is not about memorizing different functions, data types, and modules in Python! All throughout this course, you’ll have access to references and documentation that describe the functionality of these different elements, and will have lots of opportunities to practice using them. For now, all we want you to know is simply that these capabilities exist, how to experiment with them in the Python console, and how to look up information about them.
Often when beginners are tasked with writing a program to solve a problem, they jump immediately to writing code. Doesn’t matter whether the code is correct or not, or even if they fully understand the problem: somehow the allure of filling up the screen with text is too tempting. So before we go further in our study of the Python programming language, we’ll introduce the Function Design Recipe, a structured process for taking a problem description and designing and implementing a function in Python to solve this problem.
Consider the following example problem: write a function to determine whether or not a number is even. We’ll use this example to illustrate the five steps of the Function Design Recipe.
1. Write example uses. Pick a name for the function (often a verb or verb phrase). Sometimes a good name is a short answer to the question “What does your function do?” Write one or two examples of calls to your function and the expected returned values. Include an example of a standard case (as opposed to a tricky case). Put the examples inside a triple-quoted string that you’ve indented since it will be the beginning of the docstring. |
|
2. Write the function header. Write the function header above the docstring (not indented). Choose a meaningful name for each parameter (often nouns). Include the type contract (the types of the parameters and return value). |
|
3. Write the function description. Before the examples, add a description of what the function does and mention each parameter by name or otherwise make sure the purpose of each parameter is clear. Describe the return value. |
|
4. Implement the function body. Write the body of the function and indent it to match the docstring. To help yourself write the body, review your examples from the first step and consider how you determined the return values. You may find it helpful to write a few more example calls. |
|
5. Test the function. Test your function on all your example cases including any additional cases you created in the previous step. Additionally, try it on extra tricky or corner cases. One simple way to test your function is by calling it in the Python console. In the next section, we’ll discuss more powerful ways of testing your code. If you encounter any errors/incorrect return values, first make sure that your tests are correct, and then go back to Step 4 and try to identify and fix any possible errors in your code. This is called debugging your code, a process we’ll discuss throughout this course. |
The Function Design Recipe places a large emphasis on developing a precise and detailed function header and docstring before writing any code for the function body. There are two main benefits to doing this.
First, when you are given a programming task—“Write a function to do X”—you want to make sure you fully understand the goal of that function before trying to solve it. Forcing yourself to write out the function header and docstring, with examples, is an excellent way to reinforce your understanding about what you need to do.
Second, as you begin to work on larger projects and writing dozens or hundreds of functions, it is easy to lose track of what each function does. The function header and docstring serve as documentation for the function, communicating to others–and to your future self—what that function is supposed to to. Your choices for the function’s name, its parameter names, its type contract, its docstring examples, and its description, can make the difference between code that is easy to work on and maintain, and code that is undecipherable.
So the bottom line is you should follow this process for all of the functions you’ll write in this course, and beyond—trust us, it will save you lots of time and headaches!
doctest and pytestThe last step of the Function Design Recipe is to test your code—but how? In this section, we’ll discuss the different strategies for testing code that you’ll use during the term, and beyond. As you write more and more complex programs in this course, it will be vital to maintain good habits to support you in your programming. One of these habits is developing good tests that will ensure your code is correct, and—often overlooked—using good tools to make those tests as easy to run as possible. You want to get in the habit of writing tests early in the process of programming, and running them as often as possible to detect coding errors as soon as you make them.
By following the Function Design Recipe, you naturally create a few tests for each function in the form of doctest examples, the examples you write in the function docstring. The simplest form of testing your function is import your function into the Python console, and then manually evaluate each doctest example one at a time and compare the output with the expected output in the docstring. This is a form of manual testing, as it requires human interaction to complete. Manual testing is often tedious and error-prone, so while it may be good for a quick check, we can certainly do better.
Our first improvement is to use the Python library doctest, which can automatically extract doctest examples from docstrings and convert them into runnable tests. To use doctest, you can add the following code to the very bottom of any Python file: Don’t worry about the if __name__ == '__main__' part for now; we will discuss this later on.
if __name__ == '__main__':
import doctest # import the doctest library
doctest.testmod() # run the testsThen when you run the file, all of the doctest examples are automatically run, and you receive a report about which tests failed.
One warning: in order to use doctest, your docstring examples must be correctly formatted and valid Python code. For more information about the doctest module, check out Appendix B.1 doctest.
pytestThough doctest is an extremely useful module, the examples we write in docstrings are only simple cases meant to illustrate typical uses of the function. As functions get more complex, we’ll require more extensive tests to verify that they are correct. We could put all these tests into the function docstrings, but that would make the docstrings far too long.
So instead, we will use another Python library, pytest, to write our tests in a separate file, and so include an exhaustive set of tests without cluttering our code files. Let’s illustrate this with an example. Suppose we have defined the following function in a files trues.py: We’ve not included the body of this function, as we do not need to know how a function is implemented in order to write tests for it!
# In file trues.py
def has_more_trues(booleans: list) -> bool:
"""Return whether booleans contains more True values than False values.
>>> has_more_trues([True, False, True])
True
>>> has_more_trues([True, False, False])
False
"""
# Function body omittedNow, we’ll see how to write tests for this function in a new file, which we’ll call test_trues.py. By convention, all Python modules which contain tests are named with the prefix test_. Now let us introduce some terminology. A unit test is a block of code that checks for the correct behaviour of a function for one specific input. A test suite is a collection of tests that check the behaviour of a function or (usually small) set of functions. Every test file contains a test suite.
In Python, we express a unit test as a function whose name starts with the prefix test_. The body of the function contains an assert statement, which is a new form of Python statement used to check whether some boolean expression is True or False. Here are two examples of unit tests we could write that are direct translations of the doctest examples from above:
# In file test_trues.py
from trues import has_more_trues
def test_mixture_one_more_true() -> None:
"""Test has_more_trues on a list with a mixture of True and False,
with one more True than False.
"""
assert has_more_trues([True, False, True])
def test_mixture_one_more_false() -> None:
"""Test has_more_trues on a list with a mixture of True and False,
with one more False than True.
"""
assert not has_more_trues([True, False, False])These unit test functions are similar to the functions we’ve defined previously, with a few differences:
None, which is a special type that indicates that no value at all is returned by the function. Python’s None is a bit special, and we’ll see more of this later in the course. In the body of the test function, there is indeed no return statement—instead, there’s an assert.So what exactly does an assert statement do? In Python, an assert statement has the form assert <expression>, and when executed it does the following:
First, it evaluates <expression>, which should produce a boolean value.
If the value is True, nothing else happens, and the program continues onto the next statement.
But if the value is False, an AssertionError is raised. This signals to pytest that the test has failed.
So when pytest “runs” a unit test, what’s actually going on is it calls a test function like test_mixture_one_more_true. If the function call ends without raising an AssertionError, the test passes; if the function call does raise an AssertionError, the test fails. A single unit test function can contain multiple assert statements; the test passes if all of the assert statements pass, and fails if any of the assert statements raise an error.
Finally, how do we use pytest to actually run our unit test functions? Similar to doctest, we need to first import pytest and then call a specific test function.
# At the bottom of test_trues.py
if __name__ == '__main__':
import pytest
pytest.main(['test_trues.py'])Now if we run this file, we see that our two unit test functions are run:
__main__’ program (Part 1, Part 2) doctestpytestThere is another useful set of built-in functions that we have not yet discussed: functions that allow us to convert values between different data types. For example, given a string '10', can we convert it into the integer 10? Or given a list [1, 2, 3], can we convert it into a set {1, 2, 3}?
The answer to these questions is yes, and the way to do so in Python is quite elegant. Each data type that we have learned about so far, from int to dict, is also a function that takes an argument and attempts to convert it to a value of that data type.
Here are some examples: Some of these are more “obvious” than others. Don’t worry about the exact rules for conversions between types, as you won’t be expected to memorize them. Instead, we just want you to know that these conversions are possible using data types as functions.
>>> int('10')
10
>>> float('10')
10.0
>>> bool(1000)
True
>>> bool(0)
False
>>> list({1, 2, 3})
[1, 2, 3]
>>> set([1, 2, 3])
{1, 2, 3}
>>> set() # Giving set no arguments results in the empty set
set()
>>> dict([('a', 1), ('b', 2), ('c', 3)])
{'a': 1, 'b': 2, 'c': 3}In particular, str is the most versatile of these data types. Every value of the data types we’ve studied so far has a string represention which corresponds directly to how you would write the value as a Python literal.
>>> str(10)
'10'
>>> str(-5.5)
'-5.5'
>>> str(True)
'True'
>>> str({1, 2, 3})
'{1, 2, 3}'
>>> str([1, 2, 3])
'[1, 2, 3]'
>>> str({'a': 1, 'b': 2})
"{'a': 1, 'b': 2}"You often have to be careful when attempting to convert between different data types, as not all values of one type can be converted into another. Attempting to convert an “invalid” value often results in a Python exception to be raised: These exceptions typically have type ValueError or TypeError.
>>> int('David')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'David'
>>> list(1000)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterableThe ability to create values of a given type by calling the data type as a function is not unique to the built-in data types in this section. We’ve actually seen two examples of doing this so far in the course!
range revisitedEarlier, we saw that we could call range to create a sequence of numbers. But if you just try calling range by itself in the Python console, you see something kind of funny:
Whereas you might have expected to see a list ([5, 6, 7, 8, 9]), in the Python console output it looks like nothing happened at all! This is because range is actually a type conversion function: Python also has a range data type that is distinct from lists (or other collection data types).
>>> five_to_nine = range(5, 10)
>>> type(five_to_nine)
<class 'range'>
>>> five_to_nine == [5, 6, 7, 8, 9]
Falsedatetime.date revisitedRecall an example from the last section in Section 2.4 Importing Modules:
>>> import datetime
>>> canada_day = datetime.date(1867, 7, 1) # Create a new date
>>> type(canada_day)
<class 'datetime.date'>In this case, the data type is datetime.date, and it is called on three arguments instead of one. In this context, datetime.date is called to create a new date value given three arguments (the year, month, and day). This is a more general form of type “conversion”, which created a data type of a new value given a single argument. And of course, this behaviour isn’t unique to datetime.date either. As we’ll see a bit later in this course, you’ll be able to take any data type—even ones you define yourself—and create values of that type by calling the data type as a function.
We have mentioned that computers use a series of 0s and 1s to store data. These 0s and 1s represent numbers. So then, how can numbers represent textual data (i.e., a string)? The answer is functions.
Once upon a time, humans interacted with computers through punched paper tape (or simply punched tape). A hole (or the lack of a hole) at a particular location on the tape represented a 0 or a 1 (i.e., binary). Today we would call each 0 or 1 a bit. Obviously, this is much more tedious than using our modern input peripherals: keyboards, mice, touch screens, etc. Eventually, a standard for representing characters (e.g., letters, numbers) with holes was settled on. Using only 7 locations on the tape, 128 different characters could be represented (\(2^7 = 128\)).
The standard was called ASCII (pronounced ass-key) and it persists to this day. You can think of the ASCII standard as a function with domain \(\{0, 1, \dots, 127\}\), whose codomain is the set of all possible characters. This function is one-to-one, meaning no two numbers map to the same character—this would be redundant for the purpose of encoding the characters. This standard covered all English letters (lowercase and uppercase), digits, punctuation, and various others (e.g., to communicate a new line). For example, the number 65 mapped to the letter 'A' and the number 126 mapped to the punctuation mark '~'.
But what about other languages? Computer scientists extended ASCII from length-7 to length-8 sequences of bits, and hence its domain increased to size 256 (\(\{0, 1, \dots, 255\}\)). This allowed “extended ASCII” to support some other characters used in similar Latin-based languages, such as 'é' (233), 'ö' (246), '€' (128), and other useful symbols like '©' (169) and '½' (189). But what about characters used in very different languages (e.g., Greek, Mandarin, Arabic)?
The latest standard, Unicode, uses up to 32 bits that gives us a domain of \(\{0, 1, \dots, 2^{32} - 1\}\), over 4 billion different numbers. This number is in fact larger than the number of distinct characters in use across all different languages! There are several unused numbers in the domain of Unicode—Unicode is not technically a function defined over \(\{0, 1, \dots, 2^{32} - 1\}\) because of this.
But with the pervasiveness of the Internet, these unused numbers are being used to map to emojis. Of course, this can cause some lost-in-translation issues. The palm tree emoji may appear different on your device than a friend’s. In extreme cases, your friend’s device may not see a palm tree at all or see a completely different emoji. Part of the process involves submitting a proposal for a new emoji. But the second half of that process means that computer scientists need to support newly approved emojis by updating their software. And, of course, in order to do that computer scientists need to have a firm understanding of functions!
Python has two built-in functions that implement the (partial) mapping between characters and their Unicode number. The first is ord, which takes a single-character string and returns its Unicode number as an int.
The second is chr, which computes the inverse of ord: given an integer representing a Unicode number, chr returns a string containing the corresponding character.
Unicode representations are a source of one common source of surprise for Python programmers: string ordering comparisons (<, >) are based on Unicode numeric values! For example, the Unicode value of 'Z' is 90 and the Unicode value of 'a' is 97, and so the following holds:
This means that sorting a collection of strings can seem alphabetical, but treats lowercase and uppercase letters differently:
>>> sorted({'David', 'Mario', 'Jacqueline'})
['David', 'Jacqueline', 'Mario']
>>> sorted({'david', 'Mario', 'Jacqueline'})
['Jacqueline', 'Mario', 'david']As we get ready to write larger and more complex programs, we’re going to take a pause on programming to study formal mathematical logic. You might wonder what logic has to do with software development. As we’ll see over the course of this chapter, a firm understanding of logic allows us to precisely identify, define, and write boolean expressions and use them in our programs.
It might seem counter-intuitive to spend a whole chapter on logic, as bool is the simplest data type in Python. But writing boolean expressions that correctly capture definitions and conditions in a given problem domain can be tricky as these definitions and conditions grow in complexity. It will turn out to be very useful to have a formal mathematical language—logic—to express these complex boolean expressions before turning them into code.
We will start our study in this chapter with propositional logic, an elementary system of logic that is a crucial building block underlying other, more expressive systems of logic that we will need in this course.
A proposition is a statement that is either True or False. Examples of propositions are:
list.sort is correct on every input list.We use propositional variables to represent propositions; by convention, propositional variable names are lowercase letters starting at \(p\). The concept of a propositional variable is different from other forms of variables you have seen before, and even ones that we will see later in this chapter. Here’s a rule of thumb: if you read an expression involving a propositional variable \(p\), you should be able to replace \(p\) with the statement “CSC110 is cool” and still have the expression make sense.
A propositional/logical operator is an operator whose arguments must all be either True or False. Finally, a propositional formula is an expression that is built up from propositional variables by applying these operators.
In the following sections, we describe the various operators we will use in this course. It is important to keep in mind when reading that these operators inform both the structure of formulas (what they look like) as well as the truth value of these formulas (what they mean: whether the formula is True or False based on the truth values of the individual propositional variables).
We have seen these operators earlier when discussing different types of data. The fact that Python has specific keywords dedicated to these operators should at least hint that they are frequently used. Here, we spend some time introducing the operators more formally and developing our first truth tables.
| \(p\) | \(\lnot p\) |
|---|---|
| False | True |
| True | False |
The unary operator NOT (also called “negation”) is denoted by the symbol \(\lnot\). It negates the truth value of its input. So if \(p\) is True, then \(\lnot p\) is False, and vice versa. This is shown in the truth table at the side. In Python, we use the not keyword to represent this operation.
| \(p\) | \(q\) | \(p \land q\) |
|---|---|---|
| False | False | False |
| False | True | False |
| True | False | False |
| True | True | True |
The binary operator AND (also called “conjunction”) is denoted by the symbol \(\land\). It returns True when both its arguments are True. In Python, we use the and keyword to represent this operation.
The binary operator OR (also called “disjunction”) is denoted by the symbol \(\lor\), and returns True if one or both of its arguments are True. In Python, we use the or keyword to represent this operation.
| \(p\) | \(q\) | \(p \lor q\) |
|---|---|---|
| False | False | False |
| False | True | True |
| True | False | True |
| True | True | True |
The truth tables for AND and NOT agree with the popular English usage of the terms; however, the operator OR may seem somewhat different from your intuition, because the word “or” has two different meanings to most English speakers. Consider the English statement “You can have cake or ice cream.” From a nutritionist, this might be an exclusive or: you can have cake or you can have ice cream, but not both. But from a kindly relative at a family reunion, this might be an inclusive or: you can have both cake and ice cream if you want! The study of mathematical logic is meant to eliminate the ambiguity by picking one meaning of OR and sticking with it. In our case, we will always use OR to mean the inclusive or, as illustrated in the last row of its truth table.The symbol \(\oplus\) is often used to represent the exclusive or operator, but we will not use it in this course. This is also the behaviour of the or operator in Python, which evaluates to True when both of its operands are True.
AND and OR are similar in that they are both binary operators on propositional variables. However, the distinction between AND and OR is very important. Consider for example a rental agreement that reads “first and last months’ rent and a $1000 deposit” versus a rental agreement that reads “first and last months’ rent or a $1000 deposit.” The second contract is fulfilled with much less money down than the first contract.
One of the most subtle and powerful relationships between two propositions is implication, which is represented by the symbol \(\Rightarrow\). The implication \(p \Rightarrow q\) asserts that whenever \(p\) is True, \(q\) must also be True. An example of logical implication in English is the statement: “If you push that button, then the fire alarm will go off.” In some contexts, we think of logical implication as the temporal relationship that \(q\) is inevitable if \(p\) occurs. But this is not always the case! Be careful not to confuse implication with causation. Implications are so important that the parts have been given names. The statement \(p\) is called the hypothesis of the implication and the statement \(q\) is called the conclusion of the implication.
| \(p\) | \(q\) | \(p \Rightarrow q\) |
|---|---|---|
| False | False | True |
| False | True | True |
| True | False | False |
| True | True | True |
How should the truth table be defined for \(p \Rightarrow q\)? First, when both \(p\) and \(q\) are True, then \(p \Rightarrow q\) should be True, since when \(p\) occurs, \(q\) also occurs. Similarly, it is clear that when \(p\) is True and \(q\) is False, then \(p \Rightarrow q\) is False (since then \(q\) is not inevitably True when \(p\) is True). But what about the other two cases, when \(p\) is False and \(q\) is either True or False? This is another case where our intuition from both English language it a little unclear. Perhaps somewhat surprisingly, in both of these remaining cases, we will still define \(p \Rightarrow q\) to be True.
The two cases when \(p\) is False but \(p \Rightarrow q\) is True are called the vacuous truth cases. How do we justify this assignment of truth values? The key intuition is that because the statement doesn’t say anything about whether or not \(q\) should occur when \(p\) is False, it cannot be disproven when \(p\) is False. In our example above, if the alarm button is not pushed, then the statement is not saying anything about whether or not the fire alarm will go off. It is entirely consistent with this statement that if the button is not pushed, the fire alarm can still go off, or may not go off.
The formula \(p \Rightarrow q\) has two equivalentHere, “equivalent” means that the two formulas have the same truth values; for any setting of their propositional variables to True and False, the formulas will either both be True or both be False. formulas which are often useful. To make this concrete, we’ll use our example “If you are a Pittsburgh Pens fan, then you are not a Flyers fan” from the introduction.
The following two formulas are equivalent to \(p \Rightarrow q\):
\(\lnot p \lor q\). On our example: “You are not a Pittsburgh Pens fan, or you are not a Flyers fan.” This makes use of the vacuous truth cases of implication, in that if \(p\) is False then \(p \Rightarrow q\) is True, and if \(p\) is True then \(q\) must be True as well.
\(\lnot q \Rightarrow \lnot p\). On our example: “If you are a Flyers fan, then you are not a Pittsburgh Pens fan.” Intuitively, this says that if \(q\) doesn’t occur, then \(p\) cannot have occurred either.
This equivalent formula is in fact so common that we give it a special name: the contrapositive of the implication \(p \Rightarrow q\).
There is one more related formula that we will discuss before moving on. If we take \(p \Rightarrow q\) and switch the hypothesis and conclusion, we obtain the implication \(q \Rightarrow p\), which is called the converse of the original implication.
Unlike the two formulas in the list above, the converse of an implication is not logically equivalent to the original implication. Consider the statement “If you can solve any problem in this course, then you will get an A.” Its converse is “If you will get an A, then you can solve any problem in this course.” These two statements certainly don’t mean the same thing!
In Python, there is no operator or keyword that represents implication directly. If you do want to express an implication as a Python expression, we can use the first equivalent form from above, writing \(p \Rightarrow q\) as \(\lnot p \lor q\). This is less common in Python programs; however, implication has other uses in manipulating data and expressing algorithms that we’ll explore later in this course.
The final logical operator that we will consider is the biconditional, denoted by \(p \Leftrightarrow q\). This operator returns True when the implication \(p \Rightarrow q\) and its converse \(q \Rightarrow p\) are both True.
| \(p\) | \(q\) | \(p \Leftrightarrow q\) |
|---|---|---|
| False | False | True |
| False | True | False |
| True | False | False |
| True | True | True |
In other words, \(p \Leftrightarrow q\) is an abbreviation for \((p \Rightarrow q) \land (q \Rightarrow p)\). A nice way of thinking about the biconditional is that it asserts that its two arguments have the same truth value.
While we could use the natural translation of \(\Rightarrow\) and \(\land\) into English to also translate \(\Leftrightarrow\), the result is a little clunky: \(p \Leftrightarrow q\) becomes “if \(p\) then \(q\), and if \(q\) then \(p\).” Instead, we often shorten this using a quite nice turn of phrase: “\(p\) if and only if \(q\),” which is abbreviated to “\(p\) iff \(q\).”
In Python, we don’t need a separate operator to represent \(\Leftrightarrow\), since we can simply use == to determine whether two boolean values are the same!
We have now seen all five propositional operators that we will use in this course. Now is an excellent time to review these and make sure you understand the notation, meaning, and English words used to indicate each one.
| operator | notation | English | Python operation |
|---|---|---|---|
| NOT | \(\lnot p\) | \(p\) is not true | not p |
| AND | \(p \land q\) | \(p\) and \(q\) | p and q |
| OR | \(p \lor q\) | \(p\) or \(q\) (or both!) | p or q |
| implication | \(p \Rightarrow q\) | if \(p\), then \(q\) | not p or q |
| biconditional | \(p \Leftrightarrow q\) | \(p\) if and only if \(q\) | p == q |
While propositional logic is a good starting point, most interesting statements in mathematics contain variables over domains larger than simply \(\{\TRUE, \FALSE\}\). For example, the statement “\(x\) is a power of 2” is not a proposition because its truth value depends on the value of \(x\). It is only after we substitute a value for \(x\) that we may determine whether the resulting statement is True or False. For example, if \(x = 8\), then the statement becomes “8 is a power of 2”, which is True. But if \(x = 7\), then the statement becomes “7 is a power of 2”, which is False.
A statement whose truth value depends on one or more variables from any set is a predicate: a function whose codomain is \(\{\TRUE, \FALSE\}\). We typically use uppercase letters starting from \(P\) to represent predicates, differentiating them from propositional variables. For example, if \(P(x)\) is defined to be the statement “\(x\) is a power of \(2\)”, then \(P(8)\) is True and \(P(7)\) is False. Thus a predicate is like a proposition except that it contains one or more variables; when we substitute particular values for the variables, we obtain a proposition.
As with all functions, predicates can depend on more than one variable. For example, if we define the predicate \(Q(x,y)\) to mean “\(x^2 = y\),” then \(Q(5,25)\) is True since \(5^2 = 25\), but \(Q(5,24)\) is False.Just as how common arithmetic operators like \(+\) are really binary functions, the common comparison operators like \(=\) and \(<\) are binary predicates, taking two numbers and returning True or False.
We usually define a predicate by giving the statement that involves the variables, e.g., “\(P(x)\) is the statement ‘\(x\) is a power of 2.’” However, there is another component which is crucial to the definition of a predicate: the domain that each of the predicate’s variable(s) belong to. You must always give the domain of a predicate as part of its definition. So we would complete the definition of \(P(x)\) as follows:
\[P(x): \text{``$x$ is a power of 2,'' where $x \in \N$.}\]
Unlike propositional formulas, a predicate by itself does not have a truth value: as we discussed earlier, “\(x\) is a power of 2” is neither True nor False, since we don’t know the value of \(x\). We have seen one way to obtain a truth value in substituting a concrete element of the predicate’s domain for its input, e.g., setting \(x = 8\) in the statement “\(x\) is a power of 2,” which is now True.
However, we often don’t care about whether a specific value satisfies a predicate, but rather some aggregation of the predicate’s truth values over all elements of its domain. For example, the statement “every real number \(x\) satisfies the inequality \(x^2 - 2x + 1 \geq 0\)” doesn’t make a claim about a specific real number like 5 or \(\pi\), but rather all possible values of \(x\)!
There are two types of “truth value aggregation” we want to express; each type is represented by a quantifier that modifies a predicate by specifying how a certain variable should be interpreted.
The existential quantifier is written as \(\exists\), and represents the concept of “there exists an element in the domain that satisfies the given predicate.”
For example, the statement \(\exists x \in \N,~ x \geq 0\) can be translated as “there exists a natural number \(x\) that is greater than or equal to zero.” This statement is True since (for example) when \(x=1\), we know that \(x \geq 0\).
Note that there are many more natural numbers that are greater than or equal to \(0\). The existential quantifier says only that there has to be at least one element of the domain satisfying the predicate, but it doesn’t say exactly how many elements do so.
One should think of \(\exists x \in S\) as an abbreviation for a big OR that runs through all possible values for \(x\) from the domain \(S\). For the previous example, we can expand it by substituting all possible natural numbers for \(x\):In this case, the OR expression is technically infinite, since there are infinitely many natural numbers. \[(0 \geq 0) \lor (1 \geq 0) \lor (2 \geq 0) \lor (3 \geq 0) \lor \cdots\]
The universal quantifier is written as \(\forall\), and represents the concept that “every element in the domain satisfies the given predicate.”
For example, the statement \(\forall x \in \N,~ x \geq 0\) can be translated as “every natural number \(x\) is greater than or equal to zero.” This statement is True since the smallest natural number is zero itself. However, the statement \(\forall x \in \N,~ x \geq 10\) is False, since not every natural number is greater than or equal to 10.
One should think of \(\forall x \in S\) as an abbreviation for a big AND that runs through all possible values of \(x\) from \(S\). Thus, \(\forall x \in \N,~ x \geq 0\) is the same as \[(0 \geq 0) \land (1 \geq 0) \land (2 \geq 0) \land (3 \geq 0) \land \cdots\]
Let us look at a simple example of these quantifiers. Suppose we define \(Loves(a,b)\) to be a binary predicate that is \(\TRUE\) whenever person \(a\) loves person \(b\).
For example, the diagram below defines the relation “Loves” for two collections of people: \(A\) = {Ella, Patrick, Malena, Breanna}, and \(B\) = {Laura, Stanley, Thelonious, Sophia}. A line between two people indicates that the person on the left loves the person on the right.

Consider the following statements.
any and allIn Python, the built-in function any allows us to represent logical statements using the existential quantifier. The function any takes a collection of boolean values and returns True when there exists a True value in the collection:
This might not seem useful by itself, but remember that we can use comprehensions to transform one collection of data into another. For example, suppose are given a set of strings \(S\) and wish to determine whether any of them start with the letter 'D'. In predicate logic, we could write this as the statement \(\exists s \in S,~ s[0] = \text{‘D'}\). And in Python, we could do the following:
This example serves to highlight several elegant parallels between our mathematical statement and equivalent Python expression:
any functionfor s in strings The naming conventions are a bit different, however: in mathematics, we tend to represent collections using capital letters, whereas in Python all variables are lower-case words.s[0] == 'D'Similar to any, Python includes another built-in function all that can be used as a universal quantifier. The all function is given a collection of values and evaluates to True when every element has the value True. For example, if we wanted to express \(\forall s \in S,~ s[0] = \text{‘D'}\) in Python, we could write:
Of course, Python is more limited than mathematics because there are limits on the size of the collections, and so we cannot easily express existential statement quantified over infinite domains like \(\N\) or \(\R\). We’ll discuss this in more detail in a later section.
Now that we have introduced the existential and universal quantifiers, we have a complete set of tools needed to represent all statements we’ll see in this course. A general formula in predicate logic is built up using the existential and universal quantifiers, the propositional operators \(\lnot\), \(\land\), \(\lor\), \(\Rightarrow\), and \(\Leftrightarrow\), and arbitrary predicates. To ensure that the formula has a fixed truth value, we will require every variable in the formula to be quantified.Other texts will often refer to quantified variables as bound variables, and unquantified variables as free variables. We call a formula with no unquantified variables a sentence. So for example, the formula \[\forall x \in \N,~ x^2 > y\] is not a sentence: even though \(x\) is quantified, \(y\) is not, and so we cannot determine the truth value of this formula. If we quantify \(y\) as well, we get a sentence: \[\forall x, y \in \N,~ x^2 > y.\]
However, don’t confuse a formula being a sentence with a formula being True! As we’ll see repeatedly throughout the course, it is quite possible to express both True and False sentences, and part of our job will be to determine whether a given sentence is True or False, and to prove it.
Here is a common question from students who are first learning symbolic logic: “does the comma mean ‘and’ or ‘then’?” As we discussed at the start of the course, we study to predicate logic to provide us with an unambiguous way of representing ideas. The English language is filled with ambiguities that can make it hard to express even relatively simple ideas, much less the complex definitions and concepts used in many fields of computer science. We have seen one example of this ambiguity in the English word “or,” which can be inclusive or exlusive, and often requires additional words of clarification to make precise. In everyday communication, these ambiguous aspects of the English language contribute to its richness of expression. But in a technical context, ambiguity is undesirable: it is much more useful to limit the possible meanings to make them unambiguous and precise.
There is another, more insidious example of ambiguity with which you are probably more familiar: the comma, a tiny, easily-glazed-over symbol that people often infuse with different meanings. Consider the following statements:
Our intuitions tell us very different things about what the commas mean in each case. In the first, the comma means then, separating the hypothesis and conclusion of an implication. But in the second, the comma is used to mean and, the implicit joining of two separate sentences.Grammar-savvy folks will recognize this as a comma splice, which is often frowned upon but informs our reading nonetheless. The fact that we are all fluent in English means that our prior intuition hides the ambiguity in this symbol, but it is quite obvious when we put this into the more unfamiliar context of predicate logic, as in the formula: \[P(x), Q(x)\]
This, of course, is where the confusion lies, and is the origin of the question posed at the beginning of this section. Because of this ambiguity, never use the comma to connect propositions. We already have a rich enough set of symbols—including \(\land\) and \(\Rightarrow\)—that we do not need another one that is ambiguous and adds nothing new!
That said, keep in mind that commas do have two valid uses in predicate formulas:
You can see both of these usages illustrated below, but please do remember that these are the only valid places for the comma within symbolic notation! \[\forall x, y \in \N,~ \forall z \in \R,~ P(x, y) \Rightarrow Q(x, y, z)\]
We have already seen some equivalences among logical formulas, such as the equivalence of \(p \Rightarrow q\) and \(\lnot p \lor q\). While there are many such equivalences, the only other major type that is important for this course are the ones used to simplify negated formulas. Taking the negation of a statement is extremely common, because often when we are trying to decide if a statement is True, it is useful to know exactly what its negation means and decide whether the negation is more plausible than the original.
Given any formula, we can state its negation simply by preceding it by a \(\lnot\) symbol: \[\lnot \big( \forall x \in \N,~ \exists y \in \N,~ x \geq 5 \lor x^2 - y \geq 30 \big).\] However, such a statement is rather hard to understand if you try to transliterate each part separately: “Not for every natural number \(x\), there exists a natural number \(y\), such that \(x\) is greater than or equal to \(5\) or \(x^2 - y\) is greater than or equal to 30.”
Instead, given a formula using negations, we apply some simplification rules to “push” the negation symbol to the right, closer the to individual predicates. Each simplification rule shows how to “move the negation inside” by one step, giving a pair of equivalent formulas, one with the negation applied to one of the logical operator or quantifiers, and one where the negation is applied to inner subexpressions.
It is usually easy to remember the simplification rules for \(\land\), \(\lor\), \(\forall\), and \(\exists\), since you simply “flip” them when moving the negation inside. The intuition for the negation of \(p \Rightarrow q\) is that there is only one case where this is False: when \(p\) has occurred but \(q\) does not. The intuition for the negation of \(p \Leftrightarrow q\) is to remember that \(\Leftrightarrow\) can be replaced with “have the same truth value,” so the negation is “have different truth values.”
What about the quantifiers? Consider a statement of the form \(\lnot (\exists x \in S,~ P(x))\), which says “there does not exist an element \(x\) of \(S\) that satisfies \(P\).” The only way this could be true is for every element of \(S\) to not satisfy \(P\): “every element \(x\) of \(S\) does not satisfy \(P\).” A similar line of reasoning applies to \(\lnot (\forall x \in S,~ P(x))\).
Now we’re going to take a look at one of the most common steps in expressing statements in predicate logic and in processing large collections of data. At first glance these two might not appear that related, after going through this section you should be able to appreciate this elegant connection between predicate logic and data processing.
We saw in the last section that the universal quantifier \(\forall\) is used to express a statement of the form “every element of set \(S\) satisfies ____”. This works well when we use a predefined set for \(S\) (like the numeric sets \(\N\) or \(\R\)), but does not work well when we want to narrow the scope of our statement to a smaller set.
For example, consider the following statement: “Every natural number \(n\) greater than 3 satisfies the inequality \(n^2 + n \geq 20\).” The phrase “greater than 3” is a condition that modifies the statement, limiting the original domain of \(n\) (the natural numbers) to a smaller subset (the natural numbers greater than 3).
There are two ways we can represent such conditions in predicate logic. The first is to define a new set; for example, we could define a set \(S_1 = \{n \mid n \in \N \text{ and } n > 3\}\), and then simply write \(\forall n \in S_1,~ n^2 + n \geq 20\).
The second approach is to use an implication to express the condition. To see how this works, first we can rewrite the original statement using an “if … then …” structure as follows: “For every natural number \(n\), if \(n\) is greater than 3 then \(n\) satisfies the inequality \(n^2 + n \geq 20\).” We can translate this into predicate logic as \(\forall n \in \N,~ n > 3 \Rightarrow n^2 + n \geq 20\).
This works because the \(n > 3 \Rightarrow\) has a filtering effect, due to the vacuous truth case of implication. For the values \(n \in \{0, 1, 2\}\), the hypothesis of the implication, \(n > 3\) is False, and so for these values the implication itself is True. And then since the overall statement is universally quantified, these vacuous truth cases don’t affect the truth value of the statement.
The “forall-implies” structure is one of the most common forms of statements we’ll encounter in this course. They arise naturally any time a statement is universally quantified, but there are conditions that limit the domain that the statement applies to.
Now let’s turn our attention back to Python. Last chapter, we learned about several aggregation functions (like sum, max), and we’ve just learned about two more, any and all. Sometimes, however, we want to limit the scope of one of these functions to certain values in the input collection. For example, “find the sum of only the even numbers in a collection of numbers”, or “find the length of the longest string in a collection that starts with a 'D'”. For these problems, we can quickly identify which aggregation function is necessary, but the problem is in choosing the right argument to pass in.
This is where filtering appears. In programming, a filter operation is an operation that takes a collection of data and returns a new collection consisting of the elements in the original collection that satisfy some predicate (which can vary from one filter operation to the next).
There are different ways of accomplishing a filter operation in Python. The simplest one builds on what we’ve learned so far by adding a syntactic variation to comprehensions. We’ll use as our example a set comprehension here, but what we’ll discuss applies to list and dictionary comprehensions as well.
The new part, if <condition>, is a boolean expression involving the <variable>. This form of set comprehension behaves the same way as the ones we studied last chapter, except that <expression> only gets evaluated for the values of the variable that make the condition evaluate to True. Here are some examples to illustrate this:
>>> numbers = {1, 2, 3, 4, 5} # Initial collection
>>> {n for n in numbers if n > 3} # Pure filtering: only keep elements > 3
{4, 5}
>>> {n * n for n in numbers if n > 3} # Filtering with a data transformation
{16, 25}By combining these filtering comprehensions with aggregation functions, we can now achieve our goal of limiting the scope of an aggregation.
>>> numbers = {1, 2, 3, 4, 5}
>>> sum({n for n in numbers if n % 2 == 0}) # Sum of only the even numbers
6The keyword if used in this syntax for filtering comprehensions is directly connected to our use of implication above. Just as we used the hypothesis \(n > 3 \Rightarrow\) to limit the scope of the universal quantifier to a subset of the natural numbers, here we use if n % 2 == 0 to limit the scope of the sum to just a subset of numbers.
Our final example in this section should make this connection even more explicit. Here’s how we could translate the statement \(\forall n \in S,~ n > 3 \Rightarrow n^2 + n \geq 20\) into a Python expression:
So far, all of the function bodies we’ve written have consisted of a sequence of statements that always execute one after the other. This kind of code block is sometimes called a “straight line program”, since the statements form a linear path from one to the next. But sometimes we want to execute a statement or block of statements only some of the time, based on some condition.
This is similar to the implication operator we saw when discussing propositional logic. The implication \(p \Rightarrow q\) states that whenever \(p\) is True, \(q\) must also be True. In Python, what we would like to express is something of the form “Whenever \(p\) is True, then the block of code block1 must be executed”. To do so, we’ll introduce a new type of Python statement that play a role analogous to \(\Rightarrow\) in propositional logic.
Python uses the if statement to express conditional execution of code. An if statement is a compound statement, meaning it contains other statements within it. Analogously, a expression like 3 + 4 is a compound expression, since it consists of smaller expressions (3 and 4). Here is our first syntax for an if statement:
The if statement uses two keywords, if and else. Careful: we saw the if keyword used earlier to express conditions in comprehensions. The use of if here is logically similar, but quite different in how Python interprets it. The <condition> following if must be an expression that evaluates to a boolean, called the if condition. This expression plays a role analogous to the hypothesis of an implication.
The statements on the lines after the if and else are indented to indicate that they are part of the if statement, similar to how a function docstring and body are indented relative to the function header. We call the statements under the if the if branch and the statements under the else the else branch.
When an if statement is executed, the following happens:
True, then the statements in the if branch are executed. If the condition evaluates to False, then the statements in the else branch are executed instead.Let us consider an example. Suppose Toronto Pearson Airport (YYZ) has hired us to develop some software. The first feature they want is to show their clients if a flight is on time or delayed. The airport will provide us with both the time a flight is scheduled to depart and an estimated departure time based on the plane’s current GPS location. Our task is to report a status (as a string) to display a string. Here is the function header and docstring:
def get_status(scheduled: int, estimated: int) -> str:
"""Return the flight status for the given scheduled and estimated departure times.
The times are given as integers between 0 and 23 inclusive, representing
the hour of the day.
The status is either 'On time' or 'Delayed'.
>>> get_status(10, 10)
'On time'
>>> get_status(10, 12)
'Delayed'
"""Now, if we only needed to calculate a bool for whether the flight is delayed, this function would be very straightforward: simply return estimated <= scheduled, i.e., whether the estimated departure time is before or at the scheduled departure time. Boolean expressions like this are often useful first steps in implementing functions to determine different “cases” of inputs, but they aren’t the only step.
Instead, we use if statements to execute different code based on these cases. Here’s our implementation of get_status:
def get_status(scheduled: int, estimated: int) -> str:
"""..."""
if estimated <= scheduled:
return 'On time'
else:
return 'Delayed'Our if statement uses the boolean expression we identified earlier (estimated <= scheduled) to trigger different return statements to return the correct string.
One useful tool for understanding if statements is drawing control flow diagrams to visualize the order in which statements execute. For example, here is a simple diagram for our get_status function above:

An if statement introduces a “fork in path” of a function’s control flow, and this is why we use the term branch for each of the if and else blocks of code.
Now suppose Toronto Pearson Airport has changed the requirements of our feature. They’ve noticed that whenever a flight is delayed by more than four hours, the airline cancels the flight. They would like our get_status function to accommodate this change, so that the set of possible outputs is now {'On time', 'Delayed', 'Cancelled'}.
def get_status_v2(scheduled: int, estimated: int) -> str:
"""Return the flight status for the given scheduled and estimated departure times.
The times are given as integers between 0 and 23 inclusive, representing
the hour of the day.
The status is 'On time', 'Delayed', or 'Cancelled'.
>>> get_status_v2(10, 10)
'On time'
>>> get_status_v2(10, 12)
'Delayed'
>>> get_status_v2(10, 15)
'Cancelled'
"""Let’s consider what’s changed between this version and our previous one. If the estimated time is before the scheduled time, nothing’s changed, and 'On time' should still be returned. But when the estimated time is after the schedule time, we now need to distinguish between two separate subcases, based on the difference in time. We can express these subcases using nested if statements, i.e., one if statement contained in a branch of another:
def get_status_v2(scheduled: int, estimated: int) -> str:
"""..."""
if estimated <= scheduled:
return 'On time'
else:
if estimated - scheduled <= 4:
return 'Delayed'
else:
return 'Cancelled'This function body is correct, but just like with expressions, excessive nesting of statements can make code difficult to read and understand. So instead of using a nested if statement, we’ll introduce a new form of if statement that makes use of the elif keyword, which is short for “else if”.
if <condition1>:
<statement>
...
elif <condition2>:
<statement>
...
... # [any number of elif conditions and branches]
else:
<statement>
...When this form of if statement is executed, the following happens.
<condition1>) is evaluated, producing a boolean value.True, then the statements in the if branch are executed. If the condition evaluates to False, then next elif condition is evaluated, producing a boolean.True, then the statements in that elif branch are executed. If that condition evaluates to False, then the next elif condition is evaluated. This step repeats until either one of the elif conditions evaluate to True, or all of the elif conditions have evaluated to False.True, then the else branch executes.Here is how we can use elif to rewrite get_status without nested if statements.
def get_status_v3(scheduled: int, estimated: int) -> str:
"""Return the flight status for the given scheduled and estimated departure times.
The times are given as integers between 0 and 23 inclusive, representing
the hour of the day.
The status is 'On time', 'Delayed', or 'Cancelled'.
>>> get_status_v3(10, 10)
'On time'
>>> get_status_v3(10, 12)
'Delayed'
>>> get_status_v3(10, 15)
'Cancelled'
"""
if estimated <= scheduled:
return 'On time'
elif estimated - scheduled <= 4:
return 'Delayed'
else:
return 'Cancelled'This code is logically equivalent to the previous version, but it’s easier to read because there’s no more nesting! Now, it is clear exactly what are the three possible branches of execution for this function.
Adding branching to our control flow makes our functions more complex, and so we need to pay attention to how we test our code. With functions that contain if statements, any one particular input we give can only test one possible execution path, so we need to design our unit tests so that each possible execution path is used at least once. This form of test design is called white box testing, because we “see through the box” and therefore can design tests based on the source code itself. In contrast, black box testing are tests created without any knowledge of the source code (so no knowledge of the different paths the code can take).
In our doctests for get_status_v3, we chose three different examples, each corresponding to a different possible case of the if statement. This was pretty straightforward because the code is relatively simple, but we’ll study later example of more complex control flow where it won’t be so simple to design test cases to cover each branch. In fact, the percentage of lines of program code that are executed when a set of tests for that program is called code coverage, and is a metric used to assess the quality of tests. While a set of tests may strive for 100% code coverage, this does not always occur as our programs grow in complexity. The concept of code coverage and other metrics used to evaluate tests is something we’ll only touch on in this course, but in future courses you’ll learn about this in more detail and even use some automated tools for calculating these metrics. In particular, even though code coverage is a commonly used metric, it is also criticized for giving a false sense of quality of a test suite. Just because all lines of code are executed at least once does not actually mean that the tests chosen cover all possible cases to consider for a program. We’ll see a simple example of this in the following section.
Toronto Pearson Airport is beginning to trust us with more data, and are requesting more complex features as a result. They now want us to write a function that determines how many flights are cancelled in a day. The airport will provide us with the data as a dictionary (i.e., dict), where the keys are unique flight numbers and the values for each flight number is a two-element list. The first element is the scheduled time and the second element is the estimated time. More succinctly, the data is a mapping of the form: { flight_number: [scheduled, estimated] }.
Unlike earlier, when our function input was only two integers, we are now working with a collection of data. Before we start trying to solve the problem, let’s create some example data in the Python console. Specifically, we’ll create a dictionary with values for three different Air Canada flight numbers.
We know that we can query the dictionary by providing an existing key. The value associated with a key is a list of integers, and we can index the list to retrieve those integers. Index 0 of the list refers to the flight number’s scheduled time, while index 1 refers to the estimated time. Let us call our get_status_v3 function for flight 'AC110':
We’re making great progress! Instead of specifying the flight number ourselves (i.e., 'AC110'), we would instead like to substitute in different flight numbers based on the data we receive from the airport. We can do that using comprehensions. Let’s explore and see what we can get:
>>> {k for k in flights}
{'AC999', 'AC110', 'AC321'}
>>> {get_status_v3(flights[k][0], flights[k][1]) == 'Cancelled' for k in flights}
{False, True}
>>> [get_status_v3(flights[k][0], flights[k][1]) == 'Cancelled' for k in flights]
[False, True, False]Our first set comprehension can get us the set of flight numbers, but that doesn’t tell us if the flight was cancelled or not. When we created our second set comprehension we could see that there was at least one flight cancelled. Remember that sets only contain unique elements, and this set consists of all possible boolean values. When we create a list comprehension, we can see that exactly one out of three flights were cancelled (there is one True value). But remember that the airport only wants to know how many flights were cancelled; a single integer value. Currently, we have a list of boolean values.
Let us now try to combine the first set comprehension with the second, using the filtering we learned in the last section.
>>> {k for k in flights if get_status_v3(flights[k][0], flights[k][1]) == 'Cancelled'}
{'AC321'}
>>> [k for k in flights if get_status_v3(flights[k][0], flights[k][1]) == 'Cancelled']
['AC321']Excellent! We now have a set of flight numbers that were cancelled. To convert this into an integer, we can use the built-in len function on the set. Something to think about: does it matter if we use the list or set comprehension here? Let’s see what all this looks like in a function:
def count_cancelled(flights: dict) -> int:
"""Return the number of cancelled flights for the given flight data.
flights is a dictionary where each key is a flight ID,
and whose corresponding value is a list of two numbers, where the first is
the scheduled departure time and the second is the estimated departure time.
>>> count_cancelled({'AC110': [10, 12], 'AC321': [12, 19], 'AC999': [1, 1]})
1
"""
cancelled_flights = {k for k in flights
if get_status_v3(flights[k][0], flights[k][1]) == 'Cancelled'}
return len(cancelled_flights)Let’s review what we learned in this example:
get_status_v3) to help implement other functions.In the last section we introduced if statements, a powerful Python structure that allowed us to perform conditional execution of blocks of code. But as we’ll see again and again in this course, expressive power comes with a cost: as our toolkit gets larger and the programming language features we use get more advanced, our programs also get larger and more complex; harder to read and reason about.
So every time we introduce a new part of the Python programming language, we’ll also take some time to discuss not just what it can do, but also how to use it in structured ways that minimize the complexity we create by using it, and how to reason about its behaviour formally using tools from mathematical logic.
As our first example, consider the following function:
def is_even(n: int) -> bool:
"""Return whether n is even (divisible by 2)."""
if n % 2 == 0:
return True
else:
return FalseWhen we first learn about if statements, it is tempting to use them whenever we think of different “cases” of inputs, like even vs. odd numbers in this example. But remember that if statements are fundamentally about taking boolean values and conditionally executing code (usually to generate other values). In cases where all we need is a boolean value, it is often simpler to write an expression to calculate the value directly, rather than using if statements.
In our example, the if statement is redundant and can be simplified just by returning the value of the condition:
Indeed, our earlier study of propositional logic should make us comfortable with the idea of treating booleans just like any other kind of value, and we should make full use of Python’s logical operators and, or, and not to combine them.
Consider this more complex example with nested if statements:
def mystery(x: lst, y: lst) -> bool:
if x == []:
if y == []:
return True
else:
return False
else:
if y == []:
return False
else:
return TrueHere is a control flow diagram for this function, showing the four different possible execution paths.

To simplify this, we start with the first inner if statement on lines 3-6. This follows the same structure as our first example, and can be simplified to just return y == [].
The second inner if statement on lines 8-11 follows a similar structure, except that now the boolean that’s returned is the negation of the if condition. So we can simplify this as return not y == [], which we can simplify further using the != operator: return y != [].
So now we have this simplification of the function body:
But now how do we simplify this further? The idea here is to focus on the possible ways that mystery could return True. The if statement divides the inputs into two cases: when x == [] and the if branch executes, and when x != [] and the else branch executes. In the first case, when x == [], mystery returns the value of y == []. So one case for mystery returning True is when x == [] and y == []. Similarly, in the second case, when x != [], mystery returns y != [], and so the other case for mystery returning True is x != [] and y != [].
How should we combine these two cases? Because these are different cases, either one of them could occur, but we don’t expect both of them to occur (since x == [] and x != [] can’t both be true), and so we combine them using or:
This simplification took a bit of work, but as a result we have a clearer picture of what this function does. We can illustrate this further by breaking up the nested expression using local variables with meaningful names.
def mystery(x: lst, y: lst) -> bool:
both_empty = x == [] and y == []
both_non_empty = x != [] and y != []
return both_empty or both_non_emptyTo check your understanding, try writing a docstring description for this function. You’ll probably find it at least a little easier to do for this version than the original. And while this is still a relatively small example, the same principle will often apply in the future, and so be on the lookout for if statements that you can simplify in this way. That said, this simplification won’t always apply or be appropriate, depending on the complexity of the branches of the statement. We’ll discuss this in more detail later.
if statements create branches in our code, allowing us to create more advanced functions. But more branches means more complexity because there are many possible paths that our function could take when called. To mitigate the complexity that comes with branching, we recommend two principles when working with if statements:
elifs rather than nested if statements. Overuse of nesting makes your code harder to understand, and can make the visual structure of your code more complex than necessary.if __name__ == '__main__'One small application of if statements that we’ve already taken for granted in this course is writing certain “boilerplate” code for running certain libraries on our file. For example, we saw in 2.6 Testing Functions I: doctest and pytest that we add the following code to our Python file to run the doctests in that file:
Now that we’ve learned about if statements, we are ready to understand that first line, if __name__ == '__main__'.
In 2.4 Importing Modules, we learned that an import statement is an instruction to the Python interpreter to find a Python module with a specified name and run it. This allows the program that executes the import statement to access the functions and data types defined within that module.
One consequence of this behaviour, though, is that by default all statements in the imported module are executed, not just function and data type definitions.
For example, suppose we had the following file useful.py:
# useful.py
def useful_function1(x: int) -> int:
"""...
>>> useful_function1(1)
110
"""
# Body omitted
def useful_function2(s: str) -> str:
"""...
>>> useful_function1('Hello')
'Hello David'
"""
# Body omitted
import doctest
doctest.testmod()Note that here, the code to run doctest is not indented inside an if statement. It turns out that we can still run this file in the Python console, and the doctests will be run. However, these statements will also be executed every time useful.py is imported by another Python program. In other words, any time another program writes import useful, the doctests inside useful.py will be run, even though the doctests are not relevant for a program that just wants to use useful.py!
__name__To fix this problem, the Python interpreter creates a special variable called __name__ for each module when a program is run. Python uses the “double underscore” naming convention to denote special variable or function names. We’ll encounter a few more of these throughout the course. By default, the __name__ variable is set to the name of the module: the __name__ of useful.py is 'useful', and the __name__ attribute of math is 'math'.
However, when you run a module (e.g., right-click and select “Run File in Python Console”), the Python interpreter overrides the default module __name__ and instead sets it to the special string '__main__'. And so checking the __name__ variable is a way to determine if the current module is being run, or whether it’s being imported by another module!
When we write if __name__ == '__main__', we are really saying, “Execute the following code if this module is being run, and ignore the following code if this module is being imported by another module”. The boolean expression __name__ == '__main__' evaluates to True in the former case, and False in the latter, and the conditional execution happens because of the if statement.
We call this if branch (all the code under if __name__ == '__main__') the module’s main block. Here are some important conventions to follow for organizing a Python file with a main block:
import ...), constant definitions (MY_CONSTANT = ...), function definitions (def ...), and data type definitions (class ...), which we will see in the next chapter.doctest or pytest, goes inside the main block so that it is only executed when the module is run, and not when it is imported.One of the most central questions in software development is, “How do we know that the software we write is correct?” Certainly, writing test cases will ensure that our functions produce the expected output for specific situations. But as our programs increase in complexity, how confident can we be that our test cases are sufficient?
Before we address this question, we will formalize what it means for a program to be correct in the first place. Because functions are the primary way we organize programs, we’ll focus on what it means for an individual function to be correct.
A specification for a function consists of two parts:
With these two parts, a function’s specification defines what we expect the function to do. The job of an implementation of the function is to provide the Python code in the function body that meets this specification. We say that a function implementation is correct when the following holds: For all inputs that satisfy the specification’s preconditions, the function implementation’s return value satisfies the specification’s postconditions.
A function specification acts as a contract or agreement between the person who implements the function and the person who calls the function. For the person implementing the function, their responsibility is to make sure their code correctly returns or does what the specification says. When writing this code, they do not need to worry about exactly how the function is called and assume that the function’s input is always valid. So in fact, we have already seen several preconditions in this course. Every time we had a function description that said “assume X about the input(s)”, that was a precondition. For the person calling the function, their responsibility is to make sure they call the function with valid inputs. When they make this call, they do not need to worry about exactly how the function is implemented and assume that the function works correctly.
The concept of a function specification is a very powerful one, as it spreads the responsibility of function correctness across two parties that do their parts separately—as long as they both know what the function specification is. As a result, these specifications must be very precise. Outside of software, lawyers are hired to draft and review contracts to make sure that they are defensible in the eyes of the law. Similarly, programmers must behave as lawyers when designing software to write ironclad contracts that leave no ambiguity in what is expected of the user or how the software will behave. In this section, we introduce some new tools and terminology that can help our functions be more explicit in their requirements and behaviour.
Even though we haven’t formally introduced the notion of a function specification until this section, you’ve been writing specifications all along simply by following the Function Design Recipe. Let’s take a look at an early example:
def is_even(n: int) -> bool:
"""Return whether n is even.
>>> is_even(1)
False
>>> is_even(2)
True
"""
# Body omitted.Here, the type contract and description actually form a complete specification of this function’s behaviour:
n tells us that the valid inputs to is_even are int values. The type annotation int is itself a precondition of the function.bool. In addition, the description “Return whether n is even.” specifies the relationship between the function’s return value and its input. The doctest examples aid understanding, but are not strictly required to specify what this function does. The function description and return type annotation specify the postconditions of the function.From this alone, we know what it means for this function to be implemented correctly, even if we can’t see the implementation.
is_evenis implemented correctly when for allintsn,is_even(n)returns aboolthat isTruewhennis even, andFalsewhennis not even.
For example, suppose David has implemented this function. Mario loads this function implementation into the Python console and calls it:
In this case, 4 is an int, so Mario held up his end of the contract when he called the function. But the False return value is inconsistent with the function description, and so we know there must be an error in the implementation—David is at fault, not Mario.
Suppose David fixes his implementation, and asks Mario to try another call. Mario types in:
Okay pretty good, and now Mario tries:
>>> is_even([1, 2, 3])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in is_even
TypeError: unsupported operand type(s) for %: 'list' and 'int'In this case, the function did not produce a return value but rather an error (i.e., TypeError). Is David at fault again? No! Mario violated the function’s precondition by passing in a list rather than an int, and so he should have no expectation that is_even will meet its postcondition. Therefore, Mario (the caller of the function) caused the error.
All parameter type annotations are preconditions for a function. But often these type annotations are not precise enough to specify the exact set of valid inputs. Consider this function:
def max_length(strings: set) -> int:
"""Return the maximum length of a string in the set of strings.
>>> max_length({'Hello', 'Mario', 'David Liu'})
9
"""
return max({len(s) for s in strings})What happens when the set is empty? Let’s try it out in the console:
>>> empty_set = set()
>>> max_length(empty_set)
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "<input>", line 7, in max_length
ValueError: max() arg is an empty sequenceWe’ve obtained an error, rather than an int; this makes logical sense, because it is impossible to find the maximum value in a set that contains no values at all. But from a formal function specification sense, who is to blame: the function’s caller or the function’s implementer?
As it stands, the implementer is at fault because the only description of “valid inputs” given is the type annotation set; the empty set is still a set. So we need to update the specification to rule out this possibility, but how? You may recall that we’ve been adding extra “assumptions” on inputs for programming exercises in this course for the past few weeks already. What we’re learning here is how to formalize these assumptions into function docstrings. We encountered this issue in 3.3 Filtering Collections, when we wanted to restrict a statement to apply to a subset of our domain. Here we’re doing the same thing: making the set of valid function inputs more specific, because we only want to guarantee our implementation works correctly on those inputs. We add a precondition to the function docstring as follows:
def max_length(strings: set) -> int:
"""Return the maximum length of a string in the set of strings.
Preconditions:
- len(strings) > 0
"""
return max({len(s) for s in strings})Whenever possible, we’ll express these general preconditions as valid Python expressions involving the function’s parameters. Sometimes we’ll encounter a precondition that is extremely complex, in which case you can write them in English. In English, we would say that the full specification of max_length’s valid inputs is “strings is a set, and len(strings) > 0”. As functions get more complex, we can add additional preconditions by listing them under the header Preconditions: in the docstring. A function input is valid when it satisfies the type annotations and all general precondition expressions.
Note that adding the precondition to the docstring does not change the behaviour of the function. If an empty set is passed into the function by the user, the function will still produce the ValueError we saw above. However, now that the precondition has been documented in the function specification, if we call max_length(empty_set), we know that the error is entirely our fault because we violated a precondition.
While our previous example illustrates how to document preconditions as part of a function specification, it has one drawback: it relies on whoever is calling the function to read the documentation! Of course, reading documentation is an important skill for any computer scientist, but despite our best intentions we sometimes miss things. It would be nice if we could turn our preconditions into executable Python code so that the Python interpreter checks them every time we call the function.
One way to do this is to use an assert statement, just like we do in unit tests. Because we’ve written the precondition as a Python expression, we can convert this to an assertion by copy-and-pasting it at the top of the function body.
def max_length(strings: set) -> int:
"""Return the maximum length of a string in the set of strings.
Preconditions:
- len(strings) > 0
"""
assert len(strings) > 0, 'Precondition violated: max_length called on an empty set.'
return max({len(s) for s in strings})Now, the precondition is checked every time the function is called, with a meaningful error message when the precondition is violated:
>>> empty_set = set()
>>> max_length(empty_set)
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "<input>", line 7, in max_length
AssertionError: Precondition violated: max_length called on an empty set.However, this approach is annoying and error-prone. First, we have to duplicate the precondition in two places. And second, we have increased the size of the function body with extra code. The python_ta library we use in this course has a way to automatically check preconditions for all functions in a given file. Here is an example:
def max_length(strings: set) -> int:
"""Return the maximum length of a string in the set of strings.
Preconditions:
- len(strings) > 0
"""
return max({len(s) for s in strings})
if __name__ == '__main__':
import python_ta.contracts
python_ta.contracts.DEBUG_CONTRACTS = False # Disable contract debug messages
python_ta.contracts.check_all_contracts()
max_length(set())Notice that we’ve kept the function docstring the same, but removed the assertion. The function we call, python_ta.contracts.check_all_contracts, modifies our max_length function. That is, python_ta takes the function’s type contract and the preconditions it finds in the function docstring, and causes the function to check these preconditions every time the function is called! Let’s see what happens when we run this file:
Traceback (most recent call last):
...
AssertionError: max_length precondition "len(strings) > 0" violated for arguments {'strings': set()}.Pretty cool! We’ll be using check_all_contracts for the rest of this course to help us make sure we’re sticking to the specifications we’ve written in our function header and docstrings when we call our functions. Moreover, check_all_contracts checks the return type of each function, so it’ll also work as a check when we’re implementing our functions to make sure the return value is of the correct type.
Preconditions allow the implementer of a function to specify assumptions about the function’s inputs, and so simplify the work of the implementer. On the other hand, preconditions place restrictions on the user of the function; the onus is on them to respect these preconditions every time the function is called. This often increases the complexity of the code that calls the function. For example, in our max_length function, the calling code might need an if statement to first check whether a set is empty before passing it to max_length.
When confronted with an “invalid input”, there is another strategy other than simply ruling out the invalid input with a precondition: explicitly defining some alternate function behaviour for this input. Here is another way we could define max_length:
def max_length(strings: set) -> int:
"""Return the maximum length of a string in the set of strings.
Return 0 if strings is empty.
"""
if strings == set():
return 0
else:
return max({len(s) for s in strings})Here, we picked a reasonable default value for max_length when given an empty set, This is very similar to how we define empty sums and products by a mathematical convention. and then handled that as an explicit case in our implementation by using an if statement. Our function implementation is more complex than before, but now another person can call our function on an empty set without producing an error:
You’re probably wondering: is this version of max_length better or worse than our original one with the precondition? This version resulted in a longer description and function body, but it also removed a possible error we might encounter when calling the function. On the other hand, is 0 really a “reasonable” value for the behaviour of this function? Because this is ultimately a design decision, there is no clear “right answer”—there are always trade-offs to be made. Rather than sticking with a particular rule (i.e., “always/never use preconditions”), it’s better to use broader principles to evaluate different choices. How much complexity is added by handling an additional input in a function implementation? Are there “reasonable” behaviours defined for a larger set of inputs than what you originally intended? The trade-offs are rarely clear cut.
It turns out that with either of the “precondition” or “reasonable default” strategies, our specification of max_length is still incomplete. Before moving onto the next section, take a moment to study these implementations and try to guess what the gap might be!
Recall our definition of max_length from the previous section:
def max_length(strings: set) -> int:
"""Return the maximum length of a string in the set of strings.
Preconditions:
- len(strings) > 0
"""
return max({len(s) for s in strings})Let us introduce another issue:
>>> max_length({1, 2, 3})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <setcomp>
TypeError: object of type 'int' has no len()Once again, our specification of valid inputs has failed us. The parameter type for max_length is set, and in Python sets can contain the values of many different types. It is not until the function description that we see that the parameter is not just any set, but specifically a set of strings. We could make this requirement more explicit by introducing another precondition, but there is a better approach. In this section, we’ll learn how to use Python’s typing module to increase the specificity of our type annotations.
There are four collection types that have seen so far: set, list, tuple, and dict. These are analogous to the data types we’ve been using so far, with one key exception: we can specify the types of the values they can contained by writing them in square brackets. The table below shows these types and some examples; note that T, T1, etc. are variables that could be replaced with any data type.
| Type | Description |
|---|---|
set[T] |
A set whose elements all have type T |
list[T] |
A list whose elements all have type T |
tuple[T1, T2, ...] |
A tuple whose first element has type T1, second element has type T2, etc. |
dict[T1, T2] |
A dictionary whose keys are of type T1 and whose values are of type T2 |
For example:
{'hi', 'bye'} has type set[str][1, 2, 3] has type list[int]('hello', True, 3.4) has type tuple[str, bool, float]{'a': 1, 'b': 2, 'c': 3} has type dict[str, int]Here is how we can improve the type contract for max_length:
def max_length(strings: set[str]) -> int:
"""Return the maximum length of a string in the set of strings.
Preconditions:
- len(strings) > 0
"""
return max({len(s) for s in strings})Though indicating the type of the values inside a collection is useful, it is not always necessary. Sometimes we want to be flexible and say that a value must be a list, but we don’t care what’s in the list (could be a list of strings, a list of integers, or a list of strings mixed with integers). Or, we might want a list (or other collection) with elements of different types. In such cases, we will continue using the built-in types set, list, tuple, and dict, for these types annotations without additional information.
Let us revisit a function we designed when discussing if statements:
def get_status_v3(scheduled: int, estimated: int) -> str:
"""Return the flight status for the given scheduled and estimated departure times.
The times are given as integers between 0 and 23 inclusive, representing
the hour of the day.
The status is 'On time', 'Delayed', or 'Cancelled'.
"""How can we improve the specification of this function? Looking at the type annotations we see that, since none are collection types, we cannot make them any more specific than they already are. Next, looking at the docstring we see that there is the potential for some preconditions: We kept the English description of what the times represent, but moved the Python-checkable part into formal preconditions.
def get_status_v3(scheduled: int, estimated: int) -> str:
"""Return the flight status for the given scheduled and estimated departure times.
The times given represent the hour of the day.
Preconditions:
- 0 <= scheduled <= 23
- 0 <= estimated <= 23
"""Next let us revisit the count_cancelled function we designed:
def count_cancelled(flights: dict) -> int:
"""Return the number of cancelled flights for the given flight data.
flights is a dictionary where each key is a flight ID,
and whose corresponding value is a list of two numbers, where the first is
the scheduled departure time and the second is the estimated departure time.
>>> count_cancelled({'AC110': [10, 12], 'AC321': [12, 19], 'AC999': [1, 1]})
1
"""
cancelled_flights = {id for id in flights
if get_status2(flights[id][0], flights[id][1]) == 'Cancelled'}
return len(cancelled_flights)Here we can improve the type annotations. The first parameter is not just a dict, but a dict[str, list[int]]—that is, its keys are strings (the flight IDs), and the corresponding value is a list of integers. Does this type annotation mean that now the documentation describing the dictionary is irrelevant? No: while the type annotation gives some insight on the structure of the data, it does not provide domain-specific context, like the fact that the str keys represent flight IDs, or that the list values represent scheduled and estimated arrival departure times.
There is one more precondition that we can formalize, though: the length of each list in our dictionary. Every list should have length two, which translates naturally into a use of Python’s all function:
def count_cancelled(flights: dict[str, list[int]]) -> int:
"""Return the number of cancelled flights for the given flight data.
flights is a dictionary where each key is a flight ID,
and whose corresponding value is a list of two numbers, where the first is
the scheduled departure time and the second is the estimated departure time.
Precondition:
- all(len(flights[k]) == 2 for k in flights)
>>> count_cancelled({'AC110': [10, 12], 'AC321': [12, 19], 'AC999': [1, 1]})
1
"""
cancelled_flights = {id for id in flights
if get_status2(flights[id][0], flights[id][1]) == 'Cancelled'}
return len(cancelled_flights)Throughout this course, we will study various mathematical objects that play key roles in computer science. As these objects become more complex, so too will our statements about them, to the point where if we try to write out everything using just basic set and arithmetic operations, our formulas won’t fit on a single line! To avoid this problem, we create definitions, which we can use to express a long idea using a single term.This is analogous to using local variables or helper functions in programming to express part of an overall value or computation.
In this section, we’ll look at one extended example of defining our own predicates mathematically and in Python, and using them in our statements. Let us take some familiar terminology and make it precise using the languages of predicate logic and Python.
Let \(n, d \in \Z\).You may be used to defining divisibility for just the natural numbers, but it will be helpful to allow for negative numbers in our work. We say that \(d\) divides \(n\), or \(n\) is divisible by \(d\), when there exists a \(k \in \Z\) such that \(n = dk\). In this case, we use the notation \(d \DIV n\) to represent “\(d\) divides \(n\).”
Note that just like the equals sign \(=\) is a binary predicate, so too is \(\DIV\). For example, the statement \(3 \DIV 6\) is True, while the statement \(4 \DIV 10\) is False.Students often confuse the divisibility predicate with the horizontal fraction bar. The former is a predicate that returns a boolean; the latter is a function that returns a number. So \(4 \DIV 10\) is \(False\), while \(\frac{10}{4}\) is \(2.5\).
This definition also permits \(d = 0\), which may be a bit surprising! According to this definition, \(0 \mid 0\), and for any non-zero \(n \in \Z\), \(0 \nmid n\).Exercise: why are these two statements true? In other words, when \(d = 0\), \(d \mid n\) if and only if \(n = 0\).
Let’s express the statement “For every integer \(x\), if \(x\) divides 10, then it also divides 100” in two ways: with the divisibility predicate \(d \DIV n\), and without it.
With the predicate: this is a universal quantification over all possible integers, and contains a logical implication. So we can write \[\forall x \in \Z,~ x \DIV 10 \IMP x \DIV 100.\]
Without the predicate: the same structure is there, except we unpack the definition of divisibility, replacing every instance of \(d \DIV n\) with \(\exists k \in \Z,~ n = dk\). \[\forall x \in \Z,~ \big(\exists k \in \Z,~ 10 = kx\big) \IMP \big(\exists k \in \Z,~ 100 = kx\big).\]
Note that each subformula in the parentheses has its own \(k\) variable, whose scope is limited by the parentheses.That is, the \(k\) in the hypothesis of the implication is different from the \(k\) in the conclusion: they can take on different values, though they can also take on the same value. However, even though this technically correct, it’s often confusing for beginners. So instead, we’ll tweak the variable names to emphasize their distinctness: \[\forall x \in \Z,~ \big(\exists k_1 \in \Z,~ 10 = k_1x\big) \IMP \big(\exists k_2 \in \Z,~ 100 = k_2x\big).\]
As you can see, using this new predicate makes our formula quite a bit more concise! But the usefulness of our definitions doesn’t stop here: we can, of course, use our terms and predicates in further definitions.
Let \(p \in \Z\). We say \(p\) is prime when it is greater than \(1\) and the only natural numbers that divide it are \(1\) and itself.
Let’s define a predicate \(IsPrime(p)\) to express the statement that “\(p\) is a prime number,” with and without using the divisibility predicate.
The first part of the definition, “greater than \(1\),” is straightforward. The second part is a bit trickier, but a good insight is that we can enforce constraints on values through implication: if a number \(d\) divides \(p\), then \(d = 1\) or \(d = p\). We can put these two ideas together to create a formula: \[IsPrime(p): p > 1 \AND \big( \forall d \in \N,~ d \DIV p \IMP d = 1 \OR d = p \big), \qquad \text{where $p \in \Z$}.\]
To express this idea without using divisibility predicate, we substitute in the definition of divisibility. The underline shows the changed part. \[IsPrime(p): p > 1 \AND \big( \forall d \in \N,~ \underline{\left(\exists k \in \Z,~ p = kd\right)} \IMP d = 1 \OR d = p \big), \quad \text{where $p \in \Z$}.\]
As we just saw, in mathematics we can often express definitions as predicates, where an element of the domain (e.g., an integer) satisfies the predicate if it fits the definition. Because predicates are just functions, we can express these in programs as well. For example, let’s consider the divisibility predicate \(\mid\), where \(d \mid n\) means \(\exists k \in \Z,~ n = kd\) (for \(d, n \in \Z\)). Here is the start of a function design in Python:
While we can use the modulo operator % to implement this function (more on this later), we’ll stick to remaining faithful to the mathematical definition as much as possible. Unfortunately, there is one challenge with translating the mathematical definition of divisibility precisely into a Python function. In mathematics we have no trouble at all representing an infinite set of numbers with the symbol \(\Z\); but in a computer program, we cannot represent infinite sets in the same way. Instead, we’ll use a property of divisibility to restrict the set of numbers to quantify over: when \(n \neq 0\), every number that divides \(n\) must lie in the range \(\{-|n|, -|n| + 1, \dots, |n| - 1, |n|\}\). We’ll actually prove this property later on!
But the next question is, how do we represent the set \(\{-|n|, -|n| + 1, \dots, |n| - 1, |n|\}\) in Python, when the \(n\) is given as a parameter? We can use the range data type: Remember the asymmetry here: the start argument is inclusive, but the end argument is exclusive.
And then we can replace \(\Z\) by this variable in the definition of divisibility to obtain \(\exists k \in possible\_divisors,~ n = kd\). We can now translate this directly into Python code using what we learned earlier this chapter:
def divides(d: int, n: int) -> bool:
"""Return whether d divides n."""
possible_divisors = range(- abs(n), abs(n) + 1)
return any({n == k * d for k in possible_divisors})Now let’s turn our attention to the definition of \(\mathit{IsPrime}\):
\[\mathit{IsPrime}(p): p > 1 \AND \big( \forall d \in \N,~ d \DIV p \IMP d = 1 \OR d = p \big), \qquad \text{where $p \in \Z$}.\]
Here’s a start for translating this definition into a Python function:
Once again, we have a problem of an infinite set: \(\forall d \in \N\). We can use the same property of divisibility as above and note that the possible natural numbers that are divisors of p are in the set \(\{1, 2, \dots, p\}\). This is simpler than the version above because \(p \geq 1\). The quantified statement is a bit harder to translate because it contains an implication, so here we recall what we discussed in 3.3 Filtering Collections to use the if keyword in a comprehension to model implications. Here is our complete implementation of is_prime:
def is_prime(p: int) -> bool:
"""Return whether p is prime."""
possible_divisors = range(1, p + 1)
return (
p > 1 and
all({d == 1 or d == p for d in possible_divisors if divides(d, p)})
)Notice that just like the mathematical definition, in Python our implementation of is_prime uses the divides function. This is a great example of how useful it can be to divide our work into functions that build on each other, rather than writing all of our code in a single function. As we learn about more complex domains in this course, we’ll see this pattern repeat itself: definitions will build on top of one another, and you should expect that your functions will build on one another as well.
You might have noticed that our definition of divides, though faithful to the mathematical definition, is not the same as how we’ve previously determined whether a number is divisible by 2 (i.e., is even).
In this case, we check whether n is divisible by 2 by checking whether the remainder when n is divided by 2 is 0 or not. It turns out that for non-zero \(d \in Z\), checking remainders is equivalent to the original definition of divisibility:
\[\forall n, d \in \Z,~ d \neq 0 \Rightarrow (d \mid n \Leftrightarrow n~\%~d = 0).\]
Note that when \(d = 0\), the remainder \(n~\%~d\) is undefined, and so we really do need the \(d \neq 0\) condition in the above statement.
We can use this observation to write an alternate implementation of the divides function:
def divides2(d: int, n: int) -> bool:
"""Return whether d divides n."""
if d == 0:
# This is the original definition.
possible_divisors = range(-abs(n), abs(n) + 1)
return any({n == k * d for k in possible_divisors})
else:
# This is a new but equivalent check.
return n % d == 0You might also notice that the d == 0 case is quite special: according to our definition of divisibility, when d == 0 we know that d divides n if and only if n == 0:
\[\forall n, d \in \Z,~ d = 0 \Rightarrow (d \mid n \Leftrightarrow n = 0)\]
We can use this to greatly simplify the if branch in our divides2 function:
def divides3(d: int, n: int) -> bool:
"""Return whether d divides n."""
if d == 0:
# This is another new, equivalent check.
return n == 0
else:
# This is a new but equivalent check.
return n % d == 0Our implementation in divides3 meets the same function specification as the original divides, but has a much simpler implementation! It is also much more efficient than the original divides, meaning it performs fewer calculations (or computational “steps”) and takes less time to compute its result. Intuitively, this is because the original divides function used the value range(-abs(n), abs(n) + 1) in a comprehension, and so the number of expressions evaluated gets larger as n grows. This is not the case for divides3, which does not use a single range or comprehension in its body!
What this also means is that we can speed up our implementation of is_prime simply by calling divides3 instead of divides:
def is_prime(p: int) -> bool:
"""Return whether p is prime."""
possible_divisors = range(1, p + 1)
return (
p > 1 and
all({d == 1 or d == p for d in possible_divisors if divides3(d, p)}) # <-- Note the "divides3"
)This is a very powerful idea: we started out with one implementation of the divisibility predicate (divides), and then through some mathematical reasoning wrote a second implementation (divides3) that was logically equivalent to the first, but simpler and faster. But because divides and divides3 were logically equivalent, we could safe replace divides with divides3 in the implementation of is_prime to make it run faster, without worrying about introducing new errors!
The idea of swapping out implementations will come up again and again in this course. As the functions and programs you write grows larger, efficiency will be an important consideration for your code, and so it will be common to start with one function implementation and eventually replace it with another. We’re only touching on these idea here with a relatively simple example, but we will talk formally about program efficiency in a later chapter.
hypothesisWhen we introduced if statements in Section 3.4, we discussed how unit tests could be used to perform white box testing, where the goal is to “cover” all possible execution paths with unit tests. Unit tests really excel in this scenario because we can determine what the inputs of a function should be to reach a particular branch.
But choosing unit test inputs also imposes challenges on the programmer writing those tests. How do we know we have “enough” inputs? What properties of the inputs should we consider? For example, if our function takes a list[int], how long should our input lists be, should they contain duplicates, and what should the values inside the list be? For each choice of answers to these questions, we then need to choose a specific input and calculate the expected output to write a unit test.
In this section, we introduce a different form of testing called property-based testing, using the Python module hypothesis. The main advantage of property-based testing with hypothesis is that we can write one test case that calls the function being tested multiple inputs that the hypothesis library chooses for us automatically. Property-based tests are not intended to replace unit tests—both have their role in testing and both are important.
The kinds of tests we’ve discussed so far involve defining input-output pairs: for each test, we write a specific input to the function we’re testing, and then use assert statements to verify the correctness of the corresponding output. These tests have the advantage that writing any one individual test is usually straightforward, but the disadvantage that choosing and implementing test cases can be challenging and time-consuming.
There is another way of constructing tests that we will explore here: property-based testing, in which a single test typically consists of a large set of possible inputs that is generated in a programmatic way. Such tests have the advantage that it is usually straightforward to cover a broad range of inputs in a short amount of code; but it isn’t always easy to specify exactly what the corresponding outputs should be. If we were to write code to compute the correct answer, how would we know that that code is correct?
So instead, property-based tests use assert statements to check for properties that the function being tested should satisfy. In the simplest case, these are properties that every output of the function should satisfy, regardless of what the input was. For example:
str should always return a string.”len should always return an integer that is greater than or equal to zero.”max(x, y) should return something that is greater than or equal to both x and y.”nums1 and nums2, we know that sum(nums1 + nums2) == sum(nums1) + sum(nums2).These properties may seem a little strange, because they do not capture precisely what each function does; for example, str should not just return any string, but a string that represents its input. This is the trade-off that comes with property-based testing: in exchange for being able to run our code on a much larger range of inputs, we write tests which are imprecise characterizations of the function’s inputs. The challenge, then, with property-based testing is to come up with good properties that narrow down as much as possible the behaviour of the function being tested.
hypothesisAs a first example, let’s consider our familiar is_even function, which we define in a file called my_functions.py:You can follow along in this section by creating your own files!
# Suppose we've saved this in my_functions.py
def is_even(value: int) -> bool:
"""Return whether value is divisible by 2.
>>> is_even(2)
True
>>> is_even(17)
False
"""
return value % 2 == 0Rather than choosing specific inputs to test is_even on, we’re going to test the following two properties:
is_even always returns True when given an int of the form 2 * x (where x is an int)is_even always returns False when given an int of the form 2 * x + 1 (where x is an int)One of the benefits of our previous study of predicate logic is that we can express both of these properties clearly and unambiguously using symbolic notation:
\[\begin{align*} \forall x \in \Z,~ \text{is_even}(2x) \\ \forall x \in \Z,~ \lnot \text{is_even}(2x + 1) \end{align*}\]
Now let’s see how to express these properties as test cases using hypothesis. First, we create a new file called test_my_functions.py, and include the following “test” function: Make sure that my_functions.py and test_my_functions.py are in the same directory.
# In file test_my_functions.py
from my_functions import is_even
def test_is_even_2x(x: int) -> None:
"""Test that is_even returns True when given a number of the form 2*x."""
assert is_even(2 * x)Note that unlike previous tests we’ve written, we have not chosen a specific input value for is_even! Instead, our test function test_is_even_2x takes an an integer for x, and calls is_even on 2 * x. This is a more general form of test because now x could be any integer.
So now the question is, how do we actually call test_is_even_2x on many different integer values? You could run this file in the Python console and call it manually on different arguments, but there must be a better way! This is where hypothesis comes in. In order to generate a range of inputs, the hypothesis module offers a set of strategies that we can use. These strategies are able to generate several values of a specific type of input. For example, to generate int data types, we can use the integers strategy. To start, we add these two lines to the top of our test file:
# In file test_my_functions.py
from hypothesis import given # NEW
from hypothesis.strategies import integers # NEW
from my_functions import is_even
def test_is_even_2x(x: int) -> None:
"""Test that is_even returns True when given a number of the form 2*x."""
assert is_even(2 * x)Just importing given and integers isn’t enough, of course. We need to somehow “attach” them to our test function so that hypothesis knows to generate integer inputs for the test. To do so, we use a new piece of Python syntax called a decorator, which is specified by using the @ symbol with an expression in the line immediately before a function definition. Here is the use of a decorator in action:
# In file test_my_functions.py
from hypothesis import given
from hypothesis.strategies import integers
from my_functions import is_even
@given(x=integers()) # NEW
def test_is_even_2x(x: int) -> None:
"""Test that is_even returns True when given a number of the form 2*x."""
assert is_even(2 * x)The line @given(x=integers()) is a bit tricky, so let’s unpack it. First, integers is a hypothesis function that returns a special data type called a strategy, which is what hypothesis uses to generate a range of possible inputs. In this case, calling integers() returns a strategy that simply generates ints.
Second, given is a hypothesis function that takes in arguments in the form <param>=<strategy>, which acts as a mapping for the test parameter name to a strategy that hypothesis should use for generating arguments for that parameter.
We say that the line @given(x=integers()) decorates the test function, so that when we run the test function, hypothesis will call the test several times, using int values for x as specified by the strategy integers(). Essentially, @given helps automate the process of “run the test on different int values” for us!
And finally, To actually run the test, we use pytest, just like before:
# In file test_my_functions.py
from hypothesis import given
from hypothesis.strategies import integers
from my_functions import is_even
@given(x=integers())
def test_is_even_2x(x: int) -> None:
"""Test that is_even returns True when given a number of the form 2*x."""
assert is_even(2 * x)
if __name__ == '__main__':
import pytest
pytest.main(['test_my_functions.py', '-v'])Just like with unit tests, we can write multiple property-based tests in the same file and have pytest run each of them. Here is our final version of test_my_functions.py for this example, which adds a second test for numbers of the form \(2x + 1\).
# In file test_my_functions.py
from hypothesis import given
from hypothesis.strategies import integers
from my_functions import is_even
@given(x=integers())
def test_is_even_2x(x: int) -> None:
"""Test that is_even returns True when given a number of the form 2*x."""
assert is_even(2 * x)
@given(x=integers())
def test_is_even_2x_plus_1(x: int) -> None:
"""Test that is_even returns False when given a number of the form 2*x + 1."""
assert not is_even(2 * x + 1)
if __name__ == '__main__':
import pytest
pytest.main(['test_my_functions.py', '-v'])hypothesis with collectionsNow let’s consider a more complicated example, this time involving lists of integers. Let’s add the following function to my_functions.py:
# In my_functions.py
def num_evens(nums: list[int]) -> int:
"""Return the number of even elements in nums."""
return len([n for n in nums if is_even(n)])Let’s look at one example of a property-based test for num_evens. For practice, we’ll express this property in predicate logic first. Let \(\mathcal{L}_{int}\) be the set of lists of integers. The property we’ll express is:
\[ \forall \text{nums} \in \mathcal{L}_{\text{int}},~ \forall x \in \Z,~ \text{num_evens}(\text{nums} + [2x]) = \text{num_evens}(\text{nums}) + 1 \]
Translated into English: for any list of integers \(nums\) and any integer \(x\), the number of even elements of nums + [2 * x] is one more than the number of even elements of nums.
We can start using the same idea as our is_even example, by writing the test function in test_my_functions.py.
# In test_my_functions.py
def test_num_evens_one_more_even(nums: list[int], x: int) -> None:
"""Test num_evens when you add one more even element."""
assert num_evens(nums + [2 * x]) == num_evens(nums) + 1Now we need to use @given again to tell hypothesis to generate inputs for this test function. Because this function takes two arguments, we know that we’ll need a decorator expression of the form
We can reuse the same integers() strategy for x, but what about nums? Not surprisingly, we can import the lists function from hypothesis.strategies to create strategies for generating lists! The lists function takes in a single argument, which is a strategy for generating the elements of the list. In our example, we can use lists(integers()) to return a strategy for generating lists of integers.
Here is our full test file (with the is_even tests omitted):
# In file test_my_functions.py
from hypothesis import given
from hypothesis.strategies import integers, lists # NEW lists import
from my_functions import is_even, num_evens
@given(nums=lists(integers()), x=integers()) # NEW given call
def test_num_evens_one_more_even(nums: list[int], x: int) -> None:
"""Test num_evens when you add one more even element."""
assert num_evens(nums + [2 * x]) == num_evens(nums) + 1
if __name__ == '__main__':
import pytest
pytest.main(['test_my_functions.py', '-v'])The property test expressed in test_num_evens_one_more_even is pretty neat, but it by itself is not sufficient to verify the correctness of the num_evens function. For example, this property would also hold true if num_evens simply returned the length of the list, rather than the number of even elements.
This is drawback with property-based tests: even though we can now check some property for very many inputs automatically, a single property alone does not guarantee that a function is correct. The ideal goal of property-based testing, then, is choosing properties to verify, so that if all of the properties are verified, then the function must be correct. This sounds too good to be true, and it often is—as functions get more complex, it is challenging or even impossible to find such a set of properties.
But for num_evens, a relatively simple function, it is actually possible to formally prove the following statement, which tells us exactly which properties we need to check.
Theorem (correctness for num_evens). An implementation for num_evens is correct (i.e., returns the number of even elements for any list of numbers) if and only if it satisfies all three of the following:
Proving such a statement is beyond the scope of this chapter, but if you’re curious it is closely related to the proof technique of induction, which we will cover formally later this year. But the actual statement is pretty amazing: it tells us that with just one unit test (for nums = []) and two property tests, we can be certain that our num_evens function is correct!
Expressions in predicate logic with a single quantifier can generally be translated into English as either “there exists an element \(x\) of set \(S\) that satisfies \(P(x)\)” (existential quantifier) or “every element \(x\) of set \(S\) satisfies \(P(x)\)” (universal quantifier). However, there are situations where multiple variables are quantified, and we need to pay special attention to what such statements are actually saying. Let us revisit our \(Loves\) predicate from earlier this chapter. In particular, recall the following relationships regarding who loves whom:
| Sophia | Thelonious | Stanley | Laura | |
|---|---|---|---|---|
| Breanna | False | True | True | False |
| Malena | False | True | True | True |
| Patrick | False | False | True | False |
| Ella | False | False | True | True |
Remember that our \(Loves\) predicate is binary—what if we wanted to quantify both of its inputs? Consider the formula: \[\forall a \in A,~\forall b \in B,~Loves(a,b).\]
We translate this as “for every person \(a\) in \(A\), for every person \(b\) in \(B\), \(a\) loves \(b\).” After some thought, we notice that the order in which we quantified \(a\) and \(b\) doesn’t matter; the statement “for every person \(b\) in \(B\), for every person \(a\) in \(A\), \(a\) loves \(b\)” means exactly the same thing! In both cases, we are considering all possible pairs of people (one from \(A\) and one from \(B\)).
In general, when we have two consecutive universal quantifiers the order does not matter. That is, the following two formulas are equivalent:Tip: when the domains of the two variables are the same, we typically combine the quantifications, e.g., \(\forall x \in S,~ \forall y \in S,~ P(x, y)\) into \(\forall x,y \in S,~ P(x, y)\).
The same is true of two consecutive existential quantifiers. Consider the statements “there exist an \(a\) in \(A\) and \(b\) in \(B\) such that \(a\) loves \(b\)” and “there exist a \(b\) in \(B\) and \(a\) in \(A\) such that \(a\) loves \(b\).” Again, they mean the same thing: in this case, we only care about one particular pair of people (one from \(A\) and one from \(B\)), so the order in which we pick the particular \(a\) and \(b\) doesn’t matter. In general, the following two formulas are equivalent:
But even though consecutive quantifiers of the same type behave very nicely, this is not the case for a pair of alternating quantifiers. First, consider \[\forall a \in A,~ \exists b \in B,~ Loves(a,b).\] This can be translated as “For every person \(a\) in \(A\), there exists a person \(b\) in \(B\), such that \(a\) loves \(b\).” Or put a bit more naturally, “For every person \(a\) in \(A\), \(a\) loves someone in \(B\),” which can be shortened even further to “Everyone in \(A\) loves someone in \(B\).” This is true: every person in \(A\) loves at least one person.
| \(a\) (from \(A\)) | \(b\) (a person in \(B\) who \(a\) loves) |
|---|---|
| Breanna | Thelonious |
| Malena | Laura |
| Patrick | Stanley |
| Ella | Stanley |
Note that the choice of person who \(a\) loves depends on \(a\): this is consistent with the latter part of the English translation, “\(a\) loves someone in \(B\).”
Let us contrast this with the similar-looking formula, where the order of the quantifiers has changed: \[\exists b \in B,~ \forall a \in A,~ Loves(a,b).\] This formula’s meaning is quite different: “there exists a person \(b\) in \(B\), where for every person \(a\) in \(A\), \(a\) loves \(b\).” Put more naturally, “there is a person \(b\) in \(B\) who is loved by everyone in \(A\)” or “someone in \(B\) is loved by everyone in \(A\)”.
| \(b\) (from \(B\)) | Loved by everyone in \(A\)? |
|---|---|
| Sophia | No |
| Thelonious | No |
| Stanley | Yes |
| Laura | No |
This happens to be True because everyone in \(A\) loves Stanley. But it would not be True if we, for example, removed the love connection between Malena and Stanley. In this case, Stanley would no longer be loved by everyone, and so no one in \(B\) is loved by everyone in \(A\). But notice that even if Malena no longer loves Stanley, the previous statement (“everyone in \(A\) loves someone”) remains True!
So we would have a case where switching the order of quantifiers changes the meaning of a formula! In both cases, the existential quantifier \(\exists b \in B\) involves making a choice of person from \(B\). But in the first case, this quantifier occurs after \(a\) is quantified, so the choice of \(b\) is allowed to depend on the choice of \(a\). In the second case, this quantifier occurs before \(a\), and so the choice of \(b\) must be independent of the choice of \(a\).
When reading a nested quantified expression, you should read it from left to right, and pay attention to the order of the quantifiers. In order to see if the statement is True, whenever you come across a universal quantifier, you must verify the statement for every single value that this variable can take on. Whenever you see an existential quantifier, you only need to exhibit one value for that variable such that the statement is True, and this value can depend on the variables to the left of it, but not on the variables to the right of it.
Now let’s see how we could represent this example in Python. First, recall the table of who loves whom from above:
| Sophia | Thelonious | Stanley | Laura | |
|---|---|---|---|---|
| Breanna | False | True | True | False |
| Malena | False | True | True | True |
| Patrick | False | False | True | False |
| Ella | False | False | True | True |
And we can represent this table of who loves whom in Python as a list of lists or, more precisely, using a list[list[bool]].
[
[False, True, True, False],
[False, True, True, True],
[False, False, True, False],
[False, False, True, True]
]Our list is the same as the table above, except with the people’s names removed. Each row of the table represents a person from set \(A\), while each column in the table represents a person from set \(B\). We’ve kept the order the same; so the first row represents \(Breanna\), while the third column represents \(Stanley\).
Now, how are we going to access the data from this table? For this section we’re going to put all of our work into a new file called loves.py, and so we’ll start by defining a new variable in this file:
# In loves.py
LOVES_TABLE = [
[False, True, True, False],
[False, True, True, True],
[False, False, True, False],
[False, False, True, True]
]This is the first time we’ve defined a variable within a Python file (rather than the Python console) that is not in a function definition. Variables defined in this way are called global constants, to distinguish them from the local variables defined within functions. The term “constant” is not important right now, but will become important later in the course. Global constants are called “global” because their scope is the entire Python module in which they are defined: they can be accessed anywhere in the file, including all function bodies. They can also be imported and used in other Python modules, and are available when we run the file in the Python console.
LOVES_TABLETo start, let’s run our loves.py file in the Python console so we can play around with the LOVES_TABLE value. Because LOVES_TABLE is a list of lists, where each inner list represents a row of the table, it’s easy to access a single row with list indexing:
From here, we can access individual elements of the table, which represent an individual value of the \(Loves(a, b)\) predicate.
>>> LOVES_TABLE[0][1] # This is the (0, 1) entry in the table
True
>>> LOVES_TABLE[2][3] # This is the (2, 3) entry in the table
FalseIn general, LOVES_TABLE[i][j] evaluates to the entry in row i and column j of the table. Finally, since the data is stored by rows, accessing columns is a little more work. To access column j, we can use a list comprehension to access the j-th element in each row:
Now, let’s return to our Python file loves.py and define a version of our \(Loves\) predicate. First, we add two more constants to represent the sets \(A\) and \(B\), but using a dictionary to map names to their corresponding indices in LOVES_TABLE.
# In loves.py
LOVES_TABLE = [
[False, True, True, False],
[False, True, True, True],
[False, False, True, False],
[False, False, True, True]
]
A = {
'Breanna': 0,
'Malena': 1,
'Patrick': 2,
'Ella': 3
}
B = {
'Sophia': 0,
'Thelonius': 1,
'Stanley': 2,
'Laura': 3,
}Next, we define a loves predicate, which take in two strings (note the preconditions) and returns whether person a loves person b. Note that because this function is defined in the same file as LOVES_TABLE, it can access that global constant in its body.
def loves(a: str, b: str) -> bool:
"""Return whether the person at index a loves the person at index b.
Preconditions:
- a in A
- b in B
>>> loves('Breanna', 'Sophia')
False
"""
a_index = A[a]
b_index = B[b]
return LOVES_TABLE[a_index][b_index]Now that we’ve seen how to access individual entries, rows, and columns from the table, let’s turn to how we would represent the statements in predicate logic we’ve written in this section. First, we can express \(\forall a \in A,~ \forall b \in B,~ Loves(a, b)\) as the expression:
And similarly, we can express \(\exists a \in A,~ \exists b \in B,~ Loves(a, b)\) as the expression:
These two examples illustrate how Python’s all and any functions naturally enable us to express multiple quantifiers of the same type. But what about the expressions we looked at with alternating quantifiers? Consider \(\forall a \in A,~ \exists b \in B,~ Loves(a,b)\). It is possible to construct a nested expression that represents this one as well:
Though this is structurally equivalent to the statement in predicate logic, it’s syntactically longer and a bit harder to read. In general we try to avoid lots of nesting in expressions in programming, and a rule of thumb we’ll try to follow in this course is to never nest all/any calls. Instead, we can pull out the inner any into its own function, which not only reduces the nesting but makes it clearer what’s going on:
def loves_someone(a: str) -> bool:
"""Return whether a loves at least one person in B.
Preconditions:
- a in A
"""
return any({loves(a, b) for b in B})
>>> all({loves_someone(a) for a in A})
TrueSimilarly, we can express the statement \(\exists b \in B,~ \forall a \in A,~ Loves(a,b)\) in two different ways. With a nested any/all:
And by pulling out the inner all expression into a named function:
def loved_by_everyone(b: str) -> bool:
"""Return whether b is loved by everyone in A.
Preconditions:
- b in B
"""
return all({loves(a, b)} for a in A)
>>> any({loved_by_everyone(b) for b in B})
TrueHere is our final loves.py file, for you to play around with:
# In loves.py
LOVES_TABLE = [
[False, True, True, False],
[False, True, True, True],
[False, False, True, False],
[False, False, True, True]
]
A = {
'Breanna': 0,
'Malena': 1,
'Patrick': 2,
'Ella': 3
}
B = {
'Sophia': 0,
'Thelonius': 1,
'Stanley': 2,
'Laura': 3,
}
def loves(a: str, b: str) -> bool:
"""Return whether the person at index a loves the person at index b.
Preconditions:
- a in A
- b in B
>>> loves('Breanna', 'Sophia')
False
"""
a_index = A[a]
b_index = B[b]
return LOVES_TABLE[a_index][b_index]
def loves_someone(a: str) -> bool:
"""Return whether a loves at least one person in B.
Preconditions:
- a in A
"""
return any({loves(a, b) for b in B})
def loved_by_everyone(b: str) -> bool:
"""Return whether b is loved by everyone in A.
Preconditions:
- b in B
"""
return all({loves(a, b)} for a in A)Before we wrap up, let us use our understanding of multiple quantifiers to express one of the more famous properties about prime numbers: “there are infinitely many primes.” Later on, we’ll actually prove this statement!
In Section 2.9, we saw how to express the fact that a single number \(p\) is a prime number, but how do we capture “infinitely many”? The key idea is that because primes are natural numbers, if there are infinitely many of them, then they have to keep growing bigger and bigger. Another way to think about this is to consider the statement “every prime number is less than 9000. If this statement were True, then there could only be at most 8999 primes.” So we can express the original statement as “every natural number has a prime number larger than it,” or in the symbolic notation: \[\forall n \in \N,~ \exists p \in \N,~ p > n \land IsPrime(p).\]
If we wanted to express this statement without either the \(IsPrime\) or divisibility predicates, we would end up with an extremely cumbersome statement: \[\forall n \in \N,~ \exists p \IN \N,~ p > n \land p > 1 \land \Big( \forall d \in \N,~ \left(\exists k \in \Z,~ p = kd\right) \Rightarrow d = 1 \OR d = p \Big).\]
This statement is terribly ugly, which is why we define our own predicates! Keep this in mind throughout the course: when you are given a statement to express, make sure you are aware of all of the relevant definitions, and make use of them to simplify your expression.
In this section, we’ve introduced the notion of lists within lists to represent tables of values for binary predicates. In the next chapter, we’ll start looking at tabular data and other forms of nested collections of data in more detail, and see how these more complex structures can be used to represent real-world data for our programs.
We’ve seen how Python can store collections of data, such as lists, sets, and dictionaries. Mostly, we’ve focused on collections of integers or strings. But what about collections of collections? We’ve actually encountered this already: our count_cancelled function had a parameter flights that was a dictionary whose values were lists, and we represented the \(Loves\) predicate as a list of lists, storing a two-dimensional table of booleans. In this section, we’ll look at using list of lists to store more complex forms of tabular data, like a table from a spreadsheet, and writing functions to perform computations on this data.
Let’s consider a real data set from the city of Toronto. This data shows information about how many marriage licenses were issued in Toronto at a particular location and month. The data is in a tabular format with four columns: id, civic centre, number of marriage licenses issued, and time period. Each row of the table tells us how many marriage licenses were issued by a civic centre in a specific time period; the id is simply a unique numerical identifier for each row. Suppose we wanted to answer the following question: What is the average number of marriage licenses issued by each civic centre?
| ID | Civic Centre | Marriage Licenses Issued | Time Period |
|---|---|---|---|
| 1657 | ET | 80 | January 1, 2011 |
| 1658 | NY | 136 | January 1, 2011 |
| 1659 | SC | 159 | January 1, 2011 |
| 1660 | TO | 367 | January 1, 2011 |
| 1661 | ET | 109 | February 1, 2011 |
| 1662 | NY | 150 | February 1, 2011 |
| 1663 | SC | 154 | February 1, 2011 |
| 1664 | TO | 383 | February 1, 2011 |
To write a program that uses this data, we must first decide on a way to store it. As we did with our \(Loves\) table of values, we’ll store this table as a list of lists, where each inner list represents one row of the table. Unlike our previous example, these lists won’t just store boolean values, so we need to determine what data type to use for each column, based on the sample data we have.
int data type for them.str.datetime module. To review this date data type, check out 2.4 Importing Modules.With this in mind, let us see how we can store our data as a nested list In tutorial, you will explore how to load the data from a file into a nested list.:
>>> import datetime
>>> marriage_data = [
... [1657, 'ET', 80, datetime.date(2011, 1, 1)],
... [1658, 'NY', 136, datetime.date(2011, 1, 1)],
... [1659, 'SC', 159, datetime.date(2011, 1, 1)],
... [1660, 'TO', 367, datetime.date(2011, 1, 1)],
... [1661, 'ET', 109, datetime.date(2011, 2, 1)],
... [1662, 'NY', 150, datetime.date(2011, 2, 1)],
... [1663, 'SC', 154, datetime.date(2011, 2, 1)],
... [1664, 'TO', 383, datetime.date(2011, 2, 1)]
... ]
>>> len(marriage_data) # There are eight rows of data
8
>>> len(marriage_data[0]) # The first row has four elements
4
>>> [len(row) for row in marriage_data] # Every row has four elements
[4, 4, 4, 4, 4, 4, 4, 4]
>>> marriage_data[0]
[1657, 'ET', 80, datetime.date(2011, 1, 1)]
>>> marriage_data[1]
[1658, 'NY', 136, datetime.date(2011, 1, 1)]We can see that by indexing the nested list marriage_data, a list is returned. Specifically, this list represents a row from our table. For each row, we can then access its id via index 0, its civic centre via index 1, and so on.
>>> marriage_data[0][0]
1657
>>> marriage_data[0][1]
'ET'
>>> marriage_data[0][2]
80
>>> marriage_data[0][3]
datetime.date(2011, 1, 1)Suppose we want to see all of the different values from a single column of this table (e.g., all civic centres or marriage license numbers). We can retrieve a column by using a list comprehension:
>>> [row[1] for row in marriage_data] # The civic centre column
['ET', 'NY', 'SC', 'TO', 'ET', 'NY', 'SC', 'TO']Or, using an identically-structured set comprehension, we can obtain all unique values in a column.
Using our knowledge of filtering using if conditions in comprehensions, we can retrieve all rows corresponding to a specific civic centre.
>>> [row for row in marriage_data if row[1] == 'TO']
[[1660, 'TO', 367, datetime.date(2011, 1, 1)], [1664, 'TO', 383, datetime.date(2011, 2, 1)]]Or we can filter rows based on a threshold for the number of marriage licenses issued:
Earlier, we asked the question: What is the average number of marriage licenses issued by each civic centre? The question implies a mapping of civic centre names to numbers (i.e., the average). This means we need to create a dictionary comprehension. Let’s start exploring in the Python console. Remember, we saw earlier that we can get all unique civic centre names in the data through a set comprehension.
>>> names = {row[1] for row in marriage_data}
>>> names
{'NY', 'TO', 'ET', 'SC'}
>>> {key: 0 for key in names}
{'NY': 0, 'TO': 0, 'ET': 0, 'SC': 0}So far, we’ve created a dictionary where each key is a civic centre name and they all map to the value 0. To proceed, we need to be able to calculate the average number of marriage licenses issued per month by each civic centre.
Let’s try to do this just for the 'TO' civic centre first. We saw earlier how to get all rows for a specific civic centre, and to extract the values for a specific column. We’ll first combine these two operations to retrieve the number of marriage licenses issued by 'TO' each month.
>>> [row for row in marriage_data if row[1] == 'TO'] # The 'TO' rows
[[1660, 'TO', 367, datetime.date(2011, 1, 1)], [1664, 'TO', 383, datetime.date(2011, 2, 1)]]
>>> [row[2] for row in marriage_data if row[1] == 'TO'] # The 'TO' marriages issued
[367, 383]
>>> issued_by_TO = [row[2] for row in marriage_data if row[1] == 'TO']So issued_by_TO is now a list containing the number of marriage licenses issued by the 'TO' civic centre. We can now calculate their average by dividing its sum by its length:
Excellent! Through our exploration, we managed to find the average number of marriage licenses issued by one specific civic centre. How can we merge this with our earlier dictionary comprehension? It’s quite a bit to keep in our head at once, and looks like it will quickly get messy. At this point, we should design a function to help us. Specifically, let’s design a function that calculates the average for only one civic centre. As input, we will need the dataset as well as the name of the civic centre we are querying.
def average_licenses_issued(data: list[list], civic_centre: str) -> float:
"""Return the average number of marriage licenses issued by civic_centre in data.
Return 0.0 if civic_centre does not appear in the given data.
Preconditions:
- all({len(row) == 4 for row in data})
- data is in the format described in Section 4.1
"""
issued_by_civic_centre = [row[2] for row in data if row[1] == civic_centre]
if issued_by_civic_centre == []:
return 0.0
else:
total = sum(issued_by_civic_centre)
count = len(issued_by_civic_centre)
return total / countLet’s test it to make sure we get the same result as before:
Finally, we can combine it with our previous dictionary comprehension by observing that 'TO' can be replaced with the key that is changing:
>>> {key: 0 for key in names}
{'NY': 0, 'TO': 0, 'ET': 0, 'SC': 0}
>>> {key: average_licenses_issued(marriage_data, key) for key in names}
{'NY': 143.0, 'TO': 375.0, 'ET': 94.5, 'SC': 156.5}Now that we’ve done this exploration in the Python console, we can save our work by writing this as a function:
def average_licenses_by_centre(marriage_data: list[list]) -> Dict[str, float]:
"""Return a mapping of the average number of marriage licenses issued at each civic centre.
In the returned mapping:
- Each key is the name of a civic centre
- Each corresponding value is the average number of marriage licenses issued at
that centre.
Preconditions:
- marriage_data is in the format described in Section 4.1
"""
names = {'TO', 'NY', 'ET', 'SC'}
return {key: average_licenses_issued(marriage_data, key) for key in names}Up to this point, all the data we’ve worked with in Python have been stored in objects that are instances of the built-in types that come with Python, like ints and lists. Python’s built-in data types are powerful, but are not always the most intuitive way to store data. For example, we saw in 4.1 Tabular Data that we could use a list of lists to represent tabular data. One of the downsides of this approach is that when working with this data, the onus is on us to remember which list element corresponds to which component of the data.
>>> import datetime
>>> row = [1657, 'ET', 80, datetime.date(2011, 1, 1)]
>>> row[0] # The id
1657
>>> row[1] # The name of the civic centre
'ET'
>>> row[2] # The number of marriage licenses issued
80
>>> row[3] # The time period
datetime.date(2011, 1, 1)You can imagine how error prone this might be. A simple “off by one” error for an index might retrieve a completely different data type. It also makes our code difficult to read; the reader must know what each index of the list represents. And, as more experienced programmers will tell you, readable code is crucial. “Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” –Martin Fowler
So a row in our marriage license data set is made up of four data elements. It would be nice if, instead of indices, we could use a name that was reflective of each element. Certainly, we could use a dictionary (instead of a list) where the keys are strings. But there is a more robust option we’ll learn about in this section: creating our own data types.
You might remember from Chapter 1 that in Python, another term for data type is a class. This is why type(3) evaluates to <class 'int'> in Python. The built-in data types we’ve studied so far illustrate how rich and complex data types can be. So for creating our own data types, we will first learn about the simplest kind of data type: a data class, which is a kind of class whose purpose is to bundle individual pieces of data into a single Python object.
For example, suppose we want to represent a “person” consisting of a given name, family name, age, and home address. We already know how to represent each individual piece of data: the given name, family name, and address could be strings, and the age could be a natural number. To bundle these values together, we could use a list or other built-in collection data type, but that approach would run into the issues we discussed above.
So instead, we define our own data class to create a new data type consisting of these four values. Here is the way to create a data class in Python:
from dataclasses import dataclass
@dataclass
class Person:
"""A custom data type that represents data for a person."""
given_name: str
family_name: str
age: int
address: strLet’s unpack this definition.
from dataclasses import dataclass is a Python import statement that lets us use dataclass below.
@dataclass is a Python decorator. We’ve seen decorators before for function definitions; a decorator for a class definition works in the same way, acting as a modifier for our definition. In this case, @dataclass tells Python that the data type we’re defining is a data class, which we’ll explore the benefits of down below.
class Person:, signals the start of a class definition. This is similar to function definitions, except we use the class keyword instead of def. The name of the class is Person.
The rest of the code is indented to put it inside of the class body.
The next line is a docstring that describes the purpose of the class.
Each remaining line (starting with given_name: str) defines a piece of data associated with the class; each piece of data is called an instance attribute of the class.
For each instance attribute, we write a name and a type annotation. This is similar to defining parameter names and types for functions, though of course the purposes are different.
In general, a data class definition in Python has the following syntax:
@dataclass
class <ClassName>:
"""Description of data class.
"""
<attribute1>: <type1>
<attribute2>: <type2>
...Now that we’ve seen how to define a data class, we now are ready to actually put it to use. To create an instance of our Person data class, we write a Python expression that calls the data class, passing in as arguments the values for each instance attribute:
Pretty cool! That line of code creates a new Person object whose given name is 'David', family name is 'Liu', age is 100, and address is '40 St. George Street', and stores the object in the variable david. The type of this new value is, as we’d expect, Person:
If we ask Python to evaluate the Person object, we see the different pieces of data that have been bundled together:
But from a Person object, how do we extract the individual values we bundled together? If we were using lists, we’d simply do list indexing: david[0], david[1], etc. The syntax for Python classes improves this because we can use the names of the instance attributes together with dot notation to access these values:
>>> david.given_name
'David'
>>> david.family_name
'Liu'
>>> david.age
100
>>> david.address
'40 St. George Street'This is much more readable than list indexing, and this is one of the major advantages of using data classes over lists to represent custom data in Python.
One challenge when creating instances of our data classes is keeping track of which arguments correspond to which instance attributes. In the expression Person('David', 'Liu', 100, '40 St. George Street'), the order of the arguments must match the order the instance attributes are listed in the definition of the data class—and it’s our responsibility to remember this order. Think about how easy it would be for us to write Person('Liu', 'David', 100, '40 St. George Street'), only to discover much later in our program that we accidentally switched this poor fellow’s given and family names!
To solve this issue, Python enables us to create data class instances using keyword arguments to explicitly name which argument corresponds to which instance attribute, using the exact same format as the Person representation we saw above:
Not only is this more explicit, but using keyword arguments allows us to pass the values in any order we want:
This is a great improvement for the readability of our code when we use data classes, especially as they grow larger. One potential downside that comes with this (and in general when being more explicit) is that this requires a bit more typing, and makes our code a little longer. You can get around the first issue by using auto-completion features (e.g., in PyCharm), and for the second issue you can put the different arguments on separate lines:
>>> david = Person(
... family_name='Liu',
... given_name='David',
... address='40 St. George Street',
... age=100
... )Now that we have the ability to define our own data types, we need to decide how these data types will fit into our memory model. We’ll do this by using the representation that Python displays, formatted to show each instance attribute on a new line. For example, we would represent the david variable in a memory model as follows:
In the previous section, we learned about data classes, a way to define our own data types in Python. In this section, we’re going to learn study some more details about defining and designing data classes in our programs, and apply what we’ve learned to simplify some of work we did with tabular data in 4.1 Tabular Data.
Before we begin, please take a moment to review the Person data class we developed in the previous section.
from dataclasses import dataclass
@dataclass
class Person:
"""A custom data type that represents data for a person."""
given_name: str
family_name: str
age: int
address: strIn our Person data class definition, we specify the type of each instance attribute. By doing so, we constrain the possible values can be stored for these attributes. However, just as we saw with function type contracts, we don’t always want to allow every possible value of a given type for an attribute value.
For example, the age attribute for Person has a type annotation int, but we certainly would not allow negative integers to be stored here! Somehow, we’d like to record a second piece of information about this attribute: that age >= 0. This kind of constraint is called a representation invariant, since it is a predicate describing a condition on how we represent a person that must always be true—this condition never varies. The term invariant is used in a few different contexts in computer science; we’ll explore one other kind of invariant a bit later in this chapter. All attribute type annotations, like age: int, are representation invariants. However, we can express general representation invariants as well, by adding them to the class docstring. Whenever possible, we write this as Python expressions rather than English, for a reason we’ll see in the next section.
Here is how we add non-type-annotation representation invariants in a class docstring:
@dataclass
class Person:
"""A custom data type that represents data for a person.
Representation Invariants:
- self.age >= 0
"""
given_name: str
family_name: str
age: int
address: strOne oddity with this definition is that we use self.age instead of age to refer to the instance attribute. This mimics how we access data type attributes using dot notation:
In the class docstring, we use the variable name self to refer to a generic instance of the data class. Keep in mind that self here is used just in the class docstring. In the above example, the variable david would appear in our memory model, but self would not. This use of self is a strong Python convention, and we’ll return to other uses of self later on in this course.
python_taJust as we saw with preconditions in 3.7 Function Specification, representation invariants are useful pieces of documentation for how a data class should be used. Like preconditions, representation invariants are assumptions that we make about values of a data type; for example, we can assume that every Person instance has an age that’s greater than or equal to zero.
Representation invariants are also constraints on how we can create a data class instance. Because it can be easy to miss or ignore a representation invariant buried in a class docstring, python_ta.contracts supposts checking all representation invariants, just like it does with preconditions! Let’s add a check_all_contracts call to our Person example:
from dataclasses import dataclass
@dataclass
class Person:
"""A person with some basic demographic information.
Representation Invariants:
- self.age >= 0
"""
given_name: str
family_name: str
age: int
address: str
if __name__ == '__main__':
import python_ta.contracts
python_ta.contracts.DEBUG_CONTRACTS = False
python_ta.contracts.check_all_contracts()If we run the above file in the Python console, we’ll obtain an error whenever we attempt to instantiate a Person with invalid attributes.
>>> david = Person(
... given_name='David',
... family_name='Liu',
... age=-100,
... address='40 St. George Street')
Traceback (most recent call last):
File "<input>", line 1, in <module>
...
AssertionError: Representation invariant "self.age >= 0" violated.Note: currently, python_ta is strict with the header Representation Invariants:. In particular, both the “Representation” and “Invariants” must be capitalized (and spelled correctly). Please watch out for this, as otherwise any representation invariants you add will not be checked!
Just as how functions give us a way of organizing blocks of code to represent a computation, data classes give us a way of organizing pieces of data to represent an entity. In 2.5 The Function Design Recipe, we learned a structured approach to designing and implementing functions. There is an analogous Data Class Design Recipe, which you should use every time you want to create a new data type for a program. Note the similarities between the two recipes, such as the importance of naming and documentation.
To wrap up our introduction of data classes, let’s see how to apply data classes to the marriage license data set we studied in 4.1 Tabular Data.
| ID | Civic Centre | Marriage Licenses Issued | Time Period |
|---|---|---|---|
| 1657 | ET | 80 | January 1, 2011 |
| 1658 | NY | 136 | January 1, 2011 |
| 1659 | SC | 159 | January 1, 2011 |
| 1660 | TO | 367 | January 1, 2011 |
| 1661 | ET | 109 | February 1, 2011 |
| 1662 | NY | 150 | February 1, 2011 |
| 1663 | SC | 154 | February 1, 2011 |
| 1664 | TO | 383 | February 1, 2011 |
Recall that we represented the data as a list of lists:
>>> marriage_data = [
... [1657, 'ET', 80, datetime.date(2011, 1, 1)],
... [1658, 'NY', 136, datetime.date(2011, 1, 1)],
... [1659, 'SC', 159, datetime.date(2011, 1, 1)],
... [1660, 'TO', 367, datetime.date(2011, 1, 1)],
... [1661, 'ET', 109, datetime.date(2011, 2, 1)],
... [1662, 'NY', 150, datetime.date(2011, 2, 1)],
... [1663, 'SC', 154, datetime.date(2011, 2, 1)],
... [1664, 'TO', 383, datetime.date(2011, 2, 1)]
... ]We implemented the following function to calculate the average number of marriage licenses issued by a particular civic centre:
def average_licenses_issued(data: list[list], civic_centre: str) -> float:
"""Return the average number of marriage licenses issued by civic_centre in data.
Precondition:
- all({len(row) == 4 for row in data})
- any({row[1] == civic_centre for row in data})
"""
issued_by_civic_centre = [row[2] for row in data if row[1] == civic_centre]
total = sum(issued_by_civic_centre)
count = len(issued_by_civic_centre)
return total / countHere is how we will use data classes to simplify this approach. Rather than storing each row in the table as a list, we can instead introduce a new data class to store this information:
from dataclasses import dataclass
from datetime import date
@dataclass
class MarriageData:
"""A record of the number of marriage licenses issued in a civic centre in a given month.
Instance Attributes:
- id: a unique identifier for the record
- civic_centre: the name of the civic centre
- num_licenses: the number of licenses issued
- month: the month these licenses were issued
"""
id: int
civic_centre: str
num_licenses: int
month: dateThen using this data class, we can represent tabular data as a list of MarriageData instances rather than a list of lists. Not much has changed! The values representing each entry in the table are the same, but how we “bundle” each row of data into a single entity is different.
>>> marriage_data = [
... MarriageData(1657, 'ET', 80, datetime.date(2011, 1, 1)),
... MarriageData(1658, 'NY', 136, datetime.date(2011, 1, 1)),
... MarriageData(1659, 'SC', 159, datetime.date(2011, 1, 1)),
... MarriageData(1660, 'TO', 367, datetime.date(2011, 1, 1)),
... MarriageData(1661, 'ET', 109, datetime.date(2011, 2, 1)),
... MarriageData(1662, 'NY', 150, datetime.date(2011, 2, 1)),
... MarriageData(1663, 'SC', 154, datetime.date(2011, 2, 1)),
... MarriageData(1664, 'TO', 383, datetime.date(2011, 2, 1))
... ]And here is how we could modify our average_licenses_issued function.
def average_licenses_issued(data: list[MarriageData], civic_centre: str) -> float:
"""Return the average number of marriage licenses issued by civic_centre in data.
Precondition:
- any({row.civic_centre == civic_centre for row in data})
"""
issued_by_civic_centre = [
row.num_licenses for row in data if row.civic_centre == civic_centre
]
total = sum(issued_by_civic_centre)
count = len(issued_by_civic_centre)
return total / countAgain, not much has changed: instead of writing row[1] and row[2], we instead write row.civic_centre and row.num_licenses. This is longer to write, but also more explicit in what attributes of the data are accessed. And to quote from the Zen of Python, explicit is better than implicit.
Earlier, we claimed that a dataclass is a better way of representing a bundle of data than a list. Let’s review a few reasons why:
python_ta understand data class definitions, and will warn us if we try to create malformed person values (e.g., wrong arguments to Person), or access invalid attributes.Collections in Python can be used in many ways. We have already seen how we can use built-in aggregation functions (e.g., any, all, max) to perform computations across all elements of a collection (e.g., list, set).
But right now, we’re limited by what aggregation functions Python makes available to us: for example, there’s a built-in sum function, but no product function. That’s not exactly true: there is a math.product function, but let’s ignore that here. :) So in this section, we’ll learn about the for loop, a compound statement that will allow us to implement our own custom aggregation functions across different types of collection data.
Suppose we wanted to write a function that computes the sum of a list of numbers, without using the built-in sum function.
def my_sum(numbers: list[int]) -> int:
"""Return the sum of the given numbers.
>>> my_sum([10, 20, 30])
60
"""If we knew the size of numbers in advance, we could write a single expression to do this. For example, here is how we could implement my_sum if we knew that numbers always contained three elements:
def my_sum(numbers: list[int]) -> int:
"""Return the sum of the given numbers.
>>> my_sum([10, 20, 30])
60
"""
return numbers[0] + numbers[1] + numbers[2]But of course, this approach doesn’t work for general lists, when we don’t know ahead of time how many elements the input will have. We need a way to repeat the “+ numbers[_]” for an arbitrary number of list elements. Here is another way of writing our three-element code to pull out the exact statement that is repeated.
def my_sum(numbers: list[int]) -> int:
"""Return the sum of the given numbers.
>>> my_sum([10, 20, 30])
60
"""
sum_so_far = 0
sum_so_far = sum_so_far + numbers[0]
sum_so_far = sum_so_far + numbers[1]
sum_so_far = sum_so_far + numbers[2]
return sum_so_farThis implementation follows how a human might add up the numbers in the list. First, we start a counter a 0 (using a variable called sum_so_far). Then, we use three assignment statements to update the value of sum_so_far by adding another element of numbers. Let’s look at the first such statement:
This looks fairly straightforward, but is actually a big leap from the assignment statements we’ve studied before! What’s unusual about it is that for the first time, we are assigning a value to a variable that has already been given a value. This type of assignment statement is called a variable reassignment statement. This statement is especially tricky because the variable sum_so_far appears on both sides of the =. We can make sense of this statement by reviewing the evaluation order that Python follows when executing an assignment statement:
sum_so_far + numbers[0]) is evaluated.sum_so_far).We can visualize how the three assignment statements work by tracing through an example. Let’s consider calling our doctest example, my_sum([10, 20, 30]). What happens to the value of sum_so_far?
| Statement | sum_so_far after executing statement |
Notes |
|---|---|---|
sum_so_far = 0 |
0 |
|
sum_so_far = sum_so_far + numbers[0] |
10 (0 + 10) |
When evaluating the right-hand side, sum_so_far is 0 and numbers[0] is 10. |
sum_so_far = sum_so_far + numbers[1] |
30 (10 + 20) |
When evaluating the right-hand side, sum_so_far is 10 and numbers[1] is 20. |
sum_so_far = sum_so_far + numbers[2] |
60 (30 + 30) |
When evaluating the right-hand side, sum_so_far is 30 and numbers[2] is 30. |
Now that we understand this implementation, we can see that the statement sum_so_far = sum_so_far + numbers[_] is exactly what needs to be repeated for every element of the input list. So now, let’s learn how to perform repeated execution of Python statements.
In Python, the for loop is a compound statement that repeats a block of code once for element in a collection. Here is the syntax of a for loop: Notice that the syntax is very similar to a comprehension. The key difference is that a comprehension evaluates an expression once for each element in a collection, but a for loop evaluates a sequence of statements once per element.
There are three parts:
<collection> is an expression for a Python collection (e.g., a list or set).
<loop_variable> is a name for the loop variable that will refer to an element in the colleciton.
<body> is a sequence of one or more statements that will be repeatedly executed. This is called the body of the for loop. The statements within the loop body may refer to the loop variable to access the “current” element in the collection.
Just as we saw with if statements, the body of a for loop must be indented relative to the for keyword.
When a for loop is executed, the following happens:
The loop variable is assigned to the first element in the collection.
The loop body is executed, using the current value of the loop variable.
Steps 1 and 2 repeat for the second element of the collection, then the third, etc. until all elements of the collection have been assigned to the loop variable exactly once.
Each individual execution of the loop body is called a loop iteration.
As with if statements, for loops are a control flow structure in Python because they modify the order in which statements are executed—in this case, by repeating a block of code multiple times. The reason we use the term loop is because after the last statement in the loop body is executed, the Python interpreter “loops back” to the beginning of the for loop, assigning the loop variable to the next element in the collection.
my_sum and the accumulator patternNow let us see how to use a for loop to implement my_sum. We left off with the following block of repeated code:
sum_so_far = sum_so_far + numbers[0]
sum_so_far = sum_so_far + numbers[1]
sum_so_far = sum_so_far + numbers[2]We can now move the repeated sum_so_far = sum_so_far + _ part into a for loop as follows: Notice our loop variable name! A good convention to follow is that collections have a pluralized name (numbers), and loop variables have the singular version of that name (number).
One important thing to note is that we no longer need to use list indexing (numbers[_]) to access individual list elements. The for loop in Python handles the extracing of individual elements for us, so that our loop body can focus just on what to do with each element.
With this, we can now write our complete implementation of my_sum.
def my_sum(numbers: list[int]) -> int:
"""Return the sum of the given numbers.
>>> my_sum([10, 20, 30])
60
"""
sum_so_far = 0
for number in numbers:
sum_so_far = sum_so_far + number
return sum_so_farNow, no matter how many elements numbers has, the loop body sum_so_far = sum_so_far + number will repeat once for each element. The ability to write a small amount of code that processes an arbitrary amount of data is one of the truly remarkable feats of computer science.
Because of the variable reassignment, sum_so_far is more complex than every other variable we have used so far in this course. And because this reassignment happens inside the loop body, it happens once for each element in the collection, not just once or twice. This frequent reassignment can make loops hard to reason about, especially as our loop bodies grow more complex, and so we will take some time now to introduce a formal process you can use to reason about loops in your code.
First, some terminology. We call the variable sum_so_far the loop accumulator. The purpose of a loop accumulator is to store an aggregated result based on the elements of the collection that have been previously visited by the loop. In the case of my_sum, the loop accumulator sum_so_far stores, well, the sum of the elements that we have seen so far in the loop. We can keep track of the execution of the different iterations of the loop in a tracing table consisting of three columms: how many iterations have occurred so far, the value of the loop variable for that iteration, and the value of the loop accumulator at the end of that iteration. We call this table a loop accumulation table. Here is the loop accumulation table for a call to my_sum([10, 20, 30]):
| Iteration | Loop variable (number) |
Loop accumulator (sum_so_far) |
|---|---|---|
| 0 | N/A | 0 |
| 1 | 10 | 10 |
| 2 | 20 | 30 |
| 3 | 30 | 60 |
Almost every for loop has an accumulator variable. Later, some might even have more than one. To distinguish these from other variables, we recommend using the _so_far suffix in the variable name, and optionally adding a comment in your code explaining the purpose of the variable.
def my_sum(numbers: list[int]) -> int:
"""Return the sum of the numbers in numbers.
>>> my_sum([10, 20, 30])
60
"""
# ACCUMULATOR sum_so_far: keep track of the running sum of the elements in numbers.
sum_so_far = 0
for number in numbers:
sum_so_far = sum_so_far + number
return sum_so_farWhat happens if we call my_sum on an empty list?
Why does this happen? The key to understanding this is that when we loop over an empty collection, zero iterations occur and the loop body never executes. So when we call my_sum([]), first sum_so_far is assigned to 0, and then the for loop does not execute any code, and so 0 is returned. A key observation here is that when the collection is empty, the initial value of sum_so_far is returned.
Our implementation of my_sum illustrates a more general pattern that we’ll employ when we use loops to perform an aggregation computation. Here is the accumulator pattern:
_so_far to remind yourself that this is an accumulator.Here is a code template to illustrate this pattern.
<x>_so_far = <default_value>
for element in <collection>:
<x>_so_far = ... <x>_so_far ... element ... # Somehow combine loop variable and accumulator
return <x>_so_farCode templates are helpful when learning about programming techniques, as they give you a natural starting point in your code with “places to fill in”. However, as we’ll see over the next few sections, we should not blindly follow code templates either. Part of mastering a code template is deciding when to use it and when to modify it to solve the problem at hand.
Let’s use the accumulator pattern to implement the function product:
def product(numbers: list[int]) -> int:
"""Return the product of the given numbers.
>>> product([10, 20])
200
>>> product([-5, 4])
-20
"""
# ACCUMULATOR product_so_far: keep track of the product of the
# elements in numbers seen so far in the loop.
product_so_far = 1
for number in numbers:
product_so_far = product_so_far * number
return product_so_farNotice how similar the code for product is to my_sum. In fact, disregarding the changes in variable names, the only changes are:
0 versus 1)+ versus *)Because sets are collections, we can use for loops to iterate through the elements of a set as well. However, because sets are unordered, we cannot assume a particular order that the for loop will visit the elements in. So for loops over sets should only be used when the same result would be obtained regardless of the order of the elements. The aggregation functions we’ve looked at so far like sum satisfy this property.
Strings are very similar to lists because they are considered ordered sequences of data. Python treats a string as an ordered collection of characters (strings of length one), and so we can use for loops with strings to iterate over its characters one at a time.
Here is an example of using a for loop to count the number of characters in a string.
def my_len(s: str) -> int:
"""Return the number of characters in s.
>>> my_len('David')
5
"""
# ACCUMULATOR len_so_far: keep track of the number of
# characters in s seen so far in the loop.
len_so_far = 0
for character in s:
len_so_far = len_so_far + 1
return len_so_farUnlike my_sum, here we do not use the loop variable to update the accumulator len_so_far. This is because we don’t care what the actual value character is, we are only counting iterations. In these scenarios, we can use an underscore _ in place of the name for the loop variable to communicate that the loop variable is not used in the body of the for loop:
def my_len(s: str) -> int:
"""Return the number of characters in s.
>>> my_len('David')
5
"""
# ACCUMULATOR len_so_far: keep track of the number of
# characters in s seen so far in the loop.
len_so_far = 0
for _ in s:
len_so_far = len_so_far + 1
return len_so_farPython dictionaries are also iterable. Just like we saw with comprehensions, when we iterate over a dictionary, the loop variable refers to the key of each key-value pair. But of course, we can use the key to lookup its corresponding value in the dictionary.
For example, suppose we are given a dictionary mapping restaurant menu items (as strings) to their prices (as floats). Here is how we could calculate the sum of all the prices on the menu.
def total_menu_price(menu: dict[str, float]) -> float:
"""Return the total price of the given menu items.
>>> total_menu_price({'fries': 3.5, 'hamburger': 6.5})
10.0
"""
# ACCUMULATOR total_so_far: keep track of the total cost of
# all items in the menu seen so far in the loop.
total_so_far = 0.0
for item in menu:
total_so_far = total_so_far + menu[item]
return total_so_farThe loop variable item refers to the keys in the dictionary, so to access the corresponding prices we need to use a key lookup expression, menu[item]. Here is how we can visualize this using a loop accumulation table:
| Iteration | Loop variable (item) |
Loop accumulator (total_so_far) |
|---|---|---|
| 0 | 0.0 |
|
| 1 | 'fries' |
6.5 |
| 2 | 'hamburger' |
10.0 |
One final note: like sets, dictionaries are unordered. We chose a particular order of keys for the loop accumulation table just to understand the loop behaviour, but we should not assume that this is the guaranteed order the keys would be visited. Just as with sets, only loop over dictionaries when your computation does not depend on the iteration order.
IterableSomething you might notice about the two functions my_len and my_sum we’ve developed so far is that actually work on more types than currently specified by their parameter type annotation. For example, my_len works just as well on lists, sets, and other collections. If we look at the function body, we don’t use the fact that s is a string at all—just that it can be iterated over. It would be nice if we could relax our type contract to allow for any collection argument value.
We say that a Python data type is iterable when its values can be used as the “collection” of a for loop, and that a Python object is iterable when it is an instance of an iterable data type. You might wonder why Python doesn’t just call these “collections” instead. There is a technical reason that is beyond the scope of this course, but for our purposes, we’ll treat “iterable” and “collection” as synonymous. This is equivalent to when a value can be used as the “collection” of a comprehension. We can import the Iterable type from typing to indicate that a value must be any data type that is iterable. Here’s how we would write a more general my_len:
from typing import Iterable
def my_len(collection: Iterable) -> int:
"""Return the number of elements in collection.
>>> my_len('David')
5
>>> my_len([1, 2, 3])
3
>>> my_len({'a': 1000})
1
"""
len_so_far = 0
for _ in collection:
len_so_far = len_so_far + 1
return len_so_farNotice that other than renaming a variable, we did not change the function body at all! This demonstrates how powerful the accumulator pattern can be; accumulators can work with any iterable object.
You may feel that several of the examples in this section are contrived. You are not wrong; we are trying to leverage your familiarity with the built-in functions to help introduce a new concept. You may also have noticed that there are other ways to solve some of the problems we’ve presented. For example, average_menu_price can be solved using comprehensions rather than loops:
def average_menu_price_v2(menu: dict[str, float]) -> float:
"""Return the average price of an item from the menu.
>>> average_menu_price({'fries': 4.0, 'hamburger': 6.0})
5.0
"""
prices = [menu[item] for item in menu]
return sum(prices) / len(prices)Indeed, you have performed remarkably complex computations up to this point using just comprehensions to filter and transform data, and Python’s built-in functions to aggregate this data. For loops provide an alternate approach to these comprehensions that offer a trade-off of code complexity vs. flexibility. Comprehensions and built-in functions are often shorter and more direct translations of a computation than for loops, but for loops allow us to customize exactly how filtering and aggregation occurs. A good rule of thumb to follow in this course is to use comprehensions and built-in functions when possible, and use loops when you really need a custom aggregation.
Of course, on your journey to learning programming it is important that you learn and master both of these techniques, and be able to translate between them when possible! Just as there are many ways to visualize a sunset (a painting, a photograph, a drawing, pixel art), so too are there many ways to implement a function. So whenever you see some code for a function involving comprehensions or loops, remember that you can always turn it into an additional learning opportunity by trying to rewrite it with a different approach.
In the last section we introduced for loops and the accumulator pattern. The examples we used all had very similar code, with some differences in the type of collection we iterated over and how we initialized and updated our accumulator variable. In this section, we’ll study two variations of the basic loop accumulator pattern: having multiple accumulator variables for the same loop, and using if statements to perform a conditional update of loop accumulators.
Before proceeding, please take moment to review the loop accumulator pattern:
<x>_so_far = <default_value>
for element in <collection>:
<x>_so_far = ... <x>_so_far ... element ... # Somehow combine loop variable and accumulator
return <x>_so_farIn each example from the last section we used only one accumulator. The pattern can be extended to use multiple accumulators. For example, given a dictionary mapping menu items to prices, how can we get the average price? Remember that an average requires both the sum and the number of elements. We can create two accumulators to accomplish this:
def average_menu_price(menu: dict[str, float]) -> float:
"""Return the average price of an item from the menu.
>>> average_menu_price({'fries': 3.5, 'hamburger': 6.5})
5.0
"""
# ACCUMULATOR len_so_far: keep track of the number of
# items in the menu seen so far in the loop.
len_so_far = 0
# ACCUMULATOR total_so_far: keep track of the cost of
# all items in the menu seen so far in the loop.
total_so_far = 0.0
for item in menu:
len_so_far = len_so_far + 1
total_so_far = total_so_far + menu[item]
return total_so_far / len_so_farHere is how we could write a loop accumulation table for this example:
| Iteration | Loop variable (item) |
Accumulator len_so_far |
Accumulator total_so_far |
|---|---|---|---|
| 0 | 0 |
0.0 |
|
| 1 | 'fries' |
1 |
6.5 |
| 2 | 'hamburger' |
2 |
10.0 |
Consider the following problem: given a string, count the number of vowels in the string.
def count_vowels(s: str) -> int:
"""Return the number of vowels in s.
>>> count_vowels('aeiou')
5
>>> count_vowels('David')
2
"""We saw in 4.4 Repeated Execution: For Loops that we could count every character in a given string by using an accumulator that increased by 1 for every loop iteration. We can use the same idea for counting just vowels, but we need to increase the accumulator only when the current character is a vowel.
In Chapter 3, we learned how to control execution of whole blocks of code using if statements. By nesting an if statement inside a for loop, we can adapt our accumulator pattern to only update the accumulator when certain conditions are met.
def count_vowels(s: str) -> int:
"""Return the number of vowels in s.
>>> count_vowels('aeiou')
5
>>> count_vowels('David')
2
"""
# ACCUMULATOR vowels_so_far: keep track of the number of vowels
# seen so far in the loop.
vowels_so_far = 0
for letter in s:
if letter in 'aeiou':
vowels_so_far = vowels_so_far + 1
return vowels_so_farIf word is the empty string, the for loop will not iterate once and the value 0 is returned. This tells us that we have initialized our accumulator correctly. What about the loop body? There are two cases to consider:
letter is a vowel, the reassignment vowels_so_far = vowels_so_far + 1 increases the number of vowels seen so far by 1.letter is not a vowel, nothing else happens in the current iteration because this if statement has no else branch. The vowel count remains the same.Here’s our loop accumulation table for count_vowels('David'). At each iteration, the accumulator either stays the same (when letter is not a vowel) or increases by 1 (when letter is a vowel).
| Loop Iteration | Loop Variable letter |
Accumulator vowels_so_far |
|---|---|---|
| 0 | 0 |
|
| 1 | 'D' |
0 |
| 2 | 'a' |
1 |
| 3 | 'v' |
1 |
| 4 | 'i' |
2 |
| 5 | 'd' |
2 |
We can also contrast this function to an equivalent implementation using a filtering comprehension:
def count_vowels(s: str) -> int:
"""Return the number of vowels in s.
>>> count_vowels('aeiou')
5
>>> count_vowels('David')
2
"""
return len([letter for letter in s if letter in 'aeiou'])This version hopefully makes clear that the if letter in 'aeiou in the loop version acts as a filter on the string s, causing the loop accumulator to only be updated for the vowels. In this version, the actual accumulation (vowels_so_far = vowels_so_far + 1) is handled by the call to len.
maxNow let’s consider implementing another built-in aggregation function: max. We’ll require that the input be non-empty, as we cannot compute the maximum element of an empty collection. This allows us to set the initial value of our accumulator based on the input.
def my_max(numbers: list[int]) -> int:
"""Return the maximum value of the numbers in numbers.
Preconditions:
- numbers != []
>>> my_max([10, 20])
20
>>> my_max([-5, -4])
-4
"""
# ACCUMULATOR max_so_far: keep track of the maximum value
# of the elements in numbers seen so far in the loop.
max_so_far = numbers[0]
for number in numbers:
if number > max_so_far:
max_so_far = number
return max_so_farBecause we can assume that the precondition holds when implementing my_max, we can access numbers[0] to set the initial value of max_so_far without worrying about getting an IndexError. In the loop, the accumulator max_so_far is updated only when a larger number is encountered (if number > max_so_far). Note that here, the term accumulator diverges from its normal English meaning. At any point during the loop, max_so_far is assigned to a single list element, not some “accumulation” of all list elements see so far. Instead, max_so_far represents the maximum of the elements seen so far, and so what is being accumulated is a set of facts: “the elements seen so far all <= max_so_far”.
In 3.2 Predicate Logic, we saw how to use any to check whether there exists a string in a collection that starts with the letter 'D':
def starts_with(strings: Iterable[str], char: str) -> bool:
"""Return whether one of the given strings starts with the character char.
Precondition:
- all({s != '' for s in strings})
- len(char) == 1
>>> starts_with(['Hello', 'Goodbye', 'David', 'Dario'], 'D')
True
>>> starts_with(['Hello', 'Goodbye', 'David', 'Dario'], 'A')
False
"""
return any({s[0] == char for s in words})Our next goal is to implement this function without using the any function, replacing it for loops and if statements. If we take a look at the argument to any above, we see some pretty big hints on how to do this:
for s in words can be used to create a for loop.s[0] == char can be used as a condition for an if statement.Let’s give it a shot using our existing accumulator pattern. Because the result of the function is a bool, our accumulator will also be a bool. Its initial value will be False, which is the correct return value when strings is empty.
def starts_with_v2(words: list[str], char: str) -> bool:
"""..."""
# ACCUMULATOR starts_with_so_far: keep track of whether
# any of the words seen by the loop so far starts with char.
starts_with_so_far = False
for s in words:
...
return starts_with_so_farHow do we update the accumulator? We set it to True when the current string s starts with char, which is exactly the condition from the comprehension.
def starts_with_v2(strings: Iterable[str], char: str) -> bool:
"""..."""
# ACCUMULATOR starts_with_so_far: keep track of whether
# any of the strings seen by the loop so far starts with char.
starts_with_so_far = False
for s in strings:
if s[0] == char:
starts_with_so_far = True
return starts_with_so_farHere is a loop accumulation table for starts_with(['Hello', 'Goodbye', 'David', 'Mario'], 'D'). The third iteration assigns starts_with_so_far to True, while in the other iterations nothing occurs.
| Iteration | Loop variable s |
Accumulator starts_with_so_far |
|---|---|---|
| 0 | False |
|
| 1 | 'Hello' |
False |
| 2 | 'Goodbye' |
False |
| 3 | 'David' |
True |
| 4 | 'Mario' |
True |
The function starts_with_v2 is correct and fits our accumulator pattern well. But you might have noticed that it performs unnecessary work because it must loop through every element of the collection before returning a result. Why is this unnecessary? Because we are interested only in whether there exists a string that starts with the given letter! As soon as the condition s[0] == char evaluates to True, we know that the answer is Yes without checking any of the remaining strings.
So the question is, how do we take advantage of this observation to make our code more efficient? We can use a return statement inside the body of the loop. Let’s revisit how we described the execution of a return statement in Chapter 2 (new emphasis in bold):
When a return statement is executed, the following happens:
- The
<expression>is evaluated, producing a value.- That value is then returned to wherever the function was called. No more code in the function body is executed after this point."
In all our functions so far, we have written return statements only at the end of our function bodies or branches of an if statement. This should make sense based on the behaviour described above: any code after a return statement will not execute!
But we can combine return statements with if statements to conditionally stop executing any more code in the function body. This is called short-circuiting or early returning.
So our first attempt at making a more efficient starts_with is to use an early return inside the if branch:
def starts_with_v3(strings: Iterable[str], char: str) -> bool:
"""..."""
for s in strings:
if s[0] == char:
return TrueThis for loop is strange: it seems we no longer have an accumulator variable! This is actually fairly common for functions that return booleans. Rather than accumulating a True/False value, it is often possible to directly return the literals True or False.
The starts_with_v3 implementation does successfully return True on our first doctest example during the third loop iteration (when s = 'David'), skipping the fourth iteration. However, this implementation will fail the second doctest example (when there are no strings that start with the given character in the collection). We have not explicitly stated what to return when none of the strings in words starts with char. Actually, we have violated our own type contract because the function will implicitly return None in this scenario.
To fix it, we need to specify what to return if the loop stops without retuning early—this occurs only when there are no strings that start with the given character, and so we return False.
def starts_with_v4(strings: Iterable[str], char: str) -> bool:
"""..."""
for s in strings:
if s[0] == char:
return True
return FalseWhen working with early returns inside loops, students often have a tendency to write symmetric if-else branches, like the following:
def starts_with_v5(strings: Iterable[str], char: str) -> bool:
"""..."""
for s in strings:
if s[0] == char:
return True
else:
return FalseUnfortunately, while we emphasized symmetry earlier when writing functions with if statements, here symmetry is not desirable! With both the if and else branches containing an early return, the loop will only ever perform one iteration. That is, starts_with_v5 makes a decision about whether to return True or False just by examining the first string in the collection, regardless of what the other strings are. So if we consider starts_with_v5(['Hello', 'Goodbye', 'David', 'Mario'], 'D'), the only string to be visited in the loop is 'Hello', and False would be returned!
The lesson here is that existential searches are fundamentally asymmetric: your function can return True early as soon as it has found an element of the collection meeting the desired criterion, but to return False it must check every element of the collection.
Now let’s consider a dual problem to the previous one: given a collection of strings and a character, return whether all strings in the collect start with that letter. If we use the comprehension version of starts_with, this change is as simple as swapping the any for all:
def all_start_with(strings: Iterable[str], char: str) -> bool:
"""Return whether all of the given strings starts with the character char.
Precondition:
- all({s != '' for s in strings})
- len(char) == 1
>>> all_starts_with(['Hello', 'Goodbye', 'David', 'Dario'], 'D')
False
>>> all_starts_with(['Drip', 'Drop', 'Dangle'], 'D')
True
"""
return all({s[0] == char for s in strings})We can also use the accumulator pattern from starts_with_v2 to check every string. Now, our accumulator starts with the default value of True, and changes to False when the loop encounters a string that does not start with the given letter. Such a string acts as a counterexample to the statement “every string starts with the given character”.
def all_starts_with_v2(strings: Iterable[str], char: str) -> bool:
"""..."""
# ACCUMULATOR starts_with_so_far: keep track of whether
# all of the strings seen by the loop so far starts with char.
starts_with_so_far = True
for s in strings:
if s[0] != char:
starts_with_so_far = False
return starts_with_so_farAnd as before, we can also write this function using an early return, since we can return False as soon as a counterexample is found:
def all_starts_with_v3(strings: Iterable[str], char: str) -> bool:
"""..."""
for s in words:
if s[0] != char:
return False
return TrueNote that this code is very similar to starts_with_v4, except the condition has been negated and the True and False swapped. Existential and universal search are very closely related, and this is borne out by the similarities in these two functions. However, this also illustrates the fact that loops are more complex than using built-in functions and comprehensions: before, we could just swap any for all, but with loops we have to change a few different areas of the code to make this change.
We have learned a lot about collections so far:
The loops we have worked with so far are element-based, meaning the loop variable refers to a specific element in the collection. Though these loops are powerful, they have one limitation: they process each element of the collection independent of where they appear in the collection. In this section, we’ll see how we can loop through elements of index-based collections while keeping track of the current index. Looping by index enables us to solve more problems than looping by element alone, because we’ll be able to take into account where a particular element is in a collection in the loop body.
As in the previous section, before proceeding please take a moment to review the basic loop accumulator pattern:
<x>_so_far = <default_value>
for element in <collection>:
<x>_so_far = ... <x>_so_far ... element ... # Somehow combine loop variable and accumulator
return <x>_so_farWhen we introduced for loops, we presented a my_sum implementation that showed the exact statement that is repeated:
def my_sum(numbers: list[int]) -> int:
"""Return the sum of the given numbers.
>>> my_sum([10, 20, 30])
60
"""
sum_so_far = 0
sum_so_far = sum_so_far + numbers[0]
sum_so_far = sum_so_far + numbers[1]
sum_so_far = sum_so_far + numbers[2]
return sum_so_farOur eventual solution to the my_sum function used a loop variable, number, in place of the numbers[_] in the body. There is another solution if we observe that the indexes being used start at 0 and increase by one on each iteration of the loop. On the last iteration, the index should be: len(numbers) - 1. This sequence of numbers can be expressed using the range data type: range(0, len(numbers)) Based on this, let us use a different kind of for loop to implement my_sum:
def my_sum(numbers: list[int]) -> int:
"""Return the sum of the given numbers.
>>> my_sum([10, 20, 30])
60
"""
# ACCUMULATOR sum_so_far: keep track of the running sum of the elements in numbers.
sum_so_far = 0
for number in numbers:
sum_so_far = sum_so_far + number
return sum_so_far
def my_sum_v2(numbers: list[int]) -> int:
"""Return the sum of the given numbers.
>>> my_sum_v2([10, 20, 30])
60
"""
# ACCUMULATOR sum_so_far: keep track of the running sum of the elements in numbers.
sum_so_far = 0
for i in range(0, len(numbers)):
sum_so_far = sum_so_far + numbers[i]
return sum_so_farBoth my_sum and my_sum_v2 use the accumulator pattern, and in fact initialize and update the accumulator in the exact same way. But there are some key differences in how their loops are structured:
number vs. i: number refers to an element of the list numbers (starting with the first element); i refers to an integer (starting at 0).list vs. a range: for number in numbers causes the loop body to execute once for each element in numbers. for i in range(0, len(numbers)) causes the loop body to execute once for each integer in range(0, len(numbers)). Because the range “stop” argument is exclusive, these two versions both cause the same number of iterations, equal to the number of elements in numbers.number refers to a list element, we can add it directly to the accumulator. Since i refers to where we are in the list, we access the corresponding list element using list indexing to add it to the accumulator.In the case of my_sum, both our element-based and index-based implementations are correct. However, our next example illustrates a situation where the loop must know the index of the current element in order to solve the given problem.
Consider the following problem: given a string, count the number of times in the string two adjacent characters are equal. For example, the string 'look' has two adjacent 'o'’s, and the string 'David' has no repeated adjacent characters. The location of the characters matters; even though the string 'canal' has two 'a' characters, they are not adjacent
Let’s use these examples to design our function:
def count_adjacent_repeats(string: str) -> int:
"""Return the number of times in the given string that two adjacent characters are equal.
>>> count_adjacent_repeats('look')
1
>>> count_adjacent_repeats('David')
0
>>> count_adjacent_repeats('canal')
0
"""Before we try to implement this function, let’s reason about how we might approach the problem. First, as this is a “counting” problem, a natural fit would be to use an accumulator variable repeats_so_far that starts at 0 and increases by 1 every time two adjacent repeated characters are found. We don’t know where the characters in the string may be repeated, so we must start at the beginning and continue to the end. In addition, we are comparing adjacent characters, so we need two indices every loop iteration:
| Comparison |
|---|
string[0] == string[1] |
string[1] == string[2] |
string[2] == string[3] |
| … |
Notice that the indices to the left of the == operator start at 0 and increase by 1. Similarly, the indices to the right of the == operator start at 1 and increase by 1. Does this mean we need to use two for loops and two ranges? No. We should also notice that the index to the right of == is always larger than the left by 1, so we have a way of calculating the right index from the left index. Here is out first attempt.
def count_adjacent_repeats(string: str) -> int:
"""Return the number of repeated adjacent characters in string.
>>> count_adjacent_repeats('look')
1
>>> count_adjacent_repeats('David')
0
>>> count_adjacent_repeats('canal')
0
"""
# ACCUMULATOR repeats_so_far: keep track of the number of adjacent
# characters that are identical
repeats_so_far = 0
for i in range(0, len(string)):
if string[i] == string[i + 1]:
repeats_so_far = repeats_so_far + 1
return repeats_so_farUnfortunately, if we attempt to run our doctest examples above, we don’t get the expected values. Instead, we get 3 IndexErrors, one for each example. Here is the error for the first failed example:
Failed example:
count_adjacent_repeats('look')
Exception raised:
Traceback (most recent call last):
File "path\to\Python\Python38\lib\doctest.py", line 1329, in __run
exec(compile(example.source, filename, "single",
File "<doctest __main__.count_adjacent_repeats[0]>", line 1, in <module>
count_adjacent_repeats('look')
File "path/to/functions.py", line 74, in count_adjacent_repeats
if string[i] == string[i + 1]:
IndexError: string index out of range
Conveniently, the error tells us what the problem is ('string index out of range'). It even tells us the line where the error occurs: if string[i] == string[i + 1]:. It is now our job to figure out why the line is causing an IndexError. The line indexes the parameter string using i and i + 1, so one of them must be causing the error.
Remember that given a string of length n, the valid indices are from 0 to n - 1. Now let’s look at our use of range: for i in range(0, len(string)). This means that i can take on the values 0 to n - 1, which seems to be in the correct bounds. But don’t forget, we also are indexing using i + 1! This is the problem: i + 1 can take on the values 1 to n, and n is not a valid index.
We can solve this bug by remembering our goal: to compare adjacent pairs of characters. For a string of length n, the last pair of characters is (string[n - 2], string[n - 1]), so our loop variable i only needs to go up to n - 2, not n - 1. Let’s look at the final solution:
def count_adjacent_repeats(string: str) -> int:
"""Return the number of repeated adjacent characters in string.
>>> count_adjacent_repeats('look')
1
>>> count_adjacent_repeats('David')
0
>>> count_adjacent_repeats('canal')
0
"""
# ACCUMULATOR repeats_so_far: keep track of the number of adjacent
# characters that are identical
repeats_so_far = 0
for i in range(0, len(string) - 1):
if string[i] == string[i + 1]:
repeats_so_far = repeats_so_far + 1
return repeats_so_farNotice that we could not have implemented this function using an element-based for loop. Having for char in string would let us access the current character (char), but not the next character adjacent to char. To summarize, when we want to write a loop body that compares the current element with another based on their positions, we must use an index-based loop to keep track of the current index in the loop.
Index-based for loops can also be used to iterate over two collections in parallel using a single for loop. Consider the common mathematical problem: sum of products. In your linear algebra course you’ll learn about the inner product operation, which formalizes this idea.
For example, suppose we have two nickels, four dimes, and three quarters in our pocket. How much money do we have in total? To solve this, we must know the value of nickels, dimes, and quarters. Then we can use sum of products:
>>> money_so_far = 0.0
>>> money_so_far = money_so_far + 2 * 0.05 # Two nickels
>>> money_so_far = money_so_far + 4 * 0.10 # Four dimes
>>> money_so_far = money_so_far + 3 * 0.25 # Three quarters
>>> money_so_far
1.25This looks very similar to our sum_so_far exploration from earlier. The main difference is that this time we are accumulating products using the * operator. To the left of the * operator, we have a count (e.g., the number of nickels, an int). To the right of the * operator, we have a cent value (e.g., how much a nickel is worth in cents, a float). We can store this information in two same-sized lists. Let’s design a function that uses these two lists to tell us how much money we have:
def count_money(counts: list[int], denoms: list[float]) -> float:
"""Return the total amount of money for the given coin counts and denominations.
counts stores the number of coins of each type, and denominations stores the
value of each coin type. Each element in counts corresponds to the element at
the same index in denoms.
Preconditions:
- len(counts) == len(values)
>>> count_money([2, 4, 3], [0.05, 0.10, 0.25])
1.25
"""Before using a loop, let’s investigate how we would implement this using a comprehension. We need to multiply each corresponding element of counts and denoms, and add the results:
We can generate each of these products by using range: We used len(counts), but could have used len(denoms) as well because of the function’s precondition.
And we can then compute the sum of this expression by using the builtin Python function:
def count_money(counts: list[int], denoms: list[float]) -> float:
"""Return the total amount of money for the given coin counts and denominations.
counts stores the number of coins of each type, and denominations stores the
value of each coin type. Each element in counts corresponds to the element at
the same index in denoms.
Preconditions:
- len(counts) == len(values)
>>> count_money([2, 4, 3], [0.05, 0.10, 0.25])
1.25
"""
return sum([counts[i] * denoms[i] for i in range(0, len(counts))])This implementation of count_money has all the necessary ingredients that would appear in an equivalent for loop. Here is our alternate implementation of count_money using a for loop, but the same structure as my_sum from 4.4 Repeated Execution: For Loops.
def count_money(counts: list[int], values: list[float]) -> float:
"""...
"""
# ACCUMULATOR money_so_far: keep track of the total money so far.
money_so_far = 0.0
for i in range(0, len(counts)):
money_so_far = money_so_far + counts[i] * values[i]
return money_so_farWe have seen two forms of for loops. The first version, the element-based for loop, takes the form for <loop_variable> in <collection>. This is useful when we want to process each element in the collection without knowing about its position in the collection. The second version, the index-based for loops, takes the form for <loop_variable> in <range>. In index-based for loops, the range must belong to the set of valid indices for the collection we wish to loop over. We have seen two situations where this is useful: We’ll see one more example use of index-based loops later this chapter.
count_adjacent_repeats).count_money), using the same index for both lists.You might have noticed from our my_sum example that index-based for loops are more powerful than element-based for loops: given the current index, we can always access the current collection element, but not vice versa. So why don’t we just always use index-based for loops? Two reasons: first, not all collections can be indexed (think set and dict); and second, index-based for loops introduce a level of indirection to our code. In our my_sum_v2 example, we had to access the current element using list indexing (numbers[i]), while in my_sum, we could directly access the element by using the loop variable (number)`. So it’s important to understand when we can use element-based for loops vs. index-based for loops, as the former makes our code easier to write and understand.
When we introduced for loops, we said that the loop body consists of one of more statements. We saw in 4.5 For Loop Variations that we could put if statements inside loop bodies. In this section, we’ll see that a for loop body can itself contain another for loop, since for loops are themselves statements. We’ll study uses of these nested for loops, and also draw comparisons between them and comprehensions from the previous chapter.
Nested loops are particularly useful when dealing with nested data. As a first example, suppose we have a list of lists of integers:
Our goal is to compute the sum of all of the elements of this list:
def sum_all(lists_of_numbers: list[list[int]]) -> int:
"""Return the sum of all the numbers in the given lists_of_numbers.
>>> sum_all([[1, 2, 3], [10, -5], [100]])
111
"""We can start with our basic loop accumulator pattern:
def sum_all(lists_of_numbers: list[list[int]]) -> int:
"""..."""
# ACCUMULATOR sum_so_far: keep track of the running sum of the numbers.
sum_so_far = 0
for ... in lists_of_numbers:
sum_so_far = ...
return sum_so_farThe difference between this function and in my_sum from 4.4 is that here our loop variable in for ... in lists_of_numbers does not refer to a single number, but rather a list of numbers:
def sum_all(lists_of_numbers: list[list[int]]) -> int:
"""..."""
# ACCUMULATOR sum_so_far: keep track of the running sum of the numbers.
sum_so_far = 0
for numbers in lists_of_numbers: # numbers is a list of numbers, not a single number!
sum_so_far = ...
return sum_so_farSo here is one way of completing this function, by using the builtin sum function:
def sum_all(lists_of_numbers: list[list[int]]) -> int:
"""..."""
# ACCUMULATOR sum_so_far: keep track of the running sum of the numbers.
sum_so_far = 0
for numbers in lists_of_numbers: # numbers is a list of numbers, not a single number!
sum_so_far = sum_so_far + sum(numbers)
return sum_so_farThis implementation is structurally similar to the my_sum implementation we had in Section 4.4. But how would we implement this function without using sum? For this we need another for loop:
def sum_all(lists_of_numbers: list[list[int]]) -> int:
"""..."""
# ACCUMULATOR sum_so_far: keep track of the running sum of the numbers.
sum_so_far = 0
for numbers in lists_of_numbers: # numbers is a list of numbers, not a single number!
for number in numbers: # number is a single number
sum_so_far = sum_so_far + number
return sum_so_farWe say that the for number in numbers loops is nested within the for numbers in lists_of_numbers. What happens when we call our doctest example, sum_all([[1, 2, 3], [10, -5], [100]])? Let’s break this down step by step.
First, the assignment statement sum_so_far = 0 executes, creating our accumulator variable.
The outer loop is reached.
The loop variable list_of_numbers is assigned the first element in lists_of_numbers, which is [1, 2, 3].
Then, the body of the outer loop is executed. Its body is just one statement: the inner for loop, for number in numbers.
The inner loop variable number is assigned the first value in numbers, which is 1.
The inner loop body gets executed, updating the accumulator. sum_so_far is reassigned to 1 (since 0 + 1 == 1).
The inner loop iterates twice more, for number = 2 and number = 3. Notice that numbers is the *same value ([1, 2, 3]) for this entire part. At each iteration, the accumulator is updated, first by adding 2 and then 3. At this point, sum_so_far = 6 (0 + 1 + 2 + 3).
After all three iterations of the inner loop occur, the inner loop stops. The Python interpreter is done executing this statement.
The next iteration of the outer loop occurs; numbers is assigned to the list [10, -5].
Again, the body of the outer loop occurs.
number = 10 and number = -5. sum_so_far is reassigned twice more, with a final value of 11 (6 + 10 + -5).The outer loop iterates one more time, for numbers = [100].
Again, the body of the outer loop occurs.
number = 100. sum_so_far is reassigned to 111 (11 + 100).At last, there are no more iterations of the outer loop, and so it stops.
After the outer loop is done, the return statement executes, returning the value of sum_so_far, which is 111.
Whew, that’s a lot of writing! We can summarize the above behaviour by creating a loop accumulation table. Note that the table below has the same structure as the ones we’ve seen before, but is more complex because its columns include both the outer and inner loop variables and iterations. The accumulator column shows the value of sum_so_far at the end of the iteration of the inner loop. Pay close attention to the order of the rows, as this matches the order of execution we described above.
| Outer loop iteration | Outer loop variable (list_of_numbers) |
Inner loop iteration | Inner loop variable (number) |
Accumulator (sum_so_far) |
|---|---|---|---|---|
| 0 | 0 |
|||
| 1 | [1, 2, 3] |
0 | 0 |
|
| 1 | [1, 2, 3] |
1 | 1 |
1 |
| 1 | [1, 2, 3] |
2 | 2 |
3 |
| 1 | [1, 2, 3] |
3 | 3 |
6 |
| 2 | [10, -5] |
0 | 6 |
|
| 2 | [10, -5] |
1 | 10 |
16 |
| 2 | [10, -5] |
2 | -5 |
11 |
| 3 | [100] |
0 | 11 |
|
| 3 | [100] |
1 | 100 |
111 |
Our next example illustrates how to use nested loops on two different collections, obtaining all pairs of possible values from each collection. If that sounds familiar, well, it should be!
def product(set1: set, set2: set) -> set[tuple]:
"""Return the Cartesian product of set1 and set2.
>>> result = product({10, 11}, {5, 6, 7})
>>> result == {(10, 5), (10, 6), (10, 7), (11, 5), (11, 6), (11, 7)}
True
"""Before we get to writing any loops at all, let’s remind ourselves how we would write a comprehension to compute the Cartesian product:
>>> set1 = {10, 11}
>>> set2 = {5, 6, 7}
>>> result = {(x, y) for x in set1 for y in set2}
>>> result == {(10, 5), (10, 6), (10, 7), (11, 5), (11, 6), (11, 7)}
TrueNow we’ll see how to write this using nested for loop:
def cartesian_product(set1: set, set2: set) -> set[tuple]:
"""Return the Cartesian product of set1 and set2.
>>> result = cartesian_product({10, 11}, {5, 6, 7})
>>> result == {(10, 5), (10, 6), (10, 7), (11, 5), (11, 6), (11, 7)}
True
"""
# ACCUMULATOR product_so_far: keep track of the tuples from the pairs
# of elements visited so far.
product_so_far = set()
for x in set1:
for y in set2:
product_so_far = set.union(product_so_far, {(x, y)})
return product_so_farAs we saw in our first example, here the inner loop for y in set2 iterates through every element of set2 for every element of x in set1. You can visualize this in the following loop accumulation table:
| Outer loop iteration | Outer loop var (x) |
Inner loop iteration | Inner loop var (y) |
Accumulator (product_so_far) |
|---|---|---|---|---|
| 0 | set() |
|||
| 1 | 10 |
0 | set() |
|
| 1 | 10 |
1 | 5 |
{(10, 5)} |
| 1 | 10 |
2 | 6 |
{(10, 5), (10, 6)} |
| 1 | 10 |
3 | 7 |
{(10, 5), (10, 6), (10, 7)} |
| 2 | 11 |
0 | {(10, 5), (10, 6), (10, 7)} |
|
| 2 | 11 |
1 | 5 |
{(10, 5), (10, 6), (10, 7), (11, 5)} |
| 2 | 11 |
2 | 6 |
{(10, 5), (10, 6), (10, 7), (11, 5), (11, 6)} |
| 2 | 11 |
3 | 7 |
{(10, 5), (10, 6), (10, 7), (11, 5), (11, 6), (11, 7)} |
Another way of visualizing the return value is:
{
(10, 5), (10, 6), (10, 7), # First three tuples are from the first iteration of the outer loop
(11, 5), (11, 6), (11, 7) # Next three tuples are from the second iteration of the outer loop
}Both the sum_all and cartesian_product examples we’ve seen so far have used a single accumulator that is updated inside the inner loop body. However, each loop can have its own accumulator (and in fact, more than one accumulator). This is more complex, but offers more flexibilty than a single accumulator does alone.
As an example, suppose we have a list of lists of integers called grades. Each element of grades corresponds to a course and contains a list of grades obtained in that course. Let’s see an example of the data:
>>> grades = [
... [70, 75, 80], # ENG196
... [70, 80, 90, 100], # CSC110
... [80, 100] # MAT137
... ]Notice how the list of grades for course ENG196 does not have the same length as CSC110 or MAT137. Our goal is to return a new list containing the average grade of each course. We saw in Section 4.5 how to use loops to calculate the average of a collection of numbers:
def average(numbers: Iterable[int]) -> float:
"""Return the average of a collection of integers.
Preconditions:
- len(numbers) > 0
"""
# ACCUMULATOR len_so_far: keep track of the number of elements seen so far in the loop.
len_so_far = 0
# ACCUMULATOR total_so_far: keep track of the total of the elements seen so far in the loop.
total_so_far = 0
for number in numbers:
len_so_far = len_so_far + 1
total_so_far = total_so_far + number
return total_so_far / len_so_farWe can calculate a list of averages for each course using a comprehension: Exercise: write a precondition expression to guarantee there are no empty lists in grades.
def course_averages_v1(grades: list[list[int]]) -> list[float]:
"""Return a new list for which each element is the average of the grades
in the inner list at the corresponding position of grades.
>>> course_averages_v1([[70, 75, 80], [70, 80, 90, 100], [80, 100]])
[75.0, 85.0, 90.0]
"""
return [average(course_grades) for course_grades in grades]We can translate this into a for loop using a list accumulator variable and list concatenation for the update:
def course_averages_v2(grades: list[list[int]]) -> list[float]:
"""Return a new list for which each element is the average of the grades
in the inner list at the corresponding position of grades.
>>> course_averages_v2([[70, 75, 80], [70, 80, 90, 100], [80, 100]])
[75.0, 85.0, 90.0]
"""
# ACCUMULATOR averages_so_far: keep track of the averages of the lists
# visited so far in grades.
averages_so_far = []
for course_grades in grades:
course_average = average(course_grades)
averages_so_far = averages_so_far + [course_average]
return averages_so_farNow let’s see how to calculate the course_average variable for each course by using an inner loop instead of the average function. We can do this by expanding the definition of average directly in the loop body, with just a few minor tweaks:
def course_averages_v3(grades: list[list[int]]) -> list[float]:
"""Return a new list for which each element is the average of the grades
in the inner list at the corresponding position of grades.
>>> course_averages_v3([[70, 75, 80], [70, 80, 90, 100], [80, 100]])
[75.0, 85.0, 90.0]
"""
# ACCUMULATOR averages_so_far: keep track of the averages of the lists
# visited so far in grades.
averages_so_far = []
for course_grades in grades:
# ACCUMULATOR len_so_far: keep track of the number of elements seen so far in course_grades.
len_so_far = 0
# ACCUMULATOR total_so_far: keep track of the total of the elements seen so far in course_grades.
total_so_far = 0
for grade in course_grades:
len_so_far = len_so_far + 1
total_so_far = total_so_far + grade
course_average = total_so_far / len_so_far
averages_so_far = averages_so_far + [course_average]
return averages_so_farIt may be surprising to you that we can do this! Just as how in the last chapter we saw that we can take a predicate and expand it into its definition, we can do the same thing for Python functions with multiple statements in their body. The only change we needed to make was the return statement of average. The original function had the statement return total_so_far / len_so_far. Because our loop assigned this return value to course_average, we changed the code to:
One important note about the structure of this nested loop is that the inner loop accumulators are assigned to inside the body of the outer loop*, rather than at the top of the function body. This is because the accumulators len_so_far and total_so_far are specific to course_grades, which changes at each iteration of the outer loop. The statements len_so_far = 0 and total_so_far = 0 act to “reset” these accumulators for each new course_grades list.
Let’s take a look at our final loop accumulation table in this section, which illustrates the execution of course_averages_v3([[70, 75, 80], [70, 80, 90, 100], [80, 100]]) and how each loop variable and accumulator changes. Please take your time studying this table carefully—it isn’t designed to be a “quick read”, but to really deepen your understand of what’s going on!
| Outer loop iteration | Outer loop variable (course_grades) |
Inner loop iteration | Inner loop variable (grade) |
Inner accumulator (len_so_far) |
Inner accumulator (total_so_far) |
Outer accumulator (averages_so_far) |
|---|---|---|---|---|---|---|
| 0 | [] |
|||||
| 1 | [70, 75, 80] |
0 | 0 |
0 |
[] |
|
| 1 | [70, 75, 80] |
1 | 70 |
1 |
70 |
[] |
| 1 | [70, 75, 80] |
2 | 75 |
2 |
145 |
[] |
| 1 | [70, 75, 80] |
3 | 80 |
3 |
225 |
[75.0] |
| 2 | [70, 80, 90, 100] |
0 | 0 |
0 |
[75.0] |
|
| 2 | [70, 80, 90, 100] |
1 | 70 |
1 |
70 |
[75.0] |
| 2 | [70, 80, 90, 100] |
2 | 80 |
2 |
150 |
[75.0] |
| 2 | [70, 80, 90, 100] |
3 | 90 |
3 |
240 |
[75.0] |
| 2 | [70, 80, 90, 100] |
4 | 100 |
4 |
340 |
[75.0, 85.0] |
| 3 | [80, 100] |
0 | 0 |
0 |
[75.0, 85.0] |
|
| 3 | [80, 100] |
1 | 80 |
1 |
80 |
[75.0, 85.0] |
| 3 | [80, 100] |
2 | 100 |
2 |
180 |
[75.0, 85.0, 90.0] |
Nested for loops are a powerful tool in our understanding of the Python programming language, but they are by far the most complex and most error-prone that we’ve studied so far. Just as we saw with nested expressions and nested if statements, nested loops have the potential to greatly increase the size and complexity of our code. Contrast the implementation of course_averages_v3 against course_averages_v2 (or course_averages_v1), for example.
While nested loops are sometimes inevitable or convenient, we recommend following these guidelines to simplify your use of nested loops to help you better understand your code:
sum_all and cartesian_product).sum or len), use the built-in function instead.course_averages_v3), move the accumulator and inner loop into a new function, and call that function from within the original outer loop.So far, we have largely treated objects and variables in Python as being constant over time: once an object is created or a variable is initialized, its value has not changed during the program. This property has made it easier to reason about our code: once we set the value of the variable once, we can easily look up its value at any later point in the program.Indeed, this is a fact that we take for granted in mathematics: if we say “let \(x\) = 10” in a calculation or proof, we expect \(x\) to keep that same value from start to finish!
However, in programs it is sometimes useful to have objects and variables change value over time. We saw one example of this last week when we studied for loops, in which both the loop variable and accumulator take on multiple values over the course of running the loop. In this section, we’ll introduce two related but distinct actions in a program: variable reassignment and object mutation.
Recall that a statement of the form ___ = ___ is called an assignment statement, which takes a variable name on the left-hand side and an expression on the right-hand side, and assigns the value of the expression to the variable.
A variable reassignment is a Python action that assigns a value to a variable that already refers to a value. The most common kind of variable reassignment is with an assignment statement:
A variable reassignment changes which object a variable refers to. In the above example, variable x changes from referring to an object representing the number 1 to an object representing 5.
The loops that we studied last week all used variable reassignment to update the accumulator variable inside the loop.
def my_sum(nums: list[int]) -> int:
sum_so_far = 0
for num in nums:
sum_so_far = sum_so_far + num
return sum_so_farAt each iteration, the statement sum_so_far = sum_so_far + num did two things:
sum_so_far + num) using the current value of sum_so_far, obtaining a new object.sum_so_far to refer to that new object.This is the Python mechanism that causes sum_so_far to refer to the total sum at the end of the loop, which of course was the whole point of the loop! Indeed, updating loop accumulators is one of the most natural uses of variable reassignment.
This loop actually illustrates another common form of variable reassignment: reassigning the loop variable to a different value at each for loop iteration. For example, when we call my_sum([10, 20, 30]), the loop variable num gets assigned to the value 10, then the value 20, and then the value 30.
Consider the following Python code snippet:
Here, the variable x is reassigned to 7 on line 3. But what happens to y? Does it now also get “reassigned” to 9 (which is 7 + 2), or does it stay at its original value 3?
We can express Python’s behaviour here with one simple rule: variable reassignment only changes the immediate variable being reassigned, and does not change any other variables or objects, even ones that were defined using the variable being reassigned. And so in the above example, y still refers to the value 3, even after x is reassigned to 7.
This rule might seem a bit strange at first, but is actually the simplest way that Python could execute variable reassignment: it allows programmers to reason about these assignment statements in a top-down order, without worrying that future assignment statements could affect previous ones. If we’re tracing through our code carefully and read y = x + 2, I can safely predict the value of y based on the current value of x, without worrying about how x might be reassigned later in the program.
That said, there is one complication with this line of reasoning that comes up with the next form of “value change”, object mutation.
In 4.7 Nested Loops, we saw how product could help us calculate the Cartesian product by accumulating all possible pairs of elements in a list. Consider a function that also accumulates values in a list:
def squares(nums: list[int]) -> list[int]:
"""Return a list of the squares of the given numbers."""
squares_so_far = []
for num in nums:
squares_so_far = squares_so_far + [num * num]
return squares_so_farBoth the squares and product functions work properly, but are rather inefficient. We’ll study what we mean by “inefficient” more precisely later in this course. In squares, each loop iteration creates a new list object (a copy of the current list plus one more element at the end) and reassigns squares_so_far to it. It would be easier (and faster) if we could somehow reuse the same object but modify it by adding elements to it; the same applies to other collection data types like set and dict as well.
In Python, object mutation (often shortened to just mutation) is an operation that changes the value of an existing object. For example, Python’s list data type contains several methods that mutate the given list object rather than create a new one. Here’s how we could improve our squares implementation by using list.append,Check out Appendix A.2 Python Built-In Data Types Reference for a list of methods, including mutating ones, for lists, sets, dictionaries, and more. a method that adds a single value to the end of a list:
def squares(nums: list[int]) -> list[int]:
"""Return a list of the squares of the given numbers."""
squares_so_far = []
for num in nums:
list.append(squares_so_far, num * num)
return squares_so_farNow, squares runs by assigning squares_so_far to a single list object before the loop, and then mutating that list object at each loop iteration. The outward behaviour is the same, but this code is more efficient because a bunch of new list objects are not created. To use the terminology from before, squares_so_far is not reassigned; instead, the object that it refers to gets mutated.
One final note: you might notice that the loop body calls list.append without an assignment statement. This is because list.append returns None, a special Python value that indicates “no value”. Just as we explored previously with the print function, list.append has a side effect that it mutates its list argument, but does not return anything.
We say that a Python data type is mutable when it supports at least one kind of mutating operation, and immutable if it does not. Sets, lists, and dictionaries are all mutable data types, as are the data classes we studied in the previous chapter. All of the non-collection types we’ve studied—int, float, bool, and str—are immutable.
Instances of an immutable data type cannot change their value during the execution of a Python program. So for example, if we have an object representing the number 3 in Python, that object’s value will always be 3. But remember, a variable that refers to this object might be reassigned to a different object later. This is why is is important that we differentiate between variables and objects!
list vs. tuple, and what’s in a setAll the way back in 1.3 Representing Data in Python, we introduced two Python data types that could be used to represent ordered sequences, list and tuple. We’ve been using them fairly interchangeably for the past few chapters, but are now ready to discuss the difference between them. In Python, a list is mutable, but a tuple is immutable. For example, we can modify a list value by adding an element with list.append, but there is no equivalent tuple.append, nor any other mutating method on tuples.
So why bother with tuples at all? Because in Python, sets may only contain immutable objects, and dicts may only contain immutable keys. So for example, we cannot have a set of sets or set of lists in Python, but we can have a list of lists, which is why studied nested lists in the last chapter.
Of course, from a theoretical standpoint a set can have elements that are other sets! So this restriction is a quirk of Python’s built-in data types that we just have to live with when using this programming language. In case you’re curious, there is another Python data type, frozenset, which is an immutable version of set. We just won’t be using it in this course.
Variable reassignment and object mutation are distinct concepts. Reassignment will change which object a variable refers to, sometimes creating a brand new object (e.g., when we used a list accumulator in squares). Object mutation changes the object itself, independent of what variable(s) refer t othat object.
Yet we have presented them here in the same section because they share a fundamental similarity: they both result in variables changing values over the course of a program. To illustrate this point, consider the following hypothetical function definition:
def my_function(...) -> ...:
x = 10
y = [1, 2, 3]
... # Many lines of code
... # Many lines of code
... # Many lines of code
... # Many lines of code
... # Many lines of code
... # Many lines of code
return x * len(y) + ...We’ve included for effect a large omitted “middle” section of the function body, showing only the initialization of two local variables at the start of the function and a final return statement at the end of the function.
If the omitted code does not contain any variable reassignment or object mutation, then we can be sure that in the return statement, x still refers to 10 and y still refers to [1, 2, 3], regardless of what other computations occurred in the omitted lines! In other words, without reassignment and mutation, these assignment statements are universal across the function body: “for all points in the body of my_function, x == 10 and y == [1, 2, 3].” Such universal statements make our code easier to reason about, as we can determine the values of these variables from just the assignment statement that creates them.
Variable reassignment and object mutation weaken this property. For example, if we reassign x or y (e.g., x = 100) in the middle of the function body, the return statement obtains a different value for x than 10. Similarly, if we mutate y (e.g., list.append(y, 100)), the reutn statement obtains a different value for y than [1, 2, 3]. Introducing reassignment and mutation makes our code harder to reason about, as we need to track all changes to variable values line by line.
Because of this, you should avoid using variable reassignment and object mutation when possible, and use them in structured code patterns like we saw with the loop accumulator pattern. Over the course of this chapter, we’ll study other situations where reassignment and mutation are useful, and introduce a new memory model to help us keep track of changing variable values in our code.
In the last section, we introduced the concept of mutable data types, and saw how we could mutate Python lists with the list.append method. In this section, we’ll survey some of the other ways of mutating lists and other mutable Python data types. For a full reference of Python’s mutating methods on these data types, please see Appendix A.2 Python Built-In Data Types Reference.
list.append, list.insert, and list.extendIn addition to list.append, there are two other ways of adding new items to a Python list. The first is list.insert, which takes a list, an index and an object, and inserts the object at the given index into the list at the given index.
>>> strings = ['a', 'b', 'c', 'd']
>>> list.insert(strings, 2, 'hello') # Insert 'hello' into strings at index 2
>>> strings
['a', 'b', 'hello', 'c', 'd']The second is list.extend, which takes two lists and adds all items from the second list at the end of the first list, as if append were called once per element of the second list.
>>> strings = ['a', 'b', 'c', 'd']
>>> list.extend(strings, ['CSC110', 'CSC111'])
>>> strings
['a', 'b', 'c', 'd', 'CSC110', 'CSC111']There is one more way to put a value into a list: by overwriting the element stored at a specific index. Given a list lst, we’ve seen that we can access specific elements using indexing syntax lst[0], lst[1], lst[2], etc. We can also use this kind of expression as the left side of an assignment statement to mutate the list by modifying a specific index.
Note that unlike list.insert, assigning to an index removes the element previously stored at that index from the list!
Python sets are mutable. Because they are unordered, they are simpler than lists, and offer just two main mutating methods: set.add and set.remove, which (as you can probably guess) add and remove an element from a set, respectively. list also provides a few mutating methods that remove elements, though we did not cover them in this section. We’ll illustrate set.add by showing how to re-implement our squares function from the previous section with set instead of list:
def squares(numbers: set[int]) -> set[int]:
"""Return a set containing the squares of all the given numbers.
...
"""
squares_so_far = set()
for n in numbers:
set.add(squares_so_far, n * n)
return squares_so_farNote that set.add will only add the element if the set does not already contain it, as sets cannot contain duplicates. In addition, sets are unordered whereas list.append will add the element to the end of the sequence.
The most common ways for dictionaries to be mutated is by adding a new key-value pair, or changing the associated value for a key-value pair in the dictionary. This does not use a function call, but rather the same syntax as assigning by list index.
The second assignment statement adds a new key-value pair to items, with the key being 'c' and the items being 3. In this case, the left-hand side of the assignment is not a variable but instead an expression representing a component of items, in this case the key 'c' in the dictionary. When this assignment statement is evaluated, the right-hand side value 3 is stored in the dictionary items as the corresponding value for 'c'.
Assignment statements in this form can also be used to mutate the dictionary by taking an existing key-value pair and replacing the value with a different one. Here’s an example of that:
Python data classes are mutable by default. Technically there is a way to create immutable data classes, but this is beyond the scope of this course. To illustrate this, we’ll return to our Person class:
@dataclass
class Person:
"""A person with some basic demographic information.
Representation Invariants:
- self.age >= 0
"""
given_name: str
family_name: str
age: int
address: strWe mutate instances of data classes by modifying their attributes. We do this by assigning to their attributes directly, using dot notation on the left side of an assignment statement.
>>> p = Person('David', 'Liu', 100, '40 St. George Street')
>>> p.age = 200
>>> p
Person(given_name='David', family_name='Liu', age=200, address='40 St. George Street')One note of caution here: as you start mutating data class instances, you must always remember to respect the representation invariants associated with that data class. For example, setting p.age = -1 would violate the Person representation invariant. To protect against this, python_ta checks representation invariants whenever you assign to attributes of data classes, as long as the python_ta.contracts.check_all_contracts function has been called in your file.
In [1.4 Storing Data in Variables], we introduced the value-based memory model to help keep track of variables and their values:
| Variable | Value |
|---|---|
distance1 |
1.118033988749895 |
distance2 |
216.14809737770074 |
From this table we can surmise that there are two variables (distance1 and distance2), each associated with a float value. However, now that we know about reassignment and mutation, a more complex memory model is needed: the object-based memory model, which we’ll simply call the Python memory model, as this is the “standard” representation Python stores data.
Recall that every piece of data is stored in a Python program in an object. But how are the objects themselves stored? Every computer program (whether written in Python or some other language) stores data in computer memory, which you can think of as a very long list of storage locations. Each storage location is labelled with a unique memory address. In Python, every object we use is stored in computer memory at a particular location, and it is the responsibility of the Python interpreter to keep track of which objects are stored at which memory locations.
As programmers, we cannot control which memory addresses are used to store objects, but we can access a representation of this memory address using the built-in id function:
Formally, we define the id of a Python object as a unique int identifier to refer to this object.The details of how Python translates memory addresses into the integers are not important to us. Every object in Python has three important properties—id, value, and type—but of these three, only its id is guaranteed to be unique.
In Python, a variable is not an object and so does not actually store data; variables store an id that refers to an object that stores data. We also say that variables contain the id of an object. This is the case whether the data is something very simple like an int or more complex like a str. To make this distinction between variable and objects clear, we separate them in different parts of the Python memory model.
As an example, consider this code:
In our value-based memory model we would have represented these variables in a table:
| Variable | Value |
|---|---|
x |
3 |
word |
'bonjour' |
With the full object-based Python memory model, we instead draw one table-like structure on the left showing the mapping between variables and object ids, and then the objects on the right. Each object is represented as a box, with its id in the upper-left corner, type in the upper-right corner, and value in the middle. The actual object id reported by the id function has many digits, and its true value isn’t important; we just need to know that each object has a unique identifier. So for our drawings we make up short identifiers such as id92.

So there is no 3 inside the box for variable x. Instead, there is the id of an object whose value is 3. The same holds for variable word; it references an object whose value is 'bonjour'.
Notice that we didn’t draw any arrows. Programmers often draw an arrow when they want to show that one thing references another. This is great once you are very confident with a language and how references work. But in the early stages, you are much more likely to make correct predictions if you write down references (you can just make up id values) rather than arrows.
You’ve written code much more complex that what’s above, but now that we have the full Python memory model, we can understand a few more details for fundamental Python operations. These details are foundational for writing and debugging the more complex code you will work on this year. So let’s pause for a moment and be explicit about two things.
Evaluating an expression. First, we said earlier that evaluating any Python expression produces a value. We now know that it is more precise to say that evaluating any Python expression produces an id of an object representing the value of the expression. Exactly what this object is depends on the kind of expression evaluated:
176.4 or 'hello', Python creates an object of the appropriate type to hold the value.NameError is raised. If it does exist, the expression produces the id stored in that variable.+ or %, first Python evaluates the expression’s two operands and applies the operator to the resulting values, creating a new object of the appropriate type to hold the resulting value. The expression produces the id of the new object.Assignment statements. Second, we said earlier that an assignment statement is executed by first evaluating the right-hand side expression, and then storing it in the left-hand side variable. Here is a more precise version of what happens:
So far, the only objects we’ve looked at in the Python memory model are instances of primitive data types. What about compound data types like collections and data classes? Now that we have our object-based memory model, we are in a position to truly understand how Python represents these data types. An instance of a compound data type does not store values directly; instead, it stores the ids of other objects.
Let’s see what this means for some familiar collection data types.
Lists. Here is an object-based memory model diagram showing the state of memory after executing lst = [1, 2, 3].

Notice that there are four separate objects in this diagram: one for the each of the ints 1, 2, and 3, and then one for the list itself. This illustrates one of the trade-offs with the Python memory model. It is more accurate than our value-based memory model, but that accuracy comes at the cost of having more parts and therefore more time-consuming to create.
Sets. Here is an object-based memory model diagram showing how Python represents the set my_set = {1, 2, 3}.

Dictionaries. Here is an object-based memory model diagram showing the dictionary my_dict = {'a': 1, 'b': 2}. There are five objects in total!

Data classes. All Python data classes are compound data types, and instances also store the ids of other objects. Unlike the collection data types we looked at above, these ids are not bundled in a collection, but instead each associated with a particular instance attribute. Here is how we represent our favourite Person object.

You may have noticed one difference between how we drew the object boxes of the primitive vs. compound data types above. We will use the convention of drawing a double box around objects that are immutable. Think of it as signifying that you can’t get in there and change anything.
Our last topic in this section will be to use our object-based memory model to visualize variable reassignment and object mutation in Python.
Consider this simple case of variable reassignment:
Here is what our memory model looks like after the first and second lines execute:
| Before reassignment | After reassignment |
|---|---|
![]() |
![]() |
Using this diagram, we can see what happens when we execute the reassignment s = ['a', 'b']: a new list object ['a', 'b'] is created, and variable s is assigned the id of the new object. The original list object [1, 2] is not mutated. Variable reassignment does not mutate any objects; instead, it changes what a variable refers to. We can see this in the interpreter by using the id function to tell what object s refers to before and after the reassignment:
Notice that the ids are different, indicating that s refers to a new object.
Contrast this with using a mutating list method like list.append:
| Before mutation | After mutation |
|---|---|
![]() |
![]() |
In this case, no new list object is created, though a new int object is. Instead, the list object [1, 2] is mutated, and a third id is added at its end. Note that even changing the list’s size doesn’t change its id! Again, we can verify that x refers to the same list object by inspecting ids:
And finally, one last example that blends assignment and mutation: assigning to part of a compound data type. Consider this code:
What happens in this case?
| Before mutation | After mutation |
|---|---|
![]() |
![]() |
The statement s[1] = 300 is also a form of reassignment, but rather than reassigning a variable, it reassigns an id that is part of an object. This means that this statement does mutate an object, and doesn’t reassign any variables. We can verify that the id of s doesn’t change after the index assignment.
Through our new object-based memory model, we’ve seen that the Python interpreter associates each variable with the id of an object. There is nothing stopping two or more variables from containing the same id, which means that two variables can refer to the same object. This causes some interesting situations when more than one variable refers to the same mutable object. In this section, we will use our memory model to better understand this specific (and common) situation.
Let v1 and v2 be Python variables. We saw that v1 and v2 are aliases when they refer to the same object. The word “alias” is commonly used when a person is also known under a different name. For example, we might say “Eric Blair, alias George Orwell.” We have two names for the same thing, in this case a person.
Consider the following Python code:
x and z are aliases, as they both reference the same object. As a result, they have the same id. You should think of the assignment statement z = x as saying “make z refer to the object that x refers to.” After doing so, they have the same id.
In contrast, x and y are not aliases. They each refer to a list object with [1, 2, 3] as its value, but they are two different list objects, stored separately in your computer’s memory. This is again reflected in their different ids.
Here is the state of memory after the code executes:

Aliasing is often a source of confusion for programmers because it allows “mutation at a distance”: the modification of a variable’s value without explicitly mentioning that variable. Here’s an example:
The statement x[0] = -999 line mutates the value of z. But without ever mentioning x, it also mutates the value of x!
Imprecise language can lead us into misunderstanding the code. We said above that “the third line mutates the value of z”. To be more precise, the third line mutates the object that z refers to. Of course we can also say that it mutates the object that x refers to—they are the same object.

The key thing to notice about this example is that just by looking at the line of code, z[0] = -999, you can’t tell that x has changed. You need to know that on a previous line, z was made an alias of x. This is why you have to be careful when aliasing occurs.
Contrast the previous code with this sequence of statements instead:
Can you predict the value of x on the last line? Here, the third line mutates the object that y refers to, but because it is not the same object that x refers to, we still see [1, 2, 3] if we evaluate x. Here’s the state of memory after these lines execute:

What if we did this instead?
Again, we have made x and z refer to the same object. So when we change z on the third line, does x also change? This time, the answer is an emphatic no, and it is because of the kind of change we make on the third line. Instead of mutating the object that z refers to, we reassign z refer to a new object. This obviously can have no effect on the object that x refers to (or any object). Even if we switched the example from using immutable tuples to using mutable lists, x would be unchanged.
Given two aliases x and z, if we reassign x to a new object, that has no effect on z. We say that reassigning x breaks the aliasing, as afterwards x and z no longer refer to the same object, and so are no longer aliases.
In Chapter 4, we saw two types of loops: element-based and index-based for loops. With index-based loops, the loop variable referred to an integer object that could be used as an index to a collection (typically a list). But in element-based for loops, the loop variable is an alias to one of the objects within the collection. Suppose we have the following element-based for loop:
>>> numbers = [5, 6, 7]
>>> for number in numbers:
... number = number + 1
...
>>> numbers
[5, 6, 7]Notice how the values in the list numbers did not change (i.e., the for loop did not mutate numbers). This is because the loop variable number is an alias for the integer objects found inside numbers. The assignment statement inside the for loop simply changes what the loop variable refers to, but does not change what the contents of the list numbers refers to. If we would like to increment each object contained in the list, we must use an index-based for loop:
>>> numbers
[5, 6, 7]
>>> for i in range(0, len(numbers)):
... numbers[i] = numbers[i] + 1
...
>>> numbers
[6, 7, 8]The assignment statement in the index-based for loop is fundamentally different from the assignment statement in the element-based for loop. Statements of the form <name> = _______ are reassign the variable <name> to a new value. But assignment statements of the form <name>[<index>] = ______ mutate the list object that <name> currently refers to.
Let’s look one more time at this code:
>>> x = [1, 2, 3]
>>> y = [1, 2, 3]
>>> z = x
>>> id(x)
4401298824
>>> id(y)
4404546056
>>> id(z)
4401298824What if we wanted to see whether x and y, for instance, were the same? Well, we’d need to define precisely what we mean by “the same”. Our familiar == operator checks whether two objects have the same value. This is called value equality.
But there is another Python operator, is, which checks whether two objects have the same ids. This is called identity equality.
Identity equality is a stronger property than value equality: for all objects a and b, if a is b then a == b. In Python it is technically possible to change the behaviour of == in unexpected ways (like always returning False), but this is a poor programming practice and we won’t consider it in this course. The converse is not true, as we see in the above example: a == b does not imply a is b.
Aliasing also exists for immutable data types, but in this case there is never any “action at a distance”, precisely because immutable values can never change. In the example below, x and z are aliases of a tuple object. It is impossible to modify x’s value by mutating the object z refers to, since we can’t mutate tuples at all.
>>> x = (1, 2, 3)
>>> z = x
>>> z[0] = -999
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: 'tuple' object does not support item assignmentThe above discussion actually has a very interesting implication for how we reason about variables referring to immutable objects: if two variables have the same immutable value, the program’s behaviour does not depend on whether the two variables are aliases or not.
For example, consider the following two code snippets:
These two code snippets will always behave the same way, regardless of what my_function actually does! Because x and y refer to immutable values, the behaviour of my_function depends only on the values of the object, and not their ids.
This allows the Python interpreter to save a bit of computer memory by not creating new objects for some immutable values. For example, every occurrence of the boolean value True refers to the same object:
>>> id(True)
1734328640
>>> x = True
>>> id(True)
1734328640
>>> id(10 > 3)
1734328640
>>> id(not False)
1734328640A bit more surprisingly, “small” integers are automatically aliased, while “large” integers are not:
>>> x = 43
>>> y = 43
>>> x is y
True
>>> id(x)
1734453840
>>> id(y)
1734453840
>>> a = 1000
>>> b = 1000
>>> a is b
False
>>> id(a)
16727840
>>> id(b)
16727856The other immutable data type where the Python interpret takes this object creation “shortcut” is with some string values:
>>> name1 = 'David'
>>> name2 = 'David'
>>> name1 is name2
True
>>> full_name1 = 'David Liu'
>>> full_name2 = 'David Liu'
>>> full_name1 is full_name2
FalseThe exact rules for when the Python interpreter does and does not take this shortcut are beyond the scope of this course, and actually change from one version of Python to the next. For the purpose of writing Python code and doing object comparisons, the bottom line is:
is to compare for equality. Though also keep in mind that you should never write <expr> is True or <expr> is False, since these are equivalent to the simpler <expr> and not <expr>, respectively.== to compare for equality, as using is can lead to surprsing results.== to compare value equality (almost always what you want).is to check for aliasing (almost never what you want).So far in this chapter, we have talked only about variables defined within the Python console. In 2.3 Local Variables and Function Scope, we saw how to represent function scope in the value-based memory model using separate “tables of values” for each function call. In this section, we’ll see how to represent function scope in the full Python memory model so that we can capture exactly how function scope works and impacts the variables we use throughout the lifetime of our programs.
Suppose we define the following function, and then call it in the Python console:
def repeat(n: int, s: str) -> str:
message = s * n
return message
# In the Python console
>>> count = 3
>>> word = 'abc'
>>> result = repeat(count, word)Consider what the state of memory is when repeat(count, word) is called, immediately before the return message statement executes. Let’s first recall how we would draw the value-based memory model for this point:
| Variable | Value |
|---|---|
count |
3 |
word |
'abc' |
| Variable | Value |
|---|---|
n |
3 |
s |
'abc' |
message |
'abcabcabc' |
This memory model shows two tables, showing the variables defined in the Python console (count, word), and the variables local to the function repeat (n, s, and message).
Here is how we would translate this into a full Python memory model diagram:

As with the diagrams we saw in the previous sections of this chapter, our variables are on the left side of the diagram, and the objects on the right. The variables are separated into two separate boxes, one for the Python console and one for the function call for repeat. All variables, regardless of which box they’re in, store only ids that refer to objects on the right-hand side. Notice that count and n are aliases, as are word and s.
Now that we have this full diagram, we’ll introduce a more formal piece of terminology. Each “box” on the left-hand side of our diagram represents a stack frame (or just frame for short), which is a special data type used by the Python interpreter to keep track of the functions that have been called in a program, and the variables defined within each function. We call the collection of stack frames the function call stack.
Every time we call a function, the Python interpreter does the following:
What we often call “parameter passing” is a special form of variable assignment in the Python interpreter. In the example above, when we called repeat(count, word), it is as if we wrote
before executing the body of the function.
This aliasing is what allows us to define functions that mutate their argument values, and have that effect persist after the function ends. Here is an example:
def emphasize(words: list[str]) -> None:
"""Add emphasis to the end of a list of words."""
new_words = ['believe', 'me!']
list.extend(words, new_words)
# In the Python console
>>> sentence = ['winter', 'is', 'coming']
>>> emphasize(sentence)
>>> sentence
['winter', 'is', 'coming', 'believe', 'me!']When emphasize(sentence) is called in the Python console, this is the state of memory:

In this case, words and sentence are aliases, and so mutating words within the function causes a change to occur in __main__ as well.
On the other hand, consider what happens with this version of the function:
def emphasize_v2(words: list[str]) -> None:
"""Add emphasis to the end of a list of words."""
new_words = ['believe', 'me!']
words = words + new_words
# In the Python console
>>> sentence = ['winter', 'is', 'coming']
>>> emphasize_v2(sentence)
>>> sentence
['winter', 'is', 'coming']After we call emphasize_v2 in the Python console, the value of sentence is unchanged! To understand why, let’s look at two memory model diagrams. The first shows the state of memory immediately after new_words = ['believe', 'me!'] is executed:
![Diagram of emphasize_v2 after new_words = [‘believe’, ‘me!’].](images/call_stack_reassignment1.png)
The next statement to execute is words = words + new_words. The key to understanding the next diagram is to recall variable reassignment: the right-hand side (words + new_words) is evaluated, and then the resulting object id is assigned to words. List concatenation with + creates a new list object.

Notice that in this diagram, words and sentence are no longer aliases! Instead, words has been assigned to a new list object, but sentence has remained unchanged. Remember the rule of variable reassignment: an assignment statement <name> = ... only changes what object the variable <name> refers to, but never changes any other variables. This illustrates the importance of keeping variable reassignment and object mutation as distinct concepts. Even though the bodies of emphasize and emphasize_v2 look very similar, the end result is very different: emphasize mutates its argument object, while emphasize_v2 actually leaves it unchanged!
The ability to mutate objects means that we have to be careful when writing functions that accept mutable types as parameters. In general, if a function’s documentation does not specify that an object will be mutated, then it must not be mutated. How can we test that no mutation occured? And, for functions that intend to mutate an object, how can we test that the correct change occured? In this section, we will extend our study of writing tests to answer both of these questions.
Consider the squares function we introduced at the beginning of the chapter:
def squares(nums: list[int]) -> list[int]:
"""Return a list of the squares of the given numbers."""
squares_so_far = []
for num in nums:
list.append(squares_so_far, num * num)
return squares_so_farThere are two lists in squares: the nums parameter, which is an input to the function; and the squares_so_far variable, which is an output of the function. Because squares_so_far is created by the function squares, it is okay that it is mutated (i.e., the call to list.append inside the for loop). However, the nums list is passed as an argument to squares. Because the docstring does not indicate that nums will be mutated, it is expected that the squares function will not mutate the list object referred to by nums.
We can contrast this with how we would document and implement a similar function that does mutate its input:
def square_all(nums: list[int]) -> None:
"""Modify nums by squaring each of its elements."""
for i in range(0, len(nums)):
nums[i] = nums[i] * nums[i]Let us write a test that ensures the squares function does not mutate the list referred to by nums:
def test_squares_no_mutation() -> None:
"""Confirm that squares does not mutate the list it is given.
"""
lst = [1, 2, 3]
squares(lst)
# TODO: complete the testIn order to test that a list is not mutated, we first create a list lst. Second, we call the squares function on lst; note that this function call returns a list of squares, but we do not assign the result to a variable because we don’t actually care about the returned value for the purpose of this test. This might seem a bit strange, as all of our tests so far have been about checking the return value of the function being tested. In practice, we would have such unit/property-based tests for squares as well, we just aren’t showing them here. We can now add an assertion that ensures lst has not been mutated:
def test_squares_no_mutation() -> None:
"""Test that squares does not mutate the list it is given.
"""
lst = [1, 2, 3]
squares(lst)
assert lst == [1, 2, 3]The variable lst originally had value [1, 2, 3]. So our assertion checks that after the call to squares, lst still has value [1, 2, 3]. Another way to accomplish this, without re-typing the list value, is by creating a copy of lst before the call to squares. We can do this using the list.copy method:
def test_squares_no_mutation() -> None:
"""Test that squares does not mutate the list it is given.
"""
lst = [1, 2, 3]
lst_copy = list.copy(lst) # Create a copy of lst (not an alias!)
squares(lst)
assert lst == lst_copyNote that the order of statements is very important when testing for mutation. We need to create the list and its copy before the call to squares. And we need to test for mutation (i.e., the assertion) after the call to squares.
You might notice that the above test_squares_no_mutation test function doesn’t actually use the specific elements of the list lst. That is, if we replaced lst’s value with another list, the test would behave in the exact same way. That makes this test very suitable to be generalized into a property-based test, representing the following property:
For all lists of integers
lst, callingsquares(lst)does not mutatelst.
Here is how we could implement such a property-based test using the technique we learned in 3.10 Testing Functions II: hypothesis. We’ve included the import statements to remind you about the ones from hypothesis you need for property-based tests.
from hypothesis import given
from hypothesis.strategies import lists, integers
@given(lst=lists(integers()))
def test_squares_no_mutation_general(lst: list[int]) -> None:
"""Test that squares does not mutate the list it is given.
"""
lst_copy = list.copy(lst) # Create a copy of lst (not an alias!)
squares(lst)
assert lst == lst_copyNow let’s consider testing the square_all function. One common error students make when writing tests for mutating functions is to check the return value of the function.
def test_square_all() -> None:
"""Test that square_all mutates the list it is given correctly.
"""
lst = [1, 2, 3]
result = square_all(lst)
assert result == [1, 4, 9]This test fails because square_all returns None, and None == [1, 4, 9] is False. Using result in our assertion is not useful for testing if lst was mutated. Instead, we must test if the value of lst has changed: Like test_squares_no_mutation, this test does not store the return value of the function being tested. But the reason is quite different!
def test_square_all_mutation() -> None:
"""Test that square_all mutates the list it is given correctly.
"""
lst = [1, 2, 3]
square_all(lst)
assert lst == [1, 4, 9]We can again generalize this test into a property-based test by storing a copy of the original list and verifying the relationship between corresponding elements. We’ll leave it as an exercise for you to read through and understand the following property-based test:
We’ve spent the first five chapters of this course studying programming in Python. We’ve been mainly focused on how we represent data and designing functions to operate on this data. Up to this point, the ideas behind the functions that we’ve written have been relatively straight-forward, and the challenge has been in implementing these ideas correctly using various programming techniques. Over the next two chapters, we are going to study algorithms where the ideas themselves will be more complex. It won’t be “obvious” how or why these algorithms work, and so to convince ourselves that these algorithms are correct, we’ll study the formal mathematics behind them.
Our first large example of this is one that will take us the next two chapters to develop: the RSA cryptosystem, consisting of a pair of algorithms that are central to modern Internet security. If you haven’t heard about RSA, cryptosystems, or ever thought about security, don’t worry, we’ll be building all of these concepts from the ground up over the course of this chapter and the next. What will set this apart from the kind of work we’ve done so far is that to understand what these algorithms do and why they work, we’ll need to step away from code and into the realm of number theory, the branch of mathematics concerned with properties of integers.
We’ll start our journey here with a few key definitions, some of which you’ve seen before defined formally in this course, and others that you might have heard about before, but not seen a formal definition.
Here are our first two definitions; these are repeated from 3.9 Working with Definitions.
Let \(n, d \in \Z\). We say that \(d\) divides \(n\) when there exists a \(k \in \Z\) such that \(n = dk\). We use the notation \(d \mid n\) to represent the statement “\(d\) divides \(n\)”.
The following phrases are synonymous with “\(d\) divides \(n\)”:
Let \(p \in \Z\). We say \(p\) is prime when it is greater than 1 and the only natural numbers that divide it are 1 and itself.
The next few definitions introduce and expand on the notion of common divisors between two numbers.
Let \(x, y, d \in \Z\). We say that \(d\) is a common divisor of \(x\) and \(y\) when \(d\) divides \(x\) and \(d\) divides \(y\).
We say that \(d\) is the greatest common divisor of \(x\) and \(y\) when it the largest number that is a common divisor of \(x\) and \(y\), or 0 when \(x\) and \(y\) are both 0.According to this definition, what is \(\gcd(0, n)\) when \(n > 0\)? We can define the function \(\gcd : \Z \times \Z \to \N\) as the function which takes numbers \(x\) and \(y\), and returns their greatest common divisor.
You might wonder whether this definition makes sense in all cases: is it possible for two numbers to have no divisors in common? One of the statements we will prove later in this chapter is that \(1\) divides every natural number. So at the very least, \(1\) is a common divisor between any two natural numbers. There is a special case, when \(1\) is the only positive divisor between two numbers.
Let \(m, n \in \Z\). We say that \(m\) and \(n\) are coprime when \(\gcd(m, n) = 1\).
The next definitions are introduced through a fundamental theorem in number theory, which extends the relationship of divisibility to that of remainders.
(Quotient-Remainder Theorem) For all \(n \in \Z\) and \(d \in \Z^+\), there exist \(q \in \Z\) and \(r \in \N\) such that \(n = qd + r\) and \(0 \leq r < d\). Moreover, these \(q\) and \(r\) are unique for a given \(n\) and \(d\).
We say that \(q\) is the quotient when \(n\) is divided by \(d\), and that \(r\) is the remainder when \(n\) is divided by \(d\).
In Python, for given integers n and d, we can compute their quotient using //, their remainder using %, and both at the same time using the built-in function divmod:
The final definition in this section introduces some notation that is extremely commonplace in number theory, and by extension in many areas of computer science. Often when we are dealing with relationships between numbers, divisibility is too coarse a relationship: as a predicate, it is constrained by the binary nature of its output. Instead, we often care about the remainder when we divide a number by another.
Let \(a, b, n \in \Z\) and assume \(n \neq 0\). We say that \(a\) is equivalent to \(b\) modulo \(n\) when \(n \mid a - b\). In this case, we write \(a \equiv b \pmod n\).One warning: the notation \(a \equiv b \MOD n\) is not exactly the same as mod or % operator you are familiar with from programming; here, both \(a\) and \(b\) could be much larger than \(n\), or even negative.
There are two related reasons why this notation is so useful in number theory. The first is that modular equivalence can be used to divide up numbers based on their remainders when divided by \(n\):
Let \(a, b, n \in \Z\) with \(n \neq 0\). Then \(a \equiv b \pmod n\) if and only if \(a\) and \(b\) have the same remainder when divided by \(n\). In Python, we could represent this as the expression a % n == b % n.
The second reason this is so useful is that almost all of the “standard” intuitions we have about equality transfer over this new notation as well, making it pretty easy to work with right at the very start.
Let \(a, b, c, n \in \Z\) with \(n \neq 0\). Then the following hold:
Let \(a, b, c, d, n \in \Z\) with \(n \neq 0\). If \(a \equiv c \pmod n\) and \(b \equiv d \pmod n\), then the following hold:
Note that this second theorem shows that the familiar addition, subtraction, and multiplication operations preserve modular equivalence relationships. However, as we’ll study further in this chapter, this is not the case with division!
In Chapter 3, we studied how to express statements precisely using the language of predicate logic. But just as English enables us to make both true and false claims, the language of predicate logic allows for the expression of both true and false sentences. In this chapter, we will turn our attention to analyzing and communicating the truth or falsehood of these statements. You will develop the skills required to answer the following questions:
These questions draw a distinction between the internal and external components of mathematical reasoning. When given a new statement, you’ll first need to figure out for yourself whether it is true (internal), and then be able to express your thought process to others (external). But even though we make a separation, these two processes are certainly connected: it is only after convincing yourself that a statement is true that you should then try to convince others. And often in the process of formalizing your intuition for others, you notice an error or gap in your reasoning that causes you to revisit your intuition—or make you question whether the statement is actually true!
A mathematical proof is how we communicate ideas about the truth or falsehood of a statement to others. There are many different philosophical ideas about what constitutes a proof, but what they all have in common is that a proof is a mode of communication, from the person creating the proof to the person digesting it. In this course, we will focus on reading and creating our own written mathematical proofs, which is the standard proof medium in computer science.
As with all forms of communication, the style and content of a proof varies depending on the audience. In this course, the audience for all of our proofs will be an average computer science student (and not your TA or instructor). As we will discuss, your audience determines how formal a proof should be (here, quite formal), and what background knowledge you can assume is understood without explanation (here, not much).
However, there is even variation in the typical computer science student with experience in this area, so as much as possible in this course, we will introduce new mathematical domains to serve as the objects of study in our proofs.
This approach has three very nice benefits: first, by building domains from the ground up, we can specify absolutely the common definitions and properties that everyone may assume and use freely in proofs; second, these domains are the theoretical foundation of many areas of computer science, and learning about them here will serve you well in many future courses; and third, learning about new domains will help develop the skill of reading about a new mathematical context and understanding it.In other words, you won’t just learn about new domains; you’ll learn how to learn about new domains! The definitions and axioms of a new domain communicate the foundation upon which we build new proofs—in order to prove things, we need to understand the objects that we’re talking about first.
We’re going to start out our exploration of proofs by studying a few simple statements. Our first foray into domain exploration will be into number theory, which you can think of as taking a type of entity with which we are quite familiar, and formalizing definitions and pushing the boundaries of what we actually know about these numbers that we use every day.
You may find our first few examples a bit on the easy side, which is fine. We are using them not so much for their ability to generate mathematical insight, but rather to model both the thinking and the writing that would go into approaching a problem.
Each example in this section is divided into three or four parts:
With this in mind, let’s dive right in!
Prove that \(23 \mid 115\).
We will expand the definition of divisibility to rewrite this statement in terms of simpler operations: \[\exists k \in \Z,~ 115 = 23k.\]
We just need to divide 115 by 23, right?
Let \(k = 5\).
Then \(115 = 23 \cdot 5 = 23 \cdot k\). We typically signal the end of a proof by writing a black square ◼ in the bottom-right corner.
We can draw from this example a more general technique for structuring our existence proofs. A statement of the form \(\exists x \in S,~P(x)\) is True when at least one element of \(S\) satisfies \(P\) (hence our use of any in Python). The easiest way to convince someone that this is True is to actually find the concrete element that satisfies \(P\), and then show that it does. Of course, this is not the only proof technique used for existence proofs. You’ll study more sophisticated ways of doing such proofs in future courses. This is so natural a strategy that it should not be surprising that there is a “standard proof format” when dealing with such statements.
A typical proof of an existential.
Given statement to prove: \(\exists x \in S,~P(x)\).
Let \(x = \_\_\_\_\_\_\_\).
[Proof that \(P(\_\_\_\_\_\_\_)\) is True.]
Note that the two blanks represent the same element of \(S\), which you get to choose as a prover. Thus existence proofs usually come down to finding a correct element of the domain which satisfy the required properties.
Here is another example which uses the same idea, but with two existentially-quantified variables.
Prove that there exists an integer that divides 104.
There is the key phrase “there exists” right in the problem statement, so we could write \(\exists a \in \Z,~a \mid 104\). We can once again expand the definition of divisibility to write:We use the abbreviated form for two quantifications of the same type. \[\exists a, k \in \Z,~104 = ak.\]
Basically, we need to pick a pair of divisors of 104. Since this is an existential proof and we get to pick both \(a\) and \(k\), any pair of divisors will work.
Let \(a = -2\) and let \(k = -52\).
Then \(104 = ak\).
The previous example is the first one that had multiple quantifiers. In our proof, we had to give explicit values for both \(a\) and \(k\) to show that the statement held. Just as how a sentence in predicate logic must have all its variables quantified, a mathematical proof must introduce all variables contained in the sentence being proven.
In the Chapter 3, we saw how changing the order of an existential and universal quantifier changed the meaning of a statement. Now, we’ll study how the order of quantifiers changes how we can introduce variables in a proof.
Prove that all integers are divisible by \(1\).
The statement contains a universal quantification: \(\forall n \in \Z,~1 \mid n\). We can unpack the definition of divisibility to \[\forall n \in \Z,~\exists k \in \Z,~n = 1 \cdot k.\]
The final equation in the fully-expanded form of the statement is straightforward, and is valid when \(k\) equals \(n\). But how should I introduce these variables? Answer: in the same order they are quantified in the statement.
Let \(n \in \Z\). Let \(k = n\).
Then \(n = 1 \cdot n = 1 \cdot k\).
This proof is quite short, but introduces a few new elements. First, it introduced a variable \(n\) that could represent any real number. Unlike the previous existence proofs, when we introduced this variable \(n\) we did not specify a concrete value like \(10\), but rather said that \(n\) was an arbitrary real number by writing ``Let \(n \in \Z\).You might notice that we use the same word “let” to introduce both existentially- and universally-quantified variables. However, you should always be able to tell how the variable is quantified based on whether it is given a concrete value or an “arbitrary” value in the proof.
A typical proof of a universal.
Given statement to prove: \(\forall x \in S,~P(x)\).
Let \(x \in S\). (That is, let \(x\) be an arbitrary element of \(S\).)
[Proof that \(P(x)\) is True].
The other interesting element of this proof was that it contained an existentially-quantified variable \(k\) after the \(\forall n \in \Z\). We used an extremely important tool at our disposal when it comes to proofs with multiple quantifiers: any existentially-quantified variable can be assigned a value that depends on the variables defined before it.
In our proof, we first defined \(n\) to be an arbitrary integer. Immediately after this, we wanted to show that for this \(n\), \(\exists k \in \Z,~ n = 1 \cdot k\). And to prove this, we needed a value for \(k\)—a “let” statement. Because we define \(k\) after having defined \(n\), we can use \(n\) in the definition of \(k\) and say “Let \(k = n\).” It may be helpful to think about the analogous process in programming. We first initialize a variable \(n\), and then define a new variable \(k\) that is assigned the value of \(n\).
Even though this may seem obvious, one important thing to note is that the order of variables in the statement determines the order in which the variables must be introduced in the proof, and hence which variables can depend on which other variables. For example, consider the following erroneous “proof.”
(Wrong!) Prove that \(\exists k \in \Z,~\forall n \in \Z,~n = 1 \cdot k.\)
Let \(k = n\). Let \(n \in \Z\).
Then \(n = 1 \cdot k\).
This proof may look very similar to the previous one, but it contains one crucial difference. The very first sentence, “Let \(k = n\),” is invalid: at that point, \(n\) has not yet been defined! This is analagous to a NameError in Python. This is the result of having switched around the order of the quantifiers, which forces \(k\) to be defined independently of whatever \(n\) is chosen.
Note: don’t assume that just because one proof is invalid, that all proofs of this statement are invalid! We cannot conclude that this statement is False just because we found one proof that didn’t work.A meta way of looking at this: a statement is True when there exists a correct proof of it. That said, this statement is indeed False, and we’ll see later on in this chapter how to prove that a statement is False instead of True.
Let’s look at one new example.
Prove that for all integers \(x\), if \(x\) divides \((x + 5)\), then \(x\) also divides \(5\).
There is both a universal quantification and implication in this statement:As we noted back in Chapter 3, the “universal + implication” form is the most common form of statement we encounter. \[\forall x \in \Z,~ x \mid (x + 5) \Rightarrow x \mid 5.\] When we unpack the definition of divisibility, we need to be careful about how the quantifiers are grouped: \[\forall x \in \Z,~ \big( \exists k_1 \in \Z,~ x + 5 = k_1x \big) \Rightarrow \big( \exists k_2 \in \Z,~ 5 = k_2x \big).\]
I need to prove that if \(x\) divides \(x + 5\), then it also divides 5. To prove this, I’m going to assume that \(x\) divides \(x + 5\), and I need to prove that \(x\) divides 5.
Since \(x\) is divisible by \(x\), I should be able to subtract it from \(x + 5\) and keep the result a multiple of \(x\). Can I prove that using the definition of divisibility? I basically need to “turn” the equation \(x + 5 = k_1x\) into the equation \(5 = k_2x\).
Let \(x\) be an arbitrary integer. Assume that \(x \mid (x + 5)\), i.e., that there exists \(k_1 \in \Z\) such that \(x + 5 = k_1x\). We want to prove that there exists \(k_2 \in \Z\) such that \(5 = k_2x\).
Let \(k_2 = k_1 - 1\).
Then we can calculate: \[\begin{align*} k_2x &= (k_1 - 1)x \\ &= k_1 x - x \\ &= (x + 5) - x \tag{we assumed $x + 5 = k_1 x$}\\ &= 5 \end{align*}\]
Whew, that was a bit longer than the proofs we’ve already done. There were a lot of new elements that we introduced here, so let’s break them down:
After introducing \(x\), we wanted to prove the implication \(x \mid (x + 5) \Rightarrow x \mid 5\). To prove an implication, we needed to assume that the hypothesis was True, and then prove that the conclusion is also True. In our proof, we wrote “Assume \(x \mid (x + 5)\).”
This is not a claim that \(x \mid (x + 5)\) is True; rather, it is a way to consider what would happen if \(x \mid (x + 5)\) were True. The goal for the rest of the proof was to prove that \(x \mid 5\).
This proof structure is common when proving an implication:
A typical proof of an implication (direct).
Given statement to prove: \(p \Rightarrow q\).
Assume \(p\).
[Proof that \(q\) is True.]
When we assumed that \(x \mid (x + 5)\), what this really did was introduce a new variable \(k_1 \in \Z\) from the definition of divisibility. This might seem a little odd, but take a moment to think about what this means in English. We assumed that \(x\) divides \(x + 5\), which (by definition) is the same as assuming that there exists an integer \(k_1\) such that \(x + 5 = k_1x\). Given that such a number exists, we can give it a name and refer to it in the rest of our proof.In other words, we introduced a variable into the proof through an assumption we made.
One of the most important meta-techniques in mathematical proof is that of generalization: taking a true statement (and a proof of the statement), and then replacing a concrete value in the statement with a universally quantified variable. For example, consider the statement from the previous example, \(\forall x \in \Z,~ x \mid (x + 5) \Rightarrow x \mid 5\). It doesn’t seem like the “\(5\)” serves any special purpose; it is highly likely that it could be replaced by another number like \(165\), and the statement would still hold.Concretely, consider the statement \(\forall x \in \Z,~ x \mid (x + 165) \Rightarrow x \mid 165\), which is at least as plausible as the original statement with \(5\)’s.
But rather than replace the \(5\) with another concrete number and then re-proving the statement, we will instead replace it with a universally-quantified variable, and prove the corresponding statement. This way, we will know that in fact we could replace the \(5\) with any integer and the statement would still hold.
Prove that for all \(d \in \Z\), and for all \(x \in \Z\), if \(x\) divides \((x + d)\), then \(x\) also divides \(d\).
This has basically the same translation as last time, except now we have an extra variable: \[\forall d,x \in \Z,~ \big( \exists k_1 \in \Z,~ x + d = k_1x \big) \Rightarrow \big( \exists k_2 \in \Z,~ d = k_2x \big).\]
I should be able to use the same set of calculations as last time.
Let \(d\) and \(x\) be arbitrary integers. Assume that \(x \mid (x + d)\), i.e., there exists \(k_1 \in \Z\) such that \(x + d = k_1x\). We want to prove that there exists \(k_2 \in \Z\) such that \(d = k_2x\).
Let \(k_2 = k_1 - 1\).
Then we can calculate: \[\begin{align*} k_2x &= (k_1 - 1)x \\ &= k_1 x - x \\ &= (x + d) - x \\ &= d \end{align*}\]
This proof is basically the same as the previous one: we have simply swapped out all of the \(5\)’s with \(d\)’s. We say that the proof did not depend on the value \(5\), meaning there was no place that we used some special property of \(5\), where we could have used a generic integer instead. We can also say that the original statement and proof generalize to this second version.
Why does generalization matter? By generalizing the previous statement from being about the number \(5\) to an arbitrary integer, we have essentially gone from one statement being true to an infinite number of statements being true. The more general the statement, the more useful it becomes. We care about exponent laws like \(a^b \cdot a^c = a^{b + c}\) precisely because they apply to every possible number; regardless of what our concrete calculation is, we know we can use this law in our calculations.
Now let’s see an example of applying the concept of mathematical proof to justify the correctness of an algorithm. First, recall that we say that an integer \(p\) is prime when it is greater than 1 and the only numbers that divide \(p\) are 1 and \(p\) itself. We saw earlier that we could implement a predicate in Python to determine whether p is prime:
def is_prime(p: int) -> bool:
"""Return whether p is prime."""
possible_divisors = range(0, p + 1)
return (
p > 1 and
all({d == 1 or d == p for d in possible_divisors if divides(d, p)})
)This implementation is a direct translation of the mathematical definition of prime numbers, with the only difference being our restriction of the range of possible divisors. In fact, we can justify that this range is correct in a separate proof! However, you might have noticed that this algorithm is “inefficient” because it checks more numbers than necessary.
Often when this version of is_prime is taught, the range of possible divisors extends only to the square root of the input p:
from math import floor, sqrt
def is_prime(p: int) -> bool:
"""Return whether p is prime."""
possible_divisors = range(2, floor(sqrt(p)) + 1)
return (
p > 1 and
all({not divides(d, p) for d in possible_divisors})
)This version is intuitively faster, as the range of possible divisors to check is smaller. But how do we actually know that this version of is_prime is correct? We could write some tests, but as we discussed earlier both unit tests and property-based tests do not guarantee absolute correctness, they just give confidence. Luckily, for algorithms like this one that are based on the mathematical properties of the input, we do have a tool that guarantees absolutely certainty: proofs!
Formally, we can justify the correctness by formally proving the following statement.
Let \(p \in \Z\). Then \(p\) is prime if and only if \(p > 1\) and for every integer \(d\) in the range \(2 \leq d \leq \sqrt{p}\), \(d\) does not divide \(p\).
Or, translated into predicate logic: \[\forall p \in \Z,~ \mathit{Prime}(p) \Leftrightarrow \big(p > 1 \land (\forall d \in \N,~ 2 \leq d \leq \sqrt{p} \Rightarrow d \nmid p) \big).\]
How do we go about proving that this statement is correct? We’ve seen in the past how to prove implications, but how about biconditionals? Recall that a biconditional \(p \Leftrightarrow q\) is equivalent to \((p \Rightarrow q) \land (q \Rightarrow p)\). So if we want to argue that a biconditional is True, we do so by proving the two different implications.
A typical proof of a biconditional.
Given statement to prove: \(p \Leftrightarrow q\).
This proof is divided into two parts.
Part 1 (\(p \Rightarrow q\)): Assume \(p\).
[Proof that \(q\) is True.]
Part 2 (\(q \Rightarrow p\)): Assume \(q\).
[Proof that \(p\) is True.]
The first implication we’ll prove is that if \(p\) is prime, then \(p > 1\) and \(\forall d \in \N,~ 2 \leq d \leq \sqrt{p} \Rightarrow d \nmid p\). We get to assume that \(p\) is prime, and will need to prove two things: that \(p > 1\), and that \(\forall d \in \N,~ 2 \leq d \leq \sqrt{p} \Rightarrow d \nmid p\).
Let’s remind ourselves what the definition of prime is in predicate logic:
\[\mathit{Prime}(p):~ p > 1 \land \big(\forall d \in \N,~ d \mid p \Rightarrow d = 1 \lor d = p \big)\]
The first part comes straight from the definition of prime. For the second part, we should also be able to use the definition of prime: if \(d\) is between 2 and \(\sqrt{p}\), then it can’t equal 1 or \(p\), which are the only possible divisors of \(p\).
Let’s see how to write this up formally.
Let \(p \in \Z\) and assume that \(p\) is prime. We need to prove that \(p > 1\) and for all \(d \in \N\), if \(2 \leq d \leq \sqrt p\) then \(d\) does not divide \(p\).
Part 1: proving that \(p > 1\).
By the definition of prime, we know that \(p > 1\).
Part 2: proving that for all \(d \in \N\), if \(2 \leq d \leq \sqrt p\) then \(d\) does not divide \(p\).
Let \(d \in \N\) and assume \(2 \leq d \leq \sqrt p\). We’ll prove that \(d\) does not divide \(p\).
First, since \(2 \leq d\), we know \(d > 1\), and so \(d \neq 1\). Second, since \(p > 1\), we know that \(\sqrt p < p\), and so \(d \leq \sqrt p < p\).
This means that \(d \neq 1\) and \(d \neq p\). By the definition of prime again, we can conclude that \(d \nmid p\).
What we’ve proved so far is that if \(p\) is prime, then it has no divisors between 2 and \(\sqrt p\). How does this apply to our algorithm is_prime? When its input p is a prime number, we know that the expressions p > 1 and all(not divides(d, p) for d in possible_divisors) will both evaluate to True, and so the function will return True. In other words, we’ve proven that is_prime returns the correct value for every prime number, without a single test case! Pretty awesome.
Though we know that is_prime is correct for prime numbers, we’ve said nothing at all about how it behaves when given a non-prime number. To prove that its behaviour is correct in this case as well, we need to prove the other conditional.
We now need to prove the second implication, which is the converse of the first: if \(p > 1\) and \(\forall d \in \N,~ 2 \leq d \leq \sqrt{p} \Rightarrow d \nmid p\), then \(p\) must be prime. Expanding the definition of prime, we need to prove that \(p > 1\) (which we’ve assumed!) and that for all \(d_1 \in \N,~ d_1 \mid p \Rightarrow d_1 = 1 \lor d_1 = p\).
So the idea here is to let \(d_1 \in \N\) and assume \(d_1 \mid p\), and use the condition that \(\forall d \in \N,~ 2 \leq d \leq \sqrt{p} \Rightarrow d \nmid p\) to prove that \(d_1\) is 1 or \(p\).
Let \(p \in \N\), and assume \(p > 1\) and that \(\forall d \in \N,~ 2 \leq d \leq \sqrt{p} \Rightarrow d \nmid p\). We want to prove that \(p\) is prime, i.e., that \(p > 1\) and that \(\forall d_1 \in \N,~ d_1 \mid p \Rightarrow d_1 = 1 \lor d_1 = p\).
We know the first part (\(p > 1\)) is true because it’s one of our assumptions. For the second part, first let \(d_1 \in \N\), and assume \(d_1 \mid p\). We’ll prove that \(d_1 = 1 \lor d_1 = p\).
From our second assumption, we know that since \(d_1 \mid p\), it is not between 2 and \(\sqrt p\). More precisely, the contrapositive of our second assumption says that for all \(d \in \N\), \(d \mid p \Rightarrow d < 2 \lor d > \sqrt p\). So then either \(d_1 < 2\) or \(d_1 > \sqrt p\). We divide our proof into two cases based on these possibilities.
Case 1: assume \(d_1 < 2\).
Since \(d_1 \in \N\), it must be 0 or 1 in this case. We know \(0 \nmid p\) because \(p > 1\), and so \(d_1 = 1\).
Case 2: assume \(d_1 > \sqrt p\).
Since we assumed \(d_1 \mid p\), we expand the definition of divisibility to conclude that \(\exists k \in \Z,~ p = d_1 k\). Since \(d_1 > \sqrt p\) in this case, we know that \(k = \frac{p}{d_1} < \frac{p}{\sqrt{p}} = \sqrt{p}\).
Since \(p = d_1k\), we know that \(k \mid p\) as well, and so our second assumption applied to \(k\) tells us that \(k\) is not between 2 and \(\sqrt p\).
So \(k < \sqrt{p}\) and is not between 2 and \(\sqrt p\). Therefore \(k = 1\), and so \(d_1 = \frac{p}{k} = p\).
To wrap up this example, let’s see how this implication connects to our function is_prime. What we’ve proved is that if is_prime(p) returns True, then p must be prime. This sounds very similar to what we said in the previous section, but it is different! The contrapositive this statement here is useful: if p is NOT prime, then is_prime(p) returns False.
So putting the two implications together, we have:
p, if p is prime then is_prime(p) returns True.p, if is_prime(p) returns True then p is prime. Or equivalently, if p is not prime then is_prime(p) returns False.Since every integer p is either prime or not prime, we can conclude that this implementation of is_prime is correct according to its specification.
Notice the duality between the statement of correctness for is_prime and the biconditional we had set out to prove: for every natural number \(p\), \(p\) is prime if and only if \(p > 1\) and for every integer \(d\) in the range \(2 \leq d \leq \sqrt{p}\), \(d \nmid p\). The correctness of our algorithm is derived from the theoretical properties of prime numbers that we expressed in formal predicate logic. We admit this is a relatively simple example of this connection between algorithm and mathematical theory, but we had to start somewhere! Our future examples will draw on connections like this, but in far deeper ways.
In the last proof of the previous section, we did something interesting: having concluded that \(d_1 < 2\) or \(d_1 > \sqrt p\), we proceeded to split up our proof into two cases, one where we assumed that each part of the OR was true. This is a proof technique known as proof by cases.
Remember that for a universal proof, we typically let a variable be an arbitrary element of the domain, and then make an argument in the proof body to prove our goal statement. However, even when the goal statement is True for all elements of the domain, it isn’t always easy to construct a single argument that works for all of those elements! Sometimes, different arguments are required for different elements. In this case, we divide the domain into different parts, and then write a separate argument for each part.
A bit more formally, we pick a set of unary predicates \(P_1\), \(P_2\), …, \(P_k\) (for some positive integer \(k\)), such that for every element \(x\) in the domain, \(x\) satisfies at least one of the predicates (we say that these predicates are exhaustive). Note that the domain can be narrowed based on additional assumptions or conclusions made earlier in the proof. In our previous example, we started with a domain “\(d_1 \in \N\)”, and then narrowed this to “\(d_1 \in \N\) and \((d_1 < 2 \lor d_1 > \sqrt p)\)”, leading to the following predicates for our cases: \[P_1(d_1): d_1 < 2, \qquad P_2(d_1): d_1 > \sqrt p.\]
Then, we divide the proof body into cases, where in each case we assume that one of the predicates is True, and use that assumption to construct a proof that specifically works under that assumption. Recall that there’s an equivalence between predicates and sets. Another way of looking at a proof by cases is that we divide the domain into subsets \(S_1, S_2, \dots S_k\), and then prove the desired statement separately for each of these subsets.
A typical proof by cases.
Given statement to prove: \(\forall x \in S, P(x).\) Pick a set of exhaustive predicates \(P_1, \dots, P_k\) of \(S\).
Let \(x \in S\). We will use a proof by cases.
Case 1. Assume \(P_1(x)\) is True.
[Proof that \(P(x)\) is True, assuming \(P_1(x)\).]
Case 2. Assume \(P_2(x)\) is True.
[Proof that \(P(x)\) is True, assuming \(P_2(x)\).]
\(\vdots\)
Case \(k\). Assume \(P_k(x)\) is True.
[Proof that \(P(x)\) is True, assuming \(P_k(x)\).]
Proof by cases is a very versatile proof technique, since it allows the combining of simpler proofs together to form a whole proof. Often it is easier to prove a property about some (or even most) elements of the domain than it is to prove that same property about all the elements. But do keep in mind that if you can find a simple proof which works for all elements of the domain, that’s generally preferable than combining multiple proofs together in a proof by cases.
One natural use of proof by cases in number theory is to apply the Quotient-Remainder Theorem that we introduced in Section 6.1.
(Quotient-Remainder Theorem) For all \(n \in \Z\) and \(d \in \Z^+\), there exist \(q \in \Z\) and \(r \in \N\) such that \(n = qd + r\) and \(0 \leq r < d\). Moreover, these \(q\) and \(r\) are unique for a given \(n\) and \(d\).
We say that \(q\) is the quotient when \(n\) is divided by \(d\), and that \(r\) is the remainder when \(n\) is divided by \(d\).
The reason this theorem is powerful is that it tells us that for any non-zero divisor \(d \in \Z^+\), we can separate all possible integers into \(d\) different groups, corresponding to their possible remainders (between \(0\) and \(d-1\)) when divided by \(d\). Let’s see this how to use this fact to perform a proof by cases.
Prove that for all integers \(x\), \(2 \DIV x^2 + 3x\).
Using the divisibility predicate: \(\forall x \in \Z,~ 2 \DIV x^2 + 3x\). Or expanding the definition of divisibility: \[\forall x \in \Z,~ \exists k \in \Z,~ x^2 + 3x = 2k.\]
We want to “factor out a \(2\)” from the expression \(x^2 + 3x\), but this only works if \(x\) is even. If \(x\) is odd, though, then both \(x^2\) and \(3x\) will be odd, and adding two odd numbers together produces an even number.
But how do we “know” that every number has to be either even or odd? And how can we formalize the algebraic operations of “factoring out a \(2\)” or “adding two odd numbers together”? This is where the Quotient-Remainder Theorem comes in.
Let \(x \in \Z\). By the Quotient-Remainder Theorem, we know that when \(x\) is divided by \(2\), the two possible remainders are \(0\) and \(1\). We will divide up the proof into two cases based on these remainders.
Case 1: assume the remainder when \(x\) is divided by \(2\) is \(0\). That is, we assume there exists \(q \in \Z\) such that \(x = 2q + 0\). We will show that there exists \(k \in \Z\) such that \(x^2 + 3x = 2k\).
We have: \[\begin{align*} x^2 + 3x &= (2q)^2 + 3(2q) \\ &= 4q^2 + 6q \\ &= 2(2q^2 + 3q) \end{align*}\]
So let \(k = 2q^2 + 3q\). Then \(x^2 + 3x = 2k\).
Case 2: assume the remainder when \(x\) is divided by 2 is \(1\). That is, we assume there exists \(q \in \Z\) such that \(x = 2q + 1\). We will show that there exists \(k \in \Z\) such that \(x^2 + 3x = 2k\).
We have: \[\begin{align*} x^2 + 3x &= (2q+1)^2 + 3(2q+1) \\ &= 4q^2 + 4q + 1 + 6q + 3 \\ &= 2(2q^2 + 5q + 2) \end{align*}\]
So let \(k = 2q^2 + 5q + 2\). Then \(x^2 + 3x = 2k\).
Suppose we have a friend who is trying to convince us that a certain statement \(X\) is False. If they tell you that statement \(X\) is false because they tried really hard to come up with a proof of it and failed, you might believe them, or you might wonder if maybe they just missed a crucial idea leading to a correct proof.Maybe they skipped all their CSC110 classes. An absence of proof is not enough to convince us that the statement is False.
Instead, we must see a disproof, which is simply a proof that the negation of the statement is True.In other words, if we can prove that \(\NOT X\) is True, then \(X\) must be False. For this section, we’ll be using the simplification rules from Section 3.2 to make negations of statements easier to work with.
Disprove the following statement: every natural number divides 360.
This statement can be written as \(\forall n \in \N,~n \DIV 360\). However, we want to prove that it is False, so we really need to study its negation. \[\begin{align*} \NOT \big(\forall n \in \N,~n \DIV 360 \big) \\ \exists n \in \N,~ n \NDIV 360 \end{align*}\]
The original statement is obviously not True: the number 7 doesn’t divide 360, for instance. Is that a proof? We wrote the negation of the statement in symbolic form above, and if we translate it back into English, we get “there exists a natural number which does not divide 360.” So, yes. That’s enough for a proof.
Let \(n = 7\).
Then \(n \NDIV 360\), since \(\frac{360}{7} = 51.428\dots\) is not an integer.
When we want disprove a universally-quantified statement (“every element of \(S\) satisfies predicate \(P\)”), the negation of that statement becomes an existentially-quantified one (“there exists an element of \(S\) that doesn’t satisfy predicate \(P\)”). Since proofs of existential quantification involve just finding one value, the disproof of the original statement involves finding such a value which causes the predicate to be False (or alternatively, causes the negation of the predicate to be True). We call this value a counterexample for the original statement. In the previous example, we would say that 7 is a counterexample of the given statement.
A typical disproof of a universal (counterexample).
Given statement to disprove: \(\forall x \in S,~P(x)\).
We prove the negation, \(\exists x \in S,~\NOT P(x)\). Let \(x=\) _______.
[Proof that \(\NOT P\)(_______) is True.]
In this section, we’ll take a closer look at the greatest common divisor of two numbers. Recall the following definitions from 6.1 An Introduction to Number Theory.
Let \(x, y, d \in \Z\). We say that \(d\) is a common divisor of \(x\) and \(y\) when \(d\) divides \(x\) and \(d\) divides \(y\).
We say that \(d\) is the greatest common divisor of \(x\) and \(y\) when it is the largest number that is a common divisor of \(x\) and \(y\), or 0 when \(x\) and \(y\) are both 0. We can define the function \(\gcd : \Z \times \Z \to \N\) as the function which takes numbers \(x\) and \(y\), and returns their greatest common divisor.
To make it easier to translate this statement into symbolic form, we can restate the “maximum” part by saying that if \(e\) is any number which divides \(m\) and \(n\), then \(e \leq d\). Let \(m, n, d \in \Z\), and suppose \(d = \gcd(m, n)\). Then \(d\) satisfies the following statement:
\[\begin{align*} &\Big(m = 0 \land n = 0 \Rightarrow d = 0 \Big)~\land \\ &\Big(m \neq 0 \lor n \neq 0 \Rightarrow \\ & \qquad d \mid m \land d \mid n \land \big(\forall e \in \N,~e \mid m \land e \mid n \Rightarrow e \leq d\big) \Big) \end{align*}\]
This expression has a few subtleties. First, because we actually have separate definitions for \(\gcd(m, n)\) when both arguments are zero and when at least one of them is non-zero, these two definitions are expressed as two different implications. This is analogous to writing an if statement in Python. In this case, we’re saying that only one of the conclusions needs to be True, depending on which of the hypotheses are True.
Here is an example proof which makes use of both this definition, and the definition of prime.
Prove that for all integers \(p\) and \(q\), if \(p\) and \(q\) are distinct primes, then \(p\) and \(q\) are coprime, meaning \(\gcd(p, q) = 1\).
Here is an initial translation which focuses on the structure of the above statement, but doesn’t unpack any definitions: \[\forall p, q \in \Z,~\big(Prime(p) \land Prime(q) \land p \neq q\big) \IMP \gcd(p, q) = 1.\] We could unpack the definitions of \(Prime\) and gcd, but doing so would not add any insight at this point. While we will almost certainly end up using these definitions in the discussion and proof sections, expanding it here actually obscures the meaning of the statement. In general, use translation as a way of precisely specifying the structure of a statement; as we have seen repeatedly, the high-level structure of a statement is mimicked in the structure of its proof. And while you don’t need to expand every definition in a statement, you should always keep in mind that definitions referred to in the statement will require unpacking in the proof itself.
We know that primes don’t have many divisors, and that \(1\) is a common divisor for any pair of numbers. So to show that \(\gcd(p, q) = 1\), we just need to make sure that neither \(p\) nor \(q\) divides the other (otherwise that would be a common divisor larger than \(1\)).
Let \(p, q \in \Z\). Assume that \(p\) and \(q\) are both prime, and that \(p \neq q\). We want to prove that \(\gcd(p, q) = 1\).
By the definition of prime, we know that \(p \neq 1\) (since \(p > 1\)). Also by the definition of prime, the only positive divisors of \(q\) are \(1\) and \(q\) itself. So then since \(p \neq q\) (our assumption) and \(p \neq 1\), we know that \(p \NDIV q\).
Next, we know that \(1\) divides every numberWe proved this in Section 6.2!, and so \(1\) is the only positive common divisor of \(p\) and \(q\), so \(\gcd(p, q) = 1\).
In the above proof, we did something new in the last paragraph: we referred to a statement we had proved to justify a step in the proof. This might sound kind of funny—after all, many of our proofs so far have relied on some algebraic manipulations which are valid but are really knowledge we learned prior to this course. The subtle difference is that those algebraic laws we take for granted as “obvious” because we learned them so long ago. But in fact our proofs can consist of steps which are statements that we know are true because of an external source, even one that we don’t know how to prove ourselves.
This is a fundamental parallel between writing proofs and writing computer programs. In programming, we start with some basic building blocks of a language—data types, control flow constructs, etc.—but we often rely on libraries as well to simplify our tasks. We can use these libraries by reading their documentation and understanding how to use them, but don’t need to understand how they are implemented. In the same way, we can use an external theorem in our proof by understanding what it means, but without knowing how to prove it.
Let’s look at one example of this in action.
First, a “helper” definition:
Let \(m, n, a \in \Z\). We say that \(a\) is a linear combination of \(m\) and \(n\) when there exist \(p, q \in \Z\) such that \(a = pm + qn\).
For example, 101 is a linear combination of 5 and 3, since \(101 = 10 \cdot 5 + 17 \cdot 3\).
We can use this definition to state one fairly straightforward property of divisibility, and one surprising property of the greatest common divisor.
(Divisibility of Linear Combinations) Let \(m, n, d \in \Z\). If \(d\) divides \(m\) and \(d\) divides \(n\), then \(d\) divides every linear combination of \(m\) and \(n\).
(GCD Characterization) Let \(m, n \in \Z\), and assume at least one of them is non-zero. Then \(\gcd(m, n)\) is the smallest positive integer that is a linear combination of \(m\) and \(n\).
Next, we’ll see how to use these two theorems as “helpers” inside a proof of the following statement, which is yet another property of the greatest common divisor.
For all \(m, n, d \in \Z\), if \(d\) divides both \(m\) and \(n\) then \(d\) also divides \(\gcd(m, n)\).
We can translate this statement as follows: \[\forall m, n, d \in \Z,~ d \mid m \land d \mid n \Rightarrow d \mid \gcd(m, n).\]
This one is a bit tougher. All we know from the definition of gcd is that \(d \leq \gcd(m, n)\), but that doesn’t imply \(d \mid \gcd(m, n)\) by any means.
But given the context that we just discussed in the preceding paragraphs, I’d guess that we should also use the GCD Characterization Theorem to write \(\gcd(m, n)\) as \(pm + qn\). Oh, and the theorem before that one said that any number that divides \(m\) and \(n\) will divide \(pm + qn\) as well!
Let \(m, n, d \in \Z\). Assume that \(d \mid m\) and \(d \mid n\). We want to prove that \(d \mid \gcd(m, n)\). We’ll divide our proof into two cases. After reading the next two cases, answer: why did we need to divide our proof into cases? Is there another way we could have written this proof?
Case 1: assume \(m = 0\) and \(n = 0\).
In this case, by the definition of \(\gcd\) we know that \(\gcd(m, n) = 0\). So \(d \mid \gcd(m, n)\), since we assumed that \(d\) divides \(m\) and \(n\), which are 0.
Case 2: assume \(m \neq 0\) or \(n \neq 0\).
Then By the GCD Characterization Theorem, there exist integers \(p, q \in \Z\) such that \(\gcd(m, n) = pm + qn\).This line uses a known external fact that is an existential to introduce two variables \(p\) and \(q\) to use in our proof.
Then by the the Divisibility of Linear Combinations Theorem, since \(d \mid m\) and \(d \mid n\) (by assumption), we know that \(d \mid pm + qn\).
Therefore \(d \mid \gcd(m, n)\).
In the previous section, we studied some mathematical properties of the greatest common divisor. Now in this section, we’ll look at how to implement algorithms for calculating the greatest common divisor, and introduce a new form of Python loops along the way.
In this chapter we have used the divides predicate (e.g., \(d \mid n\)) liberally. In Section 3.9, we saw a possible implementation of the predicate as a function called divides:
def divides(d: int, n: int) -> bool:
"""Return whether d divides n."""
if d == 0:
return n == 0
else:
return n % d == 0With this function in hand, we can implement a gcd function as follows: In this implementation, we use abs because m and/or n might be negative.
def naive_gcd(m: int, n: int) -> int:
"""Return the gcd of m and n."""
if m == 0:
return abs(n)
elif n == 0:
return abs(m)
else:
possible_divisors = range(1, min(abs(m), abs(n)) + 1)
return max({d for d in possible_divisors if divides(d, m) and divides(d, n)})Here is the Quotient-Remainder Theorem we saw earlier in this chapter, slightly modified to allow for negative divisors as well.
(Quotient-Remainder Theorem) For all \(n \in \Z\) and \(d \in \Z\), if \(d \neq 0\) then there exist \(q \in \Z\) and \(r \in \N\) such that \(n = qd + r\) and \(0 \leq r < |d|\). Moreover, these \(q\) and \(r\) are unique for a given \(n\) and \(d\).
We say that \(q\) is the quotient when \(n\) is divided by \(d\), and that \(r\) is the remainder when \(n\) is divided by \(d\), and write \(r = n~\%~d\).
We can use this theorem to improve our algorithm by breaking down the problem into a smaller one. The key idea is the following theorem.
For all \(a, b \in \Z\) where \(b \neq 0\), \(\gcd(a, b) = \gcd(b, a~\%~b)\).
\(\forall a, b \in \Z,~ b \neq 0 \Rightarrow \gcd(a, b) = \gcd(b, a~\%~b)\).
Before we try to prove this statement, let’s consider an example using the two numbers \(a = 24\) and \(b = 16\). We know that \(\gcd(24, 16) = 8\). Also, the remainder when 24 is divided by 16 is 8, and \(\gcd(16, 8) = 8\) as well.
Before we get to a formal proof, let’s preview the main idea. We’ll define the variable \(d = \gcd(b, a~\%~b)\), and prove that \(d = \gcd(a, b)\) as well. To do so, we’ll need to prove that \(d\) divides both \(a\) and \(b\), and that it is greater than every other common divisor of \(a\) and \(b\). Watch for this structure in our actual proof below!
Let \(a, b \in \Z\) and assume \(b \neq 0\). Also let \(r = a~\%~b\) (the remainder when \(a\) is divided by \(b\)). We need to prove that \(\gcd(a, b) = \gcd(b, r)\).
To do this, let \(d = \gcd(b, r)\). We’ll prove that \(d = \gcd(a, b)\) as well, by proving three things: that \(d \mid a\), that \(d \mid b\), and that every common divisor of \(a\) and \(b\) is \(\leq d\).
Part 1: proving that \(d \mid a\).
By our definition of \(r\) and the Quotient-Remainder Theorem, we know that there exists \(q \in Z\) such that \(a = qb + r\). Since \(d = \gcd(b, r)\), we know that \(d\) divides both \(b\) and \(r\). And so by the Divisibility of Linear Combinations Theorem, we know that \(d \mid qb + r\), and so \(d \mid a\).
Part 2: proving that \(d \mid b\).
Since we defined \(d = \gcd(b, r)\), it must divide \(b\) (by the definition of \(\gcd\)).
Part 3: proving that every common divisor of \(a\) and \(b\) is \(\leq d\).
Let \(d_1 \in \Z\) and assume that \(d_1 \mid a\) and \(d_1 \mid b\). We’ll prove that \(d_1 \leq d\).
First, we’ll prove that \(d_1 \mid r\). We can rewrite the equation \(a = qb + r\) (from the Quotient-Remainder Theorem) to obtain \(r = a - qb\). Then using our assumption that \(d_1\) is a common divisor of \(a\) and \(b\) and Divisibility of Linear Combinations Theorem again, we can conclude that \(d_1 \mid r\).
So then \(d_1 \mid b\) (by our assumption), and \(d_1 \mid r\), and so it is a common divisor of \(b\) and \(r\). Therefore by the definition of \(\gcd\), we know that \(d_1 \leq \gcd(b, r) = d\).
The theorem we just proved suggests a possible way of computing the gcd of two numbers in an iterative (repeated) fashion. Let’s again use 24 and 16 as our example.
Let’s formalize this in a high-level description of an algorithm before we write the code. This algorithm for computing the gcd of two numbers is known as the Euclidean algorithm. This is named after the Greek mathematician Euclid, although he originally developed the algorithm using subtraction (\(a - b\)) rather than remainders (\(a~\%~b\)).
Euclidean Algorithm
Given: integers a and b. Returns: gcd(a, b).
x, y to the given numbers a and b.r be the remainder when x is divided by y.x and y to y and r, respectively.y is 0.x refers to the gcd of a and b.Here is how we can visualize the changing values of x and y for the given 24 and 6 in our previous example: Note the similarity between this and the loop accumulation tables of Chapter 4.
| Iteration | x |
y |
|---|---|---|
| 0 | 24 | 16 |
| 1 | 16 | 8 |
| 2 | 8 | 0 |
The main question for us in implementing this algorithm in Python is how we achieve step 4: repeating the two previous steps until some condition (“y is 0”) is satisfied. We know how to use for loops to iterate over a collection of values. This allowed us to repeat a sequence of statements (i.e., the body of the for loop) on every iteration. Naturally, the for loop ends when the statements have been repeated for all elements in a collection or range.
But in the case of step 4, we would like to repeat code based on some condition: “Repeat steps 2 and 3 until the remainder is 0”. In these scenarios, we must use a different kind of loop in Python: the while loop.
A while loop looks very similar to an if statement:
Unlike an if statement, after executing its body the while loop will check the condition again. If the condition still evaluates to True, then the body is repeated. Let’s try an example:
>>> numbers = []
>>> number = 1
>>> while number < 100:
... numbers.append(number)
... number = number * 2
...
>>> numbers
[1, 2, 4, 8, 16, 32, 64]Notice how number appears in both the while loop’s body and its condition. In the loop body, number is increasing at each iteration (we accumulated the values in the list numbers). Eventually, number refers to the value 128 and the while loop is done because 128 < 100 evaluates to False. Note that the number of iterations of our while loop is dependent on the initial value of number. Had we started with a value of, for example, 10, the loop would have only 4 iterations (not 6, as when number started with 2). Similarly, if number was initially some value greater than or equal to 100, then the while loop would never have executed its body (just as a for loop does not execute its body if given an empty collection).
Here is our (first) implementation of the Euclidean algorithm for computing the gcd of two numbers.
def euclidean_gcd(a: int, b: int) -> int:
"""Return the gcd of a and b."""
# Step 1: initialize x and y
x = a
y = b
while y != 0: # Step 4: repeat Steps 2 and 3 until y is 0
# Step 2: calculate the remainder of x divided by y
r = x % y
# Step 3: reassign x and y
x = y
y = r
# Step 5: x now refers to the gcd of a and b
return xHow does this loop work? To understand it better, let’s see how this maps onto our original algorithm.
x and y, occurs in the code before the while loop begins.while loops, however, we must write a continuing condition, which is the negation of the stopping condition. So “until \(y = 0\)” becomes while y != 0.Let’s see an example trace of the euclidean_gcd loop for the sample call euclidean_gcd(24, 16):
| Iteration | x |
y |
|---|---|---|
| 0 | 24 | 16 |
| 1 | 16 | 8 |
| 2 | 8 | 0 |
In our implementation, we don’t have a typical accumulator pattern. Instead, both x and y are loop variables for the while loop, which illustrates one major difference between while loops and for loops. In a for loop, the loop variable is initialized and reassigned automatically by the Python interpreter to each element of the collection being looped over. In a while loop, the loop variable(s) must be initialized and reassigned explicitly in code that we write.
This difference makes while loops more flexible than for loops, as the programmer has full control over exactly how the loop variable changes. This is both a strength and a weakness! While loops can be used to express algorithms that are cumbersome or impossible to express with for loops, but at the cost of requiring the programmer to write more code to keep track of loop variables. Remember: the more code you write, the more potential there is for error. So a good rule of thumb is to use for loops where possible (when you have an explicit collection to loop over), and reserve while loops for situations that can’t be easily implemented with a for loop.
One subtlety of our loop body is the order in which the loop variables are updated. Suppose we had swapped the last two lines of the loop body:
This is a really easy change to make, but also incorrect: because the statement y = r is executed first, the next statement x = y assigns x to the new value of y rather than its old one!
When performing reassignment of multiple variables, where the new variable values depend on the old ones, it is important to keep track of the reassignment order so that you don’t accidentally lose previous variable values. To avoid this problem altogether, Python has a neat feature called parallel assignment, in which multiple variables can be assigned in the same statment.
Here is how we can rewrite the loop body using parallel assignment:
The assignment statement x, y = y, r is evaluated as follows:
y, r is evaluated, producing two objects.Or more precisely, the ids of two objects.In parallel assignment, the right-hand side is fully evaluated before any variable reassignment occurs. This means that the assignment statement x, y = y, r has the same effect as y, x = r, y—order doesn’t matter, and so we can think of each variable assignment happening in parallel, without one affecting the other.
Parallel is a very useful tool when reassigning variables, so please take advantage of it to help simplify your code and avoid the “update order” problem of variable reassignment. Here is how we can rewrite the euclidean_gcd using parallel assignment:
def euclidean_gcd(a: int, b: int) -> int:
"""Return the gcd of a and b."""
x, y = a, b
while y != 0:
r = x % y
x, y = y, r
return xOur implementation of euclidean_gcd doesn’t follow a typical pattern of code we’ve seen so far. If we didn’t know anything about the algorithm and were simply looking at the code, it would be quite mysterious why it works. To improve the readability of this code, we want some way of documenting what we know about the loop variables x and y inside the loop body.
Recall that the Euclidean Algorithm relies on one key property, that gcd(x, y) == gcd(y, x % y). At each loop iteration, x and y are updated so that x = y and y = x % y. The key property that we want to capture is that even though x and y change, their gcd doesn’t. Since x and y are initialized to a and b, another way to express this is that at every loop iteration, gcd(x, y) == gcd(a, b). We call this statement a loop invariant, which is a property about loop variables that must be true at the start and end of each loop iteration. This is similar to representation invariants, which are properties of instance attributes that must be true for every instance of a given data class.
By convention, we document loop invariants at the top of a loop body using an assert statement.
def euclidean_gcd(a: int, b: int) -> int:
"""Return the gcd of a and b."""
x, y = a, b
while y != 0:
# Loop invariant (we use naive_gcd to check that the gcd are correct)
assert naive_gcd(x, y) == naive_gcd(a, b)
r = x % y
x, y = y, r
return xBecause this loop invariant must be true at the start and end of each loop iteration, it is also true after the loop stops (i.e., when y == 0). In this case, the loop invariant tells us that gcd(x, 0) == gcd(a, b), and so we know that x == gcd(a, b), which is why x is returned.
Loop invariants are a powerful way to document properties of our code, to better enable us to reason about our code. But remember that loop invariants by themselves are just statements; the only way to know for sure whether a loop invariant is correct is to do a proof, much like the one we did at the beginning of this section.
In this section, we’ll explore some properties of modular arithmetic that will be useful in the next chapter, when we study cryptographic algorithms based on modular arithmetic. First, recall the definition of modular equivalence from 6.1 An Introduction to Number Theory.
Let \(a, b, n \in \Z\), and assume \(n \neq 0\). We say that \(a\) is equivalent to \(b\) modulo \(n\) when \(n \mid a - b\). In this case, we write \(a \equiv b \pmod n\).One warning: the notation \(a \equiv b \pmod n\) is not exactly the same as mod or % operator you are familiar with from programming; here, both \(a\) and \(b\) could be much larger than \(n\), or even negative.
This definition captures the idea that \(a\) and \(b\) have the same remainder when divided by \(n\). You should think of this congruence relation as being analogous to numeric equality, with a relaxation. When we write \(a = b\), we mean that the numeric values of \(a\) and \(b\) are literally equal. When we write \(a \equiv b \pmod n\), we we mean that if you look at the remainders of \(a\) and \(b\) when divided by \(n\), those remainders are literally equal.
We will next look at how addition, subtraction, and multiplication all behave in an analogous fashion under modular arithmetic. The following proof is a little tedious because it is calculation-heavy; the main benefits here are practicing reading and using a new definition, and getting comfortable with this particular notation.
For all \(a, b, c, d, n \in \Z\), if \(n \neq 0\), if \(a \equiv c \pmod n\) and \(b \equiv d \pmod n\), then:
We will only show how to translate and prove (2), and leave (1) and (3) as exercises. \[\forall a, b, c, d, n \in \Z,~ \big(n \neq 0 \AND (n \DIV a - c) \AND (n \DIV b - d) \big) \IMP n \DIV (a - b) - (c - d).\]
Let \(a, b, c, d, n \in \Z\). Assume that \(n \neq 0\), \(n \DIV a - c\), and \(n \DIV b - d\). This means we want to prove that \(n \DIV (a-c) - (b-d).\)
By the Divisibility of Linear Combinations Theorem, since \(n \DIV (a-c)\) and \(n \DIV (b - d)\), it divides their difference:
\[\begin{align*} n &\DIV (a-c) - (b-d) \\ n &\DIV (a-b) - (c-d) \end{align*}\]
The above example stated that addition, subtraction, and multiples all preserve modular equivalence—but what above division? The following statement is a “divide by \(k\)” property, but is actually False: A good exercise is to disprove this statement! \[ \forall a, b, k, n \in \Z,~ n > 0 \land ak \equiv bk \pmod n \Rightarrow a \equiv b \pmod n \]
For the real numbers, division \(\frac{x}{y}\) has a single gap: when \(y = 0\). As we’ll see in the next theorem, division in modular arithmetic can have many such gaps, but we can also predict exactly where these gaps will occur.
(Modular inverse) Let \(n \in \Z^+\) and \(a \in \Z\). If \(\gcd(a, n) = 1\), then there exists \(p \in \Z\) such that \(ap \equiv 1 \pmod n\).
We call this \(p\) a modular inverse of \(a\) modulo \(n\).
\(\forall n \in \Z^+, \forall a \in \Z,~ \gcd(a, n) = 1 \Rightarrow \big(\exists p \in \Z,~ ap \equiv 1 \pmod n \big)\)
Let \(n \in \Z^+\) and \(a \in Z\). Assume \(\gcd(a, n) = 1\).
Since \(\gcd(a, n) = 1\), by the GCD Characterization Theorem we know that there exist integers \(p\) and \(q\) such that \(pa + qn = \gcd(a, n) = 1\).
Rearranging this equation, we get that \(pa - 1 = qn\), and so (by the definition of divisibility, taking \(k = q\)), \(n \mid pa - 1\).
Then by the definition of modular equivalence, \(pa \equiv 1 \pmod n\).
From this theorem about modular inverses, we can build up a form of division for modular arithmetic. To gain some intuition, first think about division \(\frac{a}{b}\) as the solution to an equation of the form \(ax = b\). We’ll turn this into a statement about modular equivalence now.
Let \(a \in \Z\) and \(n \in \Z^+\). If \(\gcd(a, n) = 1\), then for all \(b \in \Z\), there exists \(k \in \Z\) such that \(ak \equiv b \pmod n\).
This statement is quite complex! Remember that we focus on translation to examine the structure of the statement, so that we know how to set up a proof. We aren’t going to expand every single definition for the sake of expanding definitions.
\[\forall n \in \Z^+, \forall a \in \Z,~ \gcd(a, n) = 1 \Rightarrow \big(\forall b \in \Z,~ \exists k \in \Z,~ ak \equiv b \pmod n \big).\]
So this is saying that under the given assumptions, \(b\) is “divisible” by \(a\) modulo \(n\). This comes after the theorem about modular inverses, so that should be useful. The conclusion is “there exists a \(k \in \Z\) such that…” so that I know that at some point I’ll need to define a variable \(k\) in terms of \(a\), \(b\), and/or \(n\), which satisfies the congruence.
I notice that the hypothesis here (\(\gcd(a, n) = 1\)) matches with the hypothesis from the previous theorem, so that seems to be something I can use. That gives me a \(p \in \Z\) such that \(ap \equiv 1 \pmod n\)…
Wait, I can multiply both sides by \(b\), right?!
Let \(a \in \Z\) and \(n \in \Z^+\). Assume \(\gcd(a, n) = 1\), and let \(b \in \Z\). We want to prove that there exists \(k \in \Z\) such that \(ak \equiv b \pmod n\).
First, using the previous Modular Inverses theorem, since we assmed \(\gcd(a, n) = 1\), we know that there exists \(p \in \Z\) such that \(ap \equiv 1 \pmod n\).
Second, we know from (3) of our first example above that we can modular equivalence preserves multiplication, and so we know \(apb \equiv b \pmod n\).
Then we let \(k = pb\), and we have that \(ak \equiv b \pmod n\).
These two theorems bring together elements from all of our study of proofs so far. We have both types of quantifiers, mixed with a larger implication. We used the GCD Characterization Theorem for a key step in our proof. This illustrates the power of separating ideas into different statements and using each one to prove the next, just like we separate code into different functions in our programs!
The last ingredient we’ll need to understand for our study of cryptography next week is the patterns that emerge when it comes to exponentiation in modular arithmetic. In normal arithmetic, powers of positive integers increase without bound, but in modular arithmetic we can focus on the remainders of powers, and discover some wonderful properties. For example, \(10^{13}\) is a very large number indeed, but \(10^{13} \equiv 3 \pmod 7\)! In fact, because there are only a finite number of remainders for any given \(n \in \Z^+\), for any \(a \in \Z\) the infinite sequence of remainders of \(a^0\), \(a^1\), \(a^2\), \(a^3\), \(\dots\) must repeat at some point.
For example, let’s see what happens for each of the possible bases modulo 7: Because exponentiation by positive integers corresponds to repeated multiplication, which behaves “nicely” with modular arithmetic, the list below covers all possible integers. For example, because \(10 \equiv 3 \pmod 7\), we also know that \(10^{13} \equiv 3^{13} \pmod 7\).
No matter which base we start with, we enter a cycle. For example, the cycle starting with 2 is \([2, 4, 1, 2, \dots]\). We say this cycle has length 3, since it takes three elements in the sequence for the 2 to repeat. Here are the cycle lengths for each possible \(a \in \{0, 1, \dots, 6\}\):
| \(a\) | Cycle length |
|---|---|
| 0 | 1 |
| 1 | 1 |
| 2 | 3 |
| 3 | 6 |
| 4 | 3 |
| 5 | 6 |
| 6 | 2 |
For each base other than 0, there is another way of looking at the cycle length: the cycle length for base \(a\) is the smallest positive integer \(k\) such that \(a^k \equiv 1 \pmod 7\). For example, \(2^3 \equiv 1 \pmod 7\), and the cycle repeats at \(2^4 \equiv 2^3 \cdot 2 \equiv 2 \pmod 7\).
This “cycle length” is a fundamental property of modular exponentiation, and warrants its own definition.
Let \(a \in \Z\) and \(n \in \Z^+\). We define the order of \(a\) modulo \(n\) to be the smallest positive integer \(k\) such that \(a^k \equiv 1 \pmod n\), when such a number exists.
We denote the order of \(a\) modulo \(n\) as \(\text{ord}_n(a)\).
Something you might notice from our above table is that the cycle length for the remainders modulo 7 always divides 6. Here is another table, this time for modulo 17.
| \(a\) | Cycle length |
|---|---|
| 0 | 1 |
| 1 | 1 |
| 2 | 8 |
| 3 | 16 |
| 4 | 4 |
| 5 | 16 |
| 6 | 16 |
| 7 | 16 |
| 8 | 8 |
| 9 | 8 |
| 10 | 16 |
| 11 | 16 |
| 12 | 16 |
| 13 | 4 |
| 14 | 16 |
| 15 | 8 |
| 16 | 2 |
A similar pattern emerges: the cycle length for these bases always divides 16, which is one less than 17. And again, for each base \(a\) other than 0, the cycle length corresponding to \(a\) is the least positive integer \(k\) such that \(a^k \equiv 1 \pmod{17}\).
Here is one more interesting fact about cycle length: because it is a number \(k\) such that \(a^k \equiv 1 \pmod{17}\), any multiple \(n\) of \(k\) also satisfies \(a^n \equiv 1 \pmod{17}\). For example, \(13^4 \equiv 1 \pmod{17}\), and so \(13^{40} \equiv (13^4)^{10} \equiv 1^{10} \equiv 1 \pmod{17}\).
Combining these two observations allows us to conclude that, at least for 17, every base \(a\) other than 0 satisfies \(a^{16} \pmod{17}\). It is a remarkable fact that this turns out to generalize to every prime number. Proving this theorem is beyond the scope of this course, but we’ll state it formally here to let you marvel at it for a moment.
(Fermat’s Little Theorem) Let \(p, a \in \Z\) and assume \(p\) is prime and that \(p \nmid a\). Then \(a^{p - 1} \equiv 1 \pmod p\).
Fermat’s Little Theorem is quite beautiful in its own right, but is limited in scope to prime numbers. It turns out that the key to generalizing this theorem lies with our very last definition in this chapter.
We define the function \(\varphi : \Z^+ \to \N\), called the Euler totient function (or Euler phi function), as follows:
\[\varphi(n) = \big| \big\{ a \mid a \in \{1, \dots, n - 1\},~ \text{and $\gcd(a, n) = 1$} \big\} \big|.\]
Here are some examples of the Euler totient function:
With the Euler totient function in hand, we can now state the generalization of Fermat’s Little Theorem, which is something we’ll use in the next chapter.
(Euler’s Theorem). For all \(a \in \Z\) and \(n \in \Z^+\), if \(\gcd(a, n) = 1\) then \(a^{\varphi(n)} \equiv 1 \pmod n\).
So far we’ve seen how the data types we introduced in Chapter 1 can be used to store a variety of different data. In our modern world, data is constantly being created, stored, sent, and received. But not all data is created equal; some data is inherently more sensitive than other data. And there are laws mandating the privacy of your data in Canada. Thanks to the explosion of data and the evolution of privacy policy, there are numerous technologies (backed by a strong theoretical underpinning) being developed to ensure data privacy.
After our work from last week, we now have the theoretical foundations necessary to learn about one of the coolest applications of number theory in computer science: encrypting messages so that only the sender and receiver can read them. Check out the movie The Imitation Game, which is about some amazing codebreaking work done in World War II (and a crucial piece in the history of computing). This is only one method for ensuring data privacy, but it is pervasive—nearly every time you send or receive something on your phone or web browser, cryptography plays a role. In this section, you’ll learn about the basics of cryptography, which is the study of theoretical and practical techniques for keeping data secure.
Cryptography is the study of techniques used to keep communication secure in the face of adversaries who wish to eavesdrop on or interfere with the communication. Defining what secure communication between two parties means is complex, and involves several dimensions such as: confidentiality, data integrity, and authentication. In this chapter we will focus primarily on encryption, which involves turning coherent messages into seemingly-random nonsensical strings, and then back again.
As computers have become more powerful, cryptographic technologies have evolved to ensure that the “nonsense” strings are not easily converted back to the coherent message except by the intended recipient(s). But the growing power of computers is a double-edged sword; while cryptographic technologies have evolved, so have the technologies of malicious attackers and eavesdroppers who want to decipher the “nonsense” strings and gain access to sensitive data, such as passwords and social insurance numbers.
The simplest setup that we study in cryptography is two-party confidential communication. In this setup, we have two people, Alice and Bob, who wish to send messages to each other that only they can read, and a third person, Eve, who has access to all of the communications between Alice and Bob, and wants to discover what they’re saying.
Since Eve has access to the communications between Alice and Bob, they can’t just send their messages directly. So instead, Alice and Bob need to encrypt their messages using some sort of encryption algorithm, and send the encrypted versions to each other instead. The hope is that through some shared piece of information called a secret key, Alice and Bob can encrypt their messages in such a way that they will each be able to decrypt each other’s messages, but Eve won’t be able to decrypt the messages without knowing their secret key.
More formally, we define a secure symmetric-key cryptosystem as a system with the following parts:
A set \(\mathcal{P}\) of possible original messages, called plaintext messages. (E.g., a set of strings)
A set \(\mathcal{C}\) of possible encrypted messages, called ciphertext messages. (E.g., another set of strings)
A set \(\mathcal{K}\) of possible shared secret keys (known by both Alice and Bob, but no one else).
Two functions \(Encrypt : \mathcal{K} \times \mathcal{P} \to \mathcal{C}\) and \(Decrypt : \mathcal{K} \times \mathcal{C} \to \mathcal{P}\) that satisfies the following two properties:
One of the earliest examples we have of a symmetric-key cryptosystem is the Caesar cipher, named after the Roman general Julius Caesar. In this system, the plaintext and ciphertext sets are simply strings, and the secret key is some positive integer \(k\).
The idea of this cryptosystem, as well as the starting point of many others, is to associate characters with numbers, because we can do more things with numbers. In this example, we’ll first only consider messages that consist of uppercase letters and spaces, and associate each of these letters with a number as follows:
| Character | Value | Character | Value |
|---|---|---|---|
'A' |
0 |
'O' |
14 |
'B' |
1 |
'P' |
15 |
'C' |
2 |
'Q' |
16 |
'D' |
3 |
'R' |
17 |
'E' |
4 |
'S' |
18 |
'F' |
5 |
'T' |
19 |
'G' |
6 |
'U' |
20 |
'H' |
7 |
'V' |
21 |
'I' |
8 |
'W' |
22 |
'J' |
9 |
'X' |
23 |
'K' |
10 |
'Y' |
24 |
'L' |
11 |
'Z' |
25 |
'M' |
12 |
' ' |
26 |
'N' |
13 |
In Python, we can implement this conversion as follows:
LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ '
def letter_to_num(c: str) -> int:
"""Return the number that corresponds to the given letter.
Preconditions:
- len(c) == 1 and c in LETTERS
"""
return str.index(LETTERS, c)
def num_to_letter(n: int) -> str:
"""Return the letter that corresponds to the given number.
Precondtions:
- 0 <= n < len(LETTERS)
"""
return LETTERS[n]In the Caesar cipher, the secret key \(k\) is an integer from the set \(\{1, 2, \dots, 26\}\). So before sending any messages, Alice and Bob meet and decide on a secret key from this set.
Now when Alice wants to send a string message \(m\) to Bob, she encrypts her message as follows:
LETTERS. Note that the space character comes after Z.For example, if \(k = 3\), and the plaintext message is 'HAPPY', encryption happens as follows:
| Plaintext character | Corresponding Integer | Shifted Integer | Ciphertext character |
|---|---|---|---|
'H' |
7 |
10 |
'K' |
'A' |
0 |
3 |
'D' |
'P' |
15 |
18 |
'S' |
'P' |
15 |
18 |
'S' |
'Y' |
24 |
0 |
'A' |
The corresponding ciphertext is 'KDSSA'. Note that the Y, when shifted by 3, wraps around to become A.
Then when Bob receives the ciphertext 'KDSSA', he decrypts the ciphertext by applying the corresponding shift in reverse (subtracting the secret key \(k\) instead of adding it). We can implement this in Python as follows: Note that we’ve dropped the _so_far suffix on these accumulator variables now that you’re more experience writing loops!
def encrypt_caesar(k: int, plaintext: str) -> str:
"""Return the encrypted message using the Caesar cipher with key k.
Preconditions:
- all({x in LETTERS for x in plaintext})
- 1 <= k <= 26
"""
ciphertext = ''
for letter in plaintext:
ciphertext = ciphertext + num_to_letter((letter_to_num(letter) + k) % len(LETTERS))
return ciphertext
def decrypt_caesar(k: int, ciphertext: str) -> str:
"""Return the decrypted message using the Caesar cipher with key k.
Preconditions:
- all({x in LETTERS for x in ciphertext})
- 1 <= k <= 26
"""
plaintext = ''
for letter in ciphertext:
plaintext = plaintext + num_to_letter((letter_to_num(letter) - k) % len(LETTERS))
return plaintextIn our example above, we restricted ourselves to only upper-case letters and spaces. But the key mathematical idea of the Caesar cipher, shifting letters based on a secret key \(k\) used as an offset, generalizes to larger sets of letters.
To see how to do this, first we recall two built-in Python functions from Section 2.8 Application: Representing Text:
>>> ord('A') # Convert a character into an integer
65
>>> chr(33) # Convert an integer into a character
'!'Using these two functions, we can modify our encrypt and decrypt functions in the Caesar cipher to operate on arbitrary Python strings. For simplicity, we’ll stick only to the first 128 characters, which are known as the ASCII characters.You might recall from Section 2.8 that ASCII is one of the earliest standard for encoding characters as natural numbers on a computer. Our secret key will now take on values from the set \(\{1, 2, \dots, 127\}\).
def encrypt_ascii(k: int, plaintext: str) -> str:
"""Return the encrypted message using the Caesar cipher with key k.
Preconditions:
- all({ord(c) < 128 for c in plaintext})
- 1 <= k <= 127
>>> encrypt_ascii(4, 'Good morning!')
'Kssh$qsvrmrk%'
"""
ciphertext = ''
for letter in plaintext:
ciphertext = ciphertext + chr((ord(letter) + k) % 128)
return ciphertext
def decrypt_ascii(k: int, ciphertext: str) -> str:
"""Return the decrypted message using the Caesar cipher with key k.
Preconditions:
- all({ord(c) < 128 for c in ciphertext})
- 1 <= k <= 127
>>> decrypt_ascii(4, 'Kssh$qsvrmrk%')
'Good morning!'
"""
plaintext = ''
for letter in ciphertext:
plaintext += chr((ord(letter) - k) % 128)
return plaintextWARNING: in practice, the Caeser cipher is not secure, as it is very possible for an eavesdropper to simply try all possible secret keys to decrypt a ciphertext, and pick out the most likely message that Alice sent. So while this example is good for educational purposes, you should definitely not use this cryptosystem for any real-world applications!
The Caesar cipher we studied in the previous section is simple enough as a starting point, but should never be used in practice! It suffers from the fatal flaw that each character of the plaintext is encrypted individually, using the same secret key each time. So for example, every occurrence of the character 'D' in the plaintext is transformed into the same character in the ciphertext. Why is this a problem?
Consider the ciphertext 'OLaTO+T^+NZZW' generated by the ASCII-based Caesar cipher. Even though it may look indecipherable at first, there is information that we can learn about the original plaintext just by looking at the distribution of letters in the ciphertext. Given these observations and the hint that the plaintext is a common phrase used in CSC110, can you determine the plaintext?
'O' in the ciphertext.ord of each character. Since ord('O') = 79 and ord('N') = 78, we know that the first and tenth characters of the plaintext must be consecutive ASCII characters.In addition to what we can infer from the distribution of letters in the ciphertext, the ASCII-based Caesar cipher is vulnerable to a brute-force exhaustive key search attack. There are only 128 possible secret keys the cipher could use (corresponding to the possible remainders of modulo 128). So, given a ciphertext, it is possible to try out every secret key and see which key yields a meaningful plaintext message. For most ciphertexts generated from English plaintexts, only one possible secret key causes the decrypted message to be a meaningful English message. That’s not very secure.
Even if we enlarge the set of possible keys (e.g., by using a more general text encoding like UTF8), Caesar ciphers are still vulnerable to observations like the ones we made earlier. From these observations, we can identify “likely” keys that a brute force search could try first. So the main weakness of the Caesar cipher is not just the number of possible keys.
We will now introduce a new symmetric-key cryptosystem known as the one-time pad that is structurally similar to the Caesar cipher, but avoids the issues we raised earlier. Encryption in the one-time pad works by shifting each character in the plaintext message, much like the Caesar cipher. But where the one-time pad differs is that the shift is not the same for each character. The one-time pad accomplishes this by not using a single number for the secret key, but rather a string of length greater than or equal to the length of the plaintext message you wish to encrypt. This secret key is colloquially referred to as a “one-time pad” (of characters), from which this cryptosystem gets its name.
To encrypt a plaintext ASCII message \(m\) with secret key \(k\), for each index \(i\) between 0 and \(|m| - 1\), we compute:
Here is an example. Suppose we wanted to encrypt the plaintext 'HELLO' with the secret key 'david'. The ciphertext will have five characters, where the first is 'H' + 'd' which results in ',', the second is 'E' + 'A' which results in '&', etc. The following diagram shows the full conversion:

Similarly, for decryption we take the ciphertext c and recover the plaintext by subtracting each letter of the secret key: \((c[i] - k[i]) ~\%~ 128\).
The one-time pad cryptosystem is famous in cryptography for having a property known as perfect secrecy, This is a term termed by the mathematician and cryptographer Claude Shannon in 1949. which informally means that a ciphertext reveals no information about its corresponding plaintext other than its length. To see why, take our previous example, with ciphertext ',&B53'. This ciphertext could have been generated by any five-letter plaintext message, because for any such message there exists a secret key that could encrypt that message to obtain ',&B53'. The sender could have been sending plaintext message 'HELLO' with secret key 'david', but it is equally likely they could have been sending the message 'FUNNY' with secret key 'fQtgZ'. Because of perfect secrecy, an eavesdropper cannot gain any information about the original plaintext message, even if they know the whole ciphertext.
This perfect secrecy comes at a cost, however. The main drawback of the one-time pad cryptosystem, and why it is not actually used in practice, is that the secret key must have at least the same length as the message being sent, and cannot be reused from one message to another. The notion of perfect secrecy relies on every possible secret key to be chosen purely at random. This isn’t the case if I reuse the same one-time pad for all my messages. This requirement is also why the term “one-time” is used for one-time pads.
The attraction of perfect secrecy has led cryptographers to develop stream ciphers, which are a type of symmetric-key cryptosystem that emulate a one-time pad but share a much smaller secret key. The details of stream ciphers are beyond the scope of this course, but the basic is idea is the following: the shared secret key is quite small (less than 1KB), and both parties use an algorithm to generate an arbitrary number of new random characters, based on both the secret key and any previously-generated characters. We say that this is a “stream” of characters, from which this type of cryptosystem gets its name. These characters are then used in the same way as a one-time pad to encrypt messages.
Now, stream ciphers do not have perfect secrecy, since the characters used in encryption aren’t truly random. But if the generating algorithm is clever enough, each new character appears “random”, and the encrypted messages are computationally impossible to decrypt without knowing the starting secret key. In other words, stream ciphers give up on perfect secrecy in exchange for “good enough” secrecy and a much, much smaller shared secret key. Of course, the “good enough” is highly dependent on the algorithm used to generate the characters. A poorly-designed algorithm may unintentionally inject patterns in the generated characters, or even allow an eavesdropper to gain some information about the secret key itself!
A historical limitation of symmetric-key cryptosystems was how to establish a shared, but secret, key. If the two communicating parties were able to meet in person, they could agree upon a shared secret key while physically together (assuming no one else was spying on them). But what if I want to communicate with someone securely in a different city or different country? Or, to use a more modern example, to communicate with a server across the Internet, which I cannot hope to meet in person?
One solution to this problem is the Diffie-Hellman key exchange, which is an algorithm that is executed by two people (or computers) to compute a shared secret, while communicating in public (open to eavesdroppers). We will introduce the intuitions of the Diffie-Hellman key exchange with an analogy that uses our familiar Alice and Bob communicating with colours. After, we will replace colours with numbers to understand how the process works in today’s digital world.
Suppose that Alice and Bob would like to establish a secret paint colour that only the two of them know. They use the following procedure.
|
First, they both agree on a random, not-secret colour of paint to start with: yellow. They decide on this shared colour publicly, so eavesdroppers also know this colour! Second, they each choose their own secret colour, which they will never share with each other or anyone else. In our example, Alice decides on red and Bob chooses teal (a green-blue colour). Third, they each mix their secret colours with their shared colour yellow, producing a light orange for Alice and a blue for Bob. This is also done in secret. Fourth, they exchange these colours with each other, which is done publicly. At this point, there are three not-secret colours: yellow and the two mixtures. And there are two secret colours: Alice’s red and Bob’s teal. Fifth, Alice mixes Bob’s blue colour with her original secret red to produce a brown. Bob mixes Alice’s light orange with his original secret teal to produce the same brown. Why are these the same brown? Because they both consist of the same mixture of three colours: yellow (shared), red (Alice’s secret), and teal (Bob’s secret)! Finally, why is this brown a secret? Any eavesdropper has access to three colours: the original shared yellow (from the first step), and the two mixtures orange and blue (from the fourth step). If we assume that the colour mixtures are not easily separated (i.e., it is very difficult to extract the yellow from each mixture), then the eavesdropper cannot determine what Alice and Bob’s secret colours were, and therefor can’t mix them together with the yellow to produce the right shade of brown! |
![]() |
Unfortunately, transmitting paint across digital channels is intractable, but transmitting numbers isn’t. The Diffie-Hellman key exchange uses some neat (yet simple) operations from modular arithmetic to play out the same scenario as our paint analogy.
Diffie-Hellman Key Exhange Algorithm
Setting: Two parties, Alice and Bob
Result: Alice and Bob share a secret key \(k\).
Alice chooses a prime number \(p\) greater than two and an integer \(g\) which satisfies \(2 \leq g \leq p - 1\), and sends both to Bob.
Alice chooses a secret number \(a \in \{1, 2, \dots, p-1\}\) and sends Bob \(A = g^a ~\%~ p\) to Bob.
Bob chooses a secret number \(b \in \{1, 2, \dots, p-1\}\) and sends \(B = g^b ~\%~ p\) to Alice.
Alice computes \(k_A = B^a ~\%~ p\). Bob computes \(k_B = A^b ~\%~ p\).
It turns out that \(k_A = k_B\), and so this value is chosen as the secret key \(k\) that Alice and Bob share.
Here is an example of the Diffie-Hellman key exchange in action.
That last sentence in the Diffie-Hellman key exchange algorithm description is doing a lot of work. How do we “know” that \(k_A = k_B\)? With a proof, of course!
(Correctness of Diffie-Hellman key exchange)
For all \(p, g, a, b \in \Z^+\), \((g^b ~\%~ p)^a ~\%~ p = (g^a ~\%~ p)^b ~\%~ p\).
Even though the Diffie-Hellman algorithm frames the communication in terms of remainders, we can analyze the numbers using modular arithmetic modulo \(p\). In this case the calculation involves just switching around exponents in \(g^{ab}\).
Let \(p, g, a, b \in Z^+\). Let \(A = g^a ~\%~ p\) and \(B = g^b ~\%~ p\). We’ll prove that \(B^a ~\%~ p = A^b ~\%~ p\).
First, we have that \(A \equiv g^a \pmod p\) and \(B \equiv g^b \pmod p\). So then \(A^b \equiv (g^a)^b \equiv g^{ab} \pmod p\), and \(B^a \equiv (g^b)^a \equiv g^{ba} \pmod p\). Since \(g^{ab} = g^{ba}\), we can conclude that \(A^b \equiv B^a \pmod p\).
So then \(A^b\) and \(B^a\) must have the same remainder when divided by \(p\), and so \(B^a ~\%~ p = A^b ~\%~ p\).
We’ve just proved that the Diffie-Hellman key exchange is correct, meaning the result at the end of the algorithm is that Alice and Bob have a shared key. But that’s not the only purpose of this algorithm: it must also ensure that this shared key is also secret, unknown to anyone other than Alice and Bob.
So let’s look at the Diffie-Hellman key exchange from the perspective of an eavesdropper that has access to everything Alice and Bob communicate to each other. We say that Alice and Bob’s communications are public, while their own computing devices are private. So over the course of the algorithm, the eavesdropper has access to \(p\), \(g\), \(g^a ~\%~ p\), and \(g^b ~\%~ p\). The question is: from this information, can the eavesdropper determine the secret key \(k\)?
One approach an eavesdropper could take is to try to compute \(a\) and \(b\) directly. This is an instance of the discrete logarithm problem: given \(p, g, y \in \Z^+\), find an \(x \in \Z^+\) such that \(g^x \equiv y \pmod p\). While we could implement a brute-force algorithm for solving this problem that simply tries all possible exponents \(x \in \{0, 1, \dots, p-1\}\), this is computationally inefficient in practice when \(p\) is chosen to be extremely large. We’ll explore exactly what we mean by terms like “efficient” and “inefficient” more precisely in the next chapter.
Perhaps surprisingly, there is no known efficient algorithm for solving the discrete logarithm problem! So we say that the Diffie-Hellman key exchange is computationally secure: while there are known algorithms that eavesdroppers could use for determining the shared secret key, all known algorithms are computationally infeasible for standard primes chosen. In practice, Diffie-Hellman key exchanges tend to use primes on the order of \(2^{2048} \approx 10^{617}\)!
So far, we have studied symmetric-key cryptosystems to allow two parties to communicate securely with each other when they share a secret key. We have also studied how two parties can establish a shared secret key using the Diffie-Hellman key exchange algorithm.
One of the limitations of symmetric-key encryption schemes is that a shared secret key needs to be established for every pair of people who want to communicate. If there are \(n\) people who each want to communicate securely with each other, there are \(\frac{n(n-1)}{2}\) keys needed:
In this section, we’ll introduce a new form of cryptosystem called a public-key cryptosystem, for each each person has two keys: a private key known only to them, and a public key known to everyone. We’ll see what how to encrypt and decrypt messages in these cryptosystems, how they reduce the number of keys needed for people to communicate, and learn about the most widely-used public-key cryptosystem today, the RSA cryptosystem.
A public-key cryptosystem is one where each party in the communication generates a pair of keys: a private (or secret key, known only to them) and a public key which is known to everyone. Suppose Alice wants to send Bob a message. She uses Bob’s public key to encrypt the message, and Bob uses his private key to decrypt the message. Recall that in a symmetric-key cryptosystem, messages are encrypted and decrypted with the same key–hence, the symmetry. Similarly, if Bob wants to send a message to Alice, he uses Alice’s public key to encrypt the message, and Alice uses her private key to decrypt it.
More formally, we define a secure public-key cryptosystem as a system with the following parts:
A set \(\mathcal{P}\) of possible original messages, called plaintext messages. (E.g., a set of strings)
A set \(\mathcal{C}\) of possible encrypted messages, called ciphertext messages. (E.g., another set of strings)
A set \(\mathcal{K}_1\) of possible public keys and a set \(\mathcal{K}_2\) of possible private keys.
A subset \(\mathcal{K} \subseteq \mathcal{K}_1 \times \mathcal{K}_2\) of possible public-private key pairs. Note that we use \(\subseteq\) and not \(=\) because not every public key can be paired with every private key.
Two functions \(Encrypt : \mathcal{K_1} \times \mathcal{P} \to \mathcal{C}\) and \(Decrypt : \mathcal{K}_2 \times \mathcal{C} \to \mathcal{P}\) that satisfy the following two properties:
The Diffie-Hellman key exchange algorithm we studied in the last section worked by relying on the hardness of the discrete logarithm problem. This allowed Alice and Bob to communicate their numbers \(g^a ~\%~ p\) and $\(g^b ~\%~ p\) publicly, without anyone being able to find the “secret” \(a\) and \(b\).
The Rivest-Shamir-Adleman (RSA) cryptosystem works with numbers as well, and relies on the surprising hardness of factoring large integers. For example, can you tell me which two prime numbers can be multiplied together to produce \(30,929\)? You could write a small Python program to answer this question quite quickly, but that was only a number with 5 digits. What about the number \(1,455,980,635,647,702,351,701\), with 22 digits? In practice, RSA relies on the hardness of factoring integers with hundreds of digits!
Let’s see how RSA works.
Each person in a public-key cryptosystem must first generate a public-private key pair before they can communicate with anyone else. (Think about this as choosing a valid key-pair from the set \(\mathcal{K} = \mathcal{K}_1 \times \mathcal{K}_2\).) For RSA, we’ll put ourselves in Alice’s shoes and see what she must do to to generate a public and private key.
First, Alice picks two distinct prime numbers \(p\) and \(q\).
Next, Alice computes the product \(n = pq\).
Then, Alice chooses an integer \(e \in \{2, 3, \dots, \varphi(n) - 1\}\) such that \(\gcd(e, \varphi(n)) = 1\).
Finally, Alice chooses an integer \(d \in \{2, 3, \dots, \varphi(n) - 1\}\) that is the modular inverse of \(e\) modulo \(\varphi(n)\). (That is, \(de \equiv 1 \pmod{\varphi(n)}\).)
That’s it! Alice’s private key is the tuple \((p, q, d)\), and her public key is the tuple \((n, e)\). Alice shares her public key with the world, but she never tells her private key to anyone.
Now suppose that Bob wants to send Alice a plaintext message \(m\). For now we’ll treat the message as a number between \(1\) and \(n - 1\), and will discuss string messages later on in this section. Bob uses Alice’s public key \((n, e)\):
Alice receives the ciphertext \(c\). She uses her private key \((p, q, d)\) to decrypt the message:
Before moving on, let’s see an example of a full use of the RSA cryptosystem in action. Alice first needs to generate a public and private key.
For reference, the private key is: \((p=23, q=31, d=403)\) and the public key is: \((n=713, e=547)\).
Bob wants to send the number \(42\) to Alice. He computes the encrypted number to be \(c = 42^e ~\%~ n = 42^{547} ~\%~ 713 = 106\) and sends it to Alice. Alice receives the number \(106\) from Bob. She computes the decrypted number to be \(m = 106^d ~\%~ 713 = 106^{403} ~\%~ 713 = 42\). Voila!
In the RSA cryptosystem, the encryption and decryption algorithms are very straightforward. The “interesting” part is in how the public-private key pair is generated to make the encryption and decryption work! In this section, we’ll come to understand why the key generation involves the steps that it does by proving that the RSA algorithm works correctly, using all the number theory work we developed last week.
Let \((p, q, d) \in \Z^+ \times \Z^+ \times \Z^+\) be a private key and \((n, e) \in \Z^+ \times \Z^+\) its corresponding public key as generated by “RSA Phase 1”. Let \(m, c, m' \in \{1, \dots, n - 1\}\) be the original plaintext message, ciphertext, and decrypted message, respectively, as described in the RSA encryption and decryption phases.
Then \(m' = m\) (i.e., the decrypted message is the same as the original message).
Let \(p, q, n, d, e, m, c, m' \in \N\) be defined as in the above definition of the RSA algorithm. We need to prove that \(m' = m\).
(It is possible to prove this theorem without this assumption, but we will not do so here.)
From the definition of \(m'\) in the decryption step, we know \(m' \equiv c^d \pmod n\). From the definition of \(c\) in the encryption step, we know \(c \equiv m^e \pmod n\). Putting these together, we have: \[m' \equiv (m^e)^d \equiv m^{ed} \pmod n.\]
So we need to prove that \(m^{ed} \equiv m \pmod n\). From Steps 3 and 4 of the RSA key generation phase, we know that \(de \equiv 1 \pmod{\varphi(n)}\), i.e., there exists a \(k \in \Z\) such that \(de = k \cdot \varphi(n) + 1\).
We also know that since \(\gcd(m, n) = 1\), by Euler’s Theorem \(m^{\varphi(n)} \equiv 1 \pmod n\).
Putting this all together, we have \[\begin{align*} m' &\equiv m^{ed} \pmod n \\ &\equiv m^{k \varphi(n) + 1} \pmod n \\ &\equiv (m^{\varphi(n)})^k \cdot m \pmod n \\ &\equiv 1^k \cdot m \pmod n \tag{by Euler's Theorem!} \\ &\equiv m \pmod n \end{align*}\]
So \(m' \equiv m \pmod n\). Since we also know \(m\) and \(m'\) are between \(1\) and \(n-1\), we can conclude that \(m' = m\).
Now that we’ve established the correctness of the RSA cryptosystem, let’s now discuss its security. As we did for the Diffie-Hellman key exchange, we’ll put ourselves in the role of an eavesdropper who is trying to gain information about a secret message. Suppose we observe Bob sending an encrypted message \(c\) to Alice. In addition to the ciphertext, we also know Alice’s public key \((n, e)\). Remember that “public” means that everyone can see it—including possibly malicious users! What information can we hope to gain about Bob’s original plaintext message?
First, we know from the RSA encryption phase that \(c \equiv m^e \pmod n\), so if we know all three of \(c\), \(e\), and \(n\), can we determine the value of \(m\)? No! We don’t have an efficient way of computing “\(e\)-th roots” in modular arithmetic.
Another approach we could take is to attempt to discover Alice’s private key. Recall that \(de \equiv 1 \pmod{\varphi(n)}\). So \(d\) is the inverse of \(e\) modulo \(\varphi(n)\), and we learned in the last chapter that we can compute modular inverses, so this should be easy, right?
Not so fast! We can compute the modular inverse of \(d\) modulo \(\varphi(n)\) when we know both \(d\) and \(\varphi(n)\), but right now we only know \(n\), not \(\varphi(n)\).
So how do we compute \(\varphi(n)\)? Well, we know that if \(n = p \cdot q\) where \(p\) and \(q\) are distinct primes, then \(\varphi(n) = (p - 1)(q - 1)\). But here is the problem: it is not computationally feasible to factor \(n\) when it is extremely large. This is our second “computationally hard” problem in computer science, the Integer Factorization Problem. Despite the best efforts of computer scientists and mathematicians for centuries, there is no known efficient general algorithm for factoring integers, and it is this fact that keeps the RSA private key \((p, q, d)\) secure.
In the previous section we defined the RSA cryptosystem that used both a public key and private key to send encrypted messages between two parties. In this section, we will see how to implement the RSA cryptosystem in Python. First, we will see how to generate a private key when given two prime numbers. Second, we will see how to encrypt and decrypt a single number. Finally, we will see how to encrypt and decrypt text.
Here is our implementation of the first phase of RSA: generating the public-private key pair. In this implementation, we will assume that the prime numbers \(p\) and \(q\) are given. Algorithms do exist for generating these prime numbers, we just won’t go over them here.
def rsa_generate_key(p: int, q: int) -> \
tuple[tuple[int, int, int], tuple[int, int]]:
"""Return an RSA key pair generated using primes p and q.
The return value is a tuple containing two tuples:
1. The first tuple is the private key, containing (p, q, d).
2. The second tuple is the public key, containing (n, e).
Preconditions:
- p and q are prime
- p != q
"""
# Compute the product of p and q
n = p * q
# Choose e such that gcd(e, phi_n) == 1.
phi_n = (p - 1) * (q - 1)
# Since e is chosen randomly, we repeat the random choice
# until e is coprime to phi_n.
e = random.randint(2, phi_n - 1)
while math.gcd(e, phi_n) != 1:
e = random.randint(2, phi_n - 1)
# Choose d such that e * d % phi_n = 1.
# Notice that we're using our modular_inverse from our work in the last chapter!
d = modular_inverse(e, phi_n)
return ((p, q, d), (n, e))The algorithm makes use of both a while loop and the random module. The random module is used to generate an e, but the while loop ensures that the e we finally choose is valid. That is, we continue to randomly generate an e until e and phi_n have a greatest common divisor of 1. Once we have e, we can finally calculate the last part of our private key: d. To do so, we make use of the modular_inverse function we defined in the last chapter (which, in turn, used the extended_gcd function).
Next, let’s look at RSA encryption, which only uses the public key. Recall that the plaintext here is a number \(m\) between \(1\) and \(n - 1\) inclusive, and the ciphertext is another number \(c = m^e ~\%~ n\). This mathematical definition translates directly into Python:
def rsa_encrypt(public_key: tuple[int, int], plaintext: int) -> int:
"""Encrypt the given plaintext using the recipient's public key.
Preconditions:
- public_key is a valid RSA public key (n, e)
- 0 < plaintext < public_key[0]
"""
n, e = public_key
encrypted = (plaintext ** e) % n
return encryptedThe implementation for RSA decryption is almost identical, except we use the private key (i.e., d) for exponentiation.
def rsa_decrypt(private_key: tuple[int, int, int] ciphertext: int) -> int:
"""Decrypt the given ciphertext using the recipient's private key.
Preconditions:
- private_key is a valid RSA private key (p, q, d)
- 0 < ciphertext < private_key[0] * private_key[1]
"""
p, q, d = private_key
n = p * q
decrypted = (ciphertext ** d) % n
return decryptedThe above implementation of RSA is correct, but is a little unsatisfying because it encrypts numbers instead of strings, like we saw with the Caesar cipher and One-Time Pad cryptosystem. So next, we’ll see how to adapt the RSA encryption and decryption algorithms to strings.
Our strategy will be to take a string and break it up into individual characters and encrypt each character, just as we did with the Caesar cipher. We’ll use this approach for both encryption and decryption, using ord/chr to convert between characters and numbers, and a string accumulator to keep track of the encrypted/decrypted results.
def rsa_encrypt_text(public_key: tuple[int, int], plaintext: str) -> str:
"""Encrypt the given plaintext using the recipient's public key.
Preconditions:
- public_key is a valid RSA public key (n, e)
- all({0 < ord(c) < public_key[0] for c in plaintext})
"""
n, e = public_key
encrypted = ''
for letter in plaintext:
# Note: we could have also used our rsa_encrypt function here instead
encrypted = encrypted + chr((ord(letter) ** e) % n)
return encrypted
def rsa_decrypt_text(private_key: tuple[int, int, int], ciphertext: str) -> str:
"""Decrypt the given ciphertext using the recipient's private key.
Preconditions:
- private_key is a valid RSA private key (p, q, d)
- all({0 < ord(c) < private_key[0] * private_key[1] for c in ciphertext})
"""
p, q, d = private_key
n = p * q
decrypted = ''
for letter in ciphertext:
# Note: we could have also used our rsa_decrypt function here instead
decrypted = decrypted + chr((ord(letter) ** d) % n)
return decryptedCryptography is central to all kinds of computing and online communication in today’s modern world. Modern security practices inform every stage of how we interact online, from the Wifi networks we connect to, to how data is transmitted back and forth between our computer and a server halfway around the world, and even how data is encrypted for storage on those servers. Every time we visit a website, watch a video on our phone, or post a photo or tweet, we are relying on modern cryptography to keep our communications private.
In this section, we will tie together our study of cryptography by looking at one specific link in the chain of Internet communication. While doing so, we will explore some of the real-world design decisions and trade-offs that go into implementing a secure communication protocol used by billions of people around the world.
Whether you are browsing a website on your computer or on your phone, you can probably see a little padlock icon next to the website’s URL. Here’s what happens when you click on it:

This icon is our web browser’s way of telling us that the the data being sent from the server (www.teach.cs.toronto.edu in our above picture) has been encrypted using a communication protocol called HTTPS. We won’t define the term “protocol” formally in this course, but you can think of it as an algorithm where the steps are split among two (or more) parties, rather than just a single computer. For example, the Diffie-Hellman key exchange is more commonly referred to as a protocol rather than an algorithm. This protocol consists of two parts:
On its own, HTTP allows your computer to communicate with servers around the world. But when combined with TLS, those communications are secure and cannot be “snooped” by an eavesdropper (at least not easily!).
An analogy here might be helpful. Suppose you’re living in pre-Internet times, and writing a book (or set of course notes!), and want to send a draft to your publisher through mail. HTTP corresponds to the format in which you deliver the book: perhaps chapter by chapter, with a table of contents in front and appendices or an index at the end. TLS corresponds to how you encrypt the contents of what you send in this format. For example, you might apply a Caesar cipher to shift every character in your book or you might enclose each chapter in a separate locked briefcase for which only you and your publisher know the combination. Of course, TLS is much more sophisticated than either of the example “security” approaches. For the rest of this section, we’ll study how TLS uses the concepts we’ve learned across this chapter to encrypt your online communications.
For our description of the TLS protocol, we’ll use the term client to refer to your computer and server to refer to the website you are communicating with. TLS starts off with the client initiating a request to the server (e.g., when you type in a URL into your web browser and press “Enter”). The following happens:
That’s it! While the protocol seems straightforward, there are a few real-world details that we’ll look at. Let us investigate two questions:
Our first example of symmetric encryption, Caesar’s cipher, shows just how old the idea is. Public-key encryption is, relatively, much more modern and does not require that the two communicating parties share a secret key. But modern doesn’t always mean better—TLS relies on symmetric-key encryption because public-key cryptosystems, like RSA, are significantly slower than their symmetric-key counterparts. While RSA relies on modular exponentiation as the key encryption and decryption steps, modern symmetric-key cryptosystems use faster operationsTypically these operations act on swapping or combining individual bytes in computer memory. to encrypt and decrypt data.
When computers became household commodities, performance was king. Here, performance is a broad term that typically refers to how quickly a computer can do something. For example: how long does it take to encrypt the frame of a video, send it over a wireless connection, and decrypt that frame on your phone? Consider that your phone is likely streaming at least 30 frames per second in order for you to enjoy a video of reasonable quality. It’s also increasingly likely that, today, the frame of video is high-definition, which requires even more data to be encrypted and decrypted. While security and privacy is king in today’s world, performance cannot be forgotten.
The first two steps of the TLS protocol are “setup” steps for the actual communication of data between the client and server. While a symmetric cryptosystem is used to encrypt the communicated data, these setup steps are unencrypted, and raise a natural question: how do we know we are communicating with the right server?
For example, when we visit www.google.com, and our computer performs the TLS protocol with a distant server, how do we know our computer is connecting to a real Google server, and not some fake server that’s simply pretending to be Google? The consequences of establishing a connection with such a “fake Google” server are severe: that server might give us manipulated or fake search results, save our login information, or store text, images, and videos we upload to Google Drive or YouTube. Even if we encrypt all of this data in Step 3 of TLS, that encryption does not protect us from a malicious fake server posing as an honest one.
In order to avoid such a dangerous situation, we need some way to verify that the server (e.g., Google) we intended to speak with is actually who they say they are. Herein lies one of the main benefits of public-key cryptosystems. Every public-key cryptosystem, including RSA, can implement two additional algorithms to:
These algorithms allow a server to sign every message it sends with is private key, and then have the client verify each message signature using the server’s public key. We call these digital signatures, and they help us identify exactly who we are speaking with. We won’t go into the specifics of the algorithms here, but the process for the RSA cryptosystem is similar to what we’ve outlined in this chapter (i.e., they exploit modular arithmetic). Alice can add her signature, which is a function of her private key, to a message. Bob can verify that Alice is the sender with Alice’s public key.
Digital signatures are used in each of the first two steps in the TLS protocol, which is what we’ll look at next.
In the first step of TLS, we said that the server sends the client a “proof of identity”. To make that more precise, the data the server sends in this step is called a digital certificate, which has identifying information for the server, including its domain (e.g., www.google.com), its organization name (e.g., “Google LLC”), and its public key.
But how do we know this digital certificate is the “real” one? The certificate also includes the digital signature of a certificate authority, which is an organization whose purpose is to issue digital certificates to website domains and verify the identities of the operators of each of those domains. The largest of these worldwide are IdenTrust and DigiCert, though a recent non-profit called Let’s Encrypt launched in 2016. So when the client “verifies” the digital certificate provided by the server, what’s actually happening is that the client is verifying the digital signature provided by the certificate authority, using the certificate authority’s public key. You might ask: how does the client know the certificate authority’s public key? It turns out that web browsers come pre-installed with the public keys of many certificate authorities!
After Step 1 of TLS, the client is confident that it has connected with the right server. But we aren’t in the clear yet—because the Diffie-Hellman algorithm is performed unencrypted, there is still the danger that an attacker might wait for Step 1 to complete and then intercept the messages for Diffie-Hellman in Step 2. Thus the attacker tricking the client into sharing a secret key with the attacker instead of the intended server.
The server’s digital certificate doesn’t help here! Instead, the server signs all messages it sends during the Diffie-Hellman algorithm, so that at every step the client can verify that the message came from the intended server. Of course, this relies on the client knowing the server’s public key, which it gets from the digital certificate in the previous step!
It is this digital signature from the server that allows the client to consistently verify that it is communicating with the server, and that the messages haven’t been tampered with. At the end of Step 2, the client and server have a shared secret key, and can now communicate safely using symmetric-key encryption.
We’ve mentioned that Diffie-Hellman and RSA are secure because it is very difficult to extract the private part of the data from what is being publicly communicated. But what if it wasn’t that difficult? Remember that both RSA and Diffie-Hellman rely on very large prime numbers. But, as we saw in Chapter 6, generating these prime numbers is costly. And it turns out that, unfortunately, many servers use the same group of prime numbers.
Recall that Diffie-Hellman relies on the discrete logarithm problem being difficult to solve. But some steps of the algorithm can be precomputed for a specific group of prime numbers. In 2015, a team of academics discovered that 82% of servers used the same 512-bit group of prime numbers. The team proposed the Logjam attack, which exploited this vulnerability and compromised communications. They also extrapolated that Logjam applied to the 1024-bit case. Today, 2048-bit keys are used to avoid the Logjam attack—for example, Google announced in 2013 that it switched from 1024- to 2048-bit keys.
The Logjam attack is not an isolated incident. Security protocols are constantly being revised, leading to important updates for web browsers, email clients, servers, etc. Earlier versions of the TLS protocol (1.0 and 1.1) are deprecated as of March 2020, which means that “secure” communication must use more recent versions of the protocol. Nor are attacks limited to cryptography. The security and privacy of our data can be attacked at multiple points, and attackers are not limited to exploiting weaknesses when we communicate data. The fields of computer security and data privacy are becoming one of the most important problems to solve as laws and policies slowly catch up to a world where a person’s private information is used as a common commodity sold and exchanged by corporations.
So far in this course, when we have studied programming concepts, we have focused on the correctness of our code. In Chapters 1–5, we learned about different programming constructs, understanding what to do, how to combine them into larger programs, and how to test these programs to make sure they are correct. In Chapters 6 and 7, we learned about mathematical proof, and applied this skill to proving the correctness of various algorithms, including every part of the RSA cryptosystem.
Yet when it comes to evaluating programs, correctness is not the only important measure. As we alluded to in Chapter 7, the amount of time a program takes to run, or program running time, is a critical consideration. Running time is often shortened to “runtime”, and is also know as the “efficiency” or “performance” of a program. In this chapter, we’ll study a formal approach to analysing the running time of a program. This section will introduce the topic, and then in future sections we’ll build up some mathematical theory about comparing rates of function growth, and then apply this theory to real program code.
Consider the following function, which prints out the first n natural numbers:
What can we say about the running time of this function? An empirical approach would be to measure the time it takes for this function to run on a bunch of different inputs, and then take the average of these times to come up with some sort of estimate of the “average” running time.
But of course, given that this algorithm performs an action for every natural number between 0 and n - 1, we expect it to take longer as n gets larger, so taking an average of a bunch of running times loses important information about the inputs.This is like doing a random poll of how many birthday cakes people have eaten without taking into account how old the respondents are.
How about choosing one particular input, calling the function multiple times on that input, and averaging those running times? This seems better, but even here there are some problems. For one, the computer’s hardware can affect running time; for another, computers all are running multiple programs at the same time, so what else is currently running on your computer also affects running time. So even running this experiment on one computer wouldn’t necessarily be indicative of how long the function would take on a different computer, nor even how long it would take on the same computer running a different number of other programs.
While these sorts of timing experiments are actually done in practice for evaluating particular hardware or extremely low-level (close to hardware) programs, these details are often not helpful for most software developers, as they do not have control over the machine on which their software will be run. That said, these timing experiments can provide an intuitive understanding of the efficiency of our programs. We will explore how to conduct basic timing experiments at the end of this chapter.
So rather than use an empirical measurement of runtime, what we do instead is use an abstract representation of runtime: the number of “basic operations” an algorithm executes. This means that we can analyze functions without needing a computer, and our analysis theoretically applies to any computer system. However, there is a good reason “basic operation” is in quotation marks—this vague term raises a whole slew of questions:
The answers to these questions can depend on the hardware being used, as well as what programming language the algorithm is written in. Of course, these are precisely the details we wish to avoid thinking about. In this section, we will count only the calls to print as basic operations, and study print_integers and some variations to establish some intuition and terminology.
First, let’s return to print_integers.
From Chapter 4, we know that the for loop will call print once per iteration. We also know that this loop iterates \(n\) times (with i taking on the values 0, 1, 2, …, \(n - 1\)):
So then for an input \(n\), there are \(n\) calls to print. We say that the running time of print_integers on input \(n\) is \(n\) basic operations. If we plot \(n\) against this measure running time, we obtain a line:
We say that print_integers has a linear running time, as the number of basic operations is a linear function of the input \(n\).

Let us now consider a function that prints all combinations of pairs of integers:
def print_pairs(n: int) -> None:
"""Print all combinations of pairs of the first n natural numbers."""
for i in range(0, n):
for j in range(0, n):
print(i, j)What is the running time of this function? Similar to our previous example, there is a for loop that calls print \(n\) times, but now this loop is nested inside another for loop. Let’s see some examples of this function being called:
>>> print_pairs(1)
0 0
>>> print_pairs(2)
0 0
0 1
1 0
1 1
>>> print_pairs(3)
0 0
0 1
0 2
1 0
1 1
1 2
2 0
2 1
2 2If we look at the outer loop (loop variable i), we see that it repeats its body \(n\) times. But its body is another loop, which repeats its body \(n\) times. So the inner loop takes \(n\) calls to print each time it is executed, and it is executed \(n\) times in total. This means print is called \(n^2\) times in total.
We say that print_pairs has a quadratic running time, as the number of basic operations is a quadratic function of the input \(n\).

Now let’s consider the following function, which prints out the powers of two that are less than a positive integer \(n\). These numbers are of the form \(2^i\), where \(i\) can range from 0 to \(\ceil{\log_2(n)} - 1\). For example, when \(n = 16\), \(\ceil{\log_2(n)} = 4\), and \(i\) ranges from 0 to 3. When \(n = 7\), \(\ceil{\log_2(n)} = 3\), and \(i\) ranges from 0 to 2.
def print_powers_of_two(n: int) -> None:
"""Print the powers of two that are less than n.
Preconditions:
- n > 0
"""
for i in range(0, math.ceil(math.log2(n))):
print(2 ** i)In this case, the number of calls to print is \(\ceil{\log_2(n)}\). So the running time of print_powers_of_two is approximately, but not exactly, \(\log_2(n)\). Yet in this case we still say that print_powers_of_two has a logarithmic running time.

Our final example in this section is a bit unusual.
How many times is print called here? We can again tell from the header of the for loop: this loop iterates 10 times, and so print is called 10 times, regardless of what \(n\) is!
We say that print_ten has a constant running time, as the number of basic operations is independent to the input size.

In the past four examples, we have seen examples of functions that have linear, quadratic, logarithmic, and constant running times. While these labels are not precise, they do give us intuition about the relative size of each function.
Functions with linear running time are faster than ones with quadratic running time, and slower than ones with logarithmic running time. Functions with a constant running time are the fastest of all.
But all of our informal analyses in the previous section relied on defining a “basic operation” to be a call to print. We said, for example, that the running time of print_integers had a running time of \(n\). But what if we had a friend comes along and say, “No wait, the variable i must be assigned a new value at every loop iteration, and that counts as a basic operation.” Okay, so then we would say that there are \(n\) print calls and \(n\) assignments to i, for a total running time of \(2n\) basic operations for an input \(n\).
But then another friend chimes in, saying “But print calls take longer than variable assignments, since they need to change pixels on your monitor, so you should count each print call as \(10\) basic operations.” Okay, so then there are \(n\) print calls worth \(10n\) basic operations, plus the assignments to i, for a total of \(11n\) basic operations for an input \(n\).
And then another friend joins in: “But you need to factor in an overhead of calling the function as a first step before the body executes, which counts as \(1.5\) basic operations (slower than assignment, faster than print).” So then we now have a running time of \(11n + 1.5\) basic operations for an input \(n\).
And then another friend starts to speak, but you cut them off and say “That’s it! This is getting way too complicated. I’m going back to timing experiments, which may be inaccurate but at least I won’t have to listen to these increasing levels of fussiness.”
The expressions \(n\), \(2n\), \(11n\), and \(11n + 1.5\) may be different mathematically, but they share a common qualitative type of growth: they are all linear. And so we know, at least intuitively, that they are all faster than quadratic running times and slower than logarithmic running times. What we will study in the next section is how to make this observation precise, and thus avoid the tedium of trying to exactly quantify our “basic operations”, and instead measure the overall rate of growth in the number of operations.
In the previous section, we began our study of program running time with a few simple examples to guide our intuition. One question that emerged from these examples was how we define what “basic operations” we actually count when analysing a program’s running time—or better yet, how we can ignore small differences in counts that result from slighly different definitions of “basic operation”. This question grows even more important as we study more complex algorithms consisting of many lines of code.
Over the next two sections, we’ll develop a powerful mathematical tool for comparing function growth rates. This will formalize the idea of “linear”, “quadratic”, “logarithmic”, and “constant” running times from the previous section, and extend these categories to all types of functions.
Here is a quick reminder about function notation. When we write \(f : A \to B\), we say that \(f\) is a function which maps elements of \(A\) to elements of \(B\). In this chapter, we will mainly be concerned about functions mapping the natural numbers to the nonnegative real numbers,These are the domain and codomain which arise in algorithm analysis—an algorithm can’t take “negative” time to run, after all. i.e., functions \(f: \N \to \R^{\geq 0}\). Though there are many different properties of functions that mathematicians study, we are only going to look at one such property: describing the long-term (i.e., asymptotic) growth of a function. We will proceed by building up a few different definitions of comparing function growth, which will eventually lead into one which is robust enough to be used in practice.
Let \(f, g : \N \to \R^{\ge 0}\). We say that \(g\) is absolutely dominated by \(f\) when for all \(n \in \N\), \(g(n) \leq f(n)\).
Let \(f(n) = n^2\) and \(g(n) = n\). Prove that \(g\) is absolutely dominated by \(f\).
This is a straightforward unpacking of a definition, which you should be very comfortable with by now: \(\forall n \in \N,~ g(n) \leq f(n)\).Note that we aren’t quantifying over \(f\) and \(g\); the “let” in the example defines concrete functions that we want to prove something about.
Let \(n \in \N\). We want to show that \(n \leq n^2\).
Case 1: Assume \(n = 0\). In this case, \(n^2 = n = 0\), so the inequality holds.
Case 2: Assume \(n \geq 1\). In this case, we take the inequality \(n \geq 1\) and multiply both sides by \(n\) to get \(n^2 \geq n\), or equivalently \(n \leq n^2.\)
Unfortunately, absolute dominance is too strict for our purposes: if \(g(n) \leq f(n)\) for every natural number except \(5\), then we can’t say that \(g\) is absolutely dominated by \(f\). For example, the function \(g(n) = 2n\) is not absolutely dominated by \(f(n) = n^2\), even though \(g(n) \leq f(n)\) everywhere except \(n = 1\). Graphically:

Here is another definition which is a bit more flexible than absolute dominance.
Let \(f, g : \N \to \R^{\ge 0}\). We say that \(g\) is dominated by \(f\) up to a constant factor when there exists a positive real number \(c\) such that for all \(n \in \N\), \(g(n) \leq c \cdot f(n)\).
Let \(f(n) = n^2\) and \(g(n) = 2n\). Prove that \(g\) is dominated by \(f\) up to a constant factor.
Once again, the translation is a simple unpacking of the previous definition:Remember: the order of quantifiers matters! The choice of \(c\) is not allowed to depend on \(n\).
\[\exists c \in \R^+,~ \forall n \in \N,~ g(n) \leq c \cdot f(n).\]
The term “constant factor” is revealing. We already saw that \(n\) is absolutely dominated by \(n^2\), so if the \(n\) is multiplied by \(2\), then we should be able to multiply \(n^2\) by \(2\) as well to get the calculation to work out.
Let \(c = 2\), and let \(n \in \N\). We want to prove that \(g(n) \leq c \cdot f(n)\), or in other words, \(2n \leq 2n^2\).
Case 1: Assume \(n = 0\). In this case, \(2n = 0\) and \(2n^2 = 0\), so the inequality holds.
Case 2: Assume \(n \geq 1\). Taking the assumed inequality \(n \geq 1\) and multiplying both sides by \(2n\) yields \(2n^2 \geq 2n\), or equivalently \(2n \leq 2n^2\).
Intuitively, “dominated by up to a constant factor” allows us to ignore multiplicative constants in our functions. This will be very useful in our running time analysis because it frees us from worrying about the exact constants used to represent numbers of basic operations: \(n\), \(2n\), and \(11n\) are all equivalent in the sense that each one dominates the other two up to a constant factor.
However, this second definition is still a little too restrictive, as the inequality must hold for every value of \(n\). Consider the functions \(f(n) = n^2\) and \(g(n) = n + 90\). No matter how much we scale up \(f\) by multiplying it by a constant, \(f(0)\) will always be less than \(g(0)\), so we cannot say that \(g\) is dominated by \(f\) up to a constant factor. And again this is silly: it is certainly possible to find a constant \(c\) such that \(g(n) \leq cf(n)\) for every value except \(n = 0\). So we want some way of omitting the value \(n = 0\) from consideration; this is precisely what our third definition gives us.
Let \(f, g : \N \to \R^{\ge 0}\). We say that \(g\) is eventually dominated by \(f\) when there exists \(n_0 \in \R^+\) such that \(\forall n \in \N\), if \(n \geq n_0\) then \(g(n) \leq f(n)\).
Let \(f(n) = n^2\) and \(g(n) = n + 90\). Prove that \(g\) is eventually dominated by \(f\).
\[\exists n_0 \in \R^+,~ \forall n \in \N,~ n \geq n_0 \IMP g(n) \leq f(n).\]
Okay, so rather than finding a constant to scale up \(f\), we need to argue that for “large enough” values of \(n\), \(n + 90 \leq n^2\). How do we know that value of \(n\) is “large enough?”
Since this is a quadratic inequality, it is actually possible to solve it directly using factoring or the quadratic formula. But that’s not really the point of this example, so instead we’ll take advantage of the fact that we get to choose the value of \(n_0\) to pick one which is large enough.
Let \(n_0 = 90\), let \(n \in \N\), and assume \(n \geq n_0\). We want to prove that \(n + 90 \leq n^2\).
We will start with the left-hand side and obtain a chain of inequalities that lead to the right-hand side. \[\begin{align*} n + 90 &\leq n + n \tag{since $n \geq 90$} \\ &= 2n \\ &\leq n \cdot n \tag{since $n \geq 2$} \\ &= n^2 \end{align*}\]
Intuitively, this definition allows us to ignore “small” values of \(n\) and focus on the long term, or asymptotic, behaviour of the function. This is particularly important for ignoring the influence of slow-growing terms in a function, which may affect the function values for “small” \(n\), but eventually are overshadowed by the faster-growing terms. In the above example, we knew that \(n^2\) grows faster than \(n\), but because an extra \(+ 90\) was added to the latter function, it took a while for the faster growth rate of \(n^2\) to “catch up” to \(n + 90\).
Our final definition combines both of the previous ones, enabling us to ignore both constant factors and small values of \(n\) when comparing functions.
Let \(f, g: \N \to \R^{\ge 0}\). We say that \(g\) is eventually dominated by \(f\) up to a constant factor when there exist \(c, n_0 \in \R^+\), such that for all \(n \in \N\), if \(n \geq n_0\) then \(g(n) \leq c \cdot f(n)\).
In this case, we also say that \(g\) is Big-O of \(f\), and write \(g \in \cO(f)\).
We use the notation “\(\in \cO(f)\)” here because we formally define \(\cO(f)\) to be the set of functions that are eventually dominated by \(f\) up to a constant factor: \[\cO(f) = \{g \mid g: \N \to \R^{\ge 0},~\text{and}~\exists c, n_0 \in \R^+,~ \forall n \in \N,~ n \geq n_0 \IMP g(n) \leq c \cdot f(n)\}.\]
Let \(f(n) = n^3\) and \(g(n) = n^3 + 100n + 5000\). Prove that \(g \in \cO(f)\).We can also express this statement as “\(n^3 + 100n + 5000 \in \cO(n^3)\)”.
\[\exists c, n_0 \in \R^+,~ \forall n \in \N,~ n \geq n_0 \IMP n^3 + 100n + 5000 \leq c n^3.\]
It’s worth pointing out that in this case, \(g\) is neither eventually dominated by \(f\) nor dominated by \(f\) up to a constant factor. Exercise: prove this! So we’ll really need to make use of both constants \(c\) and \(n_0\). They’re both existentially-quantified, so we have a lot of freedom in how to choose them!
Here’s an idea: let’s split up the inequality \(n^3 + 100n + 5000 \leq c n^3\) into three simpler ones: \[\begin{align*} n^3 &\leq c_1 n^3 \\ 100n &\leq c_2 n^3 \\ 5000 &\leq c_3 n^3 \end{align*}\]
If we can make these three inequalities true, adding them together will give us our desired result (setting \(c = c_1 + c_2 + c_3\)). Each of these inequalities is simple enough that we can “solve” them by inspection. Moreover, because we have freedom in how we choose \(n_0\) and \(c\), there are many different ways to satisfy these inequalities! To illustrate this, we’ll look at two different approaches here.
Approach 1: focus on choosing \(n_0\).
It turns out we can satisfy the three inequalities even if \(c_1 = c_2 = c_3 = 1\):
We can pick \(n_0\) to be the largest of the lower bounds on \(n\), \(\sqrt[3]{5000}\), and then these three inequalities will be satisfied!
Approach 2: focus on choosing \(c\).
Another approach is to pick \(c_1\), \(c_2\), and \(c_3\) to make the right-hand sides large enough to satisfy the inequalities.
(Using Approach 1) Let \(c = 3\) and \(n_0 = \sqrt[3]{5000}\). Let \(n \in \N\), and assume that \(n \geq n_0\). We want to show that \(n^3 + 100n + 5000 \leq c n^3\).
First, we prove three simpler inequalities:
Adding these three inequalities gives us: \[n^3 + 100n + 5000 \leq n^3 + n^3 + n^3 = c n^3.\]
(Using Approach 2) Let \(c = 5101\) and \(n_0 = 1\). Let \(n \in \N\), and assume that \(n \geq n_0\). We want to show that \(n^3 + 100n + 5000 \leq c n^3\).
First, we prove three simpler inequalities:
Adding these three inequalities gives us: \[n^3 + 100n + 5000 \leq n^3 + 100n^3 + 5000n^3 = 5101 n^3 = c n^3.\]
Big-O is a useful way of describing the long-term growth behaviour of functions, but its definition is limited in that it is not required to be an exact description of growth. After all, the key inequality \(g(n) \leq c f(n)\) can be satisfied even if \(f\) grows much, much faster than \(g\). For example, we could say that \(n + 10 \in \cO(n^{100})\) according to our definition, but this is not necessarily informative.
In other words, the definition of Big-O allows us to express upper bounds on the growth of a function, but does not allow us to distinguish between an upper bound that is tight and one that vastly overestimates the rate of growth.
In this section, we will introduce the final new pieces of notation for this chapter, which allow us to express tight bounds on the growth of a function.
Let \(f, g : \N \TO \R^{\ge 0}\). We say that \(g\) is Omega of \(f\) when there exist constants \(c, n_0 \in \R^+\) such that for all \(n \in \N\), if \(n \geq n_0\), then \(g(n) \geq c \cdot f(n)\). In this case, we can also write \(g \in \Omega(f)\).
You can think of Omega as the dual of Big-O: when \(g \in \Omega(f)\), then \(f\) is a lower bound on the growth rate of \(g\). For example, we can use the definition to prove that \(n^2 - n \in \Omega(n)\).
We can now express a bound that is tight for a function’s growth rate quite elegantly by combining Big-O and Omega: if \(f\) is asymptotically both a lower and upper bound for \(g\), then \(g\) must grow at the same rate as \(f\).
Let \(f, g : \N \TO \R^{\ge 0}\). We say that \(g\) is Theta of \(f\) when \(g\) is both Big-O of \(f\) and Omega of \(f\). In this case, we can write \(g \in \Theta(f)\), and say that \(f\) is a tight bound on \(g\).Most of the time, when people say “Big-O” they actually mean Theta, i.e., a Big-O upper bound is meant to be the tight one, because we rarely say upper bounds that overestimate the rate of growth. However, in this course we will always use \(\Theta\) when we mean tight bounds, because we will see some cases where coming up with tight bounds isn’t easy.
Equivalently, \(g\) is Theta of \(f\) when there exist constants \(c_1, c_2, n_0 \in \R^+\) such that for all \(n \in \N\), if \(n \geq n_0\) then \(c_1 f(n) \leq g(n) \leq c_2 f(n)\).
When we are comparing function growth rates, we typically look for a “Theta bound”, as this means that the two functions have the same approximate rate of growth, not just that one is larger than the other. For example, it is possible to prove that \(10n + 5 \in \Theta(n)\), but \(10n + 5 \notin \Theta(n^2)\). Both of these are good exercises to prove, using the above definitions!
So far, we have seen Big-O expressions like \(\cO(n)\) and \(\cO(n^2)\), where the function in parentheses has grown to infinity. However, not every function takes on larger and larger values as its input grows. Some functions are bounded, meaning they never take on a value larger than some fixed constant.
For example, consider the constant function \(f(n) = 1\), which always outputs the value \(1\), regardless of the value of \(n\). What would it mean to say that a function \(g\) is Big-O of this \(f\)? Let’s unpack the definition of Big-O to find out.
\[\begin{align*} & g \in \cO(f) \\ & \exists c, n_0 \in \R^+,~ \forall n \in \N,~ n \geq n_0 \IMP g(n) \leq c \cdot f(n) \\ & \exists c, n_0 \in \R^+,~ \forall n \in \N,~ n \geq n_0 \IMP g(n) \leq c \tag{since $f(n) = 1$} \end{align*}\]
In other words, there exists a constant \(c\) such that \(g(n)\) is eventually always less than or equal to \(c\). We say that such functions \(g\) are asymptotically bounded with respect to their input, and write \(g \in \cO(1)\) to represent this.
Similarly, we use \(g \in \Omega(1)\) to express that functions are greater than or equal to some constant \(c\). You might wonder why we would ever say this—don’t all functions satisfy this property? While the functions we’ll be studying in later chapters in this section are generally going to be \(\geq 1\), this is not true for all mathematical functions. For example, the function \(g(n) = \frac{1}{n + 1}\) is \(\cO(1)\), but not \(\Omega(1)\). More generally, any function \(g\) where \(\lim_{n \to \infty} g(n) = 0\) is not \(\Omega(1)\).
On the other hand, the function \(g(n) = n^2\) is \(\Omega(1)\) but not \(\cO(1)\). So we reserve \(\Theta(1)\) to refer to the functions that are both \(\cO(1)\) and \(\Omega(1)\).
If we had you always write chains of inequalities to prove that one function is Big-O/Omega/Theta of another, that would get quite tedious rather quickly. Instead, in this section we will prove some properties of this definition which are extremely useful for combining functions together under this definition. These properties can save you quite a lot of work in the long run. We’ll illustrate the proof of one of these properties here; most of the others can be proved in a similar manner, while a few are most easily proved using some techniques from calculus.We discuss the connection between calculus and asymptotic notation in the following section, but this is not a required part of CSC110.
The following theorem tells us how to compare four different types of “elementary” functions: constant functions, logarithms, powers of \(n\), and exponential functions.
(Elementary function growth hierarchy)
For all \(a, b \in \R^+\), the following statements are true:
And here is a handy figure to show the progression of functions toward longer running times:

For all \(f : \N \to \R^{\geq 0}\), \(f \in \Theta(f)\).
For all \(f, g : \N \to \R^{\geq 0}\), \(g \in \cO(f)\) if and only if \(f \in \Omega(g)\).As a consequence of this, \(g \in \Theta(f)\) if and only if \(f \in \Theta(g)\).
For all \(f, g, h : \N \to \R^{\geq 0}\):
Let \(f, g : \N \TO \R^{\ge 0}\). We can define the sum of \(f\) and \(g\) as the function \(f + g : \N \TO \R^{\ge 0}\) such that \[\forall n \in \N,~ (f + g)(n) = f(n) + g(n).\]
For all \(f, g, h : \N \to \R^{\geq 0}\), the following hold:
We’ll prove the first of these statements.
\[\forall f, g, h : \N \TO \R^{\ge 0},~ \big(f \in \cO(h) \AND g \in \cO(h)\big) \IMP f + g \in \cO(h).\]
This is similar in spirit to the divisibility proofs we did in Section 6.2, which used a term (divisibility) that contained a quantifier.The definition of Big-O here has three quantifiers, but the idea is the same. Here, we need to assume that \(f\) and \(g\) are both Big-O of \(h\), and prove that \(f + g\) is also Big-O of \(h\).
Assuming \(f \in \cO(h)\) tells us there exist positive real numbers \(c_1\) and \(n_1\) such that for all \(n \in \N\), if \(n \geq n_1\) then \(f(n) \leq c_1 \cdot h(n)\). There similarly exist \(c_2\) and \(n_2\) such that \(g(n) \leq c_2 \cdot h(n)\) whenever \(n \geq n_2\). Warning: we can’t assume that \(c_1 = c_2\) or \(n_1 = n_2\), or any other relationship between these two sets of variables.
We want to prove that there exist \(c, n_0 \in \R^+\) such that for all \(n \in \N\), if \(n \geq n_0\) then \(f(n) + g(n) \leq c \cdot h(n)\).
The forms of the inequalities we can assume—\(f(n) \leq c_1 h(n)\), \(g(n) \leq c_2 h(n)\)—and the final inequality are identical, and in particular the left-hand side suggests that we just need to add the two given inequalities together to get the third. We just need to make sure that both given inequalities hold by choosing \(n_0\) to be large enough, and let \(c\) be large enough to take into account both \(c_1\) and \(c_2\).
Let \(f, g, h : \N \TO \R^{\ge 0}\), and assume \(f \in \cO(h)\) and \(g \in \cO(h)\). By these assumptions, there exist \(c_1, c_2, n_1, n_2 \in \R^+\) such that for all \(n \in \N\),
We want to prove that \(f + g \in \cO(h)\), i.e., that there exist \(c, n_0 \in \R^+\) such that for all \(n \in \N\), if \(n \geq n_0\) then \(f(n) + g(n) \leq c \cdot h(n)\).
Let \(n_0 = \max \{n_1, n_2\}\) and \(c = c_1 + c_2\). Let \(n \in \N\), and assume that \(n \geq n_0\). We now want to prove that \(f(n) + g(n) \leq c \cdot h(n)\).
Since \(n_0 \geq n_1\) and \(n_0 \geq n_2\), we know that \(n\) is greater than or equal to \(n_1\) and \(n_2\) as well. Then using the Big-O assumptions, \[\begin{align*} f(n) &\leq c_1 \cdot h(n) \\ g(n) &\leq c_2 \cdot h(n) \end{align*}\]
Adding these two inequalities together yields
\[f(n) + g(n) \leq c_1 h(n) + c_2 h(n) = (c_1 + c_2) h(n) = c \cdot h(n).\]
For all \(f : \N \to \R^{\geq 0}\) and all \(a \in \R^+\), \(a \cdot f \in \Theta(f)\).
For all \(f_1, f_2, g_1, g_2 : \N \to \R^{\geq 0}\), if \(g_1 \in \cO(f_1)\) and \(g_2 \in \cO(f_2)\), then \(g_1 \cdot g_2 \in \cO(f_1 \cdot f_2)\). Moreover, the statement is still true if you replace Big-O with Omega, or if you replace Big-O with Theta.
For all \(f : \N \to \R^{\geq 0}\), if \(f(n)\) is eventually greater than or equal to \(1\), then \(\floor{f} \in \Theta(f)\) and \(\ceil{f} \in \Theta(f)\).
[Note: this subsection is not part of the require course material for CSC110. It is presented mainly for the nice connection between Big-O notation and calculus.]
Our asymptotic notation of \(\cO\), \(\Omega\), and \(\Theta\) are concerned with the comparing the long-term behaviour of two functions. It turns out that the concept of “long-term behaviour” is captured in another object of mathematical study, familiar to us from calculus: the limit of the function as its input approaches infinity.
Let \(f: \N \to \R\) and \(L \in \R\). We have the following two definitions:We’re restricting our attention here to functions with domain \(\N\) because that’s our focus in computer science. \[ \lim_{n \to \infty} f(n) = L:~ \forall \epsilon \in \R^+,~ \exists n_0 \in \N,~ \forall n \in \N,~ n \geq n_0 \IMP |f(n) - L| < \epsilon \] \[ \lim_{n \to \infty} f(n) = \infty:~ \forall M \in \R^+,~ \exists n_0 \in \N,~ \forall n \in \N,~ n \geq n_0 \IMP f(n) > M \]
Using just these definitions and the definitions of our asymptotic symbols \(\cO\), \(\Omega\), and \(\Theta\), we can prove the following pretty remarkable results:
For all \(f, g: \N \to \R^{\geq 0}\), if \(g(n) \neq 0\) for all \(n \in \N\), then the following statements hold:
Proving this theorem is actually a very good (lengthy) exercise for a CSC110 student; they involve keeping track of variables and manipulating inequalities, two key skills you’re developing in this course! And they do tend to be useful in practice (although again, not for this course) to proving asymptotic bounds like \(n^2 \in \cO(1.01^n)\). But note that the converse of these statements is not true; for example, it is possible (and another nice exercise) to find functions \(f\) and \(g\) such that \(g \in \Theta(f)\), but \(\lim_{n \to \infty} f(n)/g(n)\) is undefined.
Let us consider a very similar function to print_integers from the beginning of the chapter:
Here, print_items takes a list as input instead, and so \(n\) is equivalent to len(lst). For the remainder of this course, we will assume input size for a list is always its length, unless something else is specified. How can we use our asymptotic notation to help us analyze the running time of this algorithm? Earlier, we said that the call to print took 1 “basic operation”, but is that true? The answer is, it doesn’t matter. By using asymptotic notation, we no longer need to worry about the constants involved, and so don’t need to worry about whether a single call to print counts as one or ten “basic operations”.
Just as switching from measuring real time to counting “basic operations” allows us to ignore the computing environment in which the program runs, switching from an exact step count to asymptotic notation allows us to ignore machine- and programming language-dependent constants involved in the execution of the code. Having ignored all these external factors, our analysis will concentrate on how the size of the input influences the running time of a program, where we measure running time just using asymptotic notation, and not exact expressions.
Warning: the “size” of the input to a program can mean different things depending on the type of input, or even depending on the program itself. Whenever you perform a running time analysis, be sure to clearly state how you are measuring and representing input size.
Because constants don’t matter, we will use a very coarse measure of “basic operation” to make our analysis as simple as possible. For our purposes, a basic operation (or step) is any block of code whose running time does not depend on the size of the input.To belabour the point a little, this depends on how we define input size. For integers, we usually will assume they have a fixed size in memory (e.g., 32 bits), which is why arithmetic operations take constant time. But of course if we allow numbers to grow infinitely, this is no longer true, and performing arithmetic operations will no longer take constant time.
This includes all primitive language operations like most assignment statements, arithmetic calculations, and list and string indexing. The one major statement type which does not fit in this category is a function call—the running time of such statements depends on how long that particular function takes to run. We’ll revisit this in more detail later.
The running time of print_items depends only on the size of the input list, and not the contents of the list. That is, we expect that print_items takes the same amount of time on every list of length \(100\). We can make this a little more clear by introducing one piece of notation that will come in handy for the rest of the chapter.
Let func be an algorithm. For every \(n \in \N\), we define the set \(\cI_{func, n}\) to be the set of allowed inputs to func of size \(n\).
For example, \(\cI_{print\_items, 100}\) is simply the set of all lists of length 100. \(\cI_{print\_items, 0}\) is the set containing just one input: the empty list.
We can restate our observation about print_items in terms of these sets: for all \(n \in \N\), every element of \(\cI_{print\_items, n}\) has the same runtime when passed to print_items.
Let func be an algorithm whose runtime depends only on its input size. We define the running time function of func as \(RT_{func}: \N \to \R^{\geq 0}\), where \(RT_{func}(n)\) is equal to the running time of func when given an input of size \(n\).
The goal of a running time analysis for func is to find a function \(f\) (typically a simple elementary function) such that \(RT_{func} \in \Theta(f)\).
Our first technique for performing this runtime analysis follows four steps:
Because Theta expressions depend only on the fastest-growing term in a sum, and ignores constants, we don’t even need an exact, “correct” expression for the number of basic operations. This allows us to be rough with our analysis, but still get the correct Theta expression.
Consider the function print_items. We define input size to be the number of items of the input list. Perform a runtime analysis of print_items.
Let \(n\) be the length of the input list lst.
For this algorithm, each iteration of the loop can be counted as a single operation, because nothing in it (including the call to print) depends on the size of the input list.This is actually a little subtle. If we consider the size of individual list elements, it could be the case that some take a much longer time to print than others (imagine printing a string of one-thousand characters vs. the number \(5\)). But by defining input size purely as the number of items, we are implicitly ignoring the size of the individual items. The running time of a call to print does not depend on the length of the input list.
So the running time depends on the number of loop iterations. Since this is a for loop over the lst argument.
Thus the total number of basic operations performed is \(n\), and so the running time is \(RT_{print\_items}(n) = n\), which is \(\Theta(n)\).
Here is a second example, which has a similar structure to our first example, but also features slightly more code, using the familiar loop accumulator pattern.
Analyse the running time of the following function.
def my_sum(numbers: list[int]) -> int:
sum_so_far = 0
for number in numbers:
sum_so_far = sum_so_far + number
return sum_so_farLet \(n\) be the length of the input list (i.e., numbers).
This function body consists of three statements (with the middle statement, the for loop, itself containing more statements). To analyse the total running time of the function, we need to count each statement separately:
sum_so_far = 0 counts as 1 step, as its running time does not depend on the length of numbers.numbers.The total running time is the sum of these three parts: \(1 + n + 1 = n + 2\), which is \(\Theta(n)\).
It is quite possible to have nested loops in a function body, and analyze the running time in the same fashion. The simplest method of tackling such functions is to count the number of repeated basic operations in a loop starting with the innermost loop and working your way out.
Consider the following function.
Perform a runtime analysis of print_sums.
Let \(n\) be the length of lst.
The inner loop (for item2 in lst) runs \(n\) times (once per item in lst), and each iteration is just a single basic operation.
But the entire inner loop is itself repeated, since it is inside another loop. The outer loop runs \(n\) times as well, and each of its iterations takes \(n\) operations.
So then the total number of basic operations is \[\begin{align*} RT_{print\_sums}(n) &= \text{steps for the inner loop} \times \text{number of times inner loop is repeated} \\ &= n \times n \\ &= n^2 \end{align*}\]
So the running time of this algorithm is \(\Theta(n^2)\).
Students often make the mistake, however, that the number of nested loops should always be the exponent of \(n\) in the Big-O expression.E.g., two levels of nested loops always becomes \(\Theta(n^2)\). However, things are not that simple, and in particular, not every loop takes \(n\) iterations.
Consider the following function:
Perform a runtime analysis of this function.
Let \(n\) be the length of the input list lst. The inner loop repeats 10 times, and each iteration is again a single basic operation, for a total of 10 basic operations. The outer loop repeats \(n\) times, and each iteration takes 10 steps, for a total of \(10n\) steps. So the running time of this function is \(\Theta(n)\). (Even though it has a nested loop!)
Alternative, more concise analysis. The inner loop’s running time doesn’t depend on the number of items in the input list, so we can count it as a single basic operation.
The outer loop runs \(n\) times, and each iteration takes \(1\) step, for a total of \(n\) steps, which is \(\Theta(n)\).
When we are analyzing the running time of two blocks of code executed in sequence (one after the other), we add together their individual running times. The sum theorems are particularly helpful here, as it tells us that we can simply compute Theta expressions for the blocks individually, and then combine them just by taking the fastest-growing one. Because Theta expressions are a simplification of exact mathematical function expressions, taking this approach is often easier and faster than trying to count an exact number steps for the entire function.E.g., \(\Theta(n^2)\) is simpler than \(10n^2 + 0.001n + 165\).
Analyze the running time of the following function, which is a combination of two previous functions.
def combined(lst: list[int]) -> None:
# Loop 1
for item in lst:
for i in range(10):
print(item + i)
# Loop 2
for item1 in lst:
for item2 in lst:
print(item1 + item2)Let \(n\) be the length of lst. We have already seen that the first loop runs in time \(\Theta(n)\), while the second loop runs in time \(\Theta(n^2)\).By “runs in time \(\Theta(n)\),” we mean that the number of basic operations of the second loop is a function \(f(n) \in \Theta(n)\).
By the Sum of Functions theorem from the previous section, we can conclude that combined runs in time \(\Theta(n^2)\). (Since \(n \in \cO(n^2)\).)
Now let’s look at one last example in this section, which is a function that prints out the sum of all distinct pairs of integers from a given list.
Analyze the running time of the following function
def all_pairs(lst: list[int]) -> None:
for i in range(0, len(lst)):
for j in range(0, i):
print(lst[i] + lst[j])Like previous examples, this function has a nested loop. However, unlike those examples, here the inner loop’s running time depends on the current value of i, i.e., which iteration of the outer loop we’re on.
This means we cannot take the previous approach of calculating the cost of the inner loop, and multiplying it by the number of iterations of the outer loop; this only works if the cost of each outer loop iteration is the same.
So instead, we need to manually add up the cost of each iteration of the outer loop, which depends on the number of iterations of the inner loop. More specifically, since \(j\) goes from \(0\) to \(i-1\), the number of iterations of the inner loop is \(i\), and each iteration of the inner loop counts as one basic operation.
Let’s see how to do this in a formal analysis.
Let \(n\) be the length of the input list.
We start by analysing the running time of the inner loop for a fixed iteration of the outer loop, and a fixed value of \(i\).
Now, the outer loop iterates \(n\) times for \(i\) going from 0 to \(n - 1\). But here the cost of each iteration is not constant. Instead, the cost of iteration \(i\) is \(i\) steps, and so the total cost of the outer loop is:
\[\sum_{i=0}^{n-1} i = \frac{n(n - 1)}{2}\]
Here we used the summation formula for the sum of the first \(n\) natural numbers, which is reviewed in Appendix C.1.
And so the total number of steps taken by all_pairs is \(\frac{n(n - 1)}{2}\), which is \(\Theta(n^2)\). Note that we can write \(\frac{n(n - 1)}{2} = \frac{1}{2} n^2 - \frac{1}{2} n\).
In the previous section, we began our study of algorithm running time analysis by looking at functions that are implemented using for loops. We chose for loops as a starting point because they make explicit the repeated statements that occur when we execute a function body, while also being relatively straightforward to analyze because of their predicable iteration patterns.
In this section, we’ll extend what we learned about for loops to two different kinds of Python code: comprehension expressions and while loop. We’ll see how all three obey similar patterns when it comes to repeating code, but while loops offer both more flexibility and more complexity in what they can do.
Consider the following function:
def square_all(numbers: list[int]) -> list[int]:
"""Return a new list containing the squares of the given numbers."""
return [x ** 2 for x in numbers]How do we analyze the running time of this code? It turns out that we do so in the same way as a for loop:
x ** 2 takes 1 step (i.e., is constant time).numbers), determines how many times the leftmost expression is evaluated.So let \(n\) be the length of the input list numbers. The comprehension expression takes \(n\) steps (1 step per element of numbers). So the running time of square_all is \(n\) steps, which is \(\Theta(n)\).
Importantly, the fact that a comprehension is creating a new collection (in our above example, a list) does not count as additional time when analysing the cost of a comprehension. This is true for all three of list, set, and dictionary comprehensions, and so the same analysis would hold in the above function if we had used a set or dictionary comprehension instead.
Analysing the running time of code involving while loops follows the same principle using for loops: we calculate the sum of the different loop iterations, either using multiplication (when the iteration running time is constant) or a summation (when the iterations have different running times). There is one subtle twist, though: a while loop requires that we write statements to initialize the loop variable(s) before the loop, and update the loop variable(s) inside the loop body. We must be careful to count the cost of these statements as well, just like we did for statements involving loop accumulators in the previous section.
To keep things simple, our first example is a simple rewriting of an earlier example using a while loop instead of a for loop.
Analyse the running time of the following function.
def my_sum_v2(numbers: list[int]) -> int:
"""Return the sum of the given numbers."""
sum_so_far = 0
i = 0
while i < len(numbers):
sum_so_far = sum_so_far + numbers[i]
i = i + 1
return sum_so_farLet \(n\) be the length of the input numbers.
In this function, we now have both an accumulator and the loop variable to worry about. We can still divide up the function into three parts, and compute the cost of each part separately.
The cost of the assignment statements sum_so_far = 0 and i = 0 is constant time. We’ll count this as a constant-time block of code, which is just 1 step. This might be a bit surprising, because they are two lines of code and look like two separate “actions”. The power of our asymptotic notation is that whether we count this block of code as 1 step or 2, we get the same Theta bound in the end! And so we just go with the simpler one here, but you’re welcome to count this as “two steps” in your own analyses if you find that more intuitive.
To analyse the while loop, we need to determine the cost of each iteration and the total number of iterations, just like a for loop.
i starts at 0 and increases by 1 until it reaches \(n\). Note that this is less obvious than the for loop version! Here we need to look at three different places in the code: how i is initialized, how i is updated inside the loop body, and how i is used in the loop condition.The return statement again takes constant time, and so counts as 1 step.
So the total running time is \(1 + n + 1 = n + 2\), which is \(\Theta(n)\).
Now, the previous example was a little contrived because we could have implemented the same function more simply using a for loop. Here is another example, which uses a while loop to compute powers of two to act as indexes into a list.
Analyse the running time of the following function.
def my_sum_powers_of_two(numbers: list[int]) -> int:
"""Return the sum of the given numbers whose indexes are powers of 2.
That is, return numbers[1] + numbers[2] + numbers[4] + numbers[8] + ...
"""
sum_so_far = 0
i = 1
while i < len(numbers):
sum_so_far = sum_so_far + numbers[i]
i = i * 2
return sum_so_farLet \(n\) be the length of the input list numbers.
This code has much of the same structure as my_sum_v2, and we can reuse most of the same analysis here. In particular, we’ll still count the initial assignment statements as 1 step, and the return statement as 1 step. To analyse the loop, we still need the number of steps per iteration and the total number of iterations. Each iteration still takes constant time (1 step), same as my_sum_v2. It is the number of loop iterations that is most challenging.
To determine the number of loop iterations, we need to take into account the initial value of i, how i is updated, and how i is used in the while loop condition. More formally, we follow these steps:
Find a pattern for how i changes at each loop iteration, and a general formula formula for \(i_k\), the value of i after \(k\) iterations. For relatively simple updates, we can find a pattern by writing a small loop tracing table, showing the value of the loop variable at the end of the iteration.
| Iteration | Value of i |
|---|---|
| 0 | 1 |
| 1 | 2 |
| 2 | 4 |
| 3 | 8 |
| 4 | 16 |
So we find that after \(k\) iterations, \(i_k = 2^k\). Note that we haven’t proved that this formula is true; a formal proof would require a proof by induction, which you may have already seen in your math classes.
We know the while loop continues while i < len(numbers). Another way to phrase this is that the while loop continues until i >= len(numbers).
So to find the number of iterations, we need to find the smallest value of \(k\) such that \(i_k \geq n\) (making the loop condition False). This is where our formula for \(i_k\) comes in:
\[\begin{align*} i_k &\geq n \\ 2^k &\geq n \\ k &\geq \log_2 n \end{align*}\] So we need to find the smallest value of \(k\) such that \(k \geq \log_2 n\). This is exactly the definition of the ceiling function, and so the smallest value of \(k\) is \(\ceil{\log_2 n}\).
So the while loop iterates \(\ceil{\log_2 n}\) times, with 1 step per iteration, for a total of \(\ceil{\log_2 n}\) steps.
Putting it all together, the function my_sum_powers_of_two has a running time of \(1 + \ceil{\log_2 n} + 1 = \ceil{\log_2 n} + 2\), which is \(\Theta(\log n)\). Note that our convention is to drop the base of the log when writing a Theta expression, since all bases \(> 1\) are equivalent to each other in Theta bounds.
It turns out that the extreme flexibility of while loops can make analysing their running time much more subtle than it might appear. Our next example considers a standard loop, with a twist in how the loop variable changes at each iteration.
def twisty(n: int) -> int:
"""Return the number of iterations it takes for this special loop to stop
for the given n.
"""
iterations_so_far = 0
x = n
while x > 1:
if x % 2 == 0:
x = x / 2
else:
x = 2 * x - 2
iterations_so_far = iterations_so_far + 1
return iterations_so_farEven though the individual lines of code in this example are simple, they combine to form a pretty complex situation. The challenge with analyzing the runtime of this function is that, unlike previous examples, here the loop variable x does not always get closer to the loop stopping condition; sometimes it does (when divided by two), and sometimes it increases!
The key insight into analyzing the runtime of this function is that we don’t just need to look at what happens after a single loop iteration, but instead perform a more sophisticated analysis based on multiple iterations. As preparation, try tracing twisty on inputs \(7\), \(9\), and \(11\). More concretely, we’ll prove the following claim.
For any integer value of x greater than \(2\), after two iterations of the loop in twisty the value of x decreases by at least one.
Let \(x_0\) be the value of variable x at some iteration of the loop, and assume \(x_0 > 2\). Let \(x_1\) be the value of \(x\) after one loop iteration, and \(x_2\) the value of \(x\) after two loop iterations. We want to prove that \(x_2 \leq x_0 - 1\).
We divide up this proof into four cases, based on the remainder of \(x_0\) when dividing by four.The intuition for these cases is that this determines whether \(x_0\) is even/odd, and whether \(x_1\) is even/odd. We’ll only do two cases here to illustrate the main idea, and leave the last two cases as an exercise.
Case 1: Assume \(4 \DIV x_0\), i.e., \(\exists k \in \Z,~ x_0 = 4k\).
In this case, \(x_0\) is even, so the if branch executes in the first loop iteration, and so \(x_1 = \frac{x_0}{2} = 2k\). And so then \(x_1\) is also even, and so the if branch executes again: \(x_2 = \frac{x_1}{2} = k\).
So then \(x_2 = \frac{1}{4}x_0 \leq x_0 - 1\) (since \(x_0 \geq 4\)), as required.
Case 2: Assume \(4 \DIV x_0 - 1\), i.e., \(\exists k \in \Z,~ x_0 = 4k + 1\).
In this case, \(x_0\) is odd, so the else branch executes in the first loop iteration, and so \(x_1 = 2x_0 - 2 = 8k\). Then \(x_1\) is even, and so \(x_2 = \frac{x_1}{2} = 4k\).
So then \(x_2 = 4k = x_0 - 1\), as required.
Cases 3 and 4: left as exercises.
Now let’s see how take this claim and use it to formally analyse the running time of twisty.
(Analysis of twisty)
As before, we count the variable initializations before the while loop as 1 step, and the return statement as 1 step.
For the while loop:
The loop body also takes 1 step, since all of the code consists of operations that do not depend on the size of the input \(n\).
To count the number of loop iterations, we first observe that \(x\) starts at \(n\) and the loop terminates when \(x\) reaches 1 or less. The Claim tells us that after every two iterations, the value of \(x\) decreases by at least one.
So then after 2 iterations, \(x \leq n - 1\), after 4 iterations, \(x \leq n - 2\), and in general, after \(2k\) iterations, \(x \leq n - k\). This tells us that after \(2(n - 1)\) loop iterations, \(x \leq n - (n - 1) = 1\), and so the loop must stop.
This analysis tells us that the loop iterations at most \(2(n - 1)\) times, and so takes at most \(2(n - 1)\) steps (remember that each iteration takes 1 step).
So the total running time of twisty is at most \(1 + 2(n - 1) + 1 = 2n\) steps, which is \(\cO(n)\).
Something funny happened at the end of the above analysis: we did not actually compute the exact number of steps the function twisty takes, only an upper bound on the number of steps (signalled by our use of the phrase “at most”). This means that we were only able to conclude a Big-O bound, and not a Theta bound, on the running time of this function: its running time is at most \(\cO(n)\), but we don’t know whether this bound is tight.
In fact, it isn’t! It is possible to prove something pretty remarkable about what happens to the variable x after three iterations of the twisty loop.
(Improved claim)
For any integer value of x greater than \(2\), let \(x_0\) be the initial value of x and let \(x_3\) be the value of x after three loop iterations. Then \(\frac{1}{8} x_0 \leq x_3 \leq \frac{1}{2} x_0\).
It is a good exercise to prove this claim Hint: you can use the same approach as the previous claim, but consider remainders when you divide by 8 instead of 4. and then use this claim to conduct a more detailed running time analysis of twisty. When you do so, you should be able to show that the running time of twisty is both \(\cO(\log n)\) and \(\Omega(\log n)\), and hence conclude that its running time is actually \(\Theta(\log n)\), not just \(\cO(n)\)!
So far in our study of running time, we have looked at algorithms that use only primitive numeric data types or loops/comprehensions over collections. In this section, we’re going to study the running time of operations on built-in collection data types (e.g., lists, sets, dictionaries), and the custom data classes that we create. Because a single instance of these compound data types can be very large (e.g., a list of one trillion elements!), the natural question we will ask is, “what operations will take longer when called on very large data structures?” We’ll also study why this is the case for Python lists by studying how they are stored in computer memory. For the other compound data types, however, their implementations are more complex and so we’ll only touch on them in this course.
Python provides a module (called timeit) that can tell us how long Python code takes to execute on our machine. Here’s an example showing how to import the module and use it:
The call to timeit will perform the operation 5 + 15 (which we passed in as a string) one thousand times. The function returned the total time elapsed, in seconds, to perform all thousand operations. The return value in the notes is specific to one machine—try the code on your own machine to see how you compare!
Next, let’s create two lists with different lengths for comparison: 1,000 and 1,000,000:
We know that there are several operations available to lists. For example, we can search the list using the in operator. Or we could lookup an element at a specific index in the list. Or we could mutate the list by inserting or deleting. Let’s compare the time it takes to access the first element of the list:
>>> timeit('lst_1k[0]', number=10, globals=globals())
5.80001506023109e-06
>>> timeit('lst_1m[0]', number=10, globals=globals())
5.599984433501959e-06The length of the list does not seem to impact the time it takes to retrieve an element from this specific index. Let’s compare the time it takes to insert a new element at the front of the list:
>>> timeit('list.insert(lst_1k, 0, -1)', number=10, globals=globals())
0.00014379998901858926
>>> timeit('list.insert(lst_1m, 0, -1)', number=10, globals=globals())
0.1726928999996744There is a clear difference in time (by several orders of magnitude) between searching a list with one-thousand elements versus one-million elements.
Indeed, every list operation has its own implementation whose running time we can analyze, using the same techniques we studied earlier in this chapter. But in order to fully understand why these implementations work the way they do, we need to dive deeper into how Python lists really work.
Recall that a Python list object represents an ordered sequence of other objects, which we call its elements. When we studied the object-based memory model in Chapter 5, we drew diagrams like this to represent a list:

Our memory-model diagrams are an abstraction. In reality, all data used by a program are stored in blocks of computer memory, which are labeled by numbers called memory addresses, so that the program can keep track of where each piece of data is stored.
Here is the key idea for how the Python interpreter stores lists in memory. For every Python list object, the references to its elements are stored in a contiguous block of memory. For example, here is how we could picture the same list as in the previous diagram, now stored in blocks of computer memory:

As before, our list stores four integers. In memory, the four consecutive blocks 400–403 store references to the actual integer values. Of course, even this diagram is a simplification of what’s actually going on in computer memory, but it illustrates the main point: the references to each list elements are always stored consecutively. This type of list implementation is used by the Python interpreter and many other programming languages, and is called an array-based list implementation.
The primary reason Python uses an array-based list implementation is that it makes list indexing fast. Because the list element references are stored in consecutive memory locations, accessing the i-th element can be done with simple arithmetic: take the memory address where the list starts, and then increase it by i blocks to obtain the the location of the i-th element reference. Think about it like this: suppose you’re walking down a hallway with numbered rooms on just one side and room numbers going up by one. If you see that the first room number is 11, and you’re looking for room 15, you can be confident that it is the fifth room down the hall. More precisely, this means that list indexing is a constant-time operation: its running time does not depend on the size of the list or the index i being accessed. So even with a very long list or a very large index, we expect list indexing to take the same amount of time (and e very fast!).
This is true for both evaluating a list indexing expression or assigning to a list index, e.g. lst[1] = 100. In the latter case, the Python interpreter takes constant time to calculate the memory address where the lst[1] reference is stored and modify it to refer to a new object.
Array-based lists have constant time indexing, but as we’ll see again and again in our study of data types, fast operations almost always come at the cost of slow ones. In order for Python to be able to calculate the address of an arbitrary list index, these references must always be stores in a contiguous block of memory; there can’t be any “gaps”.
Maintaining this contiguity has implications for how insertion and deletion in a Python list works. When a list element to be deleted, all items after it have to be moved back one memory block to fill the gap.

Similarly, when a list element is inserted somewhere in the list, all items after it moved forward one block.

In general, suppose we have a list lst of length \(n\) and we wish to remove the element at index \(i\) in the list, where \(0 \leq i < n\). Then \(n - i - 1\) elements must be moved, and the number of “basic operations” this requires is \(\Theta(n - i)\). Here we’re counting moving the contents of one memory block to another as a basic operation. Similarly, if we want to insert an element into a list of length \(n\) at index \(i\), \(n - i\) elements must be moved, and so the running time of this operation is \(\Theta(n - i)\).
At the extremes, this means that inserting/deleting at the front of a Python list (\(i = 0\)) takes \(\Theta(n)\) time, i.e., proportional to the length of list; on the other hand, inserting/deleting at the back of a Python list (\(i = n - 1\)) is a constant-time operation. We can see evidence of this in the following timeit comparisons:
>>> timeit('list.append(lst_1k, 123)', number=10, globals=globals())
1.0400000064691994e-05
>>> timeit('list.append(lst_1m, 123)', number=10, globals=globals())
1.3099999932819628e-05
>>> timeit('list.insert(lst_1k, 0, 123)', number=10, globals=globals())
4.520000015872938e-05
>>> timeit('list.insert(lst_1m, 0, 123)', number=10, globals=globals())
0.011574500000051557| Operation | Running time (\(n\) = len(lst)) |
|---|---|
List indexing (lst[i]) |
\(\Theta(1)\) |
List index assignment (lst[i] = …) |
\(\Theta(1)\) |
List insertion at end (list.append(lst, ...)) |
\(\Theta(1)\) |
List deletion at end (list.pop(lst)) |
\(\Theta(1)\) |
List insertion at index (list.insert(lst, i, ...)) |
\(\Theta(n - i)\) |
List deletion at index (list.pop(lst, i)) |
\(\Theta(n - i)\) |
Finally, we should point out one subtle assumption we’ve just made in our analysis of list insertion: that there will always be free memory blocks at the end of the list for the list to expand into. In practice, this is almost always true, and so for the purposes of this course we’ll stick with this assumption. But in CSC263/265 (Data Structures and Analysis), you’ll learn about how programming languages handle array-based list implementations to take into account whether there is “free space” or not, and how these operations still provide the running times we’ve presented in this section.
Now that we’ve learned about the running time of basic list operations, let’s see how to apply this knowledge to analysing the running time of algorithms that use these operations. We’ll look at two different examples.
Analyse the running time of the following function.
def squares(numbers: list[int]) -> list[int]:
"""Return a list containing the squares of the given numbers."""
squares_so_far = []
for number in numbers:
list.append(squares_so_far, number ** 2)
return squares_so_farLet \(n\) be the length of the input list (i.e., numbers).Note the similarities between this analysis and our analysis of sum_so_far in Section 8.4.
This function body consists of three statements (with the middle statement, the for loop, itself containing more statements). To analyse the total running time of the function, we need to count each statement separately:
squares_so_far = 0 counts as 1 step, as its running time does not depend on the length of numbers.list.append(squares_so_far, number ** 2). Based on our discussion of the previous section, this call to list.append takes constant time (\(\Theta(1)\) steps), and so the entire loop body counts as 1 step.numbers.The total running time is the sum of these three parts: \(1 + n + 1 = n + 2\), which is \(\Theta(n)\).
In our above analysis, we had to take into account the running of calling list.append, but this quantity did not depend on the length of the input list. Our second example will look very similar to the first, but now we use a different list method that results in a dramatic difference in running time:
def squares_reversed(numbers: list[int]) -> list[int]:
"""Return a list containing the squares of the given numbers, in reverse order."""
squares_so_far = []
for number in numbers:
# Now, insert number ** 2 at the START of squares_so_far
list.insert(squares_so_far, 0, number ** 2)
return squares_so_farLet \(n\) be the length of the input list (i.e., numbers).
This function body consists of three statements (with the middle statement, the for loop, itself containing more statements). To analyse the total running time of the function, we need to count each statement separately:
squares_so_far = 0 counts as 1 step, as its running time does not depend on the length of numbers.Takes \(n\) iterations
Inside the loop body, we call list.insert(squares_so_far, 0, n ** 2). As we discussed above, inserting at the front of a Python list causes all of its current elements to be shifted over, taking time proportional to the size of the list. Therefore this call takes \(\Theta(k)\) time, where \(k\) is the current length of squares_so_far. We can’t use \(n\) here, because \(n\) already refers to the length of numbers!
For the purpose of our analysis, we count a function call with \(\Theta(k)\) running time as taking \(k\) steps, i.e., ignoring the “eventually” and “constant factors” part of the definition of Theta. And so we say that the loop body takes \(k\) steps.
In order to calculate the total running time of the loop, we need to add the running times of every iteration. We know that squares_so_far starts as empty, and then increases in length by 1 at each iteration. So then \(k\) (the current length of squares_so_far) takes on the values \(0, 1, 2, \dots, n - 1\), and we can calculate the total running time of the for loop using a summation:
\[\sum_{k=0}^{n-1} k = \frac{(n-1)n}{2}\]
numbers.The total running time is the sum of these three parts: \(1 + \frac{(n-1)n}{2} + 1 = \frac{(n-1)n}{2} + 2\), which is \(\Theta(n^2)\).
To summarize, this single line of code change (from list.append to list.insert at index 0) causes the running time to change dramatically, from \(\Theta(n)\) to \(\Theta(n^2)\). When calling functions and performing operations on data types, we must always be conscious of which functions/operations we’re using and their running times. It is easy to skim over a function call because it takes up so little visual space, but that one call might make the difference between running times of \(\Theta(n)\), \(\Theta(n^2)\), or even \(\Theta(2^n)\)! Lastly, you might be curious how we could speed up squares_reversed. It turns out that Python has a built-in method list.reverse that mutates a list by reversing it, and this method has a \(\Theta(n)\) running time. So we could accumulate the squares by using list.append, and then call list.reverse on the final result.
It turns out that how Python implements sets and dictionaries is very similar, and so we’ll discuss them together in this section. Both of them are implemented using a more primitive data structure called a hash table, which you’ll also learn about in CSC263/265. The benefit of using hash tables is that they allow constant-time lookup, insertion, and removal of elements (for a set) and key-value pairs (for a dictionary)! This is actually a simplification of how hash tables are implemented. So while we’ll treat all these operations as constant-time in this course, this relies on some technical assumptions which hold in most, but not all, cases.
But of course, there is a catch. The trade-off of how Python uses hash tables is the elements of a set and the keys of a dictionary cannot be mutable data types, a restriction we discussed earlier in the course. This can be inconvenient, but in general is seen as a small price to pay for the speed of their operations.
So if you only care about set operations like “element of”, it is more efficient to use a set than a list: You’ll notice that we haven’t formally discussed the running time of the list in operation in this section. We’ll study it in the next section.
>>> lst1M = list(range(10 ** 6))
>>> set1M = set(range(10 ** 6))
>>> timeit('5000000 in lst1M', number=10, globals=globals())
0.16024739999556914
>>> timeit('5000000 in set1M', number=10, globals=globals())
4.6000059228390455e-06It turns out that data classes (and in fact all Python data types) store their instance attributes using a dictionary that maps attribute names to their corresponding values. This means that data classes benefit from the constant-time dictionary operations that we discussed above.
Explicitly, the two operations that we can perform on a data class instance are looking up an attribute value (e.g., david.age) and mutating the instance by assigning to an attribute (e.g., david.age = 99). Both of these operations take constant time, independent of how many instance attributes the data class has or what values are stored for those attributes.
| Operation | Running time |
|---|---|
Set/dict Search (in) |
\(\Theta(1)\) |
set.add/set.remove |
\(\Theta(1)\) |
Dictionary key lookup (d[k]) |
\(\Theta(1)\) |
Dictionary key assignment (d[k] = ...) |
\(\Theta(1)\) |
Data class attribute access (obj.attr) |
\(\Theta(1)\) |
Data class attribute assignment (obj.attr = ...) |
\(\Theta(1)\) |
Finally, we’ll briefly discuss a few built-in aggregation functions we’ve seen so far in this course.
sum, max, min have a linear running time (\(\Theta(n)\)), proportional to the size of the input collection. This should be fairly intuitive, as each element of the collection must be processed in order to calculate each of these values.
len is a bit surprising: it has a constant running time (\(\Theta(1)\)), independent of the size of the input collection. In order words, the Python interpreter does not need to process each element of a collection when calculating the collection’s size! Instead, each of these collection data types stores a special attribute referring to the size of that collection. And as we discussed for data classes, accessing attributes takes constant time. There is one technical difference between data class attributes and these collection “size” attributes: we can’t access the latter directly in Python code using dot notation, only through calling len on the collection. This is a result of how the Python language implements these built-in collection data types.
any and all are a bit different. Intuitively, they may need to check ever element of their input collection, just like sum or max, but they can also short-circuit (stopping before checking every element), just like the logical or and and operators. This means their running time isn’t a fixed function of the input size, but rather a possible range of values, depending on whether this short-circuiting happens or not. We’ll discuss how to formally analyse the running time of such functions in the next section.
In Section 8.3, we saw how to use asymptotic notation to characterize the rate of growth of the number of “basic operations” as a way of analyzing the running time of an algorithm. This approach allows us to ignore details of the computing environment in which the algorithm is run, and machine- and language-dependent implementations of primitive operations, and instead characterize the relationship between the input size and number of basic operations performed.
However, this focus on just the input size is a little too restrictive. Even though we can define input size differently for each algorithm we analyze, we tend not to stray too far from the “natural” definitions (e.g., length of list). In practice, though, algorithms often depend on the actual value of the input, not just its size. For example, consider the following function, which searches for an even number in a list of integers. This is very similar to how the in operator is implemented for Python lists.
def has_even(numbers: list[int]) -> bool:
"""Return whether numbers contains an even element."""
for number in numbers:
if number % 2 == 0:
return True
return FalseBecause this function returns as soon as it finds an even number in the list, its running time is not necessarily proportional to the length of the input list.
The running time of a function can vary even when the input size is fixed. Or using the notation we learned earlier this chapter, the inputs in \(\cI_{has\_even, 10}\) do not all have the same runtime. The question “what is the running time of has_even on an input of length \(n\)?” does not make sense, as for a given input the runtime depends not just on its length but on which of its elements are even. We illustrate in the following plot, which shows the results of using timeit to measure the running time of has_evens on randomly-chosen lists. While every timing experiment has some inherent uncertainty in the results, the spread of running times cannot be explained by that alone!

Because our asymptotic notation is used to describe the growth rate of functions, we cannot use it to describe the growth of a whole range of values with respect to increasing input sizes. A natural approach to fix this problem is to focus on the maximum of this range, which corresponds to the slowest the algorithm could run for a given input size.
Let func be a program. We define the function \(WC_{func}: \N \to \N\), called the worst-case running time function of func, as follows:Here, “running time” is measured in exact number of basic operations. We are taking the maximum of a set of numbers, not a set of asymptotic expressions. \[
WC_{func}(n) = \max \big\{ \text{running time of executing $func(x)$} \mid x \in \cI_{func, n} \big\}
\]
Note that \(WC_{func}\) is a function, not a (constant) number: it returns the maximum possible running time for an input of size \(n\), for every natural number \(n\). And because it is a function, we can use asymptotic notation to describe it, saying things like “the worst-case running time of this function is \(\Theta(n^2)\).”
The goal of a worst-case runtime analysis for func is to find an elementary function \(f\) such that \(WC_{func} \in \Theta(f)\).
However, it takes a bit more work to obtain tight bounds on a worst-case running time than on the runtime functions of the previous section. It is difficult to compute the exact maximum number of basic operations performed by this algorithm for every input size, which requires that we identify an input for each input size, count its maximum number of basic operations, and then prove that every input of this size takes at most this number of operations. Instead, we will generally take a two-pronged approach: proving matching upper and lower bounds on the worst-case running time of our algorithm.
Let func be a program, and \(WC_{func}\) its worst-case runtime function. We say that a function \(f: \N \to \R^{\geq 0}\) is an upper bound on the worst-case runtime when \(WC_{func} \in \cO(f)\).
To get some intuition about what an upper bound on the worst-case running means, suppose we use absolute dominance rather than Big-O. In this case, there’s a very intuitive way to expand the phrase “\(WC_{func}\) is absolutely dominated by \(f\)”:
\[\begin{align*} &\forall n \in \N,~ WC_{func}(n) \leq f(n) \\ \Longleftrightarrow \, &\forall n \in \N,~ \max \big\{ \text{running time of executing $func(x)$} \mid x \in \cI_{func, n} \big\} \leq f(n) \\ \Longleftrightarrow \, &\forall n \in \N,~ \forall x \in \cI_{func, n},~ \text{running time of executing $func(x)$} \leq f(n) \end{align*}\]
The last line comes from the fact that if we know the maximum of a set of numbers is less than some value \(K\), then all numbers in that set must be less than \(K\). Thus an upper bound on the worst-case runtime is equivalent to an upper bound on the runtimes of all inputs.
Now when we apply the definition of Big-O instead of absolute dominance, we get the following translation of \(WC_{func} \in \cO(f)\):
\[ \exists c, n_0 \in \R^+,~ \forall n \in \N,~ n \geq n_0 \Rightarrow \big(\forall x \in \cI_{func, n},~ \text{running time of executing $func(x)$} \leq c \cdot f(n) \big) \]
To approach an analysis of an upper bound on the worst-case, we typically find a function \(g\) such that \(WC_{func}\) is absolutely dominated by \(g\), and then find a simple function \(f\) such that \(g \in \cO(f)\). But how do we find such a \(g\)? And what does it mean to upper bound all runtimes of a given input size? We’ll illustrate the technique in our next example.
Find an asymptotic upper bound on the worst-case running time of has_even.
The intuitive translation using absolute dominance is usually enough for an upper bound analysis. In particular, the \(\forall n \in \N,~ \forall x \in \cI_{func, n}\) begins with two universal quantifiers, and just knowing this alone should anticipate how we’ll start our proof, using the same techniques of proof we learned earlier!
(Upper bound on worst-case)
First, let \(n \in \N\) and let numbers be an arbitrary list of length \(n\).
Now we’ll analyse the running time of has_even, except we can’t assume anything about the values inside numbers, because it’s an aribtrary list. But we can still find an upper bound on the running time:
The loop (for number in numbers) iterates at most \(n\) times. Each loop iteration counts as a single step (because it is constant time), so the loop takes at most \(n \cdot 1 = n\) steps in total.
The return False statement (if it is executed) counts as \(1\) basic operation.
Therefore the running time is at most \(n + 1\), and \(n + 1 \in \cO(n)\). So we can conclude that the worst-case running time of has_even is \(\cO(n)\).
Note that we did not prove that has_even(numbers) takes exactly \(n + 1\) basic operations for an arbitrary input numbers (this is false); we only proved an upper bound on the number of operations. And in fact, we don’t even care that much about the exact number: what we ultimately care about is the asymptotic growth rate, which is linear for \(n + 1\). This allowed us to conclude that the worst-case running time of has_even is \(\cO(n)\).
But because we calculated an upper bound rather than an exact number of steps, we can only conclude a Big-O, not Theta bound: we don’t yet know that this upper bound is tight.If this is surprising, note that we could have done the above proof but replaced \(n+1\) by \(5000n + 110\) and it would still have been mathematically valid.
So how do we prove our upper bound is tight? Since we’ve just shown that \(WC_{has\_even}(n) \in \cO(n)\), we need to prove the corresponding lower bound \(WC_{has\_even}(n) \in \Omega(n)\). But what does it mean to prove a lower bound on the maximum of a set of numbers? Suppose we have a set of numbers \(S\), and say that “the maximum of \(S\) is at least \(50\).” This doesn’t tell us what the maximum of \(S\) actually is, but it does give us one piece of information: there has to be a number in \(S\) which is at least \(50\).
The key insight is that the converse is also true—if I tell you that \(S\) contains the number \(50\), then you can conclude that the maximum of \(S\) is at least \(50\). \[\max(S) \geq 50 \IFF (\exists x \in S,~ x \geq 50)\] Using this idea, we’ll give a formal definition for a lower bound on the worst-case runtime of an algorithm.
Let func be a program, and \(WC_{func}\) is worst-case runtime function. We say that a function \(f: \N \to \R^{\geq 0}\) is a lower bound on the worst-case runtime when \(WC_{func} \in \Omega(f)\).
In an analogous fashion to the upper bound, we unpack this definition first by using absolute dominance: \[\begin{align*} & \forall n \in \N,~ WC_{func}(n) \geq f(n) \\ \Longleftrightarrow \, &\forall n \in \N,~ \max \big\{ \text{running time of executing $func(x)$} \mid x \in \cI_{func, n} \big\} \geq f(n) \\ \Longleftrightarrow \, &\forall n \in \N,~ \exists x \in \cI_{func, n},~ \text{running time of executing $func(x)$} \geq f(n) \end{align*}\]
And then using Omega:
\[ \exists c, n_0 \in \R^+,~ \forall n \in \N,~ n \geq n_0 \Rightarrow \big(\exists x \in \cI_{func, n},~ \text{running time of executing $func(x)$} \geq c \cdot f(n) \big) \]
Remarkably, the crucial difference between this definition and the one for upper bounds is a change of quantifier: now the input \(x\) is existentially quantified, meaning we get to pick it. Or really, our goal is to find an input family—a set of inputs, one per input size \(n\)—whose runtime is asymptotically larger than our target lower bound.
For example, in has_even we want to prove that the worst-case running time is \(\Omega(n)\) to match the \(\cO(n)\) upper bound, and so we want to find and input family where the number of steps taken is \(\Omega(n)\). Let’s do that now.
Find an asymptotic lower bound on the worst-case running time of has_even.
Again, we’ll just remind you of the quantifiers from the intuitive “absolute dominance” version of the lower bound definition: \(\forall n \in \N,~ \exists x \in \cI_{n}\). This will inform how we start our proof.
(Lower bound on worst-case)
Let \(n \in \N\). Let numbers be the list of length \(n\) consisting of all \(1\)’s. Now we’ll analyse the (exact) running time of has_even on this input.
In this case, the if condition in the loop is always false, so the loop never stops early. Therefore it iterates exactly \(n\) times (once per item in the list), with each iteration taking one step.
Finally, the return False statement executes, which is one step. So the total number of steps for this input is \(n + 1\), which is \(\Omega(n)\).
Finally, we can combine our upper and lower bounds on \(WC_{has\_even}\) to obtain a tight asymptotic bound.
Find a tight bound on the worst-case running time of has_even.
Since we’ve proved that \(WC_{has\_even}\) is \(\cO(n)\) and \(\Omega(n)\), it is \(\Theta(n)\).
To summarize, to obtain a tight bound on the worst-case running time of a function, we need to do two things:
Use the properties of the code to obtain an asymptotic upper bound on the worst-case running time. We would say something like \(WC_{func} \in \cO(f)\).
Find a family of inputs whose running time is \(\Omega(f)\). Almost always we find an input family whose running time is \(\Theta(f)\), but strictly speaking only \(\Omega(f)\) is required. This will prove that \(WC_{func} \in \Omega(f)\).
After showing that \(WC_{func} \in \cO(f)\) and \(WC_{func} \in \Omega(f)\), we can conclude that \(WC_f \in \Theta(f)\).
In this section, we focused on worst-case runtime, the result of taking the maximum runtime for every input size. It is also possible to define a best-case runtime function by taking the minimum possible runtimes, and obtain tight bounds on the best case through an analysis that is completely analogous to the one we just performed. In practice, however, the best-case runtime of an algorithm is usually not as useful to know—we care far more about knowing just how slow an algorithm is than how fast it can be.
We’ve encountered a few different Python functions and methods whose running time depends on more than just the size of their inputs. We alluded to one at the start of this chapter: the list search operation using the keyword in:
>>> lst = list(range(0, 1000000))
>>> timeit.timeit('0 in lst', number=10, globals=globals())
8.299997716676444e-06
>>> timeit.timeit('-1 in lst', number=10, globals=globals())
0.17990550000104122In the first timeit expression, 0 appears as the first element of lst, and so is found immediately when the search occurs. In the second, -1 does not appear in lst at all, and so all one-million elements of lst must be checked, resulting in a running-time that is proportional to the length of the list. The worst-case running time of the in operation for lists is \(\Theta(n)\), where \(n\) is the length of the list.
We have also seen two more functions that are implemented using an early return: any and all. Because any searches for a single True in a collection, it stops the first time it finds one. Similarly, because all requires that all elements of a collection be True, it stops the first time it finds a False value.
>>> all_trues = [True] * 1000000
>>> all_falses = [False] * 1000000
>>> timeit.timeit('any(all_trues)', number=10, globals=globals())
8.600000001024455e-06
>>> timeit.timeit('any(all_falses)', number=10, globals=globals())
0.10643419999905745
>>> timeit.timeit('all(all_trues)', number=10, globals=globals())
0.10217570000168053
>>> timeit.timeit('all(all_falses)', number=10, globals=globals())
6.300000677583739e-06So in the above example:
any(all_trues) returns True immediately after checking the first list element.any(all_falses) returns False only after checking all one-million list elements.all(all_trues) returns True only after checking all one-million list elements.all(all_falses) returns False immediately after checking the first list element.So any and all have a worst-case running time of \(\Theta(n)\), where \(n\) is the size of the input collection. But in practice they can be much faster if they encounter the “right” boolean value early on!
any, all, and comprehensionsThere is one subtlety that often catches students by surprise when they attempt to call any/all on a comprehension and expect a quick result. Let’s see a simple example:
>>> timeit.timeit('any([x == 0 for x in range(0, 1000000)])', number=10, globals=globals())
0.7032962000012049That’s a lot slower than we would expect, given that the first element checked is x = 0! The result is similar if we try to use a set comprehension instead of a list comprehension:
>>> timeit.timeit('any({x == 0 for x in range(0, 1000000)})', number=10, globals=globals())
0.6538308000017423The subtlety here is that in both cases, the full comprehension is evaluated before any is called. As we discussed in 8.5 Analyzing Comprehensions and While Loops, the running time of evaluating a comprehension is proportional to the size of the source collection of the comprehension—in our example, that’s range(0, 1000000), which contains one-million numbers.
But all is not lost! In practice, Python programmers do use any/all with comprehensions, but they do so by writing the comprehension expression in the function call without any surrounding square brackets or curly braces:
This is called a generator comprehension, and is used to produce a special Python collection data type called a generator. We won’t use generators or generator comprehensions very much at all in this course, but what we want you to know about them here is that unlike set/list comprehensions, generator comprehensions do not evaluate their elements all at once, but instead only when they are needed by the function being called. This means that our above any call achieves the fast running time we initially expected:
>>> timeit.timeit('any(x == 0 for x in range(0, 1000000))', number=10, globals=globals())
4.050000279676169e-05Now, only the x = 0 value from the generator comprehension gets evaluated; none of the other possible values (x = 1, 2, ..., 999999) are ever checked by the any call!
It is likely unsatisfying to hear that upper and lower bounds really are distinct things that must be computed separately. Our intuition here pulls us towards the bounds being “obviously” the same, but this is really a side effect of the examples we have studied so far in this course being rather straightforward. But this won’t always be the case: the study of more complex algorithms and data structures exhibits quite a few situations where obtaining an upper bound on the running time involves a completely different analysis than the lower bound.
Let’s look at one such example that deals with manipulating strings.
We say that a string is a palindrome when it can be read the same forwards and backwards; example of palindromes are “abba”, “racecar”, and “z”.Every string of length 1 is a palindrome. We say that a string \(s_1\) is a prefix of another string \(s_2\) when \(s_1\) is a substring of \(s_2\) that starts at index 0 of \(s_2\). For example, the string “abc” is a prefix of “abcdef”.
The algorithm below takes a non-empty string as input, and returns the length of the longest prefix of that string that is a palindrome. For example, the string “attack” has two non-empty prefixes that are palindromes, “a” and “atta”, and so our algorithm will return 4.
def palindrome_prefix(s: str) -> int:
n = len(s)
for prefix_length in range(n, 0, -1): # goes from n down to 1
# Check whether s[0:prefix_length] is a palindrome
is_palindrome = all(s[i] == s[prefix_length - 1 - i]
for i in range(0, prefix_length))
# If a palindrome prefix is found, return the current length.
if is_palindrome:
return prefix_lengthThere are a few interesting details to note about this algorithm:
range(n, 0, -1)—the third argument -1 causes the loop variable to start at n and decrease by 1 at each iteration. In other words, this loop is checking the possible prefixes starting with the longest prefix (length n) and working its way to the shortest prefix (length 1).all checks pairs of characters starting at either end of the current prefix. It uses a generator comprehension (like we discussed above) so that it can stop early as soon as it encounters a mismatch (i.e., when s[i] != s[prefix_length - 1 - i]).for loop, this algorithm is guaranteed to find a palindrome prefix, since the first letter of s by itself is a palindrome.The code presented here is structurally simple. Indeed, it is not too hard to show that the worst-case runtime of this function is \(\cO(n^2)\), where \(n\) is the length of the input string. What is harder, however, is showing that the worst-case runtime is \(\Omega(n^2)\). To do so, we must find an input family whose runtime is \(\Omega(n^2)\). There are two points in the code that can lead to fewer than the maximum loop iterations occurring, and we want to find an input family that avoids both of these.
The difficulty is that these two points are caused by different types of inputs! The call to all can stop as soon as the algorithm detects that a prefix is not a palindrome, while the return statement occurs when the algorithm has determined that a prefix is a palindrome! To make this tension more explicit, let’s consider two extreme input families that seem plausible at first glance, but which do not have a runtime that is \(\Omega(n^2)\).
all call checks all pairs of characters, but unfortunately this means that is_palindrome = True, and the loop returns during its very first iteration. Since the all call takes \(n\) steps, this input family takes \(\Theta(n)\) time to run.prefix_length == 1). However, the all call will always stop after just one step, since it starts by comparing the first letter of \(s\) with another letter, which is guaranteed to be different by our choice of input family. This again leads to a \(\Theta(n)\) running time.The key idea is that we want to choose an input family that doesn’t contain a long palindrome (so the loop runs for many iterations), but whose prefixes are close to being palindromes like palindromes (so the all call checks many pairs of letters). Let \(n \in \Z^+\). We define the input \(s_n\) as follows:
For example, \(s_4 = aaba\) and \(s_{11} = aaaaaabaaa\). Note that \(s_n\) is very close to being a palindrome: if that single character \(b\) were changed to an \(a\), then \(s_n\) would be the all-\(a\)’s string, which is certainly a palindrome. But by making the centre character a \(b\), we not only ensure that the longest palindrome of \(s_n\) has length roughly \(n/2\) (so the loop iterates roughly \(n/2\) times), but also that the “outer” characters of each prefix of \(s_n\) containing more than \(n/2\) characters are all the same (so the all call checks many pairs to find the mismatch between \(a\) and \(b\)). It turns out that this input family does indeed have an \(\Omega(n^2)\) runtime! We’ll leave the details as an exercise.
We hope that the number of sections of these notes dedicated to testing demonstrates its importance in the process of software development. What is perhaps surprising is that testing is not limited to correctness. In fact, strict efficiency constraints are the norm in several domains. For example, a Playstation controller must send wireless signals at the touch of a button or the move of a joystick—if the function for doing so were correct, but took 10 seconds, players would not be happy. Similarly, a search on Google that sifts through terabytes of data must also be fast. Check it out: each search you do on Google reports how many results were found in how many fractions of a second. In this section, we will discuss how to write tests for efficiency of functions.
Earlier we saw how to use the timeit module to measure the time taken to execute a piece of Python code. Let’s see how we might setup a performance constraint using timeit and our implementation of is_prime:
from math import floor, sqrt
from timeit import timeit
def is_prime(p: int) -> bool:
"""Return whether p is prime."""
possible_divisors = range(2, floor(sqrt(p)) + 1)
return (
p > 1 and
all(not p % d == 0 for d in possible_divisors)
)
def test_is_prime_performance() -> None:
"""Test the efficiency of is_prime."""
numbers_to_test = range(2, 1000)
for number in numbers_to_test:
time = timeit(f'is_prime({number})', number=100, globals=globals())
assert time < 0.001, 'Failed performance constraint of 0.001s.'There are several issues here that we need to keep in mind. The performance constraint of 0.001 seconds is for the total runtime of 100 calls to is_prime for only one number in numbers_to_test (there will be as many assertions as there are elements in numbers_to_test). Where did the argument number=100 come from? Should it be more or less? An important thing to remember is a computer system is not at all like a science experiment you would setup in a chemistry or biology lab. There are too many external factors (i.e., background processes being run) that can impact the results. To avoid this issue, several samples of an experiment (i.e., measurements of time) need to be taken. The field of statistics can help inform us on whether or not 100 samples is sufficient for this test.
Next, where did 0.001 seconds come from? The number is most certainly arbitrary in this example. Computer systems are very different from one another, in terms of both hardware and software. While the assertions may hold for all numbers_to_test on one computer, they may not hold on another. The 0.001 seconds may be tuned over time in the testing suite. Or it can help identify the minimum hardware requirements for running a piece of software.
While it is easy to write the Python code that checks for performance, coming up with the actual parameters (number of function calls, inputs to the function, total acceptable runtime) is quite challenging, and often domain-dependent. For example, in user interfaces, a great deal of research has gone into how fast actions should be; a so-called “instantaneous” action in a user interface should complete in 0.1 seconds. Other domains, such as embedded systems, have a series of functions that must meet hard deadlines in order for the computer system to function properly (e.g., in a spaceship).
But what about domains where there are no guidelines or standards? Runtime constraints that are tuned over time can still be useful in discovering changes in program efficiency due to bug fixes or new features. When a code change causes an efficiency test to fail, the programmers can decide whether to the efficiency constraint or explore alternative code changes. Without efficiency tests in place, the change in performance might not have been found until it impacted a real user of the software!
Abstraction is fundamental to our everyday lives, not just in computing. Loosely, abstraction is about understanding how to use something without knowing how it works. Consider your refrigerator—how does it work? Does it matter? We know that we can open a fridge door, place something (probably food) inside, and the fridge will keep it cold. So our notion of a fridge is really quite abstract; there are many thousands of refrigerator types, each one designed and built by different companies and people around the world. But this is irrelevant: when you go to a friend’s house, you can use their fridge just as you would your own, without any extra help.
There are several examples of abstraction in the real world. It doesn’t matter how a watch was built, so long as we can use it to tell time. It doesn’t matter how a cup was made or what materials it was made out of, so long as we can use it to hold liquid. Divorcing the nitty gritty details of how something works with how to use it is abstraction. And it is abstraction that has allowed for ingenuity and creativity to advance technology (i.e., how something works) without having to re-educate the entire world on how to use a cup. Of course, humans have also creatively improved how we use things, like attaching a handle to a cup meant to contain hot coffee.
We can think of abstraction as allowing for the separation of two groups of people with different goals: the creators of an entity, and the users (or clients) of that entity. Sometimes there’s overlap between these two groups, but much of the time—especially as technology and systems have grown more complex—these two groups are fairly separate. Creators are responsible for designing, building, and implementing an entity, and users are responsible for, well, using it.
The interface of an entity is the boundary between creator and user: it is the set of rules (implicit or explicit) governing how users can interact with that entity. We call an interface the public side of an entity; it is the part of the creator’s work that everyone can interact it. Creators are responsible for the design of the interface, while users are responsible for learning the interface in order to interact with the entity. For example, the interface of a cup is how you use it: where to put liquid and where to hold it when taking a drink.
Abstraction and interfaces are incredibly useful concepts in computer science because of the complexity of programming languages, algorithms, and computer hardware that come with modern technology. We’ve been using abstraction all the way through this course, playing the role of creator in some cases, and users in others:
We are users of the Python programming language itself, which provides an interface that hides the details of our computer hardware, processor instructions, memory, storage, and graphics. This isn’t unique to Python, of course: every programming language is an interface between the programmer and their computer. While we have learned some details about how the Python interpreter works (like our discussion of its array-based list implementation in the previous chapter), we’ve barely scratched the surface of this large and complex software.
We are users of built-in Python functions, data types, and modules. We don’t know how the creators of the Python programming language have implemented these built-ins, but have learned how to use them to write useful programs.
We are creators of new Python functions, data types and modules. Each time you have followed the function or data class design recipe, you have created an interface.
For a function, its interface is its header and docstring: these specify how to call the function, the arguments it expects, and what the function does. The function body, is the implementation of the function, and are not part of its interface—someone who wants to use our function should not have to look a the function body to determine what it does.
For a data class, its interface is the name of the data class and the names and types of its attributes, and the class docstring. In other words, every part of what we write to define a new data class is part of its interface! How data classes are actually implemented has been hidden from us in the @dataclass decorator, though we’ll begin learning about how this implementation works in the next section.
Finally, the interface of a Python module is simply the collection of interfaces of the functions and data types defined in that module, plus any additional documentation in the module docstring. For every Python file you’ve written so far, you’ve created a module that could be used by other programmers.
When studying mathematical statements, we have acted as both users and creators. Every time we write a proof, we act as a creator of knowledge, providing airtight evidence that a statement is True. You can view a proof as an “implementation” of a statement. Rather than acting as a set of instructions to execute to complete a task, a proof is a set of statements that assert the truth of a statement. Every time we use an “external statement” in a proof, like the Quotient-Remainder Theorem or Fermat’s Little Theorem, we are acting as users of these statements, and do not worry about how they have been proved.
As we work with more and more programming interfaces—different functions, data types, modules, and even programming languages—we see just how challenging designing interfaces can be. Every interface is a contract between creator and user: while creators have control over how they design an interface, they have the responsibility to make that interface easy and intuitive for users. Good interfaces are simple and strive to minimize the cognitive load on users; bad interfaces are cumbersome, ambiguous, and require the user to keep track of many unrelated details. Because interfaces are public, as creators we put a lot of effort into designing good interfaces, a topic we’ll discuss in this year but that you’ll explore far more in future courses.
Moreover, because interfaces are contracts, they are hard to change once released—made public to users—as any change will have ramifications on every user. We have been the users used several Python modules so far, such as timeit, pytest, and doctest. What would happen if the creators of one of these modules decided to make a change to that interface, like changing the timeit function name to time_it? This one character change would cause all code that uses the timeit.timeit function to no longer work! As clients of the timeit module, we would not be very happy.
There are two sides to every contract. Just as creators are beholden to keep the interface they provide, users are limited to that interface as well. When we act as the creators of a function or module, we are free to modify their implementations in any way we wish, as long as we do not change the public interface. We can fix a bug, simplify the code, or use a more efficient algorithm, all to improve our implementation without affecting our users. In software engineering, it is important to clearly define what the public interface of a piece of code actually is, so that its creators know precisely what they must preserve and what they are free to change.
Over the next two chapters, we’ll explore the concepts of abstraction, public interfaces, and private implementations in more detail. We’ll study how we can build our own Python data types from scratch (without relying on @dataclass) to gain full control over defining a data type’s public interface. We’ll create implementations of abstract data types and models of real-world domains, using the ideas we’ve introduced here to define clear public interfaces for every part of what we do.
All the way back in Chapter 4, we learned how to create our own simple data types in Python using the @dataclass decorator. While data classes are very useful, they are just one particular form of classes in Python. The @dataclass decorator takes our data class declaration—its public interface—and automatically creates an implementation of class. This makes it very simple to set up data classes, at the cost of flexibility of implementation.
In this section, we’ll learn about how to create a Python data type from scratch, without the automatic implementation that @dataclass provides. In future sections, we’ll apply what we’ve learned to defining new Python data types to solve various computational problems.
@dataclass?To start with, recall the Person data class example we used when we first introduced data classes:
@dataclass
class Person:
"""A custom data type that represents data for a person."""
given_name: str
family_name: str
age: int
address: strWe were able to use this data class to create and display an instance of the class and access its attributes:
>>> david = Person('David', 'Liu', 100, '40 St. George Street')
>>> david
Person(given_name='David', family_name='Liu', age=100, address='40 St. George Street')
>>> david.given_name
'David'
>>> david.family_name
'Liu'
>>> david.age
100
>>> david.address
'40 St. George Street'Now let’s see what happens if we remove the @dataclass decorator from our class definition. This is indeed valid Python syntax, but with perhaps an unexpected consequence.
# @dataclass (We've commented out this line)
class Person:
"""A custom data type that represents data for a person."""
given_name: str
family_name: str
age: int
address: str
>>> david = Person('David', 'Liu', 100, '40 St. George Street')
TypeError: Person() takes no argumentsOkay, something went wrong. Even though our class declaration still contains attribute names and type annotations, we cannot call Person and pass in values for those attributes. According to the error message, Person() takes no arguments. So what happens when we try to create an instance of Person and pass in zero arguments?
We successfully created an instance of the Person class. But what happens when we try to access the instance attributes?
This should make sense: by just calling Person() with no arguments, we haven’t specified values for any of the instance attributes, so we shouldn’t expect to see a value when we access david.given_name.
When we execute the statement david = Person(), all we have in memory is this:

A Person object has been created, but it has no attributes. To fix this (without using @dataclass), we need to define a new method for Person called the initializer. The initializer method of a class is called when an instance of the class is created in Python. As its name suggests, the purpose of this method it to initialize all of the instance attributes for the new object. To distinguish it from regular functions, Python always uses the name __init__ for the initializer method.
When we use the @dataclass decorator, the Python interpreter automatically creates an initializer method for the class. So let’s start by seeing what this “automatic” code for the initializer looks like.
class Person:
"""A custom data type that represents data for a person."""
given_name: str
family_name: str
age: int
address: str
def __init__(self, given_name: str, family_name: str, age: int, address: str) -> None:
"""Initialize a new Person object."""
self.given_name = given_name
self.family_name = family_name
self.age = age
self.address = addressSince all methods are functions, it should not surprise you to learn that we define methods using the same keyword (def) as other functions. However, there are two key differences between this method definition and all top-level function definitions we’ve studied so far. The first is that this method definition is indented so that it is inside the body of the class Person definition. This is how we signal that the function being defined is a method for the Person class.
The second difference is the presence of the parameter self. Every initializer has a first parameter that refers to the instance that has just been created and is to be initialized. By convention, we always call it self. This is such a strong Python convention that most code checkers will complain if you don’t follow it. This name is the reason we refer to attributes as self.<attr> in class representation invariants. In fact, this convention is so strong that we also typically omit the type annotation for self. We could have written self: Person, but because the type of self should always be the class that the initializer belongs to, this is considered redundant in Python!
To understand how self works, let’s examine how we use the initializer:
Notice that we never mention the initializer __init__ by name; it is called automatically, and the values in parentheses are passed to it. Also notice that we pass four values to the initializer, even though it has five parameters. We never have to pass a value for self; Python automatically sets it to the instance that is to be initialized. So this is what is happening in memory at the beginning of the initializer:

The initializer’s job is to create and initialize the instance attributes. To do this, we use one assignment statement per instance attribute. This uses the same dot notation syntax that we saw in Chapter 5 for assigning to instance attributes: self.given_name = given_name, for example. Note that given_name and self.given_name are two different expressions! given_name is a parameter of the initialize, while self.given_name is an instance attribute. Some other programming languages like Java allow you to refer to instance attributes without using dot notation. In Python, however, dot notation is mandatory for accessing and assigning to instance attributes. We can illustrate this distinction by showing the state of memory after all attributes have been initialized, immediately before the initializer returns:

You may have noticed that the initializer return type is None, and that the body of the function does not actually return anything. This is a bit strange, since when we evaluate david = Person('David', 'Liu', 100, '40 St. George Street'), a Person object is definitely returned from the function call and assigned to the variable david.

What’s going on? It turns out that calling Person doesn’t just cause __init__ to be called. It actually does three things:Of course, this is true not just for our Person class, but in fact every class in Python.
Person object behind the scenes.__init__ with the new object passed to the parameter self, along with the other arguments.__init__ in Step 2.So in fact, __init__ is a helper function in the object creation process. Its task is only to initialize attributes for an object; Python handles both creating the object beforehand, and returning the new object after __init__ has been called.
It is certainly possible to accomplish everything that we would ever want to do with our Person class by writing top-level functions, and this is the approach we’ve taken with data classes up to this point. An alternate and commonly-used approach is to define methods for a data type, which become part of the interface of that data type. Remember that methods are just functions that belong to a data type—but this “belonging to” is not just an abstract relationship, but creates concrete consequences for how the Python interpreter handles them. When we define a data class and top-level functions, the interface of a data class itself only consists of its attributes; we have to remember to import those functions separately in order to use them. When we define a class with methods, those methods are always bundled with the class, and so any instance of the class can use those methods, without needing to import them separately.
We have seen one example of a method definition already: the initializer, __init__. More generally, any function that operates on an instance of a class can be converted into a method by doing the following:
self.For example, suppose we had the following function to increase a person’s age:
def increase_age(person: Person, years: int) -> None:
"""Add the given number of years to the given person's age.
>>> david = Person('David', 'Liu', 100, '40 St. George Street')
>>> increase_age(david, 10)
>>> david.age
110
"""
person.age = person.age + yearsWe can turn increase_age into a Person method as follows:
class Person:
"""A custom data type that represents data for a person."""
given_name: str
family_name: str
age: int
address: str
def __init__(self, given_name: str, family_name: str, age: int, address: str) -> None:
"""Initialize a new Person object."""
self.given_name = given_name
self.family_name = family_name
self.age = age
self.address = address
def increase_age(self, years: int) -> None:
"""Add the given number of years to this person's age.
>>> david = Person('David', 'Liu', 100, '40 St. George Street')
>>> Person.increase_age(david, 10)
>>> david.age
110
"""
self.age = self.age + yearsNotice that we now use parameter self (without a type annotation) to access instance attributes, just as we did in the initializer. In our function docstring, the phrase “the given person” changes to “this person”, We typically use the word “this” in a method docstring to refer to the object instance that self refers to. In fact, some other programming languages also use this instead of self as a variable or keyword to refer to this object in code. and our doctest example changes the call to increase_age to Person.increase_age.
Now that we are starting to define our own custom classes and methods, we are ready to see a shorthand for calling methods in Python. Let’s take a look at the method call from our doctest above:
This uses dot notation to access the increase_age method of the Person class, calling it with the two arguments david and 10, which get assigned to parameters self and years, respectively.
The alternate form for calling the increase_age method is to use dot notation with the Person instance directly:
When we call david.increase_age(10), the Python interpreter does the following:
david, which is Person.increase_age method of the Person class.Person.increase_age on david and 10. In other words, the interpreter automatically passes the value to the left of the dot (in this case, the object david refers to) as the method’s first parameter self.This works not just for our custom class Person, but all built-in data types as well. For example, list.append(lst, 10) can be written as lst.append(10), and str.lower(s) as simply s.lower(). More generally, a method call of the form obj.method(x1, x2, ..., xn) is equivalent to type(obj).method(obj, x1, x2, ..., xn).
Though we’ve been using the more explicit “class dot notation” style (Person.increase_age) so far in this course, we’ll switch over to the “object dot notation” style (david.increase_age) starting in this chapter, as this is the much more common style in Python programming. There are two primary reasons why the latter style is standard:
It matches other languages with an object-oriented style of programming, where the object being operated own is of central importance. Because we read from left to right, every time we use dot notation with the instance object on the left, we are reminded that it is an object we are working with, whether we are accessing a piece of data bundled with that object or performing an operation on that object.
We read david.age as “access david’s age” and david.increase_age(10) as “increase david’s age by 10”. In both cases, david is the most important object in the code expression.
Only the “object dot notation” style of method call supports inheritance, which is a technical feature of classes that we’ll discuss in the next chapter.
So far in this course, we’ve used the term data type to actually mean two different things. Most of the time, we use it to mean a data type in the Python programming language, like int or list or a data class we’ve defined. When we use the term “data type” in this way, it is synonymous with the term Python class, which is the name the Python language gives to all of its data types. We’ll now call refer to these Python classes as concrete data types, since they have a concrete implementation in Python code. This is true for both built-in data types, data classes that we define, and the more general classes we learned about in Section 9.2.
However, there’s another way we’ve used the term “data type” that goes all the way back to 1.1 The Different Types of Data: as abstract representations of data that transcend any one specific programming language. For example, the Python list class is implemented differently than the Java ArrayList or JavaScript Array, but all three share some common expectations of what list operations they support. We can describe these common, language-independent list operations by defining an abstract data type (ADT), which defines an entity that stores some kind of data and the operations that can be performed on it. Using the terminology from [Section 9.1], an abstract data type is a pure interface it is concerned only with the what—what data is stored, what we can do with this data—and not the how—how a computer actually stores this data or implements these operations.
Let’s take a moment here to review some of the collection-based abstract data types we’ve seen already in this course. One caveat with this list: while computer scientists generally agree on what the “main” abstract data types are, they often disagree on what operations each one actually supports. You’ll notice here that we’ve taken a fairly conservative approach for specifying operations, limiting ourselves to the most basic ones.
Set
List
Mapping
Iterable
There are a few more foundational abstract data types in computer science that we’ll cover in this chapter, and in future courses. We have discussed many of these throughout the semester so far, and have used many in Python. But the true power of ADTs is that they are abstract enough to transcend any individual program or even programming languages. ADTs like lists, sets, and maps form a common vocabulary that is necessary to being a professional computer scientist.
Abstract data types form a high-level interface between a computer scientist and how the computer stores program data. A concrete data type is an implementation of an abstract data type: unlike abstract data types, they are actually concerned with how the data is stored and how their operations are implemented. The creators of the Python programming language took various abstract data types and created a set of built-in concrete data types (classes), making careful decisions about how each class would store its data and implement its methods. Indeed, as Python programmers we benefit from all the work they’ve put in to create classes that not just support common ADTs, but to make their implementations extremely fast using clever programming techniques. You’ll learn about some of these techniques in CSC263/265!
So a dict, for instance, is not itself an abstract data type. But the dict data type is an obvious implementation of the Mapping ADT. However, there is NOT a one-to-one correspondence between abstract data types and concrete data types, in Python or any other programming language. A single abstract data type can be implemented by many different concrete data types. For example, although the Python dict is a natural implementation of the Mapping ADT, we could implement the Mapping ADT instead with a list, where each element is a tuple storing a key-value pair:
# A Map using a Python dict
{0: 'hello', 1: 42, 2: 'goodbye'}
# A Map using a Python list
[(0, 'hello'), (1, 42), (2, 'goodbye')]Conversely, every concrete data type can be used to implement multiple ADTs. The Python list can be used to implement not just the List ADT, but each of the other above ADTs as well. For instance, think about how you would implement the Set ADT with a list, and in particular, how you would avoid duplicates. Though just because something is possible doesn’t mean it is a good idea in practice. Beginning Python programmers often implement use a list when all they need is the Set ADT’s operations. As we discussed in Section 8.6, this leads to slower programs, and so should be avoided. A dict could also implement any of the ADTs above, and the same is true of the new data structures you will learn in this course.
You might be wondering what is the point of making this distinction—so what if lists can implement the Mapping ADT, we’d never use this in “real” Python code when we have a dict instead. And that’s true! But what this distinction reminds us is that we always have choices when implementing an interface. Rather than saying “it’s not possible to implement a Map using list”, we instead say “it is possible to implement a Map using list, but this choice is worse than using dict”.
Any idea why is a dict better than list at implementing the Mapping ADT? If we ignore the fact that we’ve been using dict for this purpose all along, the answer is not obvious! It comes down to efficiency: though dict and list can both be used to implement the Map ADT, the implementation of dict makes the Mapping operations much faster than how we would (straightforwardly) implement the Mapping ADT using a list. As we’ll see a few times this chapter, running time analysis is one of the key ways to evaluate and compare different implementations of an ADT.
Over the next few sections of this chapter, we’ll learn about three new abstract data types: Stack, Queue, and Priority Queue. All three of these ADTs store a collection of items, and support operations to add an item and remove an item. However, unlike a Set or List, in which users may specify which item to remove (by value or by index, respectively), these three ADTs remove and return their items in a fixed order—client code is allowed no choice. This might seem restrictive and simplistic, but you’ll soon learn how the power of these ADTs lies in their simplicity. Once you learn about them, you’ll start seeing them everywhere, and be able to effectively communicate about these ADTs to any other computer scientist.
The Stack ADT is very simple. A stack contains zero or more items. When you add an item, it goes “on the top” of the stack (we call this “pushing” onto the stack) and when you remove an item, it is removed from the top also (we call this “popping” from the stack). The name “stack” is a deliberate metaphor for a stack of books on a table. The net effect is that the first item added to the stack is the last item removed. We call this Last-In-First-Out (LIFO) behaviour. To summarize:
Stack
In code:
class Stack:
"""A last-in-first-out (LIFO) stack of items.
Stores data in last-in, first-out order. When removing an item from the
stack, the most recently-added item is the one that is removed.
Sample usage:
>>> s = Stack()
>>> s.is_empty()
True
>>> s.push('hello')
>>> s.is_empty()
False
>>> s.push('goodbye')
>>> s.pop()
'goodbye'
"""
def __init__(self) -> None:
"""Initialize a new empty stack."""
def is_empty(self) -> bool:
"""Return whether this stack contains no items.
"""
def push(self, item: Any) -> None:
"""Add a new element to the top of this stack.
"""
def pop(self) -> Any:
"""Remove and return the element at the top of this stack.
Preconditions:
- not self.is_empty()
"""At this point, you may be wondering how we fill in the method bodies, picturing perhaps a list instance attribute to store the items in the stack. But remember, thinking about implementation is irrelevant when you are using an ADT. At this point, you should picture a pile of objects stacked on top of each other—this is enough to understand each of the doctest examples in the above code. Abstraction allows us to separate our understanding of what the Stack ADT is from how it is implemented.
Because they have so few methods, it may seem like stacks are not that powerful. But in fact, stacks are useful for many things. For instance, they can be used to check for balanced parentheses in a mathematical expression. And consider the execution of a Python program. We have talked about frames that store the names available at a given moment in its execution. What happens when f calls g, which calls h? When h is over, we go back to g and when g is over we go back to f. To make this happen, our frames go on a stack! Hence the names call stack and stack frame from our memory model.
As a more “real world” example, consider the undo feature in many different applications. When we perform an action by mistake and want to undo it, we want to undo the most recent action, and so the Stack ADT is the perfect abstract data type for keeping track of the history of our actions so that we can undo them. A similar application lies in how web browsers store page visits so that we can go back to the most recently-visited page.
Next, we’ll now implement the Stack ADT using a built-in Python data structure: the list. We’ve chosen to use the end of the list to represent the top of the stack.
class Stack1:
"""A last-in-first-out (LIFO) stack of items.
Stores data in first-in, last-out order. When removing an item from the
stack, the most recently-added item is the one that is removed.
Instance Attributes:
- items: The items stored in the stack. The end of the list represents
the top of the stack.
>>> s = Stack1()
>>> s.is_empty()
True
>>> s.push('hello')
>>> s.is_empty()
False
>>> s.push('goodbye')
>>> s.pop()
'goodbye'
"""
items: list
def __init__(self) -> None:
"""Initialize a new empty stack.
"""
self.items = []
def is_empty(self) -> bool:
"""Return whether this stack contains no items.
"""
return self.items == []
def push(self, item: Any) -> None:
"""Add a new element to the top of this stack.
"""
self.items.append(item)
def pop(self) -> Any:
"""Remove and return the element at the top of this stack.
Preconditions:
- not self.is_empty()
"""
return self.items.pop()Our current Stack1 class is correct, but has one subtle difference with the Stack ADT it is supposed to implement. While a user can create a new Stack1 object and call its methods push and pop to interact with it, they can also do one more thing: access the items instance attribute. This means that any user of a Stack1 object can access any item in the stack at any time, or even mutate items to modify the contents of the stack in unexpected ways.
You might wonder why this is an issue—if a user wants to change the items attribute, let them! And indeed this is a common and valid approach in programming, particularly in favour with many Python developers. However, it is not the only approach. Another school of thought is that a data type’s interface should communicate not just how to use it, but also how not to use it. For our current Stack1 implementation, the instance attribute items is part of the class’ interface, and so all users can reasonably expect to use it.
To make an instance attribute that isn’t part of a class’ interface, we prefix its name with an underscore _. We refer to attributes whose names begin with an underscore as private instance attributes, and those without the underscore (all the attributes we’ve seen so far) as public instance attributes. These names suggest how they’re interpreted when it comes to a class interface: all public instance attributes are part of the interface, and all private ones aren’t.
Here’s how we could modify our Stack1 implementation to make items a private attribute instead.
class Stack1:
"""A last-in-first-out (LIFO) stack of items.
Stores data in first-in, last-out order. When removing an item from the
stack, the most recently-added item is the one that is removed.
>>> s = Stack1()
>>> s.is_empty()
True
>>> s.push('hello')
>>> s.is_empty()
False
>>> s.push('goodbye')
>>> s.pop()
'goodbye'
"""
# Private Instance Attributes:
# - _items: The items stored in the stack. The end of the list represents
# the top of the stack.
_items: list
def __init__(self) -> None:
"""Initialize a new empty stack.
"""
self._items = []
def is_empty(self) -> bool:
"""Return whether this stack contains no items.
"""
return self._items == []
def push(self, item: Any) -> None:
"""Add a new element to the top of this stack.
"""
self._items.append(item)
def pop(self) -> Any:
"""Remove and return the element at the top of this stack.
Preconditions:
- not self.is_empty()
"""
return self._items.pop()Other than renaming the attribute from items to _items, the only change is in how we document this attribute. We’ve kept the same format, but now moved the description from the class docstring to comments in the class body. By doing so, there is now no mention of this attribute when we call help on our class:
>>> help(Stack1)
class Stack1(builtins.object)
| Stack1() -> None
|
| A last-in-first-out (LIFO) stack of items.
|
| Stores data in a last-in, first-out order. When removing an item from the
| stack, the most recently-added item is the one that is removed.
|
| >>> s = Stack1()
| >>> s.is_empty()
| True
| >>> s.push('hello')
| >>> s.is_empty()
| False
| >>> s.push('goodbye')
| >>> s.pop()
| 'goodbye'
|
| [The rest is omitted]
|One of the distinctive features of Python that separates it from many other programming languages is that private instance attributes can still be accessed from outside the class.
This is a design choice made by the creators of the Python programming language to prefer flexibility over restriction when it comes to accessing attributes. But does this mean private attributes are meaningless? No! By making an instance attribute private, we are communicating that client code should not access this attribute: it is not an expected way of interacting with this class. As a result, we reduce the cognitive load on the client (one less attribute to think about when using the class), and also give flexibility to the designer of the class to change or even remove a private attribute if they want to update their implementation of the class, without affecting the class’ public interface.
We implemented Stack1 using the back of the _items list to represent the top of the stack. You might wonder why we didn’t use the front of _items instead. Indeed, the implemention wouldn’t have to change much:
class Stack2:
# Duplicated code from Stack1 omitted. Only push and pop are different.
def push(self, item: Any) -> None:
"""Add a new element to the top of this stack.
"""
self._items.insert(0, item)
def pop(self) -> Any:
"""Remove and return the element at the top of this stack.
Preconditions:
- not self.is_empty()
"""
return self._items.pop(0)The key difference between Stack1 and Stack2 is not their code complexity but their efficiency. In Chapter 8, we learned that Python uses an array-based implementation for lists. Because of this, the list.append operation for an array-based list is \(\Theta(1)\), therefore Stack1.push is also \(\Theta(1)\). In contrast, list.insert has complexity \(\Theta(n - i)\), where \(i\) is the index argument passed to list.insert. In Stack2.push, \(i = 0\) and so the method has complexity \(\Theta(n)\). So the push operation for stacks is more efficient when we treat the end of an array-based list as the top of the stack.
Similarly, removing the last element of an array-based list using list.pop is also \(\Theta(1)\), and so the running time of Stack1.pop is \(\Theta(1)\). However, Stack2.pop uses passes an index of 0 to list.pop, which causes the method to have a \(\Theta(n)\) running time.
The decision of which implementation has superior efficiency is clear: Stack1 will always be more efficient than Stack2. Having such a clear-cut winner is actually quite rare. There are almost always trade-offs associated with choosing one implementation over another. We will see one such trade-off when we introduce our next ADT: queues.
The stack implementations we studied in the previous section included a precondition on their pop method specifying that the stack must not be empty. Preconditions are used to rule out erroneous situations like attempting to remove an item from an empty stack, but they come with one drawback: every precondition we add increases the complexity of the function’s interface. A precondition becomes the responsibility of the user of the function to check, for example, with code like
Sometimes these checks are straightforward, but depending on the preconditions we specify, they can be onerous as well. In this section, we’ll introduce an alternate mechanism for signaling an erroneous state from within a function call.
Consider this version of Stack.pop, which removes the precondition but keeps the same implementation:
def pop(self) -> Any:
"""Remove and return the element at the top of this stack.
"""
return self._items.pop()When we call pop on an empty stack, we encounter the following error:
>>> s = Stack()
>>> s.pop()
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "...", line 58, in pop
return self._items.pop()
IndexError: pop from empty listAs we saw earlier in the course, when an exception is raised Python stops the normal control flow of the currently running program. From the perspective of the client code, it is good to see an exception to know that something has gone wrong, but bad that the exceptions report refers to a list (IndexError: pop from empty list) and a private attribute (self._items) that the client code should have no knowledge of.
A better solution is to raise a custom exception that is descriptive, yet does not reveal any implementation details. We can achieve this very easily in Python: we define our own type of error by defining a new class:
There is some slightly new syntax here: the (Exception) that follows the class name. For now, it is enough to know that this will properly create a new type of exception. The technical mechanism used, inheritance, is one we’ll cover later in this chapter.
Here’s how we’ll use EmptyStackError in our pop method:
def pop(self) -> Any:
"""Remove and return the element at the top of this stack.
Raise an EmptyStackError if this stack is empty.
"""
if self.is_empty():
raise EmptyStackError
else:
return self._items.pop()There are two important changes in this version of pop. First, in the method docstring there is a new sentence which names both the type of exception and the scenario that will cause that exception to be raised. This exception is now part of the public interface of Stack.pop, meaning users of this class will be expected to take note of this exception. Second, this implementation now uses a new Python keyword, raise, which unsurprisingly raises an exception. Even though we’re using our custom exception class here, raise works with any exception type, such as IndexError and AttributeError. A raise statement can be used anywhere in our code to raise exceptions, even ones that we’ve defined ourselves. Let’s see what happens now when we call pop on an empty stack:
>>> s = Stack()
>>> s.pop()
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "...", line 60, in pop
raise EmptyStackError
EmptyStackErrorAs before, an exception is raised. But now the line shown is just this simple raise statement; it doesn’t mention any implementation details of the class. And it specifies that an EmptyStackError was the problem, as was documented in the method docstring.
One current limitation of the above approach is that simply the name of the exception class does not convey a lot of meaning. To provide a custom exception message, we can define a new special method with the name __str__ in our exception class: Like __init__, the name __str__ has special meaning in Python. We’ll study it and more methods like it later in the course.
class EmptyStackError(Exception):
"""Exception raised when calling pop on an empty stack."""
def __str__(self) -> str:
"""Return a string representation of this error."""
return 'pop may not be called on an empty stack'
>>> s = Stack()
>>> s.pop()
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "...", line 60, in pop
raise EmptyStackError
EmptyStackError: pop may not be called on an empty stackBecause we include EmptyStackError as part of the public interface of the Stack.pop method, we should write tests to check that this behaviour occurs as expected. But unlike the tests we’ve written so far, we cannot simply call pop on an empty stack and check the return value or the state of the stack after pop returns. Raising an error interrupts the regular control flow of a Python program—and this includes test cases!
The pytest module It is also possible to write doctests that check for exceptions. See Appendix B.1 for details. allows us to write tests that expects an exception to occur using a function pytest.raises together with the with keyword: Here is an example of a test case to check that calling Stack.pop on an empty stack raises an EmptyStackError.
# Assuming our stack implementation is contained in a file stack.py.
from stack import Stack, EmptyStackError
import pytest
def test_empty_stack_error():
"""Test that popping from an empty stack raises an exception."""
s = Stack()
with pytest.raises(EmptyStackError):
s.pop()The with keyword acts as an assertion, expecting an EmptyStackError to be raised by the body of the with block, the function call s.pop(). The test passes when that exception is raised, and fails when that exception is not raised (this includes the case when a different exception is raised instead of the expected one).
We’ve said repeatedly that when an exception is raised, the normal execution of the program is stopped, and the exception is reported to the user. However, pytest.raises seems to circumvent this: after an EmptyStackError is raised in our test, the test simply passes and execution proceeds to the next test. How does pytest.raises achieve this?
Python provides a compound statement, the try-except statement, to execute a block of code and handle a case where one or more pre-specified exceptions are raised in that block. Here is the simplest form of a try-except statement:
When a try-except statement is executed:
First, the block of code indented within the try is executed.
If no exception occurs when executing this block, the except part is skipped, and the Python interpreter continues to the next statement after the try-except.
If an exception occurs when executing this block:
If the exception has type <ExceptionClass>, the block under the except is executed, and then after that the Python interpreter continues executing the next statement after the try-except.
Importantly, in this case the program does not immediately halt!
However, if the exception is a different type, this does stop the normal program execution.
In practice, client code often uses try-except statements to call functions that may raise an error as part of their public interface. This shields users from seeing errors that they should never see, and allows the rest of the program to continue.
For example, here is how we could implement a function that takes a stack and returns the second item from the top of the stack.
def second_from_top(s: Stack) -> Optional[str]:
"""Return the item that is second from the top of s.
If there is no such item in the Stack, returns None.
"""
try:
hold1 = s.pop()
except EmptyStackError:
# In this case, s is empty. We can return None.
return None
try:
hold2 = s.pop()
except EmptyStackError:
# In this case, s had only one element.
# We restore s to its original state and return None.
s.push(hold1)
return None
# If we reach this point, both of the previous s.pop() calls succeeded.
# In this case, we restore s to its original state and return the second item.
s.push(hold2)
s.push(hold1)
return hold2Picture a lineup at a fast food restaurant. The first person in line is the first one served, then the next person in line, and so forth. As new people join the line, they join at the back, so that everyone who joined before them are served before them. This is the exact opposite of a stack: in this lineup situation, people leave the line in the same order they joined it.
In this section, we’ll introduce a new abstract data type to represent this type of collection, see how to implement it in Python, and analyze our implementation’s take a close look at the operations a queue supports, how we might implement one in Python, and how efficient our implementation is.
A queue is another collection of data that, like a stack, adds and removes items in a fixed order. Unlike a stack, items come out of a queue in the order in which they entered. We call this behaviour First-In-First-Out (FIFO).
Queue
In code:
class Queue:
"""A first-in-first-out (FIFO) queue of items.
Stores data in a first-in, first-out order. When removing an item from the
queue, the most recently-added item is the one that is removed.
>>> q = Queue()
>>> q.is_empty()
True
>>> q.enqueue('hello')
>>> q.is_empty()
False
>>> q.enqueue('goodbye')
>>> q.dequeue()
'hello'
>>> q.dequeue()
'goodbye'
>>> q.is_empty()
True
"""
def __init__(self) -> None:
"""Initialize a new empty queue."""
def is_empty(self) -> bool:
"""Return whether this queue contains no items.
"""
def enqueue(self, item: Any) -> None:
"""Add <item> to the back of this queue.
"""
def dequeue(self) -> Any:
"""Remove and return the item at the front of this queue.
Raise an EmptyQueueError if this queue is empty.
"""
class EmptyQueueError(Exception):
"""Exception raised when calling dequeue on an empty queue."""
def __str__(self) -> str:
"""Return a string representation of this error."""
return 'dequeue may not be called on an empty queue'Much like a stack, we can also picture implementing this with a Python list. And, once again, we need to decide which end of the list is considered the front. Unlike the stack, we will see that there is a trade-off in choosing which end of the list is considered a front. Before reading the rest of the section, try to informally reason with yourself why this might be, taking into account that a queue is a FIFO.
In the following implementation, we use a Python list that is hidden from the client. We have decided that the beginning of the list (i.e., index 0) is the front of the queue. This means that new items that are enqueued will be added at the end of the list, and items that are dequeued are removed from the beginning of the list.
class Queue:
"""A first-in-first-out (FIFO) queue of items.
Stores data in a first-in, first-out order. When removing an item from the
queue, the most recently-added item is the one that is removed.
>>> q = Queue()
>>> q.is_empty()
True
>>> q.enqueue('hello')
>>> q.is_empty()
False
>>> q.enqueue('goodbye')
>>> q.dequeue()
'hello'
>>> q.dequeue()
'goodbye'
>>> q.is_empty()
True
"""
# Private Instance Attributes:
# - _items: The items stored in this queue. The front of the list represents
# the front of the queue.
_items: list
def __init__(self) -> None:
"""Initialize a new empty queue."""
self._items = []
def is_empty(self) -> bool:
"""Return whether this queue contains no items.
"""
return self._items == []
def enqueue(self, item: Any) -> None:
"""Add <item> to the back of this queue.
"""
self._items.append(item)
def dequeue(self) -> Optional[Any]:
"""Remove and return the item at the front of this queue.
Raise an EmptyQueueError if this queue is empty.
"""
if self.is_empty():
raise EmptyQueueError
else:
return self._items.pop(0)Our Queue.enqueue calls list.append, which we know takes constant (\(\Theta(1)\)) time. However, the Queue.dequeue calls self._items.pop(0), which takes \(\Theta(n)\) time (where \(n\) is the number of items stored in the queue). If we changed things around so that the front of the queue is the end of the list (rather than the beginning), we simply swap these running times. This presents a trade-off; using an array-based list, we can either have an efficient enqueue or an efficient dequeue operation.
Is there, perhaps, another data structure we can use instead of a list to improve efficiency? Unfortunately, both dict and set are unordered data structures, but queues need to maintain (and remember) a very specific order. One interesting programming challenge is to implement a queue using two stacks, which can be done correctly but is not always more efficient. Eventually you will learn about even more interesting data structures, and it may be a good idea to revisit the Queue ADT and see how to use your new arsenal of data structures instead of a Python list. And because of abstraction (i.e., _items is a private attribute), you can modify your Queue implementation however you like without having to change any client code that uses it!
Not all lineups work the same way. While the lineup at a McDonald’s restaurant serves customers in a first-in-first-out order, the emergency room at a hospital does not see patients in the order that they arrive. Instead, the medical team perform an initial assessment of each patient for the severity of their illness, and patients with more life-threatening issues are seen earlier than others, regardless of when they arrived. In other words, patients are prioritized based on their condition.
The Priority Queue ADT is similar to the Queue ADT, except that every item has some measure of its “priority”. Items are removed from a Priority Queue in order of their priority, and ties are broken in FIFO order. To summarize:
One subtlety with our definition of this ADT is in how we represent priorities. For this section, we’ll simply represent priorities as integers, with larger integers representing higher priorities. We’ll see a different way of representing priorites in the next chapter.
Here is the public interface of a PriorityQueue class.
class PriorityQueue:
"""A collection items that are be removed in priority order.
When removing an item from the queue, the highest-priority item is the one
that is removed.
>>> pq = PriorityQueue()
>>> pq.is_empty()
True
>>> pq.enqueue(1, 'hello')
>>> pq.is_empty()
False
>>> pq.enqueue(5, 'goodbye')
>>> pq.enqueue(2, 'hi')
>>> pq.dequeue()
'goodbye'
"""
def __init__(self) -> None:
"""Initialize a new and empty priority queue."""
def is_empty(self) -> bool:
"""Return whether this priority queue contains no items.
"""
def enqueue(self, priority: int, item: Any) -> None:
"""Add the given item with the given priority to this priority queue.
"""
def dequeue(self) -> Any:
"""Remove and return the item with the highest priority.
Raise an EmptyPriorityQueueError when the priority queue is empty.
"""
class EmptyPriorityQueueError(Exception):
"""Exception raised when calling pop on an empty stack."""
def __str__(self) -> str:
"""Return a string representation of this error."""
return 'You called dequeue on an empty priority queue.'Unlike with the Stack and Queue ADTs, it is not clear if we can use a list here. Somehow we need to not only store items, but also keep track of which one has the largest priority, and in the case of ties, which one was inserted first.
Our implementation idea here is to use a private attribute that is a list of tuples, where each tuple is a (priority, item) pair. Our list will also be sorted with respect to priority (breaking ties by insertion order), so that the last element in the list is always the next item to be removed from the priority queue.
With this idea, three of the four PriorityQueue methods are straightforward to implement:
from typing import Any
class PriorityQueue:
"""A queue of items that can be dequeued in priority order.
When removing an item from the queue, the highest-priority item is the one
that is removed.
"""
# Private Instance Attributes:
# - _items: a list of the items in this priority queue
_items: list[tuple[int, Any]]
def __init__(self) -> None:
"""Initialize a new and empty priority queue."""
self._items = []
def is_empty(self) -> bool:
"""Return whether this priority queue contains no items.
"""
return self._items == []
def dequeue(self) -> Any:
"""Remove and return the item with the highest priority.
Raise an EmptyPriorityQueueError when the priority queue is empty.
"""
if self.is_empty():
raise EmptyPriorityQueueError
else:
_priority, item = self._items.pop()
return itemAs an exercise, we’ll leave you to show that each of these operations also runs in \(\Theta(1)\) time. But what about PriorityQueue.enqueue? An initial approach might be to first insert the new priority and item into the list, and then sort the list by priority. But this is a bit inefficient: we shouldn’t need to re-sort the entire list, if we start with a sorted list and are simply inserting one new item. We make this observation precise by observing that the worst-case running time of list.sort is \(\Theta(n \log n)\). We’ll study sorting algorithms in detail later on this year. So instead, our enqueue implementation will search for the right index in the list to add the new item. For example, suppose we want to insert the item 'hi' with priority 5 into the priority queue with self._items equal to [(1, 'a'), (4, 'b'), (6, 'c'), (10, 'd')]. We need to insert (5, 'hi') into index 2 in this list:

Here is our implementation of enqueue:
class PriorityQueue:
...
def enqueue(self, priority: int, item: Any) -> None:
"""Add the given item with the given priority to this priority queue.
"""
i = 0
while i < len(self._items) and self._items[i][0] < priority:
# Loop invariant: all items in self._items[0:i]
# have a lower priority than <priority>.
i = i + 1
self._items.insert(i, (priority, item))In the second part of the loop condition, you might wonder about the <: could we do self._items[i][0] <= priority instead? Does it make a difference? It turns out that switching < for <= in the second part of the condition does make a difference when it comes to breaking ties. We’ll leave it as an exercise for you to work this out: try tracing an enqueue operation for the item 'hi' with priority 5 into the priority queue with self._items equal to [(1, 'a'), (5, 'b'), (5, 'c'), (10, 'd')].
And finally, what is the running time of this implementation of PriorityQueue.enqueue, for a priority queue with \(n\) elements? The loop here is a bit tricky to analyze because the number of iterations is not a fixed number in terms of \(n\). Here is one analysis:
The while loop takes at most \(n\) iterations, since i starts at 0 and increases by 1 at each iteration, and the loop must stop when i reaches \(n\) (if it hasn’t stopped earlier).
Since each loop iteration takes 1 step, in total the while loop takes at most \(n\) steps.
We know from our study of array-based lists that list.insert takes at most \(n\) steps, where \(n\) is the length of the list being inserted into.
Adding up these two quantities, the total running time of this algorithm is at most \(n + n = 2n\) steps, which is \(\cO(n)\).
Of course, we shouldn’t be satisfied with just an upper bound on the running time! It turns out that we can do better by incorporating the value of variable i in our calculation. Let \(I\) be the value of variable i after the loop finishes. Then:
list.insert on a list of length \(n\) to insert an item at index \(I\) takes \(n - I\) steps.In other words, we’ve shown that every call to this implementation of PriorityQueue.enqueue will take \(\Theta(n)\) time, regardless of the priority being inserted.
Our implementation of PriorityQueue has a constant-time dequeue but a linear-time enqueue. You might naturally wonder if we can do better: what if we used an unsorted list of tuples instead? This would allow us to have \(\Theta(1)\) enqueue operations, simply by appending a new (priority, item) tuple to the end of self._items. However, we have simply shifted the work over to the dequeue operation. Specifically, we must search for the highest priority item in a list of unsorted items, which would take \(\Theta(n)\) time. Yet another trade-off!
In CSC263/CSC265, you’ll learn about the heap, a data structure which is commonly used to implement the Priority Queue ADT in practice. We can use this data structure to implement both PriorityQueue.enqueue and PriorityQueue.dequeue with a worst-case running time of \(\Theta(\log n)\). This is actually the approach taken by Python’s built-in heapq module. Pretty neat!
In this chapter, we have learned that an abstract data type can have multiple implementations, and seen this first-hand with a variety of ADTs. For example, in 9.4 Stacks we saw that the Stack ADT can be implemented using a Python list in two different ways, storing the top of the stack at the end of the list (Stack1) or the front of the list (Stack2). Though these two classes had different implementations, they shared the same public interface of the Stack ADT.
One limitation of the code we wrote for these two classes is that the only way to tell that Stack1 and Stack2 had the same interface was from their method names and docstrings. In this section, we’ll see how to create a special kind of Python class that is used to define a public interface that can be implemented by other classes, using a Python language feature known as inheritance.
Let us begin by defining a Stack class that consists only of the public interface of the Stack ADT.
class Stack:
"""A last-in-first-out (LIFO) stack of items.
This is an abstract class. Only subclasses should be instantiated.
"""
def is_empty(self) -> bool:
"""Return whether this stack contains no items.
"""
raise NotImplementedError
def push(self, item: Any) -> None:
"""Add a new element to the top of this stack.
"""
raise NotImplementedError
def pop(self) -> Any:
"""Remove and return the element at the top of this stack.
Raise an EmptyStackError if this stack is empty.
"""
raise NotImplementedError
class EmptyStackError(Exception):
"""Exception raised when calling pop on an empty stack."""In Python, we mark a method as unimplemented by having its body raise a special exception, NotImplementedError. We say that a method is abstract when it is not implemented and raises this error; we say that a class is abstract when at least one of its methods is abstract (i.e., not implemented). A concrete class is a class that is not abstract; so far in this course, we’ve been dealing with concrete classes, and called them concrete data types. The terminology here is a bit confusing because of the multiple uses of certain terms. A concrete Python class is the same as a concrete data type. However, an abstract Python class is not the same thing as an abstract data type; the former has a technical meaning specific to the Python programming language, while the latter is the name given to an abstract description of a data type that is programming language-independent.
Now, you might wonder what the purpose of an abstract class is. Indeed, a programmer who creates a Stack object will quickly find it is useless, because calling the Stack ADT operations cause errors:
>>> s = Stack()
>>> s.push(30)
Traceback...
NotImplementedError
>>> s.pop()
Traceback...
NotImplementedErrorIf we can’t use the Stack object for any of the Stack ADT operations, what was the point in creating it? The answer is very much based on abstraction, hence the name abstract class. The Stack class we have defined is a direct translation of the Stack ADT: an interface that describes the methods that a concrete class that wants to implement the Stack ADT must define. Python gives us a way to describe the relationship between an abstract class and a concrete class that implements its methods directly in the code.
Earlier in this chapter, we defined two new types: Stack1 and Stack2. However, despite the two types sharing the same method names, the code did not indicate that the types were related in any way. Now that we have the abstract class Stack, we can indicate this relationship in the code through inheritance:
class Stack1(Stack):
def __init__(self) -> None:
"""Initialize a new empty stack.
"""
self._items = []
def is_empty(self) -> bool:
"""..."""
return self._items == []
def push(self, item: Any) -> None:
"""..."""
self._items.append(item)
def pop(self) -> Any:
"""..."""
return self._items.pop()
class Stack2(Stack):
def __init__(self) -> None:
"""Initialize a new empty stack.
"""
self._items = []
def is_empty(self) -> bool:
"""..."""
return self._items == []
def push(self, item: Any) -> None:
"""..."""
self._items.insert(0, item)
def pop(self) -> Any:
"""..."""
return self._items.pop(0)In the class header class Stack1(Stack) and class Stack2(Stack), the syntax (Stack) indicates that Stack1 and Stack2 inherit from Stack. There are specific words we use to talk about these relationships:
Stack: base class, superclass, and parent class are synonyms.Stack1, Stack2: subclass, child class, and derived class are synonyms.For example, we can say that “Stack is the parent class of Stack1” or “Stack2 is a subclass of Stack”.

When one class in Python inherits from another, there are two important consequences. First, the Python interpreter treats every instance of the subclass as an instance of the superclass as well.
>>> s1 = Stack1()
>>> isinstance(s1, Stack1)
True
>>> isinstance(s1, Stack)
True
>>> isinstance(s1, Stack2)
FalseSecond, when the superclass is abstract, the subclass must implement all abstract methods from the superclass, without changing the public interface of those methods. Just like preconditions and representation invariants, inheritance serves as another form of contract:
So for example, if we say that Stack1 is a subclass of Stack, then any user of Stack1 can expect to be able to call push, pop, and is_empty on Stack1 instances. And of course the same applies to Stack2 as well.
It is this expectation that allows us to use inheritance in Python to express a shared public interface between multiples classes. In our example, because Stack1 and Stack2 are both subclasses of Stack, we expect them implement all the stack methods. They might also implement additional methods that are unique to each subclass (not shared), but this is not required.
Suppose we are writing code that operations on a stack, like in the following function:
def push_and_pop(s: ..., item: Any) -> None:
"""Push and pop the given item onto the stack s."""
s.push(item)
s.pop()What type annotation would be appropriate for s? If we use a concrete stack implementation like Stack1, this would rule out other stack implementations for this function. Instead, we use the abstract class Stack as the type annotation, to indicate that our function push_and_pop can be called with any instance of any Stack subclass.
def push_and_pop(s: Stack, item: Any) -> None:
"""Push and pop the given item onto the stack s."""
s.push(item)
s.pop()Remember that Stack defines a public interface that is shared between all of its subclasses: the body of push_and_pop only needs to call methods from that interface (pop and push), and doesn’t worry about how those methods are implemented. This allows us to pass to the push_and_pop function a Stack1 or Stack2 object, which both inherit from Stack.
>>> s1 = Stack1()
>>> push_and_pop(s1) # This works!
>>> s2 = Stack2()
>>> push_and_pop(s2) # This also works!You might notice that there are actually three versions of push in our code: Stack.push, Stack1.push, and Stack2.push. So which method does the Python interpreter choose when the push_and_pop function is called? This is how it works for s.push(item) (s.pop() is handled similarly):
s.push(item), it first computes type(s). The result will depend on the argument we passed in—in our above example, type(s1) is Stack1, and type(s2) is Stack2.push method and calls it, passing in s for the self argument.There are instances with inheritance where a subclass might not implement a particular method from the superclass. We’ll look at some examples of this in the next section.We say that the Python interpreter dynamically looks up (or resolves) the s.push/.pop method, because the actual method called by s.push/s.pop changes depending on the argument passed to push_and_pop.
We say that the push_and_pop function is polymorphic, meaning it can take as inputs values of different concrete data type and select a specific method based on the type of input. This support for polymorphism is also why the “object dot notation” style of method call is preferred to the “class dot notation” style we’ve been using up to this point. Consider the following two alternate implementations of push_and_pop:
def push_and_pop_alt1(s: Stack, item: Any) -> None:
"""Push and pop the given item onto the stack s."""
Stack.push(s, item)
Stack.pop(s)
def push_and_pop_alt2(s: Stack, item: Any) -> None:
"""Push and pop the given item onto the stack s."""
Stack1.push(s, item)
Stack1.pop(s)The first version (alt1) explicitly calls the Stack.push and Stack.pop methods, both of which are unimplemented and would raise a NotImplementedError. The second version (alt2) calls concrete methods Stack1.push and Stack1.pop, which assumes a specific stack implementation (Stack1), and so push_and_pop would only be guaranteed to work on Stack1 instances, but not any other Stack subclass. This makes push_and_pop no longer polymorphic: the correct type annotation for s would be Stack1, not Stack.
Because both Stack1 and Stack2 are different implementations of the same interface, we can use polymorphism to help us measure the performance of each. Below, we time the push_and_pop function, first with a Stack1 object and second with a Stack2 object.
if __name__ == '__main__':
# Import the main timing function.
from timeit import timeit
# The stack sizes we want to try.
STACK_SIZES = [1000, 10000, 100000, 1000000, 10000000]
for stack_size in STACK_SIZES:
stack1 = Stack1()
stack2 = Stack2()
# Bypass the Stack interface to create a stack of size <stack_size>.
# This speeds up the experiment, but we know this violates
# encapsulation!
stack1._items = list(range(0, stack_size))
stack2._items = list(range(0, stack_size))
# Call push_and_pop(stack1) 1000 times, and store the time taken.
t1 = timeit('push_and_pop(stack1, 10)', number=1000, globals=globals())
t2 = timeit('push_and_pop(stack2, 10)', number=1000, globals=globals())
print(f'Stack size {stack_size:>8}; Stack1 time {t1}; Stack2 time {t2}')If we have several implementations of an ADT, each inheriting from the same base class, then we can quickly run experiments on all of them but only need to remember a single interface. This creates a rule of thumb: when indicating the type of an object (e.g., through a type contract), choose the most generic type possible. Following this rule of thumb means that the client code is not constrained to one particular implementation (such as Stack1) and can readily change the underlying object so long as the new object type shares the same public interface.
Many software applications follow the same principle. For example, you may have used software with “plugins”:Like PyCharm! each plugin implements a shared public interface, allowing the software to use it without knowing any of the details. For example, Adobe develops the powerful Photoshop application for image editing. David comes along and discovers a feature he really wants is missing. Rather than asking Adobe to implement the new feature, he can implement it himself as a plugin. Thus, Adobe has allowed independent developers to extend the functionality of their software after it has been released and without any employees of their own. Behold, the power of abstraction!
object SuperclassIn our very first chapter, we described every piece of data as an object, and have continued to use this term throughout this course. It turns out that “object” is not merely a theoretical concept, but made explicit in the Python language. Python has a special class called object, which is an ancestor classBy “ancestor” we mean either a parent class, or a parent of a parent class, etc. of every other class, both built-in classes like int or our custom data classes and the classes we’ve defined in this chapter. And this includes abstract classes like Stack!
By default, whenever we define a new class (including data classes), if we do not specify a superclass in parentheses, object is the implicit superclass, which is why we can write class Stack: instead of class Stack(object):.

object special methodsThis object class defines several special methods as part of its shared public interface, including: The Python convention is to name methods that have a special purpose with double underscores. These are sometimes called “dunder” methods (double underscore).
__init__(self, ...), the initializer__str__(self), which returns a str representation of the object.Unlike our Stack abstract class earlier this chapter, the object class is actually not abstract, and implements each of these methods. We can use this to illustrate a different form of inheritance, where the superclass is a concrete class. In this case, inheritance is used not just to define a shared public interface, but also to provide default implementations for each method in the interface.
For example, suppose we create a dummy class with a completely empty body:
This class inherits the object.__init__ method, which allows us to create new Donut instances.
Similarly, this class inherits the object.__str__ method, which returns a string that states the class name and memory location of the object:
We can use the built-in dir function to see all of the special methods that Donut has inherited from object:Though this list includes few special attributes set directly by the Python interpreter, which are beyond the scope of this course.
>>> dir(Donut)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__']There is another reason these methods are special beyond simply being inherited from the object superclass: they are often called by other functions or parts of Python syntax. For example, we have already seen how the __init__ method is called when a new object is initialized.
The __str__ method is called when we attempt to convert an object to a string by calling str on it:
>>> d = Donut()
>>> d.__str__()
'<__main__.Donut object at 0x7fc299d7b588>'
>>> str(d)
'<__main__.Donut object at 0x7fc299d7b588>'Similarly, the built-in print function actually first converts its arguments into strings using their __str__ methods, and then prints out the resulting text.
Now, even though the object superclass contains default implementations of __init__ and __str__, in practice we often want to define our own custom implementations of these methods.
Every time we’ve defined our own __init__ in a class, we have overridden the object.__init__ method. Formally, we say that a class C overrides a method m when the method m is defined in the superclass of C, and is also given a concrete implementation in the body of C. This definition applies whether the superclass of C has m as an abstract or concrete method. For example, we could say that Stack1 overrides the push and pop method from its abstract superclass Stack.
Similarly, when we defined a custom exception class in Section 9.5,
class EmptyStackError(Exception):
"""Exception raised when calling pop on an empty stack."""
def __str__(self) -> str:
"""Return a string representation of this error."""
return 'pop may not be called on an empty stack'this class overrode the __str__ method to use its own string representation, which is displayed when this exception is raised.
We do not write software in a vacuum; we study computer science to learn how to use vast computational power to solve real-world problems. As professionals in industry and academia, the programs we create serve a purpose, whether to satisfy a need from a client, to improve individual lives and society, or to advance human knowledge and technology. In the previous chapters of these course notes, we have learned about the fundamentals of programming, mathematical proof, and algorithm analysis. We have focused on developed the knowledge and skills required to create and analyse programs.
In this final chapter of the course, we will take what we’ve learned and apply it to design and implement a large program to solve a real-world problem. As a first step to this, we’ll learn about how to approach a new domain to understand how we can apply computer science techniques to both represent and solve problems in that domain.
A problem domain is collection of knowledge We use the term domain-specific knowledge to refer to knowledge about a particular domain. Society often uses the term domain experts to refer to people who have a great deal of knowledge in a particular domain. about a specific field, phenomenon, or discipline, and an understanding of the goals, problems, deficiencies, and/or desired improvements within that area. Each problem domains encompass many different kinds of knowledge, including terminology and definitions, concepts and skills, and context and history. Through your lectures, tutorials, and assignments, you’ve touched on a wide array of problem domains, such as tracking marriage records in the City of Toronto, modelling the spread of infectious diseases, generating course timetables as U of T students, and cryptography.
Let’s unpack how we explored the domain of cryptography in Chapter 7. We first introduced the key scenario of two people communicating securely so that their messages could not be deciphered by an eavesdropper. As we dove into cryptography, we learned about:
Our previous study of programming enabled us to write programs, but we had to learn all about the domain of cryptography before being able to implement cryptographic algorithms ourselves. Our knowledge of Python programming alone might have been sufficient to explain what operations are performed on what data in, for example, rsa_generate_key, rsa_encrypt, and rsa_decrypt. But it was the domain-specific knowledge we learned that explained how we came up with these algorithms and why they are correct.
Now, we’ll introduce a new problem domain that we will spend the rest of this chapter studying.
Consider a person or household self-quarantining during the pandemic. One of the main logistical challenges they have to face is how to arrange for food during their quarantine. To help address this need, you have founded Hercules Ltd., a non-profit organization that allows people under quarantine to order groceries and meals from grocery stores and restaurants, and arranges for couriers to make deliveries right to their front doors. You are incredibly excited and can’t wait to launch a Hercules app. Your friend is a bit more cautious, and wonders how many couriers will be needed to make grocery and meal deliveries in a timely manner, which of course will depend on how many people use the app. You and your friend decide to put the computational skills you’ve learned in this course to help answer this question.
This problem domain is likely a familiar one; the idea of having food delivered to your doorstep has existed for a long time. The preceding paragraph uses some familiar terminology, such as couriers and deliveries. You may even be familiar with existing apps that already do this, such as UberEats, Skip the Dishes, or Instacart. When thinking about designing and implementing this app, you are probably considering:
We can view food delivery in Toronto as a system, which is a group of entities (or agents) that interact with each other over time. Systems modeling is frequently used to conceptualize how an organization operates. The first part of creating a computational model for such a system is to design and implement the various entities in the system—in the case of the Hercules Ltd., these are entities like couriers and the customers placing orders.
The entities in a system are not static; they change over time. New people sign up and place food orders; couriers pick up meals from restaurants and deliver them to clients. For a live app, these events are driven by real humans interacting with the app in real-time. In this chapter, however, we’re going to look at another way of driving change in our food delivery system over time. The second part of our computational model is a simulation that uses randomness to generate events that cause the system to change over time. For example, our food delivery simulation will specify how often customers place an order, taking into account that some times of day are busier than others.
Computational simulations are a powerful tool because they harness the speed and reliability of your computer to perform complex calculations and produce results that can be analysed and visualized. But simulations are reliant on the accuracy of their underlying mathematical models, and are ultimately approximations of the real world. A well-designed simulation allows the programmer to start with a simple model and extend and tweak it in response to new domain-specific knowledge.
Over the course of this chapter, we’ll study how to design and implement both of these parts of a computational model for our food delivery platform, Hercules. This case study will also give us an opportunity to explore the design of a relatively complex software system. We’ll use what we’ve learned about classes to model the entities in a food delivery network, and study a specific kind of simulation known as the discrete-event simulation. We hope you’re excited. Hercules is counting on you!
In the previous section, we said that a system is a collection of entities that interact with each other over time. In this section, we will explore what data should be a part of our problem domain—a food delivery system—and how that data might change over time. We’ll introduce an object-oriented approach to modelling this data in Python, using both data classes and general classes to represent different entities.
One thing to keep in mind as we proceed through this section (and the rest of the chapter) is that just like in the “real world”, the scope of our problem domain is not fixed and can change over time. We are interested in the minimum set of data needed for our system to be meaningful, keeping the scope small at first with the potential to expand over time. Throughout this section, we’ll point out places where we make simplifying assumptions that reduce the complexity of our system, which can serve as potential avenues for your own independent explorations after working through this chapter.
A good first step in modelling our problem domain is to identify the relevant entities in the domain. Here is our initial description of Hercules from the previous section:
Consider a person or household doing a self-quarantine during the pandemic. One of the main logistical challenges they have to face is how to arrange for food during their quarantine. To help address this need, you have founded Hercules Ltd., a non-profit organization that allows people under quarantine to order groceries and meals from grocery stores and restaurants, and arranges for couriers to make deliveries right to their front doors.
We use two strategies for picking out relevant entities from an English description like this one:
In an object-oriented design, a standard approach is to create a class to represent each of these entities. Should we make a data class or a general class for each one? There are no easy answers to this question, but a good strategy to use is to start with a data class, since data classes are easier to create, and turn it into a general class if we need a more complex design (e.g., to add methods, including the initializer, or mark attributes as private).
@dataclass
class Restaurant:
"""A place that serves food."""
@dataclass
class Customer:
"""A person who orders food."""
@dataclass
class Courier:
"""A person who delivers food orders from restaurants to customers."""
@dataclass
class Order:
"""A food order from a customer."""Once we have identified the classes representing the entities in the system, we now dive into the details of the system to identify appropriate attributes for each of these data classes. We’ll discuss our process for two of these data classes in this section, and leave the other two to lecture this week.
Restaurant data classLet us consider how we might design a restaurant data class. What would a restaurant need to have stored as data? It is useful to envision how a user might interact with the app. A user might want to browse a list of restaurants available, and so we need a way to identify each restaurant: its name. After selecting a restaurant, a user needs to see what food is available to order, so we need to store a food menu for each restaurant. Finally, couriers need to know where restaurants are in order to pick up food orders, and so we need to store a location for each restaurant.
Each of these three pieces of information—restaurant name, food menu, and location—are appropriate attributes for the restaurant. Now we have to decide what data types to use to represent this data. You have much practice doing this, stretching back to all the way to the beginning of this course! Yet as we’ll see, there are design decisions to be made even when choosing individual attributes.
The restaurant name is fairly straightforward: we’ll use a str to represent it.
The restaurant menu has a few different options. For this section, we’ll use a dict that maps the names of dishes (strs) to their price (floats).
There are many ways to represent a restaurant’s location. For example, we could store its address, as a str. Or we could improve the precision of our data and store the latitude and longitude (a tuple of floats), which would be useful for displaying restaurants on maps.
For now, we’ll store both address and latitude/longitude information for each restaurant. It may be that both representations are useful, and should be stored by our application.
@dataclass
class Restaurant:
"""A place that serves food.
Instance Attributes:
- name: the name of the restaurant
- address: the address of the restaurant
- menu: the menu of the restaurant with the name of the dish mapping to
the price
- location: the location of the restaurant as (latitude, longitude)
Representation Invariants:
- all(self.menu[item] >= 0 for item in self.menu)
- -90 <= self.location[0] <= 90
- -180 <= self.location[1] <= 180
"""
name: str
address: str
menu: dict[str, float]
location: tuple[float, float]There is one other subtlety with this design before we move on. The menu is a compound data type, and we chose to represent it using one of Python’s built-in data structures. But another approach would have been to create a completely separate Menu data class. That is certainly a viable option, but we were wary of falling into the trap of creating too many classes in our simulation. Each new class we create introduces a little more complexity into our program, and for a relatively simple class for a menu, we did not think this additional complexity was worth it.
On the flip side, we could have used a dictionary to represent a restaurant instead of a Restaurant data class. This would have reduced on area of complexity (the number of classes to keep track of), but introduced another (the “valid” keys of a dictionary used to represent a restaurant). There is always a trade-off in design, and when evaluating trade-offs one should never forget cognitive load on the programmer.
Order data classNow let’s discuss a data class that’s a bit more abstract: a single order. An order must track the customer who placed the order, the restaurant where the food is being ordered from, and the food items that are being ordered. We can also imagine that an order should have an associated courier who has been assigned to deliver the order. We’ll also keep track of when the order was created, and when the order is completed.
There’s one subtlety with two of these attributes: the associated courier and the time when the order is completed might only be assigned values after the order has been created. So we use a default value None to assign to these two instance attributes when an Order is first created. We could implement this by converting the data class to a general class and writing our own __init__ method, but instead we’ll take advantage of a new feature with data classes: the ability to specify default values for an instance attribute after the type annotation.
@dataclass
class Order:
"""A food order from a customer.
Attributes:
- customer: the name of the customer who placed this order
- restaurant: the name of the restaurant the order is placed for
- food_items: a mapping from names of food to the quantity being ordered
- start_time: the time the order was placed
- courier: the courier assigned to this order (initially None)
- end_time: the time the order was completed by the courier (initially None)
Representation Invariants:
- self.food_items != []
- all(self.food_items[i][1] > 0 for i in range(len(self.food_items)))
"""
customer: Customer
restaurant: Restaurant
food_items: dict[str, int]
start_time: datetime.datetime
courier: Optional[Courier] = None
end_time: Optional[datetime.datetime] = NoneThe line courier: Optional[Courier] = None is how we define an instance attribute Courier with a default value of None. The type annotation Optional[Courier] means that this attribute can either be None or a Courier instance. Similarly, the end_time attribute must be either None (its initial value) or a datetime.datetime value.
Here is how we could use this class (note that Customer is currently an empty data class, and so is instantiated simply as Customer()):
>>> david = Customer()
>>> mcdonalds = Restaurant(name='McDonalds', address='160 Spadina Ave',
... menu={'fries': 4.5}, location=(43.649, -79.397))
>>> order = Order(customer=david, restaurant=mcdonalds,
... food_items={'fries': 10},
... start_time=datetime.datetime(2020, 11, 5, 11, 30))
>>> order.courier is None # Illustrating default values
True
>>> order.end_time is None
TrueJust as we saw earlier in the course that built-in collection types like lists can be nested within each other, classes can also be “nested” within each other through their instance attributes. Our above Order data class has attributes which are instances of other classes we have defined (Customer, Restaurant, and Courier).
The relationship between Order and these other classes is called class composition, and is a fundamental to object-oriented design. When we create classes for a computational model, these classes don’t exist in isolation. They can interact with each other in several ways, one of which is composition. We use class composition to represent a “has a” relationship between two classes (we say that “an Order has a Customer”). This is in contrast to inheritance, which defines an “is a” relationships between two classes, e.g. “Stack1 is a Stack”.
In the previous section, we defined four different data classes—Restaurant, Customer, Courier, Order—to represent different entities in our food delivery system. We must now determine how to keep track of all of these entities, and how they can interact with each other. For example, as a user I would want to be able to look up a list of restaurants in my area to order food from. In code, how does a single Customer object “know” about all the different Restaurants in the system? Should each Customer have an attribute containing list of Restaurants? The question of how objects “know” about other objects is similar to the notion of variable scope. A variable’s scope determines where it can be accessed in a program; the scope of an object dictates the object’s lifetime and who the object belongs to. But now consider our current problem domain, with the hundreds of restaurants and potential thousands of customers. What should the scope of all those objects be?
There are many ways to approach this problem. A common object-oriented design approach is to create a new manager class whose role is to keep track of all of the entities in the system and to mediate the interactions between them (like a customer placing a new order). This class is more complex than the others we saw in the last section, and so we will not use a data class, and instead use a general class with a custom initializer and keep most of the instance attributes private.
Here is the manager class we’ll create for our food delivery system. The FoodDeliverySystem class will store (and have access to) every customer, courier, and restaurant represented in our system.
class FoodDeliverySystem:
"""A system that maintains all entities (restaurants, customers, couriers, and orders).
Public Attributes:
- name: the name of this food delivery system
Representation Invariants:
- self.name != ''
- all(r == self._restaurants[r].name for r in self._restaurants)
- all(c == self._customers[c].name for c in self._customers)
- all(c == self._couriers[c].name for c in self._couriers)
"""
name: str
# Private Instance Attributes:
# - _restaurants: a mapping from restaurant name to Restaurant object.
# This represents all the restaurants in the system.
# - _customers: a mapping from customer name to Customer object.
# This represents all the customers in the system.
# - _couriers: a mapping from courier name to Courier object.
# This represents all the couriers in the system.
# - _orders: a list of all orders (both open and completed orders).
_restaurants: dict[str, Restaurant]
_customers: dict[str, Customer]
_couriers: dict[str, Courier]
_orders: list[Order]
def __init__(self, name: str) -> None:
"""Initialize a new food delivery system with the given company name.
The system starts with no entities.
"""
self.name = name
self._restaurants = {}
self._customers = {}
self._couriers = {}
self._orders = []What we have done so far is model the static properties of our food delivery system, that is, the attributes that are necessary to capture a particular snapshot of the state of the system at a specific moment in time. Next, we’re going to look at how to model the dynamic properties of the system: how the entities interact with each other and cause the system state to change over time.
Though a FoodDeliverySystem instance starts off empty, we can define simple methods to add entities to the system. You can picture this happening when a new restaurant/customer/courier signs up for our app. By making our collection attributes private and requiring client code call these methods, we can check for uniqueness of these entity names as well.
class FoodDeliverySystem:
...
def add_restaurant(self, restaurant: Restaurant) -> bool:
"""Add the given restaurant to this system.
Do NOT add the restaurant if one with the same name already exists.
Return whether the restaurant was successfully added to this system.
"""
if restaurant.name in self._restaurants:
return False
else:
self._restaurants[restaurant.name] = restaurant
return True
def add_customer(self, customer: Customer) -> bool:
"""Add the given customer to this system.
Do NOT add the customer if one with the same name already exists.
Return whether the customer was successfully added to this system.
"""
# Similar implementation to add_restaurant
def add_courier(self, courier: Courier) -> bool:
"""Add the given courier to this system.
Do NOT add the courier if one with the same name already exists.
Return whether the courier was successfully added to this system.
"""
# Similar implementation to add_restaurantThe main driving force in our simulation is customer orders. When a customer places an order, a chain of events is triggered:
To represent these events in our program, we need to create functions that mutate the state of the system. Where should we create these functions? We could write them as top-level functions, or as methods of one of our existing entity classes (turning that class from a data class into a general class). We have previously said that one of the roles of the FoodDeliverySystem is to mediate interactions between the various entities in the system, and so this makes it a natural class to add these mutating methods.
class FoodDeliverySystem:
...
def place_order(self, order: Order) -> None:
"""Record the new given order.
Assign a courier to this new order (if a courier is available).
Preconditions:
- order not in self.orders
"""
def complete_order(self, order: Order) -> None:
"""Mark the given order as complete.
Make the courier who was assigned this order available to take a new order.
Preconditions:
- order in self.orders
"""We could then place an order from a customer using FoodDeliverySystem.place_order, which would be responsible for both recording the order and assigning a courier to that order. FoodDeliverySystem.complete_order does the opposite, marking the order as complete and un-assigning the courier so that they are free to take a new order. With both FoodDeliverySystem.place_order and FoodDeliverySystem.complete_order, we can begin to see how a simulation might take place where many customers are placing orders to different restaurants that are being delivered by available couriers.
Note that this discussion should make sense even though we haven’t implemented either of these methods. Questions like “How do we choose which courier to assign to a new order?” and “How do we mark an order as complete?” are about implementation rather than the public interface of these methods. We’ll discuss one potential implementation of these methods in lecture, but we welcome you to attempt your own implementations as an exercise.
In the previous two sections, we discussed the key classes we can use to represent a food delivery system: data classes Restaurant, Customer, Courier, and Order to represent individual entities, and a FoodDeliverySystem class to manage all of them. But even though the FoodDeliverySystem class has methods that allow us to mutate the state of the system, you might wonder: who is responsible for actually calling these methods?
If we were building a “real-world” app, we would need to write code that explicitly connects user actions (e.g., pressing a button on a mobile app) to these methods, and almost certainly rely on an existing software framework to do much of the “connecting” for us.
The approach we’re taking in this chapter is a bit different. Instead of writing the code necessary to respond to real-world actions, we are going to create a simulation that uses a combination of preset and random data to simulate these kinds of real-world actions. The driving force of our simulation will be events that cause our system to mutate. For example, a “new order” event for when a customer places an order, and a “complete order” event for when a courier has delivered an order to a customer.
Event interfaceThere are many other events we might add to the simulation, but they clearly have something in common: they are events that cause the state of the simulation to change. In 9.8 Defining a Shared Public Interface with Inheritance, we learned how to define an abstract class to represent a shared public interface, and used inheritance to relate this abstract class to concrete subclasses that must adhere to this interface. In our case, we’ll define abstract Event class with subclasses NewOrderEvent and CompleteOrderEvent to represent different kinds of events.
Here is an initial definition of this Event interface. The class has one abstract method, handle_event, which is how we connect each event to a change in the food delivery system.
class Event:
"""An abstract class representing an event in a food delivery simulation.
"""
def handle_event(self, system: FoodDeliverySystem) -> None:
"""Mutate the given food delivery system to process this event."""
raise NotImplementedErrorEach Event subclass is responsible for implementing handle_event based on the type of change the subclass represents. For example, the NewOrderEvent.handle_event method should, well, add a new order to the system. In order to implement handle_event, each subclass will probably need its own set of instance attributes to represent the details of the event (e.g., what order to add in a NewOrderEvent).
But before we discuss these subclass-specific attributes, we’ll take a brief detour we’ll introduce another feature of inheritance: shared instance attributes. Specifically, our simulation will need to know exactly when every event should happen, which every event object needs to keep track of.
We have seen that an abstract superclass declare methods that all its subclasses need to have in common, establishing a shared public interface. A superclass can also declare public instance attributes that its subclasses must have in common. For our Event class, we can establish that all event subclasses will have a timestamp indicating when the event took place. This timestamp attribute becomes part of the shared public interface of each subclass.
import datetime
class Event:
"""An abstract class representing an event in a food delivery simulation.
Instance Attributes:
- timestamp: the start time of the event
"""
timestamp: datetime.datetimeEven though abstract classes should not be instantiated directly, we provide an initializer to initialize the common attributes (namely, timestamp):
import datetime
class Event:
"""An abstract class representing an event in a food delivery simulation.
Instance Attributes:
- timestamp: the start time of the event
"""
timestamp: datetime.datetime
def __init__(self, timestamp: datetime.datetime) -> None:
"""Initialize this event with the given timestamp."""
self.timestamp = timestampNow let’s create a new class that inherits from Event:
Remember that subclasses will inherit all the methods from their superclass. So when we attempt to initialize a NewOrderEvent, the Python interpreter will call Event.__init__ (because NewOrderEvent did not override the parent’s __init__ method). This means we must provide a datetime.datetime object as the first argument when creating a new NewOrderEvent object:
>>> e = NewOrderEvent()
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'timestamp'
>>> e = NewOrderEvent(datetime.datetime(2020, 9, 8))
>>> e.timestamp
datetime.datetime(2020, 7, 20, 0, 0)It is possible that subclasses need their own attributes in addition to the ones that are common through the base class. In these scenarios, we should document our new attributes in the subclass itself. We often make these attributes private, to avoid changing the public interface declared by the abstract superclass. We do not need to repeat the documentation for the timestamp attribute; our expectation is that users should read the documentation of both the NewOrderEvent and Event classes to get the full picture of how NewOrderEvent is used.
class NewOrderEvent(Event):
"""An event representing a when a customer places an order at a restaurant."""
# Private Instance Attributes:
# _order: the new order to be added to the FoodDeliverySystem
_order: OrderTo initialize this new attribute, we must define a separate initializer for NewOrderEvent. Here is our first attempt:
class NewOrderEvent(Event):
"""An event representing a when a customer places an order at a restaurant."""
# Private Instance Attributes:
# _order: the new order to be added to the FoodDeliverySystem
_order: Order
def __init__(self, order: Order) -> None:
"""Initialize a NewOrderEvent for the given order."""
self._order = orderThis code looks correct, but has a subtle bug. By defining our own initializer for NewOrderEvent, we have overridden the Event.__init__ method. Python will no longer call Event.__init__ when creating a new NewOrderEvent object. However, this is problematic because subclasses inherit methods, not attributes. This means that the public instance attribute timestamp is missing from our NewOrderEvent object:
>>> order = ... # Assume we've defined an Order object here
>>> event = NewOrderEvent(order)
>>> event.timestamp
Traceback (most recent call last):
File "<input>", line 1, in <module>
AttributeError: 'NewOrderEvent' object has no attribute 'timestamp'So how do we make NewOrderEvent have both an _order and timestamp attribute? We need to modify its initializer, since it is the responsibility of the initializer to give values to all instance attributes.
First, what should the value of the event’s timestamp be? A natural choice is that it should be the time that the order was placed—its start_time attribute. Here is our second attempt at the NewOrderEvent.__init__ method:
class NewOrderEvent(Event):
def __init__(self, order: Order) -> None:
self.timestamp = order.start_time
self._order = orderHowever, initializing the timestamp attribute directly in the subclass is bad design; code has been duplicated and that makes our code smell bad. Every time we modify the Event class to include new shared attributes, we’d also need to modify NewOrderEvent.__init__ (and the initializers of every other subclass) to initialize those attributes.
So instead, we modify NewOrderEvent.__init__ so that it directly calls Event.__init__. Remember that when we call a method using the <Class>.<method> name, we need to pass in the self argument explicitly. Here is our third and final version of this initializer:
class NewOrderEvent(Event):
"""An event where a customer places an order for a restaurant."""
_order: Order
def __init__(self, order: Order) -> None:
Event.__init__(self, order.start_time)
self._order = orderNow, whenever we call NewOrderEvent.__init__, Python also calls Event.__init__. This causes all shared instance attributes from Event to be “inherited” by the NewOrderEvent subclass.
To summarize, we must follow two rules when inheriting from a class that defines its own initializer:
NewOrderEvent.handle_eventNext, we’ll show how to complete the implementation of NewOrderEvent by implementing its handle_event method. Our first attempt is quite simple, taking advantage of the methods we defined in 10.3 A “Manager” Class.
class NewOrderEvent(Event):
"""An event where a customer places an order for a restaurant."""
_order: Order
def __init__(self, order: Order) -> None:
Event.__init__(self, timestamp)
self._order = order
def handle_event(self, system: FoodDeliverySystem) -> None:
"""Mutate system by placing an order."""
system.place_order(self._order)Now, there’s a subtle problem with this method that we’ll return to at the end of this section. A good exercise is to pause here and try to think about what the problem might be.
Event subclassBelow, we’ve shown the implementation of our CompleteOrderEvent, which is quite similar to newOrderEvent. The major difference is that its initializer takes an explicit datetime.datetime argument to represent when the given order is completed. By convention, the timestamp parameter is the first parameter, so that the subsequent parameters are seen as additional parameters needed by NewOrderEvent rather than Event. This example shows that initializers of subclasses can have different signatures than the initializer of their parent class.
class CompleteOrderEvent(Event):
"""When an order is delivered to a customer by a courier."""
# Private Instance Attributes:
# _order: the order to be completed by this event
_order: Order
def __init__(self, timestamp: datetime.datetime, order: Order) -> None:
Event.__init__(self, timestamp)
self._order = order
def handle_event(self, system: FoodDeliverySystem) -> None:
"""Mutate the system by recording that the order has been delivered to the customer."""
system.complete_order(self._order, self.timestamp)We started off this section by asking, “when are the FoodDeliverySystem methods called”? We said that our simulation would have Event instances that would be responsible for calling these methods. But this really just changes the direction of our original question—it now becomes, “when are the Event instances created?”
One possible approach is to randomly create a whole set of events at the start of our simulation, and then process each of those events (in order of their timestamp). This approach works when the events are fairly simple and can be predictably generated all at once. However, one key feature of events in general is that processing one event can cause other events to occur. For example, when we process a NewOrderEvent, we expect that at some point in the future, a corresponding CompleteOrderEvent will occur. Once the delivery is started, it completes. This doesn’t necessarily always happen in real life, but we’ll assume it does for the purposes of this case study.
To model this behaviour, we change the return type of handle_event from None to list[Event], where the return value is a list of the events caused by the current event.
class Event:
...
def handle_event(self, system: FoodDeliverySystem) -> list[Event]:
"""Mutate the given food delivery system to process this event.
Return a new list of new events created by processing this event.
"""
raise NotImplementedErrorHere’s how we might change the NewOrderEvent to return a CompleteOrderEvent at some point in the future.
class NewOrderEvent(Event):
...
def handle_event(self, system: FoodDeliverySystem) -> list[Event]:
"""Mutate system by placing an order."""
system.place_order(self._order)
# Create a new CompleteOrderEvent. Right now the completion time is
# hard-coded as 10 minutes from the order creation.
# How could be make this more realistic by taking into account the
# positions of the courier, customer, and restaurant?
completion_time = self.timestamp + datetime.timedelta(minutes=10)
return [CompleteOrderEvent(completion_time, self._order)]So for every NewOrderEvent that is handled by our simulation, a subsequent CompleteOrderEvent will be handled at some point in the future.
Now here’s where the problem we mentioned earlier comes in! Remember our docstring for FoodDeliverySystem.place_order: we cannot place an order if there are no available couriers! So what should this event do if system.place_order returns False? At the very least, in this case no CompleteOrderEvent should be returned.
One approach we might take is a polling technique, where we return a duplicate of the event to try again a little bit later. Here is our second version of this method:
class NewOrderEvent(Event):
...
def handle_event(self, system: FoodDeliverySystem) -> list[Event]:
"""Mutate system by placing an order."""
success = system.place_order(self._order)
if success:
completion_time = self.timestamp + datetime.timedelta(minutes=10)
return [CompleteOrderEvent(completion_time, self._order)]
else:
self._order.start_time = self.timestamp + datetime.timedelta(minutes=5)
return [NewOrderEvent(self._order)]Our CompleteOrderEvent does not cause any new events to happen:
class CompleteOrderEvent(Event):
...
def handle_event(self, system: FoodDeliverySystem) -> list[Event]:
"""Mutate the system by recording that the order has been delivered to the customer."""
system.complete_order(self._order, self._timestamp)
return []Lastly, we’ll sketch one new type of event which is more conceptual, but that illustrates the power of this Event interface. This event type will represent a random generation of new orders over a given time period, which we’ll use to drive our simulation.
class GenerateOrdersEvent(Event):
"""An event that causes a random generation of new orders.
Private Representation Invariants:
- self._duration > 0
"""
# Private Instance Attributes:
# - _duration: the number of hours to generate orders for
_duration: int
def __init__(self, timestamp: datetime.datetime, duration: int) -> None:
"""Initialize this event with timestamp and the duration in hours.
Preconditions:
- duration > 0
"""
def handle_event(self, system: FoodDeliverySystem) -> list[Event]:
"""Generate new orders for this event's timestamp and duration."""
events = []
while ...:
new_order_event = ... # Create a randomly-generated NewOrderEvent
events.append(new_order_event)
return eventsWe’ll discuss how we might implement this class in lecture, but it’s a good exercise to try to implement it yourself. There’s many ways to randomly generate new events, so don’t be afraid to experiment!
In this section, we focused only on defining individual Event classes to represent different events in our simulation. In the next section, we’ll put together everything we’ve covered up to this point to finally get a full simulation up and running, so keep reading!
Let’s put together all of the classes we’ve designed over the course of this chapter to create a full simulation our of food delivery system. In this section, we’ll first learn about how the main simulation loop works. Then, we’ll turn our attention to the possible ways a simulation can be configured, and how to incorporate these configuration options as part of the public interface of a class.
Before we get to creating a full simulation class, we’ll discuss how our simulation works. The type of simulation we’re learning about is called a discrete-event simulation, because it is driven by individual events occurring at specified periods of time.
A discrete-event simulation runs as follows:
The algorithm is remarkably simple, though it does rely on a slightly modified version of our priority queue implementation from Section 9.7.In that section, we used ints to represent priority, while here we’re using datetime.datetime values. Assuming we have such an implementation called EventQueueList, here is how we could write a simple function that runs this simulation loop:
def run_simulation(initial_events: list[Event], system: FoodDeliverySystem) -> None:
events = EventQueueList() # Initialize an empty priority queue of events
for event in initial_events:
events.enqueue(event)
# Repeatedly remove and process the next event
while not events.is_empty():
event = events.dequeue()
new_events = event.handle_event(system)
for new_event in new_events:
events.enqueue(new_event)The main reason for this implementation’s simplicity is abstraction. Remember that Event is an abstract class; the complex behaviour of how different events are handled is deferred to its concrete subclasses via our calls to event.handle_event. Our run_simulation function is polymorphic: it works regardless of what Event instances it’s given in its initial_events parameter, or what new events are generated and stored in new_events. The only thing our function needs to be able to do is call the handle_event method on each event object, which we can assume is present because it is defined in the Event public interface.
Next, we will take our run_simulation in the previous section and “wrap” it inside a new class. This isn’t necessary to the running of the simulation, but is a standard practice in an object-oriented design, and makes it easier to both configure the simulation parameters and report results after the simulation is complete.
We’re going to begin with a sketch of a class to represent our simulation:
class FoodDeliverySimulation:
"""A simulation of the food delivery system.
"""
# Private Instance Attributes:
# - _system: The FoodDeliverySystem instance that this simulation uses.
# - _events: A collection of the events to process during the simulation.
_system: FoodDeliverySystem
_events: EventQueue
def __init__(self, start_time: datetime.datetime, num_days: int,
num_couriers: int, num_customers: int,
num_restaurants: int) -> None:
"""Initialize a new simulation with the given simulation parameters.
start_time: the starting time of the simulation
num_days: the number of days that the simulation runs
num_couriers: the number of couriers in the system
num_customers: the number of customers in the system
num_restaurants: the number of restaurants in the system
"""
self._events = EventQueueList()
self._system = FoodDeliverySystem()
self._populate_initial_events(start_time, num_days)
self._generate_system(num_couriers, num_customers, num_restaurants)
def _populate_initial_events(self, start_time: datetime.datetime, num_days: int) -> None:
"""Populate this simulation's Event priority queue with GenerateOrdersEvents.
One new GenerateOrderEvent is generated per day, starting with start_time and
repeating num_days times.
"""
def _generate_system(self, num_couriers: int, num_customers: int, num_restaurants: int) -> None:
"""Populate this simulation's FoodDeliverySystem with the specified number of entities.
"""
def run(self) -> None:
"""Run this simulation.
"""
while not self._events.is_empty():
event = self._events.dequeue()
new_events = event.handle_event(self._system)
for new_event in new_events:
self._events.enqueue(new_event)There are a few key items to note in this (incomplete) implementation:
The run_simulation method has been renamed to simply run, since it’s a method in the FoodDeliverySimulation class.
The local variable events and parameter system from the function are now instance attributes for the FoodDeliverySimulation class, and have been moved out of the run method entirely. It’s the job of FoodDeliverySimulation.__init__ to initialize these objects.
The initializer takes in several parameters representing configuration values for the simulation. It then uses these values in two helper methods to initialize the _system and _events objects. These methods are marked private (named with a leading underscore) because they’re only meant to be called by the initializer, and not code outside of the class.
Here is how we could use the FoodDeliverySimulation class:
Next, we’ll briefly discuss one way to implement each of the two key helper methods for the initializer, _populate_initial_events and _generate_system.
The key idea for our first helper method is that given a start time and a number of days, our initial events will be a series of GenerateOrderEvents that will generate NewOrderEvents when they are processed. Here is the basic skeleton, which will be leave as an exercise for you to complete:
def _populate_initial_events(self, start_time: datetime.datetime, num_days: int) -> None:
"""Populate this simulation's Event priority queue with GenerateOrdersEvents.
One new GenerateOrderEvent is generated per day, starting with start_time and
repeating num_days times.
"""
for day in range(0, num_days):
# 1. Create a GenerateOrderEvent for the given day after the start time.
# 2. Enqueue the new event.The way that our simulation is currently set up, our FoodDeliverySystem instance will contain all restaurants, customers, and couriers before the events start being processed. That is, we assume that only orders are dynamic in our system; the restaurants, customers, and couriers do not change over time.
The easiest way to populate these three entity types is to randomly generate new instances of each of these classes. We’ve shown an example with Customers below.
def _generate_system(self, num_couriers: int, num_customers: int, num_restaurants: int) -> None:
"""Populate this simulation's FoodDeliverySystem with the specified number of entities.
"""
for i in range(0, num_customers):
location = _generate_location()
customer = Customer(f'Customer {i}', location)
self._system.add_customer(customer)
# Couriers and Restaurants are similar
...
# Outside the class: helper for generating random locations in Toronto
TORONTO_COORDS = (43.747743, 43.691170, -79.633951, -79.176646)
def _generate_location() -> tuple[float, float]:
"""Return a randomly-generated location (latitude, longitude) within the Toronto bounds.
"""
return (random.uniform(TORONTO_COORDS[0], TORONTO_COORDS[1]),
random.uniform(TORONTO_COORDS[2], TORONTO_COORDS[3]))After completing the implementation of these two helper methods, you are ready to run the simulation! Try doing the following in the Python console:
>>> simulation = FoodDeliverySimulation(datetime.datetime(2020, 11, 30), 7, 4, 100, 50)
>>> simulation.run()Of course, we aren’t printing anything out, and the FoodDeliverySimualtion.run method doesn’t actually return anything. You are free to insert some print calls to see whether events are actually being processed, but that’s not the only way to see the results of the simulation.
Once the simulation is complete, self._system will have accumulated several completed orders, as a list[Order]. We can access these values and perform any kind of computation on them we want, just like we did all the way back in Chapter 4!
For example, we might ask:
Adapted from https://docs.python.org/3/library/functions.html. Note that not all built-in functions are shown.
| Built-in Function | Description |
|---|---|
abs(x) |
Return the absolute value of a number. The argument may be an integer or a floating point number. |
all(iterable) |
Return True if all elements of the iterable are true (or if the iterable is empty). |
any(iterable) |
Return True if any element of the iterable is true. If the iterable is empty, return False. |
chr(i) |
Return the string representing a character whose Unicode code point is the integer i. For example, The valid range for the argument is from 0 through 1,114,111. |
divmod(a, b) |
Take two (non complex) numbers as arguments and return a pair of numbers consisting of their quotient and remainder when using integer division. For integers, the result is the same as (a // b, a % b). |
|
Construct an iterator from those elements of iterable for which function returns True. iterable may be either a sequence, a container which supports iteration, or an iterator. |
id(object) |
Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. |
input([prompt]) |
If the |
|
Return True if the object argument is an instance of the classinfo argument, or of a subclass thereof. If object is not an object of the given type, the function always returns False. |
len(s) |
Return the length (the number of items) of an object. |
|
Return the largest item in an If one positional argument is provided, it should be an iterable. The largest item in the iterable is returned. If two or more positional arguments are provided, the largest of the positional arguments is returned. There are two optional keyword-only arguments. The The |
|
Return the smallest item in an If one positional argument is provided, it should be an iterable. The smallest item in the iterable is returned. If two or more positional arguments are provided, the smallest of the positional arguments is returned. There are two optional keyword-only arguments. The The |
|
Open
|
ord(c) |
Given a string representing one Unicode character, return an integer representing the Unicode code point of that character. For example, ord('a') returns the integer 97 and ord('€') (Euro sign) returns 8364. This is the inverse of chr(). |
|
Return base to the power exp; if mod is present, return base to the power exp, modulo mod (computed more efficiently than pow(base, exp) % mod). The two-argument form pow(base, exp) is equivalent to using the power operator: base ** exp. |
|
Print Both |
reversed(seq) |
Return a reverse iterator. |
|
Return For the built-in types supporting Any integer value is valid for ndigits (positive, zero, or negative). The return value is an integer if ndigits is omitted or None. Otherwise the return value has the same type as number. Note: The behavior of |
|
Return a new sorted list from the items in Has two optional arguments which must be specified as keyword arguments.
|
|
Sums start and the items of an iterable from left to right and returns the total. |
type(object) |
Return the type of an The |
Adapted from https://docs.python.org/3/library/stdtypes.html.
boolBoolean values are the two constant objects False and True. They are used to represent truth values.
int, floatThere are two distinct numeric types: integers and floating point numbers. Numbers are created by numeric literals or as the result of built-in functions and operators. Unadorned integer literals yield integers. Numeric literals containing a decimal point or an exponent sign yield floating point numbers.
Python fully supports mixed arithmetic: when a binary arithmetic operator has operands of different numeric types, the operand with the “narrower” type is widened to that of the other, where integer is narrower than floating point. Comparisons between numbers of mixed type use the same rule.
All numeric types support the following operations (for priorities of the operations, see Operator precedence):
| Operation | Description |
|---|---|
x + y |
Returns the sum of |
x - y |
Returns the difference of |
x * y |
Returns the product of |
x / y |
Returns the quotient of |
x // y |
Returns the floored quotient of |
x % y |
Returns the remainder of |
x ** y |
Returns |
-x |
Returns |
int(x) |
Returns |
float(x) |
Returns |
math.floor(x) |
Returns the greatest integer |
math.ceil(x) |
Returns the least integer |
See also the built-in functions abs, divmod, pow, and round.
str, list, tupleThe operations in the following table are supported by most sequence types, both mutable and immutable.
| Operation | Description |
|---|---|
x in s |
Returns |
x not in s |
Returns |
s + t |
Returns the concatenation of |
s * n or n * s |
Returns the equivalent to adding |
s[i] |
Returns the |
s[i:j] |
Returns the slice of |
s[i:j:k] |
Returns the slice of |
s.index(x[, i[, j]]) |
Returns the |
s.count(x) |
Returns the total number of occurrences of |
See also the built-in functions len, max, and min.
Sequences of the same type also support comparisons. In particular, tuples and lists are compared lexicographically by comparing corresponding elements. This means that to compare equal, every element must compare equal and the two sequences must be of the same type and have the same length.
listThe list data type supports all of the immutable sequence operations from the previous section, as well as the following operations.
| Operation | Description |
|---|---|
s[i] = x |
Set the item at index |
list.append(self, x) |
Appends |
list.extend(self, t) or self += t |
Extends |
list.insert(self, i, x) |
Inserts |
list.pop(self[, i]) |
Returns the item at |
list.remove(self, x) |
Removes the first occurrence of |
list.reverse(self) |
Reverses the items of self in place (mutates self). |
|
Sorts
The Example: |
strTextual data in Python is handled with str objects, or strings. Strings are immutable sequences.
Triple quoted strings may span multiple lines—all associated whitespace will be included in the string literal.
Strings may also be created from other objects using the str constructor.
Since there is no separate “character” type, indexing a string produces strings of length 1. That is, for a non-empty string s, s[0] == s[0:1]. Strings implement all of the common sequence operations, along with the additional methods described below.
setA set object is an unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.
Like other collections, sets support x in set, len(set), and for x in set. Being an unordered collection, sets do not record element position or order of insertion. Accordingly, sets do not support indexing, slicing, or other sequence-like behavior.
The set type is mutable—the contents can be changed using methods like add() and remove(). Since it is mutable, it has no hash value and cannot be used as either a dictionary key or as an element of another set.
Non-empty sets can be created by placing a comma-separated list of elements within braces, for example: {'jack', 'sjoerd'}, in addition to the set constructor.
| Operation | Description |
|---|---|
len(self) |
Return the size (number of elements) of self. |
x in self |
Return whether x is in self. |
x not in self |
Return whether x is not in self. |
set.isdisjoint(self, other) |
Return whether the set self has no elements in common with other. Sets are disjoint if and only if their intersection is the empty set. |
set.issubset(self, other) |
Return whether every element in the set self is in other. Can also use self <= other. |
self < other |
Return whether the set self is a proper subset of other, that is, self <= other and self != other. |
set.issuperset(self, other) |
Return whether every element in other is in the set self. Can also use self >= other. |
self > other |
Return whether the set self is a proper superset of other, that is, self >= other and self != other. |
set.union(self, *others) |
Return a new set with elements from the set and all others. |
set.intersection(self, *others) |
Return a new set with elements common to the set and all others. |
set.difference(self, *others) |
Return a new set with elements in the set that are not in the others. |
set.symmetric_difference(self, other) |
Return a new set with elements in either the set or other but not both. |
set.update(self, *others) |
Update the set, adding elements from all others. |
set.intersection_update(self, *others) |
Update the set, keeping only elements found in it and all others. |
set.difference_update(self, *others) |
Update the set, removing elements found in others. |
set.symmetric_difference_update(self, other) |
Update the set, keeping only elements found in either set, but not in both. |
set.add(self, elem) |
Add element elem to the set. |
set.remove(self, elem) |
Remove element elem from the set. Raises KeyError if elem is not contained in the set. |
set.discard(self, elem) |
Remove element elem from the set if it is present. |
set.pop(self) |
Remove and return an arbitrary element from the set. Raises KeyError if the set is empty. |
set supports set to set comparisons. Two sets are equal if and only if every element of each set is contained in the other (each is a subset of the other). A set is less than another set if and only if the first set is a proper subset of the second set (is a subset, but is not equal). A set is greater than another set if and only if the first set is a proper superset of the second set (is a superset, but is not equal).
dictA mapping object maps hashable values to arbitrary objects. Mappings are mutable objects. There is currently only one standard mapping type, the dictionary.
Dictionaries can be created by placing a comma-separated list of key: value pairs within braces, for example: {'jack': 4098, 'sjoerd': 4127} or {4098: 'jack', 4127: 'sjoerd'}, or by the dict constructor.
These are the operations that dictionaries support (and therefore, custom mapping types should support too):
| Operation | Description |
|---|---|
list(d) |
Return a list of all the keys used in the dictionary d. |
len(d) |
Return the number of items in the dictionary d. |
d[key] |
Return the item of d with key key. Raises a KeyError if key is not in the map. |
d[key] = value |
Set d[key] to value. |
key in d |
Return True if d has a key key, else False. |
key not in d |
Equivalent to not key in d. |
|
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError. |
dict.items(self) |
Return a new view of the dictionary’s items ((key, value) pairs). |
|
If key is in the dictionary, remove it and return its value, else return default If default is not given and key is not in the dictionary, a KeyError is raised. |
dict.popitem(self) |
Remove and return a (key, value) pair from the dictionary. Pairs are returned in last-in-first-out (LIFO) order. popitem() is useful to destructively iterate over a dictionary, as often used in set algorithms. If the dictionary is empty, calling popitem() raises a KeyError. |
|
If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None. |
Dictionaries compare equal if and only if they have the same (key, value) pairs (regardless of ordering). Order comparisons (<, <=, >= >) raise TypeError.
Dictionaries preserve insertion order. Note that updating a key does not affect the order. Keys added after deletion are inserted at the end.
>>> d = {"one": 1, "two": 2, "three": 3, "four": 4}
>>> d
{'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> list(d)
['one', 'two', 'three', 'four']
>>> list(d.values())
[1, 2, 3, 4]
>>> d["one"] = 42
>>> d
{'one': 42, 'two': 2, 'three': 3, 'four': 4}
>>> del d["two"]
>>> d["two"] = None
>>> d
{'one': 42, 'three': 3, 'four': 4, 'two': None}rangeThe range type represents an immutable sequence of numbers and is commonly used for looping a specific number of times in for loops.
Constructor: range(stop) or range(start, stop[, step]).
The arguments to the range constructor must be integers. If the step argument is omitted, it defaults to 1. If the start argument is omitted, it defaults to 0. If step is zero, ValueError is raised.
For a positive step, the contents of a range r are determined by the formula r[i] = start + step*i where i >= 0 and r[i] < stop.
For a negative step, the contents of the range are still determined by the formula r[i] = start + step*i, but the constraints are i >= 0 and r[i] > stop.
Range examples:
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(range(1, 11))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> list(range(0, 30, 5))
[0, 5, 10, 15, 20, 25]
>>> list(range(0, 10, 3))
[0, 3, 6, 9]
>>> list(range(0, -10, -1))
[0, -1, -2, -3, -4, -5, -6, -7, -8, -9]
>>> list(range(0))
[]
>>> list(range(1, 0))
[]NoneThis object is returned by functions that don’t explicitly return a value. It supports no special operations. There is exactly one null object, named None (a built-in name).
Adapted from https://docs.python.org/3/reference/datamodel.html#special-method-names. Note that not all special methods are shown.
A class can implement certain operations that are invoked by special syntax (such as arithmetic operations or subscripting and slicing) by defining methods with special names. This is Python’s approach to operator overloading, allowing classes to define their own behavior with respect to language operators. For instance, if a class defines a method named __getitem__(), and x is an instance of this class, then x[i] is roughly equivalent to type(x).__getitem__(x, i).
| Method | Description |
|---|---|
object.__init__(self[, ...]) |
Called after the instance has been created, but before it is returned to the caller. The arguments are those passed to the class constructor expression. If a base class has an |
object.__str__(self) |
Called by str(object) and the built-in functions format() and print() to compute the “informal” or nicely printable string representation of an object. The return value must be a string object. |
object.__lt__(self, other) object.__le__(self, other) object.__eq__(self, other) object.__ne__(self, other) object.__gt__(self, other) object.__ge__(self, other) |
These are the so-called “rich comparison” methods. The correspondence between operator symbols and method names is as follows:
|
The following methods can be defined to implement container objects. Containers usually are sequences (such as lists or tuples) or mappings (like dictionaries), but can represent other containers as well.
| Method | Description |
|---|---|
object.__len__(self) |
Called to implement the built-in function len(). Should return the length of the object, an integer >= 0. |
object.__getitem__(self, key) |
Called to implement evaluation of For sequence types, the accepted keys should be integers and slice objects. Note that the special interpretation of negative indexes (if the class wishes to emulate a sequence type) is up to the If key is of an inappropriate type, |
object.__setitem__(self, key, value) |
Called to implement assignment to Same note as for This should only be implemented for mappings if the objects support changes to the values for keys, or if new keys can be added, or for sequences if elements can be replaced. The same exceptions should be raised for improper key values as for the |
object.__contains__(self, item) |
Called to implement membership test operators ( For mapping objects, this should consider the keys of the mapping rather than the values or the key-item pairs. |
object.__iter__(self) |
This method is called when an iterator is required for a container. This method should return a new iterator object that can iterate over all the objects in the container. For mappings, it should iterate over the keys of the container. |
Adapted from https://docs.python.org/3/library/exceptions.html. Note: Not all built-in Python exceptions are shown.
| Exception | Description |
|---|---|
AssertionError |
Raised when an assert statement fails. |
AttributeError |
Raised when an attribute reference or assignment fails. (When an object does not support attribute references or attribute assignments at all, |
FileNotFoundError |
Raised when a file or directory is requested but doesn’t exist. |
ImportError |
Raised when the import statement has troubles trying to load a module. Also raised when the “from list” in from ... import has a name that cannot be found. |
ModuleNotFoundError |
A subclass of ImportError which is raised by import when a module could not be located. |
IndexError |
Raised when a sequence subscript is out of range. (Slice indices are silently truncated to fall in the allowed range; if an index is not an integer, TypeError is raised.) |
KeyError |
Raised when a mapping (dictionary) key is not found in the set of existing keys. |
NameError |
Raised when a local or global name is not found. |
NotImplementedError |
In user defined base classes, abstract methods should raise this exception when they require derived classes to override the method, or while the class is being developed to indicate that the real implementation still needs to be added. |
RecursionError |
It is raised when the interpreter detects that the maximum recursion depth (see sys.getrecursionlimit()) is exceeded. |
SyntaxError |
Raised when the parser encounters a syntax error. This may occur in an import statement, in a call to the built-in functions exec() or eval(), or when reading the initial script or standard input (also interactively). |
IndentationError |
Base class for syntax errors related to incorrect indentation. |
TabError |
Raised when indentation contains an inconsistent use of tabs and spaces. |
TypeError |
Raised when an operation or function is applied to an object of inappropriate type. The associated value is a string giving details about the type mismatch. This exception may be raised by user code to indicate that an attempted operation on an object is not supported, and is not meant to be. If an object is meant to support a given operation but has not yet provided an implementation, Passing arguments of the wrong type (e.g. passing a list when an int is expected) should result in a |
ValueError |
Raised when an operation or function receives an argument that has the right type but an inappropriate value, and the situation is not described by a more precise exception such as IndexError. |
ZeroDivisionError |
Raised when the second argument of a division or modulo operation is zero. The associated value is a string indicating the type of the operands and the operation. |











doctestAdapted from https://docs.python.org/3.8/library/doctest.html.
The doctest module searches for pieces of text that look like interactive Python sessions, and then executes those sessions to verify that they work exactly as shown.
Here’s a simple standalone example:
def is_even(value: int) -> bool:
""" Return whether value is divisible by 2.
>>> is_even(2)
True
>>> is_even(17)
False
"""
return value % 2 == 0The simplest way to start using doctest is to end each module with:
doctest then examines docstrings in the module.
Running the module as a script causes the examples in the docstrings to get executed and verified.
This won’t display anything unless an example fails, in which case the failing example(s) and the cause(s) of the failure(s) are printed, and the final line of output is ***Test Failed*** N failures., where N is the number of examples that failed.
You can force verbose mode by passing verbose=True to testmod(). In this case, a detailed report of all examples tried is printed to standard output, along with assorted summaries at the end.
This section examines in detail how doctest works: which docstrings it looks at, how it finds interactive examples, and how it handles exceptions. This is the information that you need to know to write doctest examples; for information about actually running doctest on these examples, see the following sections.
The module docstring, and all function, class and method docstrings are searched. Objects imported into the module are not searched.
The module docstring, and all function, class and method docstrings are searched. Objects imported into the module are not searched.
In most cases a copy-and-paste of an interactive console session works fine, but doctest isn’t trying to do an exact emulation of any specific Python shell.
>>> # comments are ignored
>>> x = 12
>>> x
12
>>> if x == 13:
... print("yes")
... else:
... print("no")
... print("NO")
... print("NO!!!")
...
no
NO
NO!!!
>>>Any expected output must immediately follow the final '>>> ' or '... ' line containing the code, and the expected output (if any) extends to the next '>>> ' or all-whitespace line.
Notes:
Expected output cannot contain an all-whitespace line, since such a line is taken to signal the end of expected output. If expected output does contain a blank line, put <BLANKLINE> in your doctest example each place a blank line is expected.
This is an incorrect example because the prompt characters (i.e., >>>) are missing:
This is an incorrect example because there is no space between the >>> and the function call:
This is an incorrect example because the result of the function call (True) is not included:
This is an incorrect example because the result of the function call (True) is indented:
The expected output for an exception must start with a traceback header, which may be either of the following two lines, indented the same as the first line of the example:
The traceback header is followed by an optional traceback stack, whose contents are ignored by doctest. The traceback stack is typically omitted, or copied verbatim from an interactive session.
The traceback stack is followed by the most interesting part: the line(s) containing the exception type and detail. This is usually the last line of a traceback, but can extend across multiple lines if the exception has a multi-line detail:
"""
>>> 1 + 'hi'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'
"""Best practice is to omit the traceback stack, unless it adds significant documentation value to the example. So the last example is probably better as:
"""
>>> 1 + 'hi'
Traceback (most recent call last):
TypeError: unsupported operand type(s) for +: 'int' and 'str'
"""doctest is serious about requiring exact matches in expected output. If even a single character doesn’t match, the test fails. This will probably surprise you a few times, as you learn exactly what Python does and doesn’t guarantee about output. For example, when printing a set, Python doesn’t guarantee that the element is printed in any particular order, so a test like
is vulnerable! One workaround is to do
instead. Another is to do
As mentioned in the introduction, doctest has grown to have three primary uses:
These uses have different requirements, and it is important to distinguish them. In particular, filling your docstrings with obscure test cases makes for bad documentation.
When writing a docstring, choose docstring examples with care. There’s an art to this that needs to be learned—it may not be natural at first. Examples should add genuine value to the documentation. A good example can often be worth many words. If done with care, the examples will be invaluable for your users, and will pay back the time it takes to collect them many times over as the years go by and things change. We’re still amazed at how often one of our doctest examples stops working after a “harmless” change.
pytestpytest is a Python library used to run tests for your code. In this section, we’ll describe how to write tests that are automatically discovered and run by pytest, how to actually run pytest in your code, and some tips and tricks for making the most of pytest.
pytest test?A test in pytest is a Python function whose name starts with test_. Inside a test function, we use assert statements to verify expected values or behaviours of a function.
For example:
# This is the function to test
def has_more_trues(booleans: list) -> bool:
"""Return whether booleans contains more True values than False values.
>>> has_more_trues([True, False, True])
True
>>> has_more_trues([True, False, False])
False
"""
# Function body omitted
# This the test
def test_mixture_one_more_true() -> None:
"""Test has_more_trues on a list with a mixture of True and False,
with one more True than False.
"""
assert has_more_trues([True, False, True])A single test can have multiple assert statements, although it is generally recommended to separate each assert statement into a separate test. A single Python file can have multiple tests; when pytest is run on a file, it (by default) runs all the tests in that file.
pytestThe simplest way of running pytest is to add the following if __name__ == '__main__' block to the bottom of a test file:
When you run this file, pytest.main will run all test functions in the file. Note: by default, pytest.main actually searches through all Python files in the current directory whose name starts with test_ or ends with _test, which can be a bit surprising. So our practice will be to explicitly pass in the name of the current test file to pytest.main, wrapped in a list:
# If we're in a file test_my_file.py
if __name__ == '__main__':
import pytest
pytest.main(['test_my_file.py'])It is possible to write a pytest test that checks whether a function raises a specific error. To do so, use pytest.raises, which takes an error type as an argument, inside a with statement. Here is an example
import pytest
def add_one(n):
return n + 1
def test_add_one_type_error():
"""Test add_one when given a non-numeric argument."""
with pytest.raises(TypeError):
add_one('hello')pytest.mainpytest.main takes a list of strings as an argument because users can add options (as strings) to modify pytest’s default behaviour when running tests. The format for this is pytest.main([<option1>, <option2>, ...]).
Here are some useful options:
'<filename>': as we saw above, adding a filename restricts pytest to only running the tests in that Python file.'<filename>::<test_name>': restrict pytest to run a specific test in the given file (e.g., 'test_my_file.py::test_1')'-x': stop running tests after the first failure (by default, pytest runs all tests, regardless of the number of failures)'--pdb': start the Python debugger when a test failsFor the full documentation for the pytest library, check out https://docs.pytest.org/en/latest/.
python_taPythonTA is a Python program that analyses Python code to help students find and fix common coding and style errors. Unlike testing libraries like doctest or pytest, PythonTA does not actually run your code. Instead, it analyzes the program text directly, looking for common patterns of code that oftne lead to errors.PyCharm does something very similar, which is why you’ll see red or yellow highlighted text in your Python files as you’re working, before running the file.
To run PythonTA on a Python file, put the following code at the bottom of the file you want to check:
When you run this file, you’ll see a report open up in your web browser that shows any errors that PythonTA detected. These errors are divided into two broad categories:
We recommend running PythonTA regularly as you’re working on an assignment, as it can be a useful way to check your work and improve the quality of your code. If you’re ever stuck, try taking a break and running PythonTA and fixing any errors it finds for you! This is a way to develop good programming habits and style, which will come in handy in this course (and all future courses).
When you run PythonTA, it generates a new report file called pyta_report.html in the same folder as the file you’re checking. After you’re done running PythonTA, you can safely delete this report file.
typingReference: https://docs.python.org/3.9/library/typing.html.
| Type | Description |
|---|---|
Any |
A value that could be of any type. (Used as a placeholder when a variable’s type could be anything, or is unknown.) |
Callable[[T1, T2, ...], Tr] |
A function whose parameters have type Example: the function has type |
dict[T1, T2] |
A dictionary whose keys have type Example: |
list[T] |
A list whose elements all have type Example: |
Optional[T] |
Synonym of Union[T, None]. |
set[T] |
A set whose elements all have type Example: |
tuple[T1, T2, ...] |
A tuple whose first element has type Example: |
Union[T1, T2, ...] |
A value whose type is one of Example: both |
pdb (Python Debugger)Adapted from https://docs.python.org/3/library/pdb.html.
The module pdb defines an interactive source code debugger for Python programs.
The typical usage to break into the debugger from a running program is to insert
at the location you want to break into the debugger. You can then step through the code following this statement, and continue running without the debugger using the continue command.
The commands recognized by the debugger are listed below. Most commands can be abbreviated to one or two letters as indicated; e.g. h(elp) means that either h or help can be used to enter the help command (but not he or hel, nor H or Help or HELP). Arguments to commands must be separated by whitespace (spaces or tabs).
Entering a blank line repeats the last command entered. Exception: if the last command was a list command, the next 11 lines are listed.
Commands that the debugger doesn’t recognize are assumed to be Python statements and are executed in the context of the program being debugged. This is a powerful way to inspect the program being debugged; it is even possible to change a variable or call a function. When an exception occurs in such a statement, the exception name is printed but the debugger’s state is not changed.
| Command | Description |
|---|---|
| a(rgs) | Print the argument list of the current function. |
| c(ont(inue)) | Continue execution, only stop when a breakpoint is encountered. |
| h(elp) | Without argument, print the list of available commands. With a command as an argument, print help about that command. |
| l(ist) | List source code for the current file. Without arguments, list 11 lines around the current line or continue the previous listing. The current line in the current frame is indicated by |
| ll | List all source code for the current function or frame. (Short for “long list”.) |
| n(ext) | Continue execution until the next line in the current function is reached or it returns. (The difference between next and step is that step stops inside a called function, while next executes called functions at (nearly) full speed, only stopping at the next line in the current function.) |
| r(eturn) | Continue execution until the current function returns. |
| s(tep) | Execute the current line, stop at the first possible occasion (either in a function that is called or on the next line in the current function.) |
When performing calculations, we’ll often end up writing sums of terms, where each term follows a pattern. For example: \[\frac{1 + 1^2}{3 + 1} + \frac{2 + 2^2}{3 + 2} + \frac{3 + 3^2}{3 + 3} + \cdots + \frac{100 + 100^2}{3 + 100}\]
We will often use summation notation to express such sums concisely. We could rewrite the previous example simply as: \[\sum_{i=1}^{100} \frac{i + i^2}{3 + i}.\]
In this example, \(i\) is called the index of summation, and \(1\) and \(100\) are the lower and upper bounds of the summation, respectively. A bit more generally, for any pair of integers \(j\) and \(k\), and any function \(f : \Z \to \R\), we can use summation notation in the following way: \[\sum_{i=j}^k f(i) = f(j) + f(j+1) + f(j+2) + \dots + f(k).\]
We can similarly use product notation to abbreviate multiplication:Fun fact: the Greek letter \(\Sigma\) (sigma) corresponds to the first letter of “sum,” and the Greek letter \(\Pi\) (pi) corresponds to the first letter of “product.” \[\prod_{i=j}^k f(i) = f(j) \times f(j+1) \times f(j+2) \times \dots \times f(k).\]
It is sometimes useful (e.g., in certain formulas) to allow a summation or product’s lower bound to be greater than its upper bound. In this case, we say the summation or product is empty, and define their values as follows:These particular values are chosen so that adding an empty summation and multiplying by an empty product do not change the value of an expression.
Finally, we’ll end off this section with a few formulas for common summation formulas, and a few laws governing how expressions using summation and product notation can be simplified.
For all \(n \in \N\), the following formulas hold:
For all \(m, n \in \Z\), the following formulas hold:
\(\sum_{i=m}^{n} (a_i + b_i) = \left( \sum_{i=m}^{n} a_i \right) + \left(\sum_{i=m}^{n} b_i \right)\) (separating sums)
\(\prod_{i=m}^{n} (a_i \cdot b_i) = \left( \prod_{i=m}^{n} a_i \right) \cdot \left (\prod_{i=m}^{n} b_i \right)\) (separating products)
\(\sum_{i=m}^{n} c \cdot a_i = c \cdot \left( \sum_{i=m}^{n} a_i \right)\) (factoring out constants, sums)
\(\prod_{i=m}^{n} c \cdot a_i = c^{n - m + 1} \cdot \left( \prod_{i=m}^{n} a_i \right)\) (factoring out constants, products)
\(\sum_{i=m}^{n} a_i = \sum_{i'=0}^{n-m} a_{i'+m}\) (change of index \(i' = i - m\))
\(\prod_{i=m}^{n} a_i = \prod_{i'=0}^{n-m} a_{i'+m}\) (change of index \(i' = i - m\))
In this course we will deal heavily with the manipulation of inequalities. While many of these operations are very similar to manipulating equalities, there are enough differences to warrant a comprehensive list.
(Arithmetic manipulations) For all real numbers \(a\), \(b\), and \(c\), the following are true:
Moreover, if we replace any of the “if” inequalities with a strict inequality (i.e., change \(\leq\) to \(<\)), then the corresponding “then” inequality is also strict.For example, the following is true: “If \(a < b\), then \(a + c < b + c\).”
The previous theorem tells us that basic operations like adding a number or multiplying by a positive number preserves inequalities. However, other operations like multiplying by a negative number or taking reciprocals reverses the direction of the inequality, which is something we didn’t have to worry about when dealing with equalities. But it turns out that, at least for non-negative numbers, most of our familiar functions preserve inequalities.
Let \(f : \R^{\geq 0} \to \R^{\geq 0}\). We say that \(f\) is when for all \(x, y \in \R^{\geq 0}\), if \(x < y\) then \(f(x) < f(y)\).
Most common functions are strictly increasing:
Moreover, adding two strictly increasing functions, or multiplying a strictly increasing function by a positive constant or another always-positive strictly increasing function, results in another strictly increasing function. So for example, we know that \(f(x) = 300x^2 + x \log_3 x + 2^{x+100}\) is also strictly increasing.
It should be clear from this definition that the following property holds, which enables us to manipulate inequalities using a host of common functions.
For all non-negative real numbers \(a\) and \(b\), and all strictly increasing functions \(f: \R^{\geq 0} \TO \R^{\geq 0}\), if \(a \leq b\), then \(f(a) \leq f(b)\).
Moreover, if \(a < b\), then \(f(a) < f(b)\).
It is this theorem that allows us to perform several common operations on inequalities as a “step” in a computation. For example, if we know \(0 < a \leq b\), then we can conclude that \(a^2 \leq b^2\), or \(\log_2(a) \leq \log_2(b)\), because both of the functions \(x^2\) and \(\log_2(x)\) are strictly increasing functions.
