In Chapter 3, we studied how to express statements precisely using the language of predicate logic. But just as English enables us to make both true and false claims, the language of predicate logic allows for the expression of both true and false sentences. In this chapter, we will turn our attention to analyzing and communicating the truth or falsehood of these statements. You will develop the skills required to answer the following questions:
These questions draw a distinction between the internal and external components of mathematical reasoning. When given a new statement, you’ll first need to figure out for yourself whether it is true (internal), and then be able to express your thought process to others (external). But even though we make a separation, these two processes are certainly connected: it is only after convincing yourself that a statement is true that you should then try to convince others. And often in the process of formalizing your intuition for others, you notice an error or gap in your reasoning that causes you to revisit your intuition—or make you question whether the statement is actually true!
A mathematical proof is how we communicate ideas about the truth or falsehood of a statement to others. There are many different philosophical ideas about what constitutes a proof, but what they all have in common is that a proof is a mode of communication, from the person creating the proof to the person digesting it. In this course, we will focus on reading and creating our own written mathematical proofs, which is the standard proof medium in computer science.
As with all forms of communication, the style and content of a proof varies depending on the audience. In this course, the audience for all of our proofs will be an average computer science student (and not your TA or instructor). As we will discuss, your audience determines how formal a proof should be (here, quite formal), and what background knowledge you can assume is understood without explanation (here, not much).
However, there is even variation in the typical computer science student with experience in this area, so as much as possible in this course, we will introduce new mathematical domains to serve as the objects of study in our proofs.
This approach has three very nice benefits: first, by building domains from the ground up, we can specify absolutely the common definitions and properties that everyone may assume and use freely in proofs; second, these domains are the theoretical foundation of many areas of computer science, and learning about them here will serve you well in many future courses; and third, learning about new domains will help develop the skill of reading about a new mathematical context and understanding it.In other words, you won’t just learn about new domains; you’ll learn how to learn about new domains! The definitions and axioms of a new domain communicate the foundation upon which we build new proofs—in order to prove things, we need to understand the objects that we’re talking about first.
We’re going to start out our exploration of proofs by studying a few simple statements. Our first foray into domain exploration will be into number theory, which you can think of as taking a type of entity with which we are quite familiar, and formalizing definitions and pushing the boundaries of what we actually know about these numbers that we use every day.
You may find our first few examples a bit on the easy side, which is fine. We are using them not so much for their ability to generate mathematical insight, but rather to model both the thinking and the writing that would go into approaching a problem.
Each example in this section is divided into three or four parts:
With this in mind, let’s dive right in!
Prove that \(23 \mid 115\).
We will expand the definition of divisibility to rewrite this statement in terms of simpler operations: \[\exists k \in \Z,~ 115 = 23k.\]
We just need to divide 115 by 23, right?
Let \(k = 5\).
Then \(115 = 23 \cdot 5 = 23 \cdot k\). We typically signal the end of a proof by writing a black square ◼ in the bottom-right corner.
We can draw from this example a more general technique for structuring our existence proofs. A statement of the form \(\exists x \in S,~P(x)\) is True when at least one element of \(S\) satisfies \(P\) (hence our use of any in Python). The easiest way to convince someone that this is True is to actually find the concrete element that satisfies \(P\), and then show that it does. Of course, this is not the only proof technique used for existence proofs. You’ll study more sophisticated ways of doing such proofs in future courses. This is so natural a strategy that it should not be surprising that there is a “standard proof format” when dealing with such statements.
A typical proof of an existential.
Given statement to prove: \(\exists x \in S,~P(x)\).
Let \(x = \_\_\_\_\_\_\_\).
[Proof that \(P(\_\_\_\_\_\_\_)\) is True.]
Note that the two blanks represent the same element of \(S\), which you get to choose as a prover. Thus existence proofs usually come down to finding a correct element of the domain which satisfy the required properties.
Here is another example which uses the same idea, but with two existentially-quantified variables.
Prove that there exists an integer that divides 104.
There is the key phrase “there exists” right in the problem statement, so we could write \(\exists a \in \Z,~a \mid 104\). We can once again expand the definition of divisibility to write:We use the abbreviated form for two quantifications of the same type. \[\exists a, k \in \Z,~104 = ak.\]
Basically, we need to pick a pair of divisors of 104. Since this is an existential proof and we get to pick both \(a\) and \(k\), any pair of divisors will work.
Let \(a = -2\) and let \(k = -52\).
Then \(104 = ak\).
The previous example is the first one that had multiple quantifiers. In our proof, we had to give explicit values for both \(a\) and \(k\) to show that the statement held. Just as how a sentence in predicate logic must have all its variables quantified, a mathematical proof must introduce all variables contained in the sentence being proven.
In the Chapter 3, we saw how changing the order of an existential and universal quantifier changed the meaning of a statement. Now, we’ll study how the order of quantifiers changes how we can introduce variables in a proof.
Prove that all integers are divisible by \(1\).
The statement contains a universal quantification: \(\forall n \in \Z,~1 \mid n\). We can unpack the definition of divisibility to \[\forall n \in \Z,~\exists k \in \Z,~n = 1 \cdot k.\]
The final equation in the fully-expanded form of the statement is straightforward, and is valid when \(k\) equals \(n\). But how should I introduce these variables? Answer: in the same order they are quantified in the statement.
Let \(n \in \Z\). Let \(k = n\).
Then \(n = 1 \cdot n = 1 \cdot k\).
This proof is quite short, but introduces a few new elements. First, it introduced a variable \(n\) that could represent any real number. Unlike the previous existence proofs, when we introduced this variable \(n\) we did not specify a concrete value like \(10\), but rather said that \(n\) was an arbitrary real number by writing ``Let \(n \in \Z\).You might notice that we use the same word “let” to introduce both existentially- and universally-quantified variables. However, you should always be able to tell how the variable is quantified based on whether it is given a concrete value or an “arbitrary” value in the proof.
A typical proof of a universal.
Given statement to prove: \(\forall x \in S,~P(x)\).
Let \(x \in S\). (That is, let \(x\) be an arbitrary element of \(S\).)
[Proof that \(P(x)\) is True].
The other interesting element of this proof was that it contained an existentially-quantified variable \(k\) after the \(\forall n \in \Z\). We used an extremely important tool at our disposal when it comes to proofs with multiple quantifiers: any existentially-quantified variable can be assigned a value that depends on the variables defined before it.
In our proof, we first defined \(n\) to be an arbitrary integer. Immediately after this, we wanted to show that for this \(n\), \(\exists k \in \Z,~ n = 1 \cdot k\). And to prove this, we needed a value for \(k\)—a “let” statement. Because we define \(k\) after having defined \(n\), we can use \(n\) in the definition of \(k\) and say “Let \(k = n\).” It may be helpful to think about the analogous process in programming. We first initialize a variable \(n\), and then define a new variable \(k\) that is assigned the value of \(n\).
Even though this may seem obvious, one important thing to note is that the order of variables in the statement determines the order in which the variables must be introduced in the proof, and hence which variables can depend on which other variables. For example, consider the following erroneous “proof.”
(Wrong!) Prove that \(\exists k \in \Z,~\forall n \in \Z,~n = 1 \cdot k.\)
Let \(k = n\). Let \(n \in \Z\).
Then \(n = 1 \cdot k\).
This proof may look very similar to the previous one, but it contains one crucial difference. The very first sentence, “Let \(k = n\),” is invalid: at that point, \(n\) has not yet been defined! This is analagous to a NameError in Python. This is the result of having switched around the order of the quantifiers, which forces \(k\) to be defined independently of whatever \(n\) is chosen.
Note: don’t assume that just because one proof is invalid, that all proofs of this statement are invalid! We cannot conclude that this statement is False just because we found one proof that didn’t work.A meta way of looking at this: a statement is True when there exists a correct proof of it. That said, this statement is indeed False, and we’ll see later on in this chapter how to prove that a statement is False instead of True.
Let’s look at one new example.
Prove that for all integers \(x\), if \(x\) divides \((x + 5)\), then \(x\) also divides \(5\).
There is both a universal quantification and implication in this statement:As we noted back in Chapter 3, the “universal + implication” form is the most common form of statement we encounter. \[\forall x \in \Z,~ x \mid (x + 5) \Rightarrow x \mid 5.\] When we unpack the definition of divisibility, we need to be careful about how the quantifiers are grouped: \[\forall x \in \Z,~ \big( \exists k_1 \in \Z,~ x + 5 = k_1x \big) \Rightarrow \big( \exists k_2 \in \Z,~ 5 = k_2x \big).\]
I need to prove that if \(x\) divides \(x + 5\), then it also divides 5. To prove this, I’m going to assume that \(x\) divides \(x + 5\), and I need to prove that \(x\) divides 5.
Since \(x\) is divisible by \(x\), I should be able to subtract it from \(x + 5\) and keep the result a multiple of \(x\). Can I prove that using the definition of divisibility? I basically need to “turn” the equation \(x + 5 = k_1x\) into the equation \(5 = k_2x\).
Let \(x\) be an arbitrary integer. Assume that \(x \mid (x + 5)\), i.e., that there exists \(k_1 \in \Z\) such that \(x + 5 = k_1x\). We want to prove that there exists \(k_2 \in \Z\) such that \(5 = k_2x\).
Let \(k_2 = k_1 - 1\).
Then we can calculate: \[\begin{align*} k_2x &= (k_1 - 1)x \\ &= k_1 x - x \\ &= (x + 5) - x \tag{we assumed $x + 5 = k_1 x$}\\ &= 5 \end{align*}\]
Whew, that was a bit longer than the proofs we’ve already done. There were a lot of new elements that we introduced here, so let’s break them down:
After introducing \(x\), we wanted to prove the implication \(x \mid (x + 5) \Rightarrow x \mid 5\). To prove an implication, we needed to assume that the hypothesis was True, and then prove that the conclusion is also True. In our proof, we wrote “Assume \(x \mid (x + 5)\).”
This is not a claim that \(x \mid (x + 5)\) is True; rather, it is a way to consider what would happen if \(x \mid (x + 5)\) were True. The goal for the rest of the proof was to prove that \(x \mid 5\).
This proof structure is common when proving an implication:
A typical proof of an implication (direct).
Given statement to prove: \(p \Rightarrow q\).
Assume \(p\).
[Proof that \(q\) is True.]
When we assumed that \(x \mid (x + 5)\), what this really did was introduce a new variable \(k_1 \in \Z\) from the definition of divisibility. This might seem a little odd, but take a moment to think about what this means in English. We assumed that \(x\) divides \(x + 5\), which (by definition) is the same as assuming that there exists an integer \(k_1\) such that \(x + 5 = k_1x\). Given that such a number exists, we can give it a name and refer to it in the rest of our proof.In other words, we introduced a variable into the proof through an assumption we made.
One of the most important meta-techniques in mathematical proof is that of generalization: taking a true statement (and a proof of the statement), and then replacing a concrete value in the statement with a universally quantified variable. For example, consider the statement from the previous example, \(\forall x \in \Z,~ x \mid (x + 5) \Rightarrow x \mid 5\). It doesn’t seem like the “\(5\)” serves any special purpose; it is highly likely that it could be replaced by another number like \(165\), and the statement would still hold.Concretely, consider the statement \(\forall x \in \Z,~ x \mid (x + 165) \Rightarrow x \mid 165\), which is at least as plausible as the original statement with \(5\)’s.
But rather than replace the \(5\) with another concrete number and then re-proving the statement, we will instead replace it with a universally-quantified variable, and prove the corresponding statement. This way, we will know that in fact we could replace the \(5\) with any integer and the statement would still hold.
Prove that for all \(d \in \Z\), and for all \(x \in \Z\), if \(x\) divides \((x + d)\), then \(x\) also divides \(d\).
This has basically the same translation as last time, except now we have an extra variable: \[\forall d,x \in \Z,~ \big( \exists k_1 \in \Z,~ x + d = k_1x \big) \Rightarrow \big( \exists k_2 \in \Z,~ d = k_2x \big).\]
I should be able to use the same set of calculations as last time.
Let \(d\) and \(x\) be arbitrary integers. Assume that \(x \mid (x + d)\), i.e., there exists \(k_1 \in \Z\) such that \(x + d = k_1x\). We want to prove that there exists \(k_2 \in \Z\) such that \(d = k_2x\).
Let \(k_2 = k_1 - 1\).
Then we can calculate: \[\begin{align*} k_2x &= (k_1 - 1)x \\ &= k_1 x - x \\ &= (x + d) - x \\ &= d \end{align*}\]
This proof is basically the same as the previous one: we have simply swapped out all of the \(5\)’s with \(d\)’s. We say that the proof did not depend on the value \(5\), meaning there was no place that we used some special property of \(5\), where we could have used a generic integer instead. We can also say that the original statement and proof generalize to this second version.
Why does generalization matter? By generalizing the previous statement from being about the number \(5\) to an arbitrary integer, we have essentially gone from one statement being true to an infinite number of statements being true. The more general the statement, the more useful it becomes. We care about exponent laws like \(a^b \cdot a^c = a^{b + c}\) precisely because they apply to every possible number; regardless of what our concrete calculation is, we know we can use this law in our calculations.