3.7 Function Specifications

One of the most central questions in software development is, “How do we know that the software we write is correct?” Certainly, writing test cases will ensure that our functions produce the expected output for specific situations. But as our programs increase in complexity, how confident can we be that our test cases are sufficient?

Function specifications and correctness

Before we address this question, we will formalize what it means for a program to be correct in the first place. Because functions are the primary way we organize programs, we’ll focus on what it means for an individual function to be correct.

A specification for a function consists of two parts:

A description of what values the function takes as valid inputs. We can represent this description as a set of predicates, where a valid input must satisfy all these predicates. We call these predicates the preconditions of the function.
A description of what the function returns/does, in terms of its inputs. For now, all of our Python functions only return values, and do nothing else. Later on in the course, we’ll study other kinds of function behaviour that could be included in a specification. We can represent this description as a set of predicates as well, that must all be satisfied by the return value of the function. We call these predicates the postconditions of the function.

With these two parts, a function’s specification defines what we expect the function to do. The job of an implementation of the function is to provide the Python code in the function body that meets this specification. We say that a function implementation is correct when the following holds: For all inputs that satisfy the specification’s preconditions, the function implementation’s return value satisfies the specification’s postconditions.

A function specification acts as a contract or agreement between the person who implements the function and the person who calls the function. For the person implementing the function, their responsibility is to make sure their code correctly returns or does what the specification says. When writing this code, they do not need to worry about exactly how the function is called and assume that the function’s input is always valid. So in fact, we have already seen several preconditions in this course. Every time we had a function description that said “assume X about the input(s)”, that was a precondition. For the person calling the function, their responsibility is to make sure they call the function with valid inputs. When they make this call, they do not need to worry about exactly how the function is implemented and assume that the function works correctly.

The concept of a function specification is a very powerful one, as it spreads the responsibility of function correctness across two parties that do their parts separately—as long as they both know what the function specification is. As a result, these specifications must be very precise. Outside of software, lawyers are hired to draft and review contracts to make sure that they are defensible in the eyes of the law. Similarly, programmers must behave as lawyers when designing software to write ironclad contracts that leave no ambiguity in what is expected of the user or how the software will behave. In this section, we introduce some new tools and terminology that can help our functions be more explicit in their requirements and behaviour.

Simple specifications

Even though we haven’t formally introduced the notion of a function specification until this section, you’ve been writing specifications all along simply by following the Function Design Recipe. Let’s take a look at an early example:

def is_even(n: int) -> bool:
    """Return whether n is even.

    >>> is_even(1)
    False
    >>> is_even(2)
    True
    """
    # Body omitted.

Here, the type contract and description actually form a complete specification of this function’s behaviour:

The type annotation of the parameter n tells us that the valid inputs to is_even are int values. The type annotation int is itself a precondition of the function.
Similarly, the type annotation for the return value tells us that the function will always return a bool. In addition, the description “Return whether n is even.” specifies the relationship between the function’s return value and its input. The doctest examples aid understanding, but are not strictly required to specify what this function does. The function description and return type annotation specify the postconditions of the function.

From this alone, we know what it means for this function to be implemented correctly, even if we can’t see the implementation.

is_even is implemented correctly when for all ints n, is_even(n) returns a bool that is True when n is even, and False when n is not even.

For example, suppose David has implemented this function. Mario loads this function implementation into the Python console and calls it:

>>> is_even(4)
False

In this case, 4 is an int, so Mario held up his end of the contract when he called the function. But the False return value is inconsistent with the function description, and so we know there must be an error in the implementation—David is at fault, not Mario.

Suppose David fixes his implementation, and asks Mario to try another call. Mario types in:

>>> is_even(4)
True

Okay pretty good, and now Mario tries:

>>> is_even([1, 2, 3])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in is_even
TypeError: unsupported operand type(s) for %: 'list' and 'int'

In this case, the function did not produce a return value but rather an error (i.e., TypeError). Is David at fault again? No! Mario violated the function’s precondition by passing in a list rather than an int, and so he should have no expectation that is_even will meet its postcondition. Therefore, Mario (the caller of the function) caused the error.

Preconditions in general

All parameter type annotations are preconditions for a function. But often these type annotations are not precise enough to specify the exact set of valid inputs. Consider this function:

def max_length(strings: set) -> int:
    """Return the maximum length of a string in the set of strings.

    >>> max_length({'Hello', 'Mario', 'David Liu'})
    9
    """
    return max({len(s) for s in strings})

What happens when the set is empty? Let’s try it out in the console:

>>> empty_set = set()
>>> max_length(empty_set)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "<input>", line 7, in max_length
ValueError: max() arg is an empty sequence

We’ve obtained an error, rather than an int; this makes logical sense, because it is impossible to find the maximum value in a set that contains no values at all. But from a formal function specification sense, who is to blame: the function’s caller or the function’s implementer?

As it stands, the implementer is at fault because the only description of “valid inputs” given is the type annotation set; the empty set is still a set. So we need to update the specification to rule out this possibility, but how? You may recall that we’ve been adding extra “assumptions” on inputs for programming exercises in this course for the past few weeks already. What we’re learning here is how to formalize these assumptions into function docstrings. We encountered this issue in 3.3 Filtering Collections, when we wanted to restrict a statement to apply to a subset of our domain. Here we’re doing the same thing: making the set of valid function inputs more specific, because we only want to guarantee our implementation works correctly on those inputs. We add a precondition to the function docstring as follows:

def max_length(strings: set) -> int:
    """Return the maximum length of a string in the set of strings.

    Preconditions:
      - len(strings) > 0
    """
    return max({len(s) for s in strings})

Whenever possible, we’ll express these general preconditions as valid Python expressions involving the function’s parameters. Sometimes we’ll encounter a precondition that is extremely complex, in which case you can write them in English. In English, we would say that the full specification of max_length’s valid inputs is “strings is a set, and len(strings) > 0”. As functions get more complex, we can add additional preconditions by listing them under the header Preconditions: in the docstring. A function input is valid when it satisfies the type annotations and all general precondition expressions.

Note that adding the precondition to the docstring does not change the behaviour of the function. If an empty set is passed into the function by the user, the function will still produce the ValueError we saw above. However, now that the precondition has been documented in the function specification, if we call max_length(empty_set), we know that the error is entirely our fault because we violated a precondition.

Checking preconditions automatically with python_ta

While our previous example illustrates how to document preconditions as part of a function specification, it has one drawback: it relies on whoever is calling the function to read the documentation! Of course, reading documentation is an important skill for any computer scientist, but despite our best intentions we sometimes miss things. It would be nice if we could turn our preconditions into executable Python code so that the Python interpreter checks them every time we call the function.

One way to do this is to use an assert statement, just like we do in unit tests. Because we’ve written the precondition as a Python expression, we can convert this to an assertion by copy-and-pasting it at the top of the function body.

def max_length(strings: set) -> int:
    """Return the maximum length of a string in the set of strings.

    Preconditions:
      - len(strings) > 0
    """
    assert len(strings) > 0, 'Precondition violated: max_length called on an empty set.'
    return max({len(s) for s in strings})

Now, the precondition is checked every time the function is called, with a meaningful error message when the precondition is violated:

>>> empty_set = set()
>>> max_length(empty_set)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "<input>", line 7, in max_length
AssertionError: Precondition violated: max_length called on an empty set.

However, this approach is annoying and error-prone. First, we have to duplicate the precondition in two places. And second, we have increased the size of the function body with extra code. The python_ta library we use in this course has a way to automatically check preconditions for all functions in a given file. Here is an example:

def max_length(strings: set) -> int:
    """Return the maximum length of a string in the set of strings.

    Preconditions:
      - len(strings) > 0
    """
    return max({len(s) for s in strings})


if __name__ == '__main__':
    import python_ta.contracts
    python_ta.contracts.DEBUG_CONTRACTS = False  # Disable contract debug messages
    python_ta.contracts.check_all_contracts()

    max_length(set())

Notice that we’ve kept the function docstring the same, but removed the assertion. The function we call, python_ta.contracts.check_all_contracts, modifies our max_length function. That is, python_ta takes the function’s type contract and the preconditions it finds in the function docstring, and causes the function to check these preconditions every time the function is called! Let’s see what happens when we run this file:

Traceback (most recent call last):
...
AssertionError: max_length precondition "len(strings) > 0" violated for arguments {'strings': set()}.

Pretty cool! We’ll be using check_all_contracts for the rest of this course to help us make sure we’re sticking to the specifications we’ve written in our function header and docstrings when we call our functions. Moreover, check_all_contracts checks the return type of each function, so it’ll also work as a check when we’re implementing our functions to make sure the return value is of the correct type.

Preconditions as assumptions and restrictions

Preconditions allow the implementer of a function to specify assumptions about the function’s inputs, and so simplify the work of the implementer. On the other hand, preconditions place restrictions on the user of the function; the onus is on them to respect these preconditions every time the function is called. This often increases the complexity of the code that calls the function. For example, in our max_length function, the calling code might need an if statement to first check whether a set is empty before passing it to max_length.

When confronted with an “invalid input”, there is another strategy other than simply ruling out the invalid input with a precondition: explicitly defining some alternate function behaviour for this input. Here is another way we could define max_length:

def max_length(strings: set) -> int:
    """Return the maximum length of a string in the set of strings.

    Return 0 if strings is empty.
    """
    if strings == set():
        return 0
    else:
        return max({len(s) for s in strings})

Here, we picked a reasonable default value for max_length when given an empty set, This is very similar to how we define empty sums and products by a mathematical convention. and then handled that as an explicit case in our implementation by using an if statement. Our function implementation is more complex than before, but now another person can call our function on an empty set without producing an error:

>>> empty_set = set()
>>> max_length(empty_set)
0

You’re probably wondering: is this version of max_length better or worse than our original one with the precondition? This version resulted in a longer description and function body, but it also removed a possible error we might encounter when calling the function. On the other hand, is 0 really a “reasonable” value for the behaviour of this function? Because this is ultimately a design decision, there is no clear “right answer”—there are always trade-offs to be made. Rather than sticking with a particular rule (i.e., “always/never use preconditions”), it’s better to use broader principles to evaluate different choices. How much complexity is added by handling an additional input in a function implementation? Are there “reasonable” behaviours defined for a larger set of inputs than what you originally intended? The trade-offs are rarely clear cut.

That’s not all!

It turns out that with either of the “precondition” or “reasonable default” strategies, our specification of max_length is still incomplete. Before moving onto the next section, take a moment to study these implementations and try to guess what the gap might be!