This lesson is being piloted (Beta version)

# Objects in Python

## Overview

Teaching: 20 min
Exercises: 15 min
Questions
• What are objects in Python?

• What is a class or type?

• Can objects belong to more than one class?

• How can objects be created from a class?

Objectives
• Be able to distinguish between class and object

• Be able to construct objects via a class’s constructor

• Be able to distinguish between equality and identity of objects

You may recall that we calculated the mean of a Numpy array by using numpy.mean, as

import numpy
numbers = numpy.arange(10)
print(numpy.mean(numbers))

4.5


However, we can also calculate the mean of a Numpy array as:

print(numbers.mean())

4.5


Let’s see if we can do this with a normal list:

more_numbers = [1, 2, 3, 4]
print(more_numbers.mean())


In this case Python will complain with an error. How does Python know it can do this for numbers but not more_numbers?

## What type is it?

Let’s investigate this further by using type to identify what the data type of numbers is:

type(numbers)

<class 'numpy.ndarray'>


What about the type of the variable more_numbers?

type(more_numbers)

<class 'list'>


We can see here that numbers is an object of the type numpy.ndarray. In Python, anything which can be stored in a variable or passed to a function is called an object. Objects are classified by their type, or their class.

## Class or Type?

Note that in literature, you’ll find a subtle distinction between class and type. However, since in Python 3 we can’t have one without the other, we will use both terms interchangably.

## Let’s find some types

Can you find anything that you can store in a variable which does not have a class? What is the type of the number 1, or the string "hello"?

Does the class change if they are passed directly to type, or if they are stored in a variable?

## Solution

type(1)

<class 'int'>

type("hello")

<class 'string'>


## Does everything have a class?

Try to find words that Python recognises that do not have classes. What about numpy.mean or numpy? What about if or for? Can you think of others?

## Solution

type(numpy.mean)

<class 'function'>

type(numpy)

<class 'module'>


The objects numpy.mean and numpy are things that we typically wouldn’t store in variables or passed around. However, they could in principle be stored in variables, and therefore are objects with a class.

type(if)

 File "<stdin>", line 1
type(if)
^
SyntaxError: invalid syntax

type(for)

 File "<stdin>", line 1
type(for)
^
SyntaxError: invalid syntax


The words if and for are part of the Python language itself, they can’t be stored in variables. Only things which can be stored in variables can have a class.

## Changing things

In Python, there are two ways in which objects can behave. The most intuitive case is when object are created with a value, and they keep the value forever. Many objects we’re familiar with, such as integers or strings, are objects which hold a value.

Let’s store a string in a variable:

message = "Hello"


The variable message now refers to an object, which has the value "Hello". We can point another variable at the same object with:

second_message = message


But, we can never change the value of the string object itself. The string “hello” will always be the string “hello”. We can set the variable second_message to a new object, with

second_message = message + ", world"


But the original object is still there, unchanged. We can still get to it by typing

print(message)


This may not seem surprising, but not all objects in Python behave this way. Consider the following list of strings:

messages = ["Hello", "world!"]


Let’s point new variable duplicate_messages at the list named messages

duplicate_messages = messages


Think of this as pointing duplicate_messages at the same underlying object contained in messages: Now let’s change a part of duplicate_messages:

duplicate_messages[1] = "there!"


What is the value of messages now?

print(messages)

['Hello', 'there!']


Note how we changed messages through the variable duplicate_messages. We can do this because both messages and duplicate_messages refer to the same underlying object, and that underlying object can be changed.

We say that objects which can’t be changed, like numbers and string, are immutable. Numbers are an intuitive example of immutable objects, the number 1000 will always be the number 1000. We say that these objects that can be changed are mutable, they can be “mutated” after they’ve created.

## Immutable lists

Python has a class similar to a list called a tuple. Is a tuple mutable or immutable?

Check if you can change a tuple by setting:

messages = ("Hello", "world!")


and trying to modify the second element with:

messages[1] = "there!'


## Solution

messages = ("Hello", "world!")
messages[1] = "there!"


You should see an error containing the text:

TypeError: 'tuple' object does not support item assignment


This is telling you that you can’t modify the tuple object, this is true because the tuple object is immutable.

## What kind of objects?

List some objects that you think are mutable and immutable. Verify this by trying to find ways to change the objects.

Note: Be careful that you’re not “cheating” by using = to point to a new object.

## Instances and Methods

We say that an object of a particular class is an instance of that class. To use a real world example, we could have the type or class Chair which describes to all the chairs in the world. The chair that you are sitting on right now is a specific instance of the chair class.

We can check if an object is an instance of a particular class with the isinstance function.

isinstance(numbers, numpy.ndarray)

True


Every object is created with a single class, and which can’t be changed. The class of an object can also provide behaviour that the object might have, by providing functions to objects in its class. These functions can be called by using a dot after the variable name, for example:

numbers.mean()


The functions which are associated with an object are provided by the class of the object. When a class provides a function to an object we call that function a method of the class.

We say that the numpy.ndarray class provides the mean method. Since numbers belongs to the class numpy.ndarray, we can use the mean method on the object referred to by numbers, by calling numbers.mean(). This allows objects of a numpy.ndarray to provide functionality specific to objects of class numpy.ndarray.

It’s worth noting that both mutable and immutable objects can have methods. Methods of immutable objects, however, can’t change the underlying object. If needed, they will return a brand new object, and set the expected value in the new object. To keep this change, you will need to store it in a variable, for example:

hello = "hello, world"
capital_hello = hello.capitalize()
print(capital_hello)


Methods of mutable objects can, and often do, change the class.

grades = [84, 78, 91]

[84, 78, 91, 66]


In this case, we don’t need the extra = to assign the value to a new object.

## Finding out what things are

use type() to find the type of students, defined as

students = ['Petra', 'Aalia', 'Faizan', 'Shona']


and check this with isinstance.

## Solution

type(students)

<class 'list'>

isinstance(students, list)

True


## Other common classes

What other classes have you encountered previously when using Python? What methods did they provide?

## Making an object

A class can be called as a function, in which case it constructs new instances of itself. While this is not the only way to make objects, it is one that all classes offer. For example, a new list can be created as:

students = list()
print(students)
print(type(students))

[]
<class 'list'>


## Making a Numpy array

While all classes can be constructed by calling their name, some classes don’t recommend this route. For example, numpy.ndarray is used internally by Numpy to initialise its arrays, but Numpy recommends using one of the higher-level functions like numpy.zeros, numpy.ones, numpy.empty, or numpy.asarray to construct an array (of zeroes, of ones, without initialising the data, and initialising from an existing data structure like a list, respectively).

## Make a dict

Given the following list of students and their grades, how would you construct a dict with students as keys, and grades as values?

students = ['Petra', 'Aalia', 'Faizan', 'Shona']
grades = [84, 78, 91, 66]


You can check the type of the object you’ve created with isinstance

Hint: zip() can be used to turn two lists into tuples of corresponding pairs of elements.

## Solution

student_grades = dict(zip(students, grades))

True


## Equality and identity

Python has two ways of testing whether two objects are the “same”. The first is equality, or whether the associated values or contents of the object are the same.

The second is identity, or whether the objects are in fact the same instance, with names referring to the same underlying object.

Equality is tested with ==, which you have probably used before. We can test for identity with the is keyword:

old_students = students
new_students = ['Petra', 'Aalia', 'Faizan', 'Shona']

if old_students == students:
print("old_students is equal to the students list")
if new_students == students:
print("new_students is equal to the students list")
if old_students is students:
print("old_students is identical to the students list")
if new_students is students:
print("new_students is identical to the students list")


Constructing a new list that has the same elements as an existing list gives a list that is equal, but not identical, to the existing one. This is true for any class: constructing a new object that is the same as an existing one will give a result that is equal, but not identical, to the existing one.

## Inheritance

Object-oriented programming allows relationships to be defined between classes or types. One class may be considered to be a specialisation or subclass of another. For a real world example, a car could be considered a specialisation or subclass of the class of all vehicles.

This is very frequently seen in the way Python handles exceptions. For example, if we check what type a ValueError is, we see that it is of class 'ValueError':

an_error = ValueError("A value must be provided")
print(type(an_error))
if isinstance(an_error, ValueError):
print("an_error is a ValueError")

<class 'ValueError'>
an_error is a ValueError


However, we can also check if it is an Exception:

if isinstance(an_error, Exception):
print("an_error is an Exception")

an_error is an Exception


This is because ValueError is a subclass of Exception: value errors are a specific type of exception that can occur, and so should have all the same logic that is common to all exceptions.

One place this can be used is to structure exception handling; for example:

numerator = 5
denominator = 0

try:
print(numerator, "divided by", denominator, "is", numerator / denominator)
except ZeroDivisionError:
print("You can't divide by zero!")
except Exception:
print("Something else went wrong.")


ZeroDivisionError is another subclass of Exception. On encountering an exception, Python checks each except in turn to see whether the exception matches the class being tested for. The more specific ZeroDivisionError catches the specific case of dividing by zero, but the block is skipped for all other issues, which are then handled by the more general Exception.

## Key Points

• Anything that we can store in a variable in Python is an object

• Every object in Python has a class (or type)

• list and numpy.ndarray are commonly-used classes; lists and arrays are corresponding objects

• Calling the class as a function constructs new objects of that class

• Classes can inherit from other classes; objects of the subclass are automatically also of the parent class

# Writing classes

## Overview

Teaching: 20 min
Exercises: 25 min
Questions
• How are classes written in Python?

• What do methods look like?

• How can a class customise how its instances are constructed?

Objectives
• Write classes from scratch

• Write methods for classes

• Write custom __init__ methods

In the previous section, we’ve seen how objects can have different behaviour, provided by methods, which in turn are provided by the class of an object.

But what if we want to make our own classes and objects?

If we wanted to plot a variety of quadratic functions, with a consistent set of styles, we could define a class that does this:

from matplotlib.pyplot import subplots
from numpy import linspace

color = 'red'
linewidth = 1

def plot(self, a, b, c):
'''Plot the line a * x ** 2 + b * x + c and output to the screen.
x runs between -10 and 10, with 1000 intermediary points.
The line is plotted in the colour specified by color, and with width
linewidth.'''

fig, ax = subplots()
x = linspace(-10, 10, 1000)
ax.plot(x, a * x ** 2 + b * x + c,
color=self.color, linewidth=self.linewidth)


Similarly to how def is used to define a function, the class keyword is used to define a new class. Both functions and variables can be created inside the class block, and these will be accessible on any objects of the class that are created.

When functions are defined within a class, they will become methods of instances of the class. In order for the function to be aware of the object that they need to refer to, methods are always given the instance as their first argument. By convention, the first argument of methods is always called self, so that the object can be referred to consistently whenever it is needed.

Note that variables within methods are local to that method. For example, fig and ax will be deleted once the method finishes running. To access variables attached to the object, their names must be prefixed by self..

## Other names than self

While it is possible to use any variable name for the first argument of a method, and Python will not complain, other programmers will. Since one aim when programming is to be as clear as possible to others who may read the program later, we strongly recommend following the convention of calling the first argument to methods self.

## Naming classes

Another convention in Python is that class names start with a capital letter, and instead of underscores, initial letters of subsequent words are also capitalised. This makes it easier to distinguish classes from objects and other variables at a glance.

So far this code hasn’t visibly done anything; while we have defined a class, we have yet to use it. Let’s do that now.

plotter = QuadraticPlotter()
plotter.plot(1, 2, 3)
plotter.plot(1, 0, -1)


Notice that we only supply the arguments a, b, and c to plotter.plot()— Python automatically adds the object to become the self parameter.

So far, this hasn’t done anything that we couldn’t have done with a function to perform the setup and then do the plot—perhaps something like:

def quadratic_plot(a, b, c, color='red', linewidth=1):
'''Plot the line a * x ** 2 + b * x + c and output to the screen.
x runs between -10 and 10, with 1000 intermediary points.
The line is plotted in the colour specified by color, and with width
linewidth.'''

fig, ax = subplots()
x = linspace(-10, 10, 1000)
ax.plot(x, a * x ** 2 + b * x + c, color=color, linewidth=linewidth)


However, what if we wanted to plot some of the curves in a thick blue line? With this function, we could set the color and linewidth on every call, but that would create a lot of repetition, and hence opportunities for the code to become inconsistent.

We could also use a dict to hold the common options:

thick_blue = {'color': 'blue', 'linewidth': 5}



## **

The ** syntax here tells Python to take the thick_blue dict, and use its keys and values as keywords and keyword arguments. We’ll look at this operator later in the lesson where we talk about decorators.

Using objects on the other hand gives a neat alternative way of achieving this result:

blue_plotter = QuadraticPlotter()
blue_plotter.color = 'blue'
blue_plotter.linewidth = 5

plotter.plot(3, -5, 5)
blue_plotter.plot(-3, 1, 0)
plotter.plot(2, 10, 2)
blue_plotter.plot(-2, 13, 4)


The two objects plotter and blue_plotter can store the different states needed to set up the two styles of plot, whilst keeping the plotting functionlity common, so it doesn’t need to be written separately for red and blue versions. We no longer have to specify the colour every time we want to plot with a non-default colour—instead, we can use the QuadraticPlotter instance that has the colour we want set.

If we need to, we can check the values of the variables we defined:

print("Line width of red plotter is", plotter.linewidth)
print("Line width of blue plotter is", blue_plotter.linewidth)

Line width of red plotter is 1
Line width of blue plotter is 5


## Mutation revisitied

Note that the classes we create ourselves in this way will produce mutable objects. This means that we can change the values in objects of plotter.linewidth, and Python allows us to do that. It doesn’t throw an error.

## Zoom in

Currently QuadraticPlotter is hardcoded to plot between -10 and 10. Try adjusting it so that it can be adjusted in the same way as the color and linewidth can, while keeping the current defaults.

Use the new class to plot the curve with a = 3, b = 2, c = 1 both between -10 and 10, and between -5 and 50. Do this without changing the arguments to the plot method.

## Solution

class QuadraticPlotter:
color = 'red'
linewidth = 1
x_min = -10
x_max = 10

def plot(self, a, b, c):
'''Plot the line a * x ** 2 + b * x + c and output to the screen.
x runs between -10 and 10, with 1000 intermediary points.
The line is plotted in the colour specified by color, and with width
linewidth.'''

fig, ax = subplots()
x = linspace(self.x_min, self.x_max, 1000)
ax.plot(x, a * x ** 2 + b * x + c,
color=self.color, linewidth=self.linewidth)

wide_plot.x_min = -5
wide_plot.x_max = 50

narrow_plot.plot(3, 2, 1)
wide_plot.plot(3, 2, 1)


## Plots of fits

The following function performs an Orthogonal Distance Regression fit of some data, and plots the resulting fit line along with the data.

from scipy.odr import ODR, Model, RealData
from matplotlib.pyplot import show

def linear(params, x):
return params[0] * x + params[1]

def odr_fit(f, x, y, xerr=None, yerr=None, p0=None, num_params=None):
if not p0 and not num_params:
raise ValueError("p0 or num_params must be specified")
if p0 and (num_params is not None):
assert len(p0) == num_params

data_to_fit = RealData(x, y, xerr, yerr)
model_to_fit_with = Model(f)
if not p0:
p0 = tuple(1 for _ in range(num_params))

odr_analysis = ODR(data_to_fit, model_to_fit_with, p0)
odr_analysis.set_job(fit_type=0)
return odr_analysis.run()

def plot_results(f, fitobj, x, y,
xmin=None, xmax=None, xerr=None, yerr=None, filename=None):
fig, ax = subplots()
if xmin is None:
xmin = min(x)
if xmax is None:
xmax = max(x)

x_range = linspace(xmin, xmax, 1000)
ax.plot(x_range, f(fitobj.beta, x_range), label='Fit')
ax.errorbar(x, y, xerr=xerr, yerr=yerr, fmt='.', label='Data')
ax.set_xlabel(r'$x$')
ax.set_ylabel(r'$y$')
fig.suptitle(f'Data: $A={fitobj.beta[0]:.02}' f'\\pm{fitobj.cov_beta[0][0]**0.5:.02}, ' f'B={fitobj.beta[1]:.02}\\pm{fitobj.cov_beta[1][1]**0.5:.02}$')
ax.legend(loc=0, frameon=False)

if filename is not None:
fig.savefig(filename)

x_data = [0, 1, 2, 3, 4, 5]
y_data = [1, 3, 2, 4, 5, 5]
x_err = [0.2, 0.1, 0.3, 0.2, 0.5, 0.3]
y_err = [0.4, 0.4, 0.1, 0.2, 0.1, 0.4]

result = odr_fit(linear, x_data, y_data, x_err, y_err, num_params=2)
plot_results(linear, result, x_data, y_data, xerr=x_err, yerr=y_err)
show()


This code has a lot of repeated terms, and would have even more if we wanted to set custom formatting each time.

Try rewriting this as a class, turning most function arguments into variables attached to the object, and functions into methods. Some of these you won’t be able to set in the class definition, but will need to be set before the functions will work.

## Solution

class FitterPlotter:
x_data = None
y_data = None
x_err = None
y_err = None

fit_result = None
fit_form = None
num_fit_params = None

xmin = None
xmax = None

def odr_fit(self, p0=None):
if None in (self.x_data, self.y_data, self.fit_form):
raise ValueError("x_data, y_data, and fit_form must be specified")
if not p0 and not self.num_fit_params:
raise ValueError("p0 or num_fit_params must be specified")
if p0 and (self.num_fit_params is not None):
assert len(p0) == self.num_fit_params

data_to_fit = RealData(self.x_data, self.y_data, self.x_err, self.y_err)
model_to_fit_with = Model(self.fit_form)
if not p0:
p0 = tuple(1 for _ in range(self.num_fit_params))

odr_analysis = ODR(data_to_fit, model_to_fit_with, p0)
odr_analysis.set_job(fit_type=0)
self.fit_result = odr_analysis.run()
return self.fit_result

def plot_results(self, filename=None):
if None in (self.x_data, self.y_data):
raise ValueError("x_data and y_data must be specified")
fig, ax = subplots()
xmin, xmax = self.xmin, self.xmax
if xmin is None:
xmin = min(self.x_data)
if xmax is None:
xmax = max(self.x_data)

if self.fit_result is not None:
x_range = linspace(xmin, xmax, 1000)
ax.plot(x_range, self.fit_form(self.fit_result.beta, x_range),
label='Fit')
fig.suptitle(f'Data: $A={self.fit_result.beta[0]:.02}' f'\\pm{self.fit_result.cov_beta[0][0]**0.5:.02}, ' f'B={self.fit_result.beta[1]:.02}' f'\\pm{self.fit_result.cov_beta[1][1]**0.5:.02}$')

ax.errorbar(self.x_data, self.y_data, xerr=self.x_err, yerr=self.y_err,
fmt='.', label='Data')
ax.set_xlabel(r'$x$')
ax.set_ylabel(r'$y$')
ax.legend(loc=0, frameon=False)

if filename is not None:
fig.savefig(filename)

fitterplotter = FitterPlotter()
fitterplotter.x_data = [0, 1, 2, 3, 4, 5]
fitterplotter.y_data = [1, 3, 2, 4, 5, 5]
fitterplotter.x_err = [0.2, 0.1, 0.3, 0.2, 0.5, 0.3]
fitterplotter.y_err = [0.4, 0.4, 0.1, 0.2, 0.1, 0.4]
fitterplotter.fit_form = linear
fitterplotter.num_fit_params = 2

fitterplotter.odr_fit()
fitterplotter.plot_results()
show()


## Initialising instances

So far we can create an object with the defaults that we set in the class definition, and then customise it afterwards. But wouldn’t it be nice to be able to create an object with the attributes that we want straight out of the box?

To do this, we can define an initialiser for the class. When Python creates an instance of a class, it looks for a method called __init__. If it finds one, then it calls it, giving it all the arguments passed to the class.

For example, for the QuadraticPlotter, the variables color and linewidth could be passed as arguments to __init__ and set on initialisation, rather than being defined as part of the class definition:

from matplotlib.colors import is_color_like

def __init__(self, color='red', linewidth=1):
'''Set the initial attributes of this plotter.'''
assert is_color_like(color)

self.color = color
self.linewidth = linewidth

def plot(self, a, b, c):
'''Plot the line a * x ** 2 + b * x + c and output to the screen.
x runs between x_min and x_max, with 1000 intermediary points.
The line is plotted in the colour specified by color, and with width
linewidth.'''

fig, ax = subplots()
x = linspace(-10, 10, 1000)
ax.plot(x, a * x ** 2 + b * x + c,
color=self.color, linewidth=self.linewidth)

pink_plotter.plot(0, 1, 0)


This also lets us do some validation that the values we are given are usable, rather than deferring these errors to a long way down the line.

## Pronouncing __init__

The method name __init__ is most often pronounced “dunder init”, where the “dunder” is short for “double underscore”, since the name starts and ends with two underscores.

We’ll encounter more methods with “dunder” in the name in a later episode.

## Zoom in again

Try rewriting the “Zoom in” example above to set the bounds of the plot, as well as the color and linewidth, using arguments to the constructor.

## Solution

class QuadraticPlotter:
def __init__(self, color='red', linewidth=1, x_min=-10, x_max=10):
'''Set the initial attributes of this plotter.'''
assert is_color_like(color)
self.color = color
self.linewidth = linewidth
self.x_min = x_min
self.x_max = x_max

def plot(self, a, b, c):
'''Plot the line a * x ** 2 + b * x + c and output to the screen.
x runs between x_min and x_max, with 1000 intermediary points.
The line is plotted in the colour specified by color, and with width
linewidth.'''

fig, ax = subplots()
x = linspace(self.x_min, self.x_max, 1000)
ax.plot(x, a * x ** 2 + b * x + c,
color=self.color, linewidth=self.linewidth)

narrow_plot.plot(3, 2, 1)
wide_plot.plot(3, 2, 1)


## Initialising fitting

Adjust your solution to the Plots of fits challenge above so that it has an initialiser which checks that the needed parameters are given before initialising the object.

## Solution

class FitterPlotter:
fit_result = None

def __init__(self, x_data, y_data, x_err=None, y_err=None,
fit_form=None, num_fit_params=None, xmin=None, xmax=None):
self.x_data = x_data
self.y_data = y_data
self.x_err = x_err
self.y_err = y_err
self.fit_form = fit_form
self.num_fit_params = num_fit_params
self.xmin = xmin
self.xmax = xmax

def odr_fit(self, p0=None):
if self.fit_form is None:
raise ValueError("fit_form must be specified")
if not p0 and not self.num_fit_params:
raise ValueError("p0 or num_fit_params must be specified")
if p0 and (self.num_fit_params is not None):
assert len(p0) == self.num_fit_params

data_to_fit = RealData(self.x_data, self.y_data, self.x_err, self.y_err)
model_to_fit_with = Model(self.fit_form)
if not p0:
p0 = tuple(1 for _ in range(self.num_fit_params))

odr_analysis = ODR(data_to_fit, model_to_fit_with, p0)
odr_analysis.set_job(fit_type=0)
self.fit_result = odr_analysis.run()
return self.fit_result

def plot_results(self, filename=None):
fig, ax = subplots()
xmin, xmax = self.xmin, self.xmax
if xmin is None:
xmin = min(self.x_data)
if xmax is None:
xmax = max(self.x_data)

if self.fit_result is not None:
x_range = linspace(xmin, xmax, 1000)
ax.plot(x_range, self.fit_form(self.fit_result.beta, x_range),
label='Fit')
fig.suptitle(f'Data: $A={self.fit_result.beta[0]:.02}' f'\\pm{self.fit_result.cov_beta[0][0]**0.5:.02}, ' f'B={self.fit_result.beta[1]:.02}' f'\\pm{self.fit_result.cov_beta[1][1]**0.5:.02}$')

ax.errorbar(self.x_data, self.y_data, xerr=self.x_err, yerr=self.y_err,
fmt='.', label='Data')
ax.set_xlabel(r'$x$')
ax.set_ylabel(r'$y$')
ax.legend(loc=0, frameon=False)

if filename is not None:
fig.savefig(filename)

fitterplotter = FitterPlotter(
x_data=[0, 1, 2, 3, 4, 5],
y_data=[1, 3, 2, 4, 5, 5],
x_err=[0.2, 0.1, 0.3, 0.2, 0.5, 0.3],
y_err=[0.4, 0.4, 0.1, 0.2, 0.1, 0.4],
fit_form=linear,
num_fit_params=2
)

fitterplotter.odr_fit()
fitterplotter.plot_results()
show()


## Key Points

• Classes in Python are blocks started with the class keyword

• Method definitions look like functions, but must take a self argument

• The __init__ method is called when instances are constructed

# Inheritance

## Overview

Teaching: 20 min
Exercises: 20 min
Questions
• How can classe relationships where one represents a specific subset of another be represented?

• How can functionality on one class be overridden or extended by its children?

Objectives
• Be able to use inheritance to construct parent-child relationships between classes

• Be able to override methods on child classes, and refer back to the parent class’s implementations

We have talked about using classes as a way to reduce repetition in the software we write. However, what happens if we want to write two classes that do similar but distinct things? For example, if we wanted to write a CubicPlotter as well as our QuadraticPlotter, would we need to repeat all of the code common to both of them? What if we wanted a QuarticPlotter and a QuinticPlotter as well? This repetitive code would quickly start to build up…

Thankfully, Python (and most other languages that have classes) give us a mechanism to avoid this in the form of inheritance. A class that inherits from a second class automatically gains all of the second’s attributes and methods. The class that is being inherited from is called the parent class, superclass, or base class, while the new class inheriting from it is called the child class, subclass, or derived class.

We saw earlier that ValueError is a subclass of Exception, and that this can be used to handle both specific and more general exceptions in a hierarchy. We can also use this to define our own exceptions. Say, for example, we have a function to convert temperatures from degrees Celsius to degrees Fahrenheit, $$\theta_{\mathrm{F}}(\theta_{\mathrm{C}})=\frac{9}{5}\theta_{\mathrm{C}} + 32$$. We know that temperatures below absolute zero are not valid, so if we encounter those in our code we would like to raise the alarm as soon as possible; we could do this with an assert, but another way of expressing this could be by defining our own exception to flag this. A temperature below $$-273.15^\circ\mathrm{C}$$ is an example of a bad value, so this would want to inherit from ValueError.

class InvalidTemperatureError(ValueError):
pass

def celsius_to_fahrenheit(temperature_c):
if temperature_c < -273.15:
raise InvalidTemperatureError
return temperature_c * 9 / 5 + 32


The pass keyword here tells Python that while we have started an indented block, we don’t actually have anything to put in it. (If we were to omit it, Python would complain at us that it expected a block and didn’t get one.) So we have constructed a new class called InvalidTemperatureError, which is an exact copy of ValueError, except that it knows that ValueError is its parent. Let’s test this.

for temperature_c in 0, 100, -300:
print(temperature_c, "degrees Celsius is",
celsius_to_fahrenheit(temperature_c), "degrees Fahrenheit")

0 degrees Celsius is 32.0 degrees Fahrenheit
100 degrees Celsius is 212.0 degrees Fahrenheit
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "<stdin>", line 3, in celsius_to_fahrenheit
__main__.InvalidTemperatureError


If we wanted to, we could catch this exception with except InvalidTemperatureError or with except ValueError (or even except Exception).

What about if we want to add functionality? Let’s consider an example of a Polygon class, which can calculate its perimeter.

class Polygon:
def __init__(self, side_lengths):
self.side_lengths = side_lengths

def perimeter(self):
'''Returns the perimeter of the polygon.'''
return sum(self.side_lengths)

some_shape = Polygon([1, 2, 3, 4, 5])
print(some_shape.perimeter())

15


Now, we know more about triangles than we do about generic polygons, so we can create a specialised subclass of Polygon called Triangle. For example, for a triangle of sides $$a$$, $$b$$, and $$c$$, Heron’s formula states that the perimeter of the triangle is given by $$\sqrt{p(p-a)(p-b)(p-c)}$$, where $$p=\frac{1}{2}(a+b+c)$$.

class Triangle(Polygon):
def __init__(self, side_lengths):
# Triangles have three sides
assert len(side_lengths) == 3
self.side_lengths = side_lengths

def area(self):
'''Returns the area of the triangle.'''
a, b, c = self.side_lengths
p = self.perimeter() / 2
return (p * (p - a) * (p - b) * (p - c)) ** 0.5

a_triangle = Triangle([3, 4, 5])
print("Perimeter:", a_triangle.perimeter())
print("Area:", a_triangle.area())

Perimeter: 12
Area: 6.0


We’ve done a few new things here. Firstly, we’ve overridden the __init__ method of the Polygon parent class, since we now need to check that the sides that the shape is being given form a triangle, and not some other shape. This means that only the __init__ method from the Triangle class is called, and not the one in the Polygon class. Next, we’ve defined a new method area, which is only available on the Triangle class. We’ve also called the perimeter method, which is defined on the Polygon parent class—we don’t have to recreate this, since we can use it as-is.

One niggling issue is that we are still repeating ourselves a little here. The line self.side_lengths = side_lengths appears in the __init__ method of both classes. If we can, we’d like to remove this by using the equivalent method from the Polygon class. In principle we could use Polygon.__init__, but this still has some repetition, since we have to specify the name of the Polygon class more than once, even though the class knows what its parent class is.

What we can do instead is make use of the super() function. This gives us access to the superclass (and any superclasses further up the chain), without having to refer to any one of them by name. When we call a method of the super() object, Python automatically works its way up the tree until the first class which has a method of the correct name, and calls that. The Triangle class would then become:

class Triangle(Polygon):
def __init__(self, side_lengths):
# Triangles have three sides
assert len(side_lengths) == 3
super().__init__(side_lengths)

def area(self):
'''Returns the area of the triangle.'''
a, b, c = self.side_lengths
p = (a + b + c) / 2
return (p * (p - a) * (p - b) * (p - c)) ** 0.5


(You can see that super() has also taken care of the self argument for us, which using Polygon directly wouldn’t do.)

While in this case we have only saved a single line of repetition, making use of super() becomes essential as methods become increasingly complex and build up functionality in layers.

## Not implemented

If we anticipate a lot of subclasses may provide a particular method, but we can’t or don’t want to provide it on the superclass, we can add a stub method that raises NotImplementedError instead, so that it becomes clear if an implementation has been forgotten. For example, the area method of Polygon could be:

def area(self):
raise NotImplementedError


## Inheriting from object

Sometimes in older Python you will see classes inherit from object. This is a holdover from Python 2, where this was needed to create a “new-style” class instead of an “old-style” class. Old-style classes were removed in Python 3, with all classes being new-style ones which inherit from object automatically, so you don’t need to (and shouldn’t) do this any more.

## super() placement

A four-sided shape where one of the side lengths is zero is a triangle. We can adjust the __init__ method of the Polygon to reflect this by removing any zero-length sides before storing the list of side lengths. The method then becomes:

def __init__(self, side_lengths):
filtered_side_lengths = []
for side_length in side_lengths:
assert side_length >= 0
if side_length > 0:
filtered_side_lengths.append(side_length)
self.side_lengths = filtered_side_lengths


How does this affect our implementation of Triangle.__init__? Adjust this so that Triangle([3, 4, 0, 5]) works, and Triangle([3, 4, 0]) does not.

## Solution

We now need to call super().__init__ before checking the lengths, and check the resulting instance variable rather than the side_lengths argument.

class Polygon:
def __init__(self, side_lengths):
filtered_side_lengths = []
for side_length in side_lengths:
assert side_length >= 0
if side_length > 0:
filtered_side_lengths.append(side_length)
self.side_lengths = filtered_side_lengths

def perimeter(self):
'''Returns the perimeter of the polygon.'''
return sum(self.side_lengths)

class Triangle(Polygon):
def __init__(self, side_lengths):
# Triangles have three sides
super().__init__(side_lengths)
assert len(self.side_lengths) == 3

def area(self):
'''Returns the area of the triangle.'''
a, b, c = self.side_lengths
p = (a + b + c) / 2
return (p * (p - a) * (p - b) * (p - c)) ** 0.5

a_triangle = Triangle([3, 4, 0, 5])
print("Perimeter:", a_triangle.perimeter())
print("Area:", a_triangle.area())
b_triangle = Triangle([3, 4, 0])

Perimeter: 12
Area: 6.0
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-17-751f0372a229> in <module>()
27 print("Perimeter:", a_triangle.perimeter())
28 print("Area:", a_triangle.area())
---> 29 b_triangle = Triangle([3, 4, 0])

<ipython-input-17-751f0372a229> in __init__(self, side_lengths)
16         # Triangles have three sides
17         super().__init__(side_lengths)
---> 18         assert len(self.side_lengths) == 3
19
20     def area(self):

AssertionError:


Where to place your call to super() is an important thing to consider when writing subclasses!

## Rectangles

Write another subclass of Polygon to represent rectangles, and add a method to calculate their area.

## Solution

class Rectangle(Polygon):
def __init__(self, side_lengths):
super().__init__(side_lengths)
num_sides = len(self.side_lengths)
assert num_sides == 2 or num_sides == 4
if num_sides == 2:
width, height = side_lengths
self.side_lengths = [width, height, width, height]
else:
assert self.side_lengths[0] == self.side_lengths[2]
assert self.side_lengths[1] == self.side_lengths[3]

def area(self):
return self.side_lengths[0] * self.side_lengths[1]


## Polynomial plotters

In the previous episode, we wrote a QuadraticPlotter class for plotting quadratic functions. We know, however, that quadratics are not the only type of polynomial in the world.

Write a PolynomialPlotter class similar to QuadraticPlotter, and rewrite QuadraticPlotter to be a subclass of it.

## Solution

from numpy import linspace
from matplotlib.pyplot import subplots
from matplotlib.colors import is_color_like

class PolynomialPlotter:
def __init__(self, color='red', linewidth=1, x_min=-10, x_max=10):
assert is_color_like(color)
self.color = color
self.linewidth = linewidth
self.x_min = x_min
self.x_max = x_max

def polynomial(self, x, coefficients):
'''For a given x and list of n+1 coefficients [a, b, c, d, ...],
returns the polynomial f(x) = ax^n + bx^(n-1) + cx^(n-2) + ...'''
result = 0
for coefficient in coefficients:
result = result * x + coefficient
return result

def plot(self, coefficients):
'''Given the list of coefficients [a, b, c, d, ...],
plot the polynomial f(x) = ax^n + bx^(n-1) + cx^(n-2) + ... .
The line is plotted in the colour specified by color, and with width
linewidth.'''
fig, ax = subplots()
x = linspace(self.x_min, self.x_max, 1000)
ax.plot(x, self.polynomial(x, coefficients),
color=self.color, linewidth=self.linewidth)

def plot(self, a, b, c):
super().plot([a, b, c])


## More general function plotters

Taking this a step further, write a more general FunctionPlotter class, and adjust PolynomialPlotter to be a subclass of it.

## Solution

class FunctionPlotter:
def __init__(self, color='red', linewidth=1, x_min=-10, x_max=10):
assert is_color_like(color)
self.color = color
self.linewidth = linewidth
self.x_min = x_min
self.x_max = x_max

def plot(self, function):
'''Plot a function of a single argument.
The line is plotted in the colour specified by color, and with width
linewidth.'''
fig, ax = subplots()
x = linspace(self.x_min, self.x_max, 1000)
ax.plot(x, function(x), color=self.color, linewidth=self.linewidth)

class PolynomialPlotter(FunctionPlotter):
def plot(self, coefficients):
'''Given the list of coefficients [a, b, c, d, ...],
plot the polynomial f(x) = ax^n + bx^(n-1) + cx^(n-2) + ... .
The line is plotted in the colour specified by color, and with width
linewidth.'''
def polynomial(x):
'''For a given x and list of n+1 coefficients [a, b, c, d, ...],
returns the polynomial f(x) = ax^n + bx^(n-1) + cx^(n-2) + ...'''
result = 0
for coefficient in coefficients:
result = result * x + coefficient
return result
super().plot(polynomial)

def plot(self, a, b, c):
'''Plot the line a * x ** 2 + b * x + c and output to the screen.
x runs between x_min and x_max, with 1000 intermediary points.
The line is plotted in the colour specified by color, and with width
linewidth.'''
super().plot([a, b, c])


Defining a function within another function as we do in PolynomialPlotter is a useful way of parametrising functions without having to pass arguments every time.

## Key Points

• Adding a class in parentheses after a class definition indicates that the new class is a subclass of the bracketed class (parent class).

• The subclass inherits all of that parent class’s attributes and methods.

• Defining a method with the same name as one of the parent class’s overrides it.

• Use super() to access parent classes and their methods.

# Decorators, class methods, and properties

## Overview

Teaching: 20 min
Exercises: 25 min
Questions
• What is a decorator?

• How do I tag methods as being applicable to a class rather than an instance?

• How can I add logic to process changes to instance variables?

Objectives
• Understand the purpose of decorators and how they are implemented

• Be able to use @classmethod and @property

Sometimes when we are writing software we would like to be able to attach additional functionality to a variety of functions (or classes) without writing the functionality directly into the function. Python gives us some extra syntax to make this easier.

For example, say that we want to track what functions are being called in our program. We can write a function that will take a function that we want to track as an argument, and return a new function that outputs before and after calling the function we are interested in.

def track_this(function):
def new_function():
print("Entering", function)
function()
print("Leaving", function)

return new_function


To test this out:

def say_hello():
print("Hello, world.")

say_hello = track_this(say_hello)

def say_goodbye():
print("See you later.")

say_goodbye = track_this(say_goodbye)

def conversation():
say_hello()
say_goodbye()

conversation = track_this(conversation)

conversation()

Entering <function conversation at 0x1115c07b8>
Entering <function say_hello at 0x11143be18>
Hello, world.
Leaving <function say_hello at 0x11143be18>
Entering <function say_goodbye at 0x1115c0e18>
See you later.
Leaving <function say_goodbye at 0x1115c0e18>
Leaving <function conversation at 0x1115c07b8>


So we can now see in more detail what’s going on as we move through this program. However, having to set each function to the result of calling track_this on the function is laborious; it would be nicer if there were an easier way to do this, and thankfully Python gives us one.

@track_this
def say_hello():
print("Hello, world.")

@track_this
def say_goodbye():
print("See you later.")

@track_this
def conversation():
say_hello()
say_goodbye()

conversation()


Using @ followed by the name of the altering function (track_this in this case), and placing this before the function definition, Python takes the result of calling the altering function and overwrites the new function with it.

This syntax is called a “decorator”; the functions say_hello, say_goodbye, and conversation have been decorated with the track_this decorator.

Our @track_this decorator is currently not very general. We can see a problem when we try and decorate a function that takes arguments:

## Decorators and arguments

@track_this
def say_something(thing_to_say):
print(thing_to_say)

say_something("Hello there")

TypeError                                 Traceback (most recent call last)
<ipython-input-29-000c7283eed1> in <module>()
3     print(thing_to_say)
4
----> 5 say_something("Hello there")

TypeError: new_function() takes 0 positional arguments but 1 was given


To make this more flexible, we can rewrite the track_this decorator as:

def track_this(function):
def new_function(*args, **kwargs):
print("Entering", function)
function(*args, **kwargs)
print("Leaving", function)

return new_function


The * and ** here carry two meanings. In the definition def new_function(*args, **kwargs), they mean “take any positional arguments and put them into a list called args, and take any keyword arguments and put them into a dict called kwargs. In the function call function(*args, **kwargs), they mean “pass each element of the list args as a separate argument, and pass each element of the dict kwargs as a keyword argument.

You can also write and use decorators that themselves accept arguments by using a nested function definition, but we won’t go into detail about this today.

## Double checking

Try writing a decorator that checks the result of a computation is consistent by running it twice and checking the outputs are equal. This should return the result if it is consistent, and raise an exception otherwise. Test it by decorating the area and perimeter methods of the Polygon and Triangle classes from the previous episode.

## Solution

class InconsistentResultsError(AssertionError):
pass

def check_consistency(function):
def consistent_function(*args, **kwargs):
results = [function(*args, **kwargs) for _ in range(2)]
if results[0] != results[1]:
raise InconsistentResultsError
return results[0]

return consistent_function

class Polygon:
def __init__(self, side_lengths):
filtered_side_lengths = []
for side_length in side_lengths:
assert side_length >= 0
if side_length > 0:
filtered_side_lengths.append(side_length)
self.side_lengths = filtered_side_lengths

@check_consistency
def perimeter(self):
'''Returns the perimeter of the polygon.'''
return sum(self.side_lengths)

class Triangle(Polygon):
def __init__(self, side_lengths):
# Triangles have three sides
super().__init__(side_lengths)
assert len(self.side_lengths) == 3

@check_consistency
def area(self):
'''Returns the area of the triangle.'''
a, b, c = self.side_lengths
p = (a + b + c) / 2
return (p * (p - a) * (p - b) * (p - c)) ** 0.5

a_triangle = Triangle([3, 4, 5])
print("Perimeter:", a_triangle.perimeter())
print("Area:", a_triangle.area())


## Class methods

Sometimes we want to write functions associated with classes that are relevant to the class as a whole, rather than to one specific instance. We can do this by adding the @classmethod decorator to a method.

Class methods are most frequently used as specialised constructors, to create instances of the class without having to supply every argument to __init__.

For example, revisiting the Triangle class from earlier, we may want to be able to define an equilateral triangle by giving a single side length.

class Triangle(Polygon):
def __init__(self, side_lengths):
# Triangles have three sides
super().__init__(side_lengths)
assert len(self.side_lengths) == 3

@classmethod
def equilateral(cls, side_length):
return cls([side_length] * 3)

def area(self):
'''Returns the area of the triangle.'''
a, b, c = self.side_lengths
p = (a + b + c) / 2
return (p * (p - a) * (p - b) * (p - c)) ** 0.5


Notice that in addition to adding the @classmethod decorator, the first argument which is usually self has been replaced with cls. Since class methods aren’t specific to a particular instance, there is no need to have the self argument referring to it. Conversely, it is useful to be able to refer to the specific class without having to do this by name, since in general we would like class methods to work and return the correct type of class for subclasses as well.

Let’s test this now.

e_triangle = Triangle.equilateral(1.5)
print("Perimeter:", e_triangle.perimeter())
print("Area:", e_triangle.area())

Perimeter: 4.5
Area: 0.9742785792574935


Now we only need to supply a single number, the length of the side, and the equilateral class method constructs the list of three equal side lengths from this, returning a Triangle with three equal sides.

## Squares

Add a class method to the Rectangle class which you wrote in the previous episode to create a square, given the length of its side.

## Solution

class Rectangle(Polygon):
def __init__(self, side_lengths):
super().__init__(side_lengths)
num_sides = len(self.side_lengths)
assert num_sides == 2 or num_sides == 4
if num_sides == 2:
width, height = side_lengths
self.side_lengths = [width, height, width, height]
else:
assert self.side_lengths[0] == self.side_lengths[2]
assert self.side_lengths[1] == self.side_lengths[3]

def area(self):
return self.side_lengths[0] * self.side_lengths[1]

@classmethod
def square(cls, side_length):
return cls([side_length] * 4)


## Properties

In general, when working with classes, there is an assumption that instance variables can be modified, unless something is done to prevent this. In some languages, variables can be defined as read-only, or private so that they cannot be seen from outside of the class. Python has neither of these—any instance variable can be modified by any piece of code using the class. There is, however, a convention that variables and methods whose names begin with _ are private to the implementation—while they can be accessed from outside the class, they are not guaranteed to remain stable between versions, and the class doesn’t guarantee to behave well if they are changed.

To look at a specific example, what happens if we take the Triangle class and change side_lengths?

a_triangle = Triangle([3, 4, 5])
a_triangle.side_lengths = [3, 4, 5, 6]
print(a_triangle.area())

     11     def area(self):
12         '''Returns the area of the triangle.'''
---> 13         a, b, c = self.side_lengths
14         p = (a + b + c) / 2
15         return (p * (p - a) * (p - b) * (p - c)) ** 0.5

ValueError: too many values to unpack (expected 3)


Our implementation of area assumes that side_lengths was validated by __init__ and so has three elements, all positive. By adding a fourth, the implementation becomes broken as a list of four elements can’t be unpacked to three variables. Similarly,

a_polygon = Polygon([1, 2, 3, 4, 5])
a_polygon.side_lengths = 'spam and eggs'
a_polygon.perimeter()

TypeError                                 Traceback (most recent call last)
<ipython-input-50-9b8ffd74de36> in <module>()
1 a_polygon = Polygon([1, 2, 3, 4, 5])
2 a_polygon.side_lengths = 'spam and eggs'
----> 3 a_polygon.perimeter()

<ipython-input-49-5ae60040e7be> in perimeter(self)
10     def perimeter(self):
11         '''Returns the perimeter of the polygon.'''
---> 12         return sum(self.side_lengths)
13
14

TypeError: unsupported operand type(s) for +: 'int' and 'str'


It doesn’t make sense to take the sum of a string (or more precisely, to add the individual characters together), so this also fails.

One way to fix this is to signal that this shouldn’t happen is to mark side_lengths as private by renaming it to _side_lengths. However, this removes some potentially useful functionality—it would definitely be useful for a user of the class to be able to read the side lengths, just not write them directly. Python provides us with an @property decorator that lets us do this.

class Polygon:
def __init__(self, side_lengths):
filtered_side_lengths = []
for side_length in side_lengths:
assert side_length >= 0
if side_length > 0:
filtered_side_lengths.append(side_length)
self._side_lengths = filtered_side_lengths

def perimeter(self):
'''Returns the perimeter of the polygon.'''
return sum(self._side_lengths)

@property
def side_lengths(self):
return self._side_lengths


We have done two things here: self.side_lengths has been renamed to self._side_lengths, indicating that it is intended to be considered as private to the class. We have also added a new method side_lengths, and decorated that with the @property decorator. This allows the result of calling this function to be accessed as though it were an instance variable:

a_polygon = Polygon([1, 2, 3, 4, 5])
print(a_polygon.side_lengths)

[1, 2, 3, 4, 5]


However, we can’t assign to it without referring to the private _side_lengths:

a_polygon.side_lengths = 'spam and eggs'

AttributeError                            Traceback (most recent call last)
<ipython-input-57-9b8ffd74de36> in <module>()
1 a_polygon = Polygon([1, 2, 3, 4, 5])
----> 2 a_polygon.side_lengths = 'spam and eggs'
3 a_polygon.perimeter()

AttributeError: can't set attribute


So we have now successfully “protected” our class from changes that will break it, by signalling to users of it what is internal to the implementation, and what is designed for them to use. However, we have still removed a little functionality in the process—previously, a user could change side_lengths without breaking things, provided that they were careful (since the consistency checks of __init__ were being bypassed), whereas now this is not supported behaviour (even if it is possible).

What we would like is to offer the ability to set the value for the property, but add some kind of validation function that does that rather than allowing it to be assigned directly. This kind of function is called a setter, and the @property decorator in fact allows us to create one. (The first function, which gets the value, is referred to as a getter.)

class Polygon:
def __init__(self, side_lengths):
self.side_lengths = side_lengths

def perimeter(self):
'''Returns the perimeter of the polygon.'''
return sum(self._side_lengths)

@property
def side_lengths(self):
return self._side_lengths

@side_lengths.setter
def side_lengths(self, side_lengths):
filtered_side_lengths = []
for side_length in side_lengths:
assert side_length >= 0
if side_length > 0:
filtered_side_lengths.append(side_length)
self._side_lengths = filtered_side_lengths


We’ve moved the validation logic into the method side_length, as decorated by the @side_lengths.setter decorator, and the __init__ method uses this to do its initial setup. Testing this:

a_polygon = Polygon([1, 2, 3, 4, 5])
print("Original perimeter:", a_polygon.perimeter())
a_polygon.side_lengths = ([1, 2, 3, 4, 5, 6])
print("Modified perimeter:", a_polygon.perimeter())

Original perimeter: 15
Modified perimeter: 21


## More robust plotters

Adjust the FunctionPlotter, PolynomialPlotter, or QuadraticPlotter example from earlier to make color a property, with a getter and a setter, with the setter checking that the the color is a valid matplotlib color.

## Solution

from numpy import linspace
from matplotlib.pyplot import subplots
from matplotlib.colors import is_color_like

class FunctionPlotter:
def __init__(self, color='red', linewidth=1, x_min=-10, x_max=10):
self.color = color
self.linewidth = linewidth
self.x_min = x_min
self.x_max = x_max

@property
def color(self):
return self._color

@color.setter
def color(self, color):
assert is_color_like(color)
self._color = color

def plot(self, function):
'''Plot a function of a single argument.
The line is plotted in the colour specified by color, and with width
linewidth.'''
fig, ax = subplots()
x = linspace(self.x_min, self.x_max, 1000)
ax.plot(x, function(x), color=self._color, linewidth=self.linewidth)


## Key Points

• A decorator adds functionality to a class or function. To use the decoratorname decorator, add @decoratorname one line before the class or function definition.

• Use the @classmethod decorator to indicate methods to be called from the class rather than from an instance.

• Use the @property decorator to control access to instance variables

# Special methods

## Overview

Teaching: 25 min
Exercises: 15 min
Questions
• How can classes allow their instances to work with standard Python operators?

• How can classes allow their instances to behave like iterables or collections?

• How can classes allow their instances to be called like functions?

Objectives
• Be able to implement methods like __add__, __eq__, and __gt__.

• Be able to implement methods like __len__, __iter__, and __reversed__.

• Be able to implement the __call__ method.

In the previous episodes, we built a Triangle class that could represent a triangle by storing the lengths of its sides. Now, mathematically speaking, two triangles with identical sides are the same triangle. Let’s see if Python agrees with this.

a_triangle = Triangle([3, 4, 5])
the_same_triangle = Triangle([3, 4, 5])
if a_triangle == the_same_triangle:
print("Python thinks that these triangles are the same.")
else:
print("Python thinks that these are different triangles.")

Python thinks that these are different triangles.


So, despite these triangles having been constructed with exactly the same side lengths, Python distinguishes between them. By default, Python will only consider two objects to be the same if they are identical:

a_triangle = Triangle([3, 4, 5])
duplicate_triangle = a_triangle
if a_triangle == duplicate_triangle:
print("Python thinks that these triangles are the same.")
else:
print("Python thinks that these are different triangles.")

Python thinks that these triangles are the same.


This isn’t great for our triangle example—we would much prefer if we could compare equality of triangles without having to compare the side_lengths property by hand. Fortunately, Python gives us a way of doing this. If we implement the __eq__ method of Triangle, then Python learns how to compare triangles.

class Triangle(Polygon):
def __init__(self, side_lengths):
# Triangles have three sides
super().__init__(side_lengths)
assert len(self.side_lengths) == 3

@classmethod
def equilateral(cls, side_length):
return cls([side_length] * 3)

def area(self):
'''Returns the area of the triangle.'''
a, b, c = self.side_lengths
p = (a + b + c) / 2
return (p * (p - a) * (p - b) * (p - c)) ** 0.5

def __eq__(self, other):
'''Returns True if the triangle self and the triangle other
are the same triangle'''
if not isinstance(other, Triangle):
return False
else:
# Check all permutations
if self.side_lengths == other.side_lengths:
return True
elif (self.side_lengths[1:] + [self.side_lengths[1]] ==
other.side_lengths):
return True
elif ([self.side_lengths[2]] + self.side_lengths[:-1] ==
other.side_lengths):
return True
return False

a_triangle = Triangle([3, 4, 5])
the_same_triangle = Triangle([3, 4, 5])
if a_triangle == the_same_triangle:
print("Python thinks that these triangles are the same.")
else:
print("Python thinks that these are different triangles.")

Python thinks that these triangles are the same.


Great! We can compare equality. __eq__ is the second example we’ve seen of a so-called “special”, “magic”, or “dunder” (short for “double underscore”) method. These are methods that Python ascribes a special meaning to; it guards the names of these with a double underscore __ on each side, so it is unlikely to collide with a name you might want to use for a method of your own. These methods allow us to enable instances of our classes to behave more like Python objects you’re used to dealing with, using the typical set of operators, rather than needing to use method calls for everything.

Let’s look at some more examples of these. Firstly, wouldn’t it be nice if we got something more descriptive when Python referred to our Triangles?

a_triangle

<__main__.Triangle at 0x1235e7b00>


We can do this by implementing the __repr__ method (short for “representation”). This is designed to be something that looks like Python code—ideally, something that if you pasted it back in, you’d get the same (or at least a similar) object. For the Triangle, this could look like:

    def __repr__(self):
return f'Triangle({self.side_lengths})'


Testing this now gives:

a_triangle = Triangle([3, 4, 5])
a_triangle

Triangle([3, 4, 5])


## Other comparisons

What about if we want to know how two objects compare to each other? We can do this by implementing __lt__, __gt__, __le__, __ge__, and __ne__, representing <, >, <=, >=, and != respectively. For example, an implementation of __lt__ might look like:

    def sides_with_max_first(self):
max_index = self.side_lengths.index(max(self.side_lengths))
if max_index == 0:
return self.side_lengths
elif max_index == 1:
return self.side_lengths[1:] + [self.side_lengths[0]]
else:
return [self.side_lengths[2]] + [self.side_lengths[:2]]

def __lt__(self, other):
if self.area() != other.area():
return self.area() < other.area()
elif self.perimeter() != other.perimeter():
return self.perimeter() < other.perimeter()
elif self == other:
return False
else:
return self.sides_with_max_first() < other.sides_with_max_first()


Testing this:

a_triangle = Triangle([3, 4, 5])
b_triangle = Triangle([5, 12, 13])
a_triangle < b_triangle

True


Python then does two very nice things for us: firstly, since we have defined __lt__, it can now sort lists of Triangles for us. And while we could leave only the < operator defined, this could be confusing for those using the class; fortunately, given implementations of __eq__ and __lt__, Python can automatically generate the other relational operators using the functools.total_ordering decorator.

from functools import total_ordering

@total_ordering
class Triangle(Polygon):
...


## Sorting random triangles

Add a class method that generates a triangle with three random edge lengths (for example, using random.random(). Use this to construct and sort a list of 10 random triangles.

## Solution

Add an import at the top of the file:

from random import random


    @classmethod
def random(cls):
'''Returns a triangle with three random length sides in the
range [0, 1).
If the sum of the two short sides isn't longer than the
long side (and so the triangle doesn't close), then try
again. There is an infinitesimal probability that this
method will never return, as randomness keeps delivering
invalid triangles.'''

random_triangle = cls([random(), random(), random()])
while isinstance(random_triangle.area(), complex):
random_triangle = cls([random(), random(), random()])
return random_triangle


Testing this:

random_triangles = [Triangle.random() for _ in range(10)]
[triangle.area() for triangle in sorted(random_triangles)]


## Arithmetic

In the same way that __lt__ and friends correspond to relational operators, arithmetic operations like +, -, *, etc. can be defined with methods like __add__, __sub__, and __mul__.

Define a new class ErrorBar to represent a number with an associated error in Gaussian statistics. Add __init__, __repr__, __add__, __sub__, __mul__, and __truediv__ methods, making the (very unreasonable) assumption that all errors are uncorrelated.

## Solution

class ErrorBar:
def __init__(self, centre, error):
self.centre = centre
self.error = error

def __repr__(self):
return f'{self.centre} ± {self.error}'

centre = self.centre + other.centre
error = (self.error ** 2 + other.error ** 2) ** 0.5
return ErrorBar(centre, error)

def __sub__(self, other):
centre = self.centre - other.centre
error = (self.error ** 2 + other.error ** 2) ** 0.5
return ErrorBar(centre, error)

def __mul__(self, other):
centre = self.centre * other.centre
error = centre * ((self.error / self.centre) ** 2 +
(other.error / other.centre) ** 2) ** 0.5
return ErrorBar(centre, error)

def __truediv__(self, other):
centre = self.centre / other.centre
error = centre * ((self.error / self.centre) ** 2 +
(other.error / other.centre) ** 2) ** 0.5
return ErrorBar(centre, error)


## Callable objects

By implementing the __call__ method, we can allow instances of a class to be called like functions. For example, returning to the FunctionPlotter example:

from numpy import linspace, sin
from matplotlib.colors import is_color_like
from matplotlib.pyplot import subplots

class FunctionPlotter:
def __init__(self, color='red', linewidth=1, x_min=-10, x_max=10):
self.color = color
self.linewidth = linewidth
self.x_min = x_min
self.x_max = x_max

@property
def color(self):
return self._color

@color.setter
def color(self, color):
assert is_color_like(color)
self._color = color

def plot(self, function):
'''Plot a function of a single argument.
The line is plotted in the colour specified by color, and with width
linewidth.'''
fig, ax = subplots()
x = linspace(self.x_min, self.x_max, 1000)
ax.plot(x, function(x), color=self._color, linewidth=self.linewidth)

def __call__(self, *args, **kwargs):
return self.plot(*args, **kwargs)

plotter = FunctionPlotter()
plotter(sin)


## Subclassing with __call__

Do we need to redefine __call__ on each subclass of FunctionPlotter to get the correct version of the plot() function? Why/why not?

## Solution

No; self returns the current instance, so the call to self.plot() will pick up the correct version of plot() for whichever class the instance is.

## Collections and iterables

Python also gives us the power to make our objects behave like iterable or collection types (for example tuples, lists, dicts, and generators). For example, to let instances of the class behave with the len() function, we implement __len__(). For example, adding this to the Polygon class:

    def __len__(self):
return len(self.side_lengths)


will define the length of the object as the number of edges that the Polygon has. (Note that we shouldn’t make this the perimeter—Python expects len() to return a non-negative integer.) Testing this,

a_polygon = Polygon([1, 2, 3, 4, 5])
print(len(a_polygon))

5


We can also let our code loop over elements of our objects by implementing the __iter__() method, which should return an iterator; this is a particular type of object in Python that makes things like for loops work. We can get one of these from any iterable via the iter() function.

     def __iter__(self):
return iter(self.side_lengths)


We can now iterate through the sides of our Polygons without having to get the side_lengths property each time.

a_polygon = Polygon([1, 2, 3, 4, 5])
for side_length in a_polygon:
print(side_length)

1
2
3
4
5


## In reverse

The reversed() function returns an iterator over the elements of an iterable or collection going backwards. This is implemented for classes via the __reversed__ method. Implement this for the Polygon class, and test your implementation.

## Solution

Method:

    def __reversed__(self):
return reversed(self.side_lengths)


Test:

a_polygon = Polygon([1, 2, 3, 4, 5])
for side_length in reversed(a_polygon):
print(side_length)

5
4
3
2
1


## Getting specific elements

You can also allow your code to access elements via square brackets, just like with lists. The __getitem__() method does this, taking the index (or key) being sought as its argument.

For a non-dict-like collection, __getitem__() can work for both integer indices and for slices.

Implement __getitem__() for the Polygon class. Since in our current implementation of Polygon, it doesn’t make sense to take a subset of the sides, requesting a slice should raise IndexError; only requesting a single element with an integer index should work.

## Solution

Implementation:

    def __getitem__(self, key):
if type(key) is int:
return self.side_lengths[key]
else:
raise IndexError


Test:

a_polygon = Polygon([1, 2, 3, 4, 5])
print(a_polygon[2])

3


## for loops with __getitem__()

Once a class has __getitem__() defined, then Python will automatically work out how to loop over it, even in the absence of __iter__() (although adding this does make it more efficient). Even beter, when __len__() is also implemented, then Python automatically knows how to reversed() the class as well.

Test this by removing the implementations of __iter__() and __reversed__() from Polygon and testing the loops forwards and backwards again.

## More dunder methods

Python offers many more dunder methods than could possibly be covered in this episode. A full listing, categorised by the functions that they serve, can be found in the Python documentation

## Key Points

• Implement methods like __eq__, __add__, and __gt__ to allow operations such as arithmetic and comparisons.

• Implement __repr__ to get more meaningful printouts when you output an object.

• Implement methods like __len__, __iter__, and __reversed__ to make instances of a class behave like a collection or iterable.

• Implement the __call__ method to make instances of a class callable like functions.

# Duck typing and interfaces

## Overview

Teaching: 10 min
Exercises: 15 min
Questions
• How does Python decide what you can and can’t do with an object?

• When is inheritance not appropriate?

• What alternatives are there to inheritance?

Objectives
• Understand how duck typing works, and how interfaces assist with understanding this.

• Understand the circumstances where inheritance can be a hindrance rather than a help.

• Be aware of concepts such as composition which can help where inheritance fails.

There is a principle that if something “looks like a duck, and swims like a duck, and quacks like a duck, then it is probably a duck”.

Python’s type system adopts a similar philosophy—if it looks, swims, and quacks like a duck, and that’s the only duck-like aspects that we need at a particular time, then as far as Python is concerned, then it is a duck.

For example, the Newton–Raphson method solves equations of the form $$f(x)=0$$ iteratively from a starting point $$x_0$$ as $x_{n+1}=x_n - \frac{f(x_n)}{f’(x_n)}\;.$ We could implement this in Python as:

def newton(function, derivative, initial_estimate, num_iters=10):
'''Solves the equation function(x) == 0 using the Newton&ndash;Raphson
method with num_iters iterations, starting from initial_estimate.
derivative is the derivative of function with respect to x.'''

current_estimate = initial_estimate
for _ in range(num_iters):
current_estimate = (
current_estimate
- function(current_estimate) / derivative(current_estimate)
)
return current_estimate


This clearly works with functions that operate on and return real numbers.

from math import sin, cos

print(newton(sin, cos, 1))
print(newton(sin, cos, 2))
print(newton(sin, cos, 1.5))

0.0
3.141592653589793
-12.566370614359172


If you only planned for this to work with real numbers, you might think of adding a check at the start of the function that the initial_estimate given is a real number, or that each successive current_estimate is real. However, if we think about this in a duck typed way, we don’t really need to care about this—provided that the values can be subtracted and divided, and function and derivative can operate on them, then the algorithm will work.

This means that we can apply this function to cases we may not have considered. For example, when $$f(z)$$ is a polynomial, then plotting the solution $$z_n$$ (which is now a complex number) obtained as a function of the initial estimate $$z_0$$ gives us Newton’s fractal.

%matplotlib inline
from numpy import angle, linspace, newaxis, pi
from matplotlib.pyplot import colorbar, subplots

def complex_linspace(lower, upper, num_real, num_imag):
real_space = linspace(lower.real, upper.real, num_real)
imag_space = linspace(lower.imag, upper.imag, num_imag) * 1J
return real_space + imag_space[:, newaxis]

def test_polynomial(x):
return x ** 3 - 1

def test_derivative(x):
return 3 * x ** 2

z_min = -1 - 1J
z_max = 1 + 1J
initial_z = complex_linspace(z_min, z_max, 1000, 1000)

results = newton(test_polynomial, test_derivative, initial_z, 20)

fig, ax = subplots()
image = ax.imshow(angle(results), vmin=-3, vmax=3,
extent=(z_min.real, z_max.real, z_min.imag, z_max.imag))
cbar = colorbar(image, ax=ax, ticks=(-2*pi/3, 0, 2*pi/3))
cbar.set_label(r'$\arg(z_n)$')
cbar.ax.set_yticklabels((r'$-\frac{2\pi}{3}$', '0', r'$\frac{2\pi}{3}$'))
ax.set_xlabel(r'$\operatorname{Re}(z_0)$')
ax.set_ylabel(r'$\operatorname{Im}(z_0)$')


Because our Newton–Raphson function was duck typed, it automatically worked for this problem, despite this problem requiring Numpy arrays of complex numbers rather than the real numbers we thought we were writing for.

## Protocols

It is frequently useful to codify exactly what requirements are placed on an object (or duck) so that we can design classes to match. In Python, when these requirements are documented, the specification is called a protocol; you may also hear the word (informal) interface used to describe this as well.

An example of a well-known protocol in Python is the iterator protocol, which should be obeyed by objects returned by the __iter__() method. For a class to support the iterator protocol, it must have two methods:

• __iter__(), which returns the object itself. (This is so that an iterator can be given to a for loop directly, which is sometimes desirable rather than relying on it being returned by the __iter__() method of a collection-type object.)
• __next__(), which returns the next item in the sequence. If there are no more items, then this should raise the StopIteration exception, and successive calls should keep raising this exception.

For instance, an iterator that returns the Fibonacci numbers up to some upper bound may look something like:

class FibonacciIterator:
def __init__(self, max_value):
self.max_value = max_value
self.last_two_numbers = (1, 0)

def __iter__(self):
return self

def __next__(self):
next_number = sum(self.last_two_numbers)
self.last_two_numbers = (self.last_two_numbers[1], next_number)
if self.max_value < next_number:
raise StopIteration
else:
return next_number


This is an example of where we want to use the iterator directly with the for loop, since we want to initialise it with a max_value. Testing this:

for number in FibonacciIterator(100):
print(number)

1
1
2
3
5
8
13
21
34
55
89


## Triangular numbers

Write a class that implements the iterator protocol and that returns the first $$n$$ triangular numbers. These are defined such that the $$n$$th triangular number is the sum of the first $$n$$ positive integers, so the first five are 1, 3, 6, 10, and 15.

What would happen if you removed the upper bound (and so never raised StopIteration) and used the iterator in a for loop? When might this behaviour be useful?

## Solution

class TriangularIterator:
def __init__(self, n):
self.n = n
self.total = 0
self.index = 0

def __iter__(self):
return self

def __next__(self):
if self.index >= self.n:
raise StopIteration
self.index += 1
self.total += self.index
return self.total


A loop over an iterator that can’t raise StopIteration will run forever. This could be useful if you’re using zip() to iterate over another, bounded, iterable at the same time; then each element will get a corresponding triangular number, no matter how many elements there are.

## Spot the problem

Look back at the solutions for the QuadraticPlotter, PolynomialPlotter, and FunctionPlotter. What problems do you see with the plot method of these classes?

## Solution

The arguments to FunctionPlotter.plot(), PolynomialPlotter.plot(), and QuadraticPlotter.plot() are all different—one expects a callable, one expects a list of coefficients as one argument, and one expects three coefficients as separate arguments. In general, specialistations of a class should keep the same interface to its functions, and the parent class should be interchangeable with its specialisations.

## Over to you

Thinking about your own research software, what kind of places might an interface be useful to better codify how different parts of the software interact?

## Abstract base classes

Python also allows us to go a step further than a protocol, and formalise the requirements we place on our interfaces in code. An abstract base class is a class that must be inherited from—you can’t create instances of it directly. Python provides these for many of its protocols in the collections.abc module. For example, the Fibonacci iterator above could inherit from abc.Iterator. This would allow other code to check in advance that it supports the protocol, and also would guard against us forgetting to implement some part of the protocol. For example, if we forgot the __next__() method:

from collections.abc import Iterator
class FibonacciIterator(Iterator):
def __init__(self, max_value):
self.max_value = max_value
self.last_two_numbers = (1, 0)

def __iter__(self):
return self

for number in FibonacciIterator(100):
print(number)


In this case Python gives us an error:

TypeError                                 Traceback (most recent call last)
<ipython-input-3-a96ac2788df3> in <module>
5         self.last_two_numbers = (1, 0)
6
----> 7 for number in FibonacciIterator(100):
8     print(number)

TypeError: Can't instantiate abstract class FibonacciIterator with abstract methods __next__


This can be useful when working with more complex interfaces. (On the other hand, removing the __iter__() method works fine, because abc.Iterator helpfully defines __iter__() for us, so we can inherit it.)

## Implementing multiple interfaces

You may find yourself wanting to implement multiple interfaces in a single class. This is possible by making use of multiple inheritance, where a class inherits from more than one base class. This is not supported in all programming languages, and in many programming languages it is considered to be problematic. It is more common in Python, but we don’t have space to go into detail about it in this lesson.

## Hashable Polygons

The hashable protocol allows classes to be used as dictionary keys and as members of sets. Look up the hashable protocol and adjust the Polygon class so that it follows this.

Test this by using a Triangle instance as a dict key:

triangle_descriptions = {
Triangle([3, 4, 5]): "The basic Pythagorean triangle"
}


## Solution

The hashable protocol requires implementing one method, __hash__(), which should return a hash of the aspects of the instance that make it unique. Lists can’t be hashed, so we also need to turn the list of side_lengths into a tuple.

    def __hash__(self):
return hash(tuple(self.side_lengths))


## Composition

Composition is a technique where rather than adding more and more functionality to a single class (either explicitly, or via inheritance), functionality is added by adding instances of other classes that group together the related functionality.

An example of a library that makes heavy use of composition is the Matplotlib object-oriented API. While Matplotlib makes its pyplot API available for basic plotting, it is built on top of a very intricate hierarchy of classes and objects. Those who want more control over their plots are encouraged to use this interface instead of the simplified pyplot version.

To get a feel for how Matplotlib uses composition to separate its concerns while having a large amount of functionality, we can write a small test function to recursively walk through a member variables of an object that are themselves instances of a non-builtin class.

from matplotlib.pyplot import subplots

def traverse_objects(base_object, level=0, max_level=5):
"""Recursively walk through the member variables of base_object,
and print out information about each that is an instance of a
non-built-in class. max_level controls the depth that the
recursion may continue to, to avoid infinite loops."""

if hasattr(base_object, '__dict__') and level < max_level:
for child_name, child_object in vars(base_object).items():
if child_object.__class__.__module__ != 'builtins':
print(" " * level, child_name, ':', type(child_object))
traverse_objects(child_object, level=level+1)

# Create a simple plot
fig, ax = plt.subplots()
ax.scatter([1, 2, 3], [1, 4, 9])
ax.scatter([1, 1.5, 2, 2.5, 3], [1, 1, 2, 3, 5])

# Inspect the object hierarchy of ths figure object
traverse_objects(fig)


This gives a lot of output—72 lines, so in principle 72 different classes are combining here. In practice this number is not accurate; there is some duplication in this list, since for example both canvas and patch have a figure member variable so that they can refer back to the Figure that they work with. Conversely, this simple traversal ignores some additional composition; for example, fig._axstack._elements is a list of tuples, but within some of those tuples are more objects of type matplotlib.gridspec.SubplotSpec and matplotlib.axes._subplots.AxesSubplot.

This is why when you have errors in your code, tracebacks from some libraries can be quite long. Having lots of small methods in classes that are dedicated to one very specific aspect means that it is easier to reason about what each one is doing in isolation by itself, but can make it more complicated to get a view of the big picture.

## Composing plotters

How could the FunctionPlotter, PolynomialPlotter, and QuadraticPlotter be refactored to make use of composition instead of inheritance?

## Solution

One way of doing this is to define a “plottable function” interface. An object respecting this interface would:

• be callable
• accept one argument
• return $$f(x)$$

Then, with the FunctionPlotter as defined previously, there is no need to subclass to create QuadraticPlotters and PolynomialPlotters; instead, we can define a QuadraticFunction class as:

class Quadratic:
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c

def __call__(self, x):
return self.a * x ** 2 + self.b * x + self.c


This can then be passed to a FunctionPlotter:

plotter = FunctionPlotter()


Alternatively, we can encapsulate the function to be plotted as part of the class.

from matplotlib.colors import is_color_like

class FunctionPlotter:
def __init__(self, function, color='red', linewidth=1, x_min=-10, x_max=10):
assert is_color_like(color)
self.color = color
self.linewidth = linewidth
self.x_min = x_min
self.x_max = x_max
self.function = function

def plot(self):
'''Plot a function of a single argument.
The line is plotted in the colour specified by color, and with width
linewidth.'''
fig, ax = subplots()
x = linspace(self.x_min, self.x_max, 1000)
ax.plot(x, self.function(x), color=self.color, linewidth=self.linewidth)
fig.show()


This could then be used as:

from numpy import sin
sin_plotter = FunctionPlotter(sin)