Content from Lesson 5: Unit Testing Code
Last updated on 2025-10-28 | Edit this page
Estimated time: 15 minutes
Overview
Questions
- Why should I test my code?
- What is the role of automated testing?
- What are the different types of automated tests?
- What is the structure of a unit test?
- What is test “mocking”?
Objectives
- Explain the reasons why testing is important
- Describe the three main types of tests and what each are used for
- Describe the practice of test “mocking” and when to use it
- Obtain example code repository and run existing unit tests
- Describe the format of a unit test written for the Pytest testing framework
Testing is a critical part of writing reliable, maintainable code — especially in collaborative or research environments where reproducibility and correctness are key. In this session, we will explore why testing matters, and introduce different levels of testing — from small, focused unit tests, to broader integration and system tests that check how components work together. We will also look at testing approaches such as regression testing (to ensure changes do not break existing behavior) and property-based testing (to test a wide range of inputs automatically). Finally, we will cover mocking, a technique used to isolate code during tests by simulating the behavior of external dependencies.
Introduction to testing
Code testing is the process of verifying that your code behaves as expected and continues to do so as it evolves. It helps catch bugs early, ensures changes do not unintentionally break existing functionality, and supports the development of more robust and maintainable software. Whether you’re working on a small script or a large application, incorporating testing into your workflow builds confidence in your code and makes collaboration and future updates much easier.
Why test your code?
Being able to demonstrate that a process generates the right results is important in any field of research, whether it is software generating those results or not. So when writing software we need to ask ourselves some key questions:
- Does the code we develop works as expected?
- To what extent are we confident of the accuracy of results that software produces?
- Can we and others verify these assertions for themselves?
If we are unable to demonstrate that our software fulfills these criteria, why would anyone use it?
As a codebase grows, debugging becomes more challenging, and new code may introduce bugs or unexpected behavior in parts of the system it does not directly interact with. Tests can help catch issues before they become runtime bugs, and a failing test can pinpoint the source of the problem. Additionally, tests serve as invocation examples for other developers and users, making it easier for them to reuse the code effectively.
Having well-defined tests for our software helps ensure your software works correctly, reliably, and consistently over time. By identifying bugs early and confirming that new changes do not break existing functionality, testing improves code quality, reduces the risk of errors in production, and makes future development and long-term maintenance faster and safer.
Levels of Code Testing
Testing can be performed at different code levels, each serving a distinct purpose to ensure software behaves correctly at various stages of execution. Together, these testing levels provide a structured approach to improving software quality and reliability.
Unit testing is the most granular level, where individual components—like functions or classes—are tested in isolation to confirm they behave correctly under a variety of inputs. This makes it easier to identify and fix bugs early in the development process.
Integration testing builds on unit testing by checking how multiple components or modules work together. This level of testing helps catch issues that arise when components interact — such as unexpected data formats, interface mismatches, or dependency problems.
At the highest level, system testing evaluates the software as a complete, integrated system. This type of testing focuses on validating the entire application’s functionality from end to end, typically from the user’s perspective, including inputs, outputs, and how the system behaves under various conditions.
Approaches to Code Testing
Different approaches to code testing help ensure that software behaves as expected under a range of conditions. When the expected output of a function or program is known, tests can directly check that the results match fixed values or fall within a defined confidence interval.
However, for cases where exact outputs are not predictable — such as simulations with random elements — property-based testing is useful. This method tests a wide range of inputs to ensure that certain properties or patterns hold true across them.
Another important approach is regression testing, which helps detect when previously working functionality breaks due to recent changes in the code. By rerunning earlier tests, developers can catch and address these regressions early, maintaining software stability over time.
Mocking
When running tests, you often want to focus on testing a specific piece of functionality, but dependencies on external objects or functions can complicate this, as you cannot always be sure they work as expected. Mocking addresses this by allowing you to replace those dependencies with “mocked” objects or functions that behave according to your instructions. So, mocking is a testing approach used to isolate the unit of code being tested by replacing its dependencies with simplified, controllable versions — known as mocks.
Mocks mimic the behavior of real components (such as databases, APIs, or external services) without requiring their full functionality or availability. This allows developers to test specific code paths, simulate error conditions, or verify how a unit interacts with other parts of the system. Mocking is especially useful in unit and integration testing to ensure tests remain focused, fast and reliable.
For example, if a function modifies data and writes it to a file, you can mock the file-writing object, so instead of creating an actual file, the mocked object stores the “written” data. This enables you to verify that the data written is as expected, without actually creating a file, making tests more controlled and efficient.
Related Practices
Code style and linting are essential practices in code testing, as they help ensure that code is readable and maintainable by following established conventions, such as PEP8 in Python. Linting tools automatically check that code adheres to these style guidelines, reducing errors and improving consistency.
Continuous Integration (CI) further enhances testing practices by automating key processes, such as running tests and linting tools, every time code changes are committed. This helps catch issues early, maintain code quality, and streamline the development workflow. Together, these practices improve code reliability and make collaboration smoother.
Practical Work
In the rest of this session, we will walk you through writing tests for your code.
Content from Example Code
Last updated on 2025-10-28 | Edit this page
Estimated time: 10 minutes
Overview
Questions
- How do I run a set of unit tests?
- How do unit test frameworks operate?
Objectives
- Obtain example code used for this lesson
- Run a repository’s existing unit tests using a unit testing framework
- Describe the typical format of a unit test
Creating a Copy of the Example Code Repository
For this lesson we’ll be using some example code available on GitHub, which we’ll clone onto our machines using the Bash shell. So firstly open a Bash shell (via Git Bash in Windows or Terminal on a Mac). Then, on the command line, navigate to where you’d like the example code to reside, and use Git to clone it. For example, to clone the repository in our home directory, and change our directory to the repository contents:
Examining the Code
Next, let’s take a look at the code, which is in the
factorial-example/mymath directory, called
factorial.py, so open this file in an editor.
The example code is a basic Python implementation of Factorial. Essentially, it multiplies all the whole numbers from a given number down to 1 e.g. given 3, that’s 3 x 2 x 1 = 6 - so the factorial of 3 is 6.
We can also run this code from within Python to show it working. In the shell, ensure you are in the root directory of the repository, then type:
PYTHON
Python 3.10.12 (main, Feb  4 2025, 14:57:36) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> Then at the prompt, import the factorial function from
the mymath library and run it:
Which gives us 6 - which gives us some evidence that this function is working. Of course, in practice, our functions may well be more complicated than this, and of course, they may call other separate functions. Now we could just come up with a list of known input numbers and expected outputs and run each of these manually to test the code, but this would take some time. Computers are really good at one thing - automation - so let’s use that and automate our tests, to make it easy for ourselves.
Running the Tests
As it turns out, this code repository already has a test. Navigate to
the repository’s tests directory, and open a file called
test_factorial.py:
PYTHON
import unittest
from mymath.factorial import factorial
class TestFactorialFunctions(unittest.TestCase):
    def test_3(self):
        self.assertEqual(factorial(3), 6)Now, we using a Python unit test framework called
unittest. There are other such frameworks for Python,
including nose and pytest which is very
popular, but the advantage of using unittest is that it’s
already built-in to Python so it’s easier for us to use it.
Before we look into this example unit test, questions Who here is familiar with object oriented programming? Yes/No Who’s written an object oriented program? Yes/No
What is Object Oriented Programming?
For those that aren’t familiar with object oriented programming, it’s a way of structuring your programs around the data of your problem. It’s based around the concept of objects, which are structures that contain both data and functions that operate on that data. In object oriented programming, objects are used to model real-world entities, such as people, bank accounts, libraries, books, even molecules, and so on. With each object having its own:
- data - known as attributes
- functions - known as methods
These are encapsulated within a defined structure known as a class. An introduction to object oriented programming is beyond the scope of this session, but if you’d like to know more there’s a great introductory tutorial on the RealPython site. This site is a great practical resource for learning about how to do many things in Python!
For the purposes of this activity, we use object oriented classes to
encapsulate our unit tests since that’s how they’re defined in the
unittest framework. You can consider them as a kind of
syntactic sugar to group our tests together, 2ith a single unit test
being represented as a single function - or method - within a class.
In this example, we have a class called
TestFactorialFunctions with a single unit test, which we’ve
called test_3. Within that test method, we are essentially
doing what we did when we ran it manually earlier: we’re running
factorial with the argument 3, and checking it equals 6. We use an
inbuilt function, or method, in this class called
assertEqual, that checks the two are the same, and if not,
the test will fail.
So how do we run this test? In the shell, we can run this test by ensuring we’re in the repository’s root directory, and running:
OUTPUT
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK[CHECKPOINT - who’s run the tests and got this output? Yes/No]
So what happens? We see a single ., we see a message
that says it ran very quickly, and OK. The single dot means
the single test we have was successfully run, so our test passes!
But how does unittest know what to run exactly? Unit test frameworks
like unitttest follow a common pattern of finding tests and
running them. When we give a single file argument to
unittest, it searches the Python file for
unittest.TestCase classes, and within those classes, looks
for methods starting with test_, and runs them. So we could
add more tests in this class in the same way, and it would run each in
turn. We could even add multiple unittest.TestCase classes
here if we wanted, each testing different aspects of our code for
example, and unittest would search all of these classes and
run each test_ function in turn.
- Unittest is a built-in Python unit testing framework
- Other popular unit testing frameworks for Python include
pytestandnose
- Object oriented programming is a way of encapsulating data and the functions that operate on that data together
- Run a set of Unittest unit tests using
python -m unittestfollowed by the script filename containing the tests that begins withtest_
Content from Creating a New Test
Last updated on 2025-10-28 | Edit this page
Estimated time: 10 minutes
Overview
Questions
- How do I write a unit test?
- How do I write a unit test that tests for an error?
Objectives
- Implement and run unit tests to verify the correct behaviour of program functions
- Describe how and when testing fits into code development
- Write a unit test that tests for an expected error
Add a New Test
As we’ve mentioned, adding a new unit test is a matter of adding a
new test method. Let’s add one to test the number 5. Edit
the tests/test_factorial.py file again:
[CHECKPOINT - who’s finished editing the file Yes/No]
And then we can run it exactly as before, in the shell
OUTPUT
test_3 (tests.test_factorial.TestFactorialFunctions) ... ok
test_5 (tests.test_factorial.TestFactorialFunctions) ... ok
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OKWe can see the tests pass. So the really useful thing here, is we can rapidly add tests and rerun all of them. Particularly with more complex codes that are harder to reason about, we can develop a set of tests into a suite of tests to verify the codes’ correctness. Then, whenever we make changes to our code, we can rerun our tests to make sure we haven’t broken anything. An additional benefit is that successfully running our unit tests can also give others confidence that our code works as expected.
[CHECKPOINT - who managed to run this with their new unit test Yes/No]
Change our Implementation, and Re-test
Let’s illustrate another key advantage of having unit tests. Let’s
assume during development we find an error in our code. For example, if
we run our code with factorial(10000) our Python program
from within the Python interpreter, it crashes with an exception:
OUTPUT
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/steve/factorial-example/mymath/factorial.py", line 11, in factorial
    return  n * factorial(n-1)
  File "/home/steve/factorial-example/mymath/factorial.py", line 11, in factorial
    return  n * factorial(n-1)
  File "/home/steve/factorial-example/mymath/factorial.py", line 11, in factorial
    return  n * factorial(n-1)
  [Previous line repeated 995 more times]
  File "/home/steve/factorial-example/mymath/factorial.py", line 8, in factorial
    if n == 0 or n == 1:
RecursionError: maximum recursion depth exceeded in comparisonIt turns out that our factorial function is recursive, which
means it calls itself. In order to compute the factorial of 10000, it
does that a lot. Python has a default limit for recursion of 1000, hence
the exception, which is a bit of a limitation in our implementation.
However, we can correct our implementation by changing it to use a
different method of calculating factorials that isn’t recursive. Edit
the mymath/factorial.py file and replace the function with
this one:
PYTHON
def factorial(n):
    """
    Calculate the factorial of a given number.
    :param int n: The factorial to calculate
    :return: The resultant factorial
    """
    factorial = 1
    for i in range(1, n + 1):
        factorial = factorial * i
    return factorialMake sure you replace the code in the factorial.py file,
and not the test_factorial.py file.
This is an iterative approach to solving factorial that isn’t recursive, and won’t suffer from the previous issue. It simply goes through the intended range of numbers and multiples it by a previous running total each time, but doesn’t do it recursively by calling itself. Notice that we’re not changing how the function is called, or its intended behaviour. So we don’t need to change the Python docstring here, since it still applies.
We now have our updated implementation, but we need to make sure it works as intended. Fortunately, we have our set of tests, so let’s run them again:
OUTPUT
test_3 (tests.test_factorial.TestFactorialFunctions) ... ok
test_5 (tests.test_factorial.TestFactorialFunctions) ... ok
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OKAnd they work, which gives us some confidence - very rapidly - that our new implementation is behaving exactly the same as before. So again, each time we change our code, whether it’s making small or large changes, we retest and check they all pass
[CHECKPOINT - who managed to write unit test and run it? Yes/No]
What makes a Good Test?
Of course, we only have 2 tests so far, and it would be good to have more But what kind of tests are good to write? With more tests that sufficiently test our code, the more confidence we have that our code is correct. We could keep writing tests for e.g., 10, 15, 20, and so on. But these become increasingly less useful, since they’re in much the same “space”. We can’t test all positive numbers, and it’s fair to say at a certain point, these types of low integers are sufficiently tested. So what test cases should we choose?
We should select test cases that test two things:
- The paths through our code, so we can check they work as we expect. For example, if we had a number of paths through the code dictated with if statements, we write tests to ensure those are followed. 
- We also need to test the boundaries of the input data we expect to use, known as edge cases. For example, if we go back to our code. we can see that there are some interesting edge cases to test for: 
- Zero? 
- Very large numbers (as we’ve already seen)? 
- Negative numbers? 
All good candidates for further tests, since they test the code in different ways, and test different paths through the code.
Testing for Failure
We’ve seen what happens if a test succeeds, but what happens if a
test fails? Let’s deliberately change our test to be wrong and find out,
by editing the tests/test_factorial.py file, changing the
expected result of factorial(3) to be 10, and
saving the file.
We’ll rerun our tests slightly differently than last time:
In this case, we add -v for more verbose output, giving
us detailed results test-by-test.
OUTPUT
test_3 (tests.test_factorial.TestFactorialFunctions) ... FAIL
======================================================================
FAIL: test_3 (tests.test_factorial.TestFactorialFunctions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/steve/factorial-example/tests/test_factorial.py", line 8, in test_3
    self.assertEqual(factorial(3), 10)
AssertionError: 6 != 10
----------------------------------------------------------------------
Ran 1 test in 0.000s
FAILED (failures=1)In this instance we get a FAIL instead of an
OK for our test, and we see an AssertionError
that 6 is not equal to 10, which is clearly
true.
Let’s now change our faulty test back by editing the file again,
changing the 10 back to 6, and re-run our
tests:
OUTPUT
test_3 (tests.test_factorial.TestFactorialFunctions) ... ok
----------------------------------------------------------------------
Ran 1 test in 0.000s
OKThis illustrates an important point with our tests: it’s important to make sure your tests are correct too. So make sure you work with known ‘good’ test data which has been verified to be correct!
- FIXME
Content from Handling Errors
Last updated on 2025-10-27 | Edit this page
Estimated time: 10 minutes
Overview
Questions
- FIXME
Objectives
- FIXME
How do we Handle Testing for Errors?
But what do we do if our code is expected to throw an error? How would we test for that?
Let’s try our code with a negative number, which we’ve already identified as a good test case, from within the python interpreter:
We can see that we get the result of 1, which is incorrect, since the factorial function is undefined for negative numbers.
Perhaps what we want in this case is to test for negative numbers as an invalid input, and display an exception if that is the case. How would we implement that, and how would we test for the presence of an exception?
In our implementation let’s add a check at the start of our function, which is known as a precondition. The precondition will check the validity of our input data before we do any processing on it, and this approach to checking function input data is considered good practice.
Edit the mymath/factorial.py file again, and add at the
start, below the docstring:
If we run it now, we should see our error:
OUTPUT
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/steve/factorial-example/mymath/factorial.py", line 9, in factorial
    raise ValueError('Only use non-negative integers.')
ValueError: Only use non-negative integers.Sure enough, we get our exception as desired. But how do we test for this in a unit test, since this is an exception, not a value? Fortunately, unit test frameworks have ways to check for this.
Let’s add a new test to tests/test_factorial.py:
So here, we use unittest’s built-in
assertRaises() (instead of assertEquals()) to
test for a ValueError exception occurring when we run
factorial(-1). We also use Python’s with here
to test for this within the call to factorial(). So if we
re-run our tests again, we should see them all succeed:
You should see:
OUTPUT
test_3 (tests.test_factorial.TestFactorialFunctions) ... ok
test_5 (tests.test_factorial.TestFactorialFunctions) ... ok
test_negative (tests.test_factorial.TestFactorialFunctions) ... ok
----------------------------------------------------------------------
Ran 3 tests in 0.000s
OKBrief Summary
So we now have the beginnings of a test suite! And every time we change our code, we can rerun our tests. So the overall process of development becomes:
- Add new functionality (or modify new functionality) to our code
- Potentially add new tests to test any new functionality
- Re-run all our tests
- FIXME