Byte-sized RSE: Unit Testing Code: All in One View

Last updated on 2025-11-19 | Edit this page

Estimated time: 15 minutes

Overview

Questions

Why should I test my code?
What is the role of automated testing?
What is the structure of a unit test?
What is test “mocking”?

Objectives

Explain the reasons why testing is important
Describe the three main types of tests and what each are used for
Describe the practice of test “mocking” and when to use it
Obtain example code repository and run existing unit tests
Describe the format of a unit test written for the Pytest testing framework

Testing is a critical part of writing reliable, maintainable code — especially in collaborative or research environments where reproducibility and correctness are key. In this session, we will explore why testing matters, and introduce different levels of testing — from small, focused unit tests, to broader integration and system tests that check how components work together. We will also look at testing approaches such as regression testing (to ensure changes do not break existing behavior) and property-based testing (to test a wide range of inputs automatically). Finally, we will cover mocking, a technique used to isolate code during tests by simulating the behavior of external dependencies.

Introduction to code testing

Code testing is the process of verifying that your code behaves as expected and continues to do so as it evolves. It helps catch bugs early, ensures changes do not unintentionally break existing functionality, and supports the development of more robust and maintainable software. Whether you’re working on a small script or a large application, incorporating testing into your workflow builds confidence in your code and makes collaboration and future updates much easier.

Why test your code?

Being able to demonstrate that a process generates the right results is important in any field of research, whether it is software generating those results or not. So when writing software we need to ask ourselves some key questions:

Does the code we develop works as expected?
To what extent are we confident of the accuracy of results that software produces?
Can we and others verify these assertions for themselves?

If we are unable to demonstrate that our software fulfills these criteria, why would anyone use it?

As a codebase grows, debugging becomes more challenging, and new code may introduce bugs or unexpected behavior in parts of the system it does not directly interact with. Tests can help catch issues before they become runtime bugs, and a failing test can pinpoint the source of the problem. Additionally, tests serve as invocation examples for other developers and users, making it easier for them to reuse the code effectively.

Having well-defined tests for our software helps ensure your software works correctly, reliably, and consistently over time. By identifying bugs early and confirming that new changes do not break existing functionality, testing improves code quality, reduces the risk of errors in production, and makes future development and long-term maintenance faster and safer.

Levels of Code Testing

Testing can be performed at different code levels, each serving a distinct purpose to ensure software behaves correctly at various stages of execution. Together, these testing levels provide a structured approach to improving software quality and reliability.

Unit testing is the most granular level, where individual components—like functions or classes—are tested in isolation to confirm they behave correctly under a variety of inputs. This makes it easier to identify and fix bugs early in the development process.

Integration testing builds on unit testing by checking how multiple components or modules work together. This level of testing helps catch issues that arise when components interact — such as unexpected data formats, interface mismatches, or dependency problems.

At the highest level, system testing evaluates the software as a complete, integrated system. This type of testing focuses on validating the entire application’s functionality from end to end, typically from the user’s perspective, including inputs, outputs, and how the system behaves under various conditions.

Approaches to Code Testing

Different approaches to code testing help ensure that software behaves as expected under a range of conditions. When the expected output of a function or program is known, tests can directly check that the results match fixed values or fall within a defined confidence interval.

However, for cases where exact outputs are not predictable — such as simulations with random elements — property-based testing is useful. This method tests a wide range of inputs to ensure that certain properties or patterns hold true across them.

Another important approach is regression testing, which helps detect when previously working functionality breaks due to recent changes in the code. By rerunning earlier tests, developers can catch and address these regressions early, maintaining software stability over time.

Mocking

When running tests, you often want to focus on testing a specific piece of functionality, but dependencies on external objects or functions can complicate this, as you cannot always be sure they work as expected. Mocking addresses this by allowing you to replace those dependencies with “mocked” objects or functions that behave according to your instructions. So, mocking is a testing approach used to isolate the unit of code being tested by replacing its dependencies with simplified, controllable versions — known as mocks.

Mocks mimic the behavior of real components (such as databases, APIs, or external services) without requiring their full functionality or availability. This allows developers to test specific code paths, simulate error conditions, or verify how a unit interacts with other parts of the system. Mocking is especially useful in unit and integration testing to ensure tests remain focused, fast and reliable.

For example, if a function modifies data and writes it to a file, you can mock the file-writing object, so instead of creating an actual file, the mocked object stores the “written” data. This enables you to verify that the data written is as expected, without actually creating a file, making tests more controlled and efficient.

Code style and linting are essential practices in code testing, as they help ensure that code is readable and maintainable by following established conventions, such as PEP8 in Python. Linting tools automatically check that code adheres to these style guidelines, reducing errors and improving consistency.

Continuous Integration (CI) further enhances testing practices by automating key processes, such as running tests and linting tools, every time code changes are committed. This helps catch issues early, maintain code quality, and streamline the development workflow. Together, these practices improve code reliability and make collaboration smoother.

Practical Work

In the rest of this session, we will walk you through writing tests for your code.

Content from Example Code

Last updated on 2025-11-19 | Edit this page

Estimated time: 10 minutes

Overview

Questions

How do I run a set of unit tests?
How do unit test frameworks operate?

Objectives

Obtain example code used for this lesson
Run a repository’s existing unit tests using a unit testing framework
Describe the typical format of a unit test

Creating a Copy of the Example Code Repository

For this lesson we’ll be using some example code available on GitHub, which we’ll clone onto our machines using the Bash shell. So firstly open a Bash shell (via Git Bash in Windows or Terminal on a Mac). Then, on the command line, navigate to where you’d like the example code to reside, and use Git to clone it. For example, to clone the repository in our home directory, and change our directory to the repository contents:

BASH

cd
git clone https://github.com/UNIVERSE-HPC/factorial-example
cd factorial-example

Examining the Code

Next, let’s take a look at the code, which is in the factorial-example/mymath directory, called factorial.py, so open this file in an editor.

The example code is a basic Python implementation of Factorial. Essentially, it multiplies all the whole numbers from a given number down to 1 e.g. given 3, that’s 3 x 2 x 1 = 6 - so the factorial of 3 is 6.

We can also run this code from within Python to show it working. In the shell, ensure you are in the root directory of the repository, then type:

BASH

python

PYTHON

Python 3.10.12 (main, Feb  4 2025, 14:57:36) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

Then at the prompt, import the factorial function from the mymath library and run it:

PYTHON

>>> from mymath.factorial import factorial
>>> factorial(3)

Which gives us 6 - which gives us some evidence that this function is working. Of course, in practice, our functions may well be more complicated than this, and of course, they may call other separate functions. Now we could just come up with a list of known input numbers and expected outputs and run each of these manually to test the code, but this would take some time. Computers are really good at one thing - automation - so let’s use that and automate our tests, to make it easy for ourselves.

Running the Tests

As it turns out, this code repository already has a test. Navigate to the repository’s tests directory, and open a file called test_factorial.py:

PYTHON

import unittest
from mymath.factorial import factorial


class TestFactorialFunctions(unittest.TestCase):

    def test_3(self):
        self.assertEqual(factorial(3), 6)

Now, we using a Python unit test framework called unittest. There are other such frameworks for Python, including nose and pytest which is very popular, but the advantage of using unittest is that it’s already built-in to Python so it’s easier for us to use it.

Before we look into this example unit test, questions Who here is familiar with object oriented programming? Yes/No Who’s written an object oriented program? Yes/No

Callout

What is Object Oriented Programming?

For those that aren’t familiar with object oriented programming, it’s a way of structuring your programs around the data of your problem. It’s based around the concept of objects, which are structures that contain both data and functions that operate on that data. In object oriented programming, objects are used to model real-world entities, such as people, bank accounts, libraries, books, even molecules, and so on. With each object having its own:

data - known as attributes
functions - known as methods

These are encapsulated within a defined structure known as a class. An introduction to object oriented programming is beyond the scope of this session, but if you’d like to know more there’s a great introductory tutorial on the RealPython site. This site is a great practical resource for learning about how to do many things in Python!

For the purposes of this activity, we use object oriented classes to encapsulate our unit tests since that’s how they’re defined in the unittest framework. You can consider them as a kind of syntactic sugar to group our tests together, 2ith a single unit test being represented as a single function - or method - within a class.

In this example, we have a class called TestFactorialFunctions with a single unit test, which we’ve called test_3. Within that test method, we are essentially doing what we did when we ran it manually earlier: we’re running factorial with the argument 3, and checking it equals 6. We use an inbuilt function, or method, in this class called assertEqual, that checks the two are the same, and if not, the test will fail.

So how do we run this test? In the shell, we can run this test by ensuring we’re in the repository’s root directory, and running:

BASH

python -m unittest tests/test_factorial.py

OUTPUT

.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

Checkpoint: Attendee progress

Who has run the tests and got this output?

So what happens? We see a single ., we see a message that says it ran very quickly, and OK. The single dot means the single test we have was successfully run, so our test passes!

But how does unittest know what to run exactly? Unit test frameworks like unitttest follow a common pattern of finding tests and running them. When we give a single file argument to unittest, it searches the Python file for unittest.TestCase classes, and within those classes, looks for methods starting with test_, and runs them. So we could add more tests in this class in the same way, and it would run each in turn. We could even add multiple unittest.TestCase classes here if we wanted, each testing different aspects of our code for example, and unittest would search all of these classes and run each test_ function in turn.

Key Points

Unittest is a built-in Python unit testing framework
Other popular unit testing frameworks for Python include pytest and nose
Object oriented programming is a way of encapsulating data and the functions that operate on that data together
Run a set of Unittest unit tests using python -m unittest followed by the script filename containing the tests that begins with test_

Content from Creating a New Test

Last updated on 2025-11-19 | Edit this page

Estimated time: 10 minutes

Overview

Questions

How do I write a unit test?
How do I write a unit test that tests for an error?

Objectives

Implement and run unit tests to verify the correct behaviour of program functions
Describe how and when testing fits into code development
Write a unit test that tests for an expected error

Add a New Test

As we’ve mentioned, adding a new unit test is a matter of adding a new test method. Let’s add one to test the number 5. Edit the tests/test_factorial.py file again:

PYTHON

  def test_5(self):
    self.assertEqual(factorial(5), 120)

Checkpoint: Attendee progress

Who has finished editing the file?

And then we can run it exactly as before, in the shell

BASH

python -m unittest -v tests/test_factorial.py

OUTPUT

test_3 (tests.test_factorial.TestFactorialFunctions) ... ok
test_5 (tests.test_factorial.TestFactorialFunctions) ... ok

----------------------------------------------------------------------
Ran 2 tests in 0.000s

OK

We can see the tests pass. So the really useful thing here, is we can rapidly add tests and rerun all of them. Particularly with more complex codes that are harder to reason about, we can develop a set of tests into a suite of tests to verify the codes’ correctness. Then, whenever we make changes to our code, we can rerun our tests to make sure we haven’t broken anything. An additional benefit is that successfully running our unit tests can also give others confidence that our code works as expected.

Checkpoint: Running the new test

Who has managed to run this with their new unit test?

Change our Implementation, and Re-test

Let’s illustrate another key advantage of having unit tests. Let’s assume during development we find an error in our code. For example, if we run our code with factorial(1080) our Python program from within the Python interpreter, it crashes with an exception:

PYTHON

>>> from mymath.factorial import factorial
>>> factorial(1080)

OUTPUT

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/user/Documents/factorial-example/mymath/factorial.py", line 11, in factorial
    return  n * factorial(n-1)
                ^^^^^^^^^^^^^^
  File "/Users/user/Documents/factorial-example/mymath/factorial.py", line 11, in factorial
    return  n * factorial(n-1)
                ^^^^^^^^^^^^^^
  File "/Users/user/Documents/factorial-example/mymath/factorial.py", line 11, in factorial
    return  n * factorial(n-1)
                ^^^^^^^^^^^^^^
  [Previous line repeated 996 more times]
RecursionError: maximum recursion depth exceeded

It turns out that our factorial function is recursive, which means it calls itself. In order to compute the factorial of 1080, it does that a lot. Python has a default limit for recursion of 1000, hence the exception, which is a bit of a limitation in our implementation. However, we can correct our implementation by changing it to use a different method of calculating factorials that isn’t recursive. Edit the mymath/factorial.py file and replace the function with this one:

PYTHON

def factorial(n):
    """
    Calculate the factorial of a given number.

    :param int n: The factorial to calculate
    :return: The resultant factorial
    """
    factorial = 1
    for i in range(1, n + 1):
        factorial = factorial * i
    return factorial

Make sure you replace the code in the factorial.py file, and not the test_factorial.py file.

This is an iterative approach to solving factorial that isn’t recursive, and won’t suffer from the previous issue. It simply goes through the intended range of numbers and multiples it by a previous running total each time, but doesn’t do it recursively by calling itself. Notice that we’re not changing how the function is called, or its intended behaviour. So we don’t need to change the Python docstring here, since it still applies.

We now have our updated implementation, but we need to make sure it works as intended. Fortunately, we have our set of tests, so let’s run them again:

BASH

python -m unittest -v tests/test_factorial.py

OUTPUT

test_3 (tests.test_factorial.TestFactorialFunctions) ... ok
test_5 (tests.test_factorial.TestFactorialFunctions) ... ok

----------------------------------------------------------------------
Ran 2 tests in 0.000s

OK

And they work, which gives us some confidence - very rapidly - that our new implementation is behaving exactly the same as before. So again, each time we change our code, whether it’s making small or large changes, we retest and check they all pass

Checkpoint: Update the code and run tests again

Who has managed to write the unit test and run it?

Callout

What makes a Good Test?

Of course, we only have 2 tests so far, and it would be good to have more But what kind of tests are good to write? With more tests that sufficiently test our code, the more confidence we have that our code is correct. We could keep writing tests for e.g., 10, 15, 20, and so on. But these become increasingly less useful, since they’re in much the same “space”. We can’t test all positive numbers, and it’s fair to say at a certain point, these types of low integers are sufficiently tested. So what test cases should we choose?

We should select test cases that test two things:

The paths through our code, so we can check they work as we expect. For example, if we had a number of paths through the code dictated with if statements, we write tests to ensure those are followed.
We also need to test the boundaries of the input data we expect to use, known as edge cases. For example, if we go back to our code. we can see that there are some interesting edge cases to test for:
Zero?
Very large numbers (as we’ve already seen)?
Negative numbers?

All good candidates for further tests, since they test the code in different ways, and test different paths through the code.

Testing for Failure

We’ve seen what happens if a test succeeds, but what happens if a test fails? Let’s deliberately change our test to be wrong and find out, by editing the tests/test_factorial.py file, changing the expected result of factorial(3) to be 10, and saving the file.

We’ll rerun our tests slightly differently than last time:

BASH

python -m unittest -v tests/test_factorial.py

In this case, we add -v for more verbose output, giving us detailed results test-by-test.

OUTPUT

test_3 (tests.test_factorial.TestFactorialFunctions) ... FAIL

======================================================================
FAIL: test_3 (tests.test_factorial.TestFactorialFunctions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/steve/factorial-example/tests/test_factorial.py", line 8, in test_3
    self.assertEqual(factorial(3), 10)
AssertionError: 6 != 10

----------------------------------------------------------------------
Ran 1 test in 0.000s

FAILED (failures=1)

In this instance we get a FAIL instead of an OK for our test, and we see an AssertionError that 6 is not equal to 10, which is clearly true.

Let’s now change our faulty test back by editing the file again, changing the 10 back to 6, and re-run our tests:

BASH

python -m unittest -v tests/test_factorial.py

OUTPUT

test_3 (tests.test_factorial.TestFactorialFunctions) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

This illustrates an important point with our tests: it’s important to make sure your tests are correct too. So make sure you work with known ‘good’ test data which has been verified to be correct!

Key Points

Unit tests check that functions behave as expected
Re-running tests ensures code changes don’t break behaviour
Use edge cases and different paths through your code to create effective tests

Content from Handling Errors

Last updated on 2025-11-19 | Edit this page

Estimated time: 10 minutes

Overview

Questions

How to write tests for code that might have an expected error?
How can tests check that the correct exception is raised?

Objectives

Understand why we should test error cases
Use assertRaises to test that an error is raised
Add a precondition to validate input and raise appropriate errors

How do we Handle Testing for Errors?

But what do we do if our code is expected to throw an error? How would we test for that?

Let’s try our code with a negative number, which we’ve already identified as a good test case, from within the python interpreter:

PYTHON

>>> from mymath.factorial import factorial
>>> factorial(-1)

We can see that we get the result of 1, which is incorrect, since the factorial function is undefined for negative numbers.

Perhaps what we want in this case is to test for negative numbers as an invalid input, and display an exception if that is the case. How would we implement that, and how would we test for the presence of an exception?

In our implementation let’s add a check at the start of our function, which is known as a precondition. The precondition will check the validity of our input data before we do any processing on it, and this approach to checking function input data is considered good practice.

Edit the mymath/factorial.py file again, and add at the start, below the docstring:

PYTHON

if n < 0:
    raise ValueError('Only use non-negative integers.')

If we run it now, we should see our error:

PYTHON

>>> from mymath.factorial import factorial
>>> factorial(-1)

OUTPUT

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/steve/factorial-example/mymath/factorial.py", line 9, in factorial
    raise ValueError('Only use non-negative integers.')
ValueError: Only use non-negative integers.

Checkpoint: Attendee progress

Who has added the precondition and run the code?

Sure enough, we get our exception as desired. But how do we test for this in a unit test, since this is an exception, not a value? Fortunately, unit test frameworks have ways to check for this.

Let’s add a new test to tests/test_factorial.py:

PYTHON

    def test_negative(self):
        with self.assertRaises(ValueError):
          factorial(-1)

So here, we use unittest’s built-in assertRaises() (instead of assertEquals()) to test for a ValueError exception occurring when we run factorial(-1). We also use Python’s with here to test for this within the call to factorial(). So if we re-run our tests again, we should see them all succeed:

BASH

python -m unittest -v tests/test_factorial.py

You should see:

OUTPUT

test_3 (tests.test_factorial.TestFactorialFunctions) ... ok
test_5 (tests.test_factorial.TestFactorialFunctions) ... ok
test_negative (tests.test_factorial.TestFactorialFunctions) ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.000s

OK

Checkpoint: Add a unit test for errors

Who has added the new test and run it?

Brief Summary

So we now have the beginnings of a test suite! And every time we change our code, we can rerun our tests. So the overall process of development becomes:

Add new functionality (or modify new functionality) to our code
Potentially add new tests to test any new functionality
Re-run all our tests

Key Points

Code should validate their inputs and raise clear exceptions when needed
unittest can test for exceptions with assertRaises
Testing errors is part of good testing practice

Overview

Questions

Objectives

Introduction to code testing

Why test your code?

Levels of Code Testing

Approaches to Code Testing

Mocking

Related Practices

Practical Work

Overview

Questions

Objectives

Creating a Copy of the Example Code Repository

BASH

Examining the Code

BASH

PYTHON

PYTHON

Running the Tests

PYTHON

What is Object Oriented Programming?

BASH

OUTPUT

Checkpoint: Attendee progress

Overview

Questions

Objectives

Add a New Test

PYTHON

Checkpoint: Attendee progress

BASH

OUTPUT

Checkpoint: Running the new test

Change our Implementation, and Re-test

PYTHON

OUTPUT

PYTHON

BASH

OUTPUT

Checkpoint: Update the code and run tests again

What makes a Good Test?

Testing for Failure

BASH

OUTPUT

BASH

OUTPUT

Overview

Questions

Objectives

How do we Handle Testing for Errors?

PYTHON

PYTHON

PYTHON

OUTPUT

Checkpoint: Attendee progress

PYTHON

BASH

OUTPUT

Checkpoint: Add a unit test for errors

Brief Summary