Write Reliable Code
Last updated on 2025-04-17 | Edit this page
Estimated time: 55 minutes
Overview
Questions
- What is reliable code?
- What is testing? Why do we care?
Objectives
- Understanding the importance of code testing
- Understanding basic unit testing concepts
- Learning Pytest syntax and structure
- Practicing test-driven development
Reliable code and the importance of testing
Reliable code consistently performs as expected, manages unexpected situations gracefully, and functions correctly over time. While error handling deals with managing failures, reliability provides assurance that your code consistently delivers expected results. In simple terms, reliable code is code that can be trusted. However, writing reliable code is not a one-time task; it often requires continuous revision and improvement.
The first step towards achieving reliable code is testing. Code testing ensures reliability by systematically checking the code under controlled conditions to identify any errors or weaknesses. This approach allows you to catch errors and bugs early in the development process, which contributes to writing more structured and maintainable code. Complex and unstructured code is always challenging to test and maintain. Think about unstructured code as a precariously set-up game of dominoes. Each domino represents a piece of code, and they are arranged haphazardly. If you inadvertently knock over one domino while attempting to adjust another, the entire arrangement is affected, causing a cascade of errors. In contrast, structured and well-organised code is like a carefully planned domino setup where each piece is meticulously placed. Adjusting one piece does not lead to a domino effect, ensuring the system remains stable and reliable.
Testing helps you to organise pieces of code together, making sure that each of them is stable and can be replaced or modified without destroying anything else.
What is testing?
Testing in software development is a systematic process aimed at evaluating and verifying that a piece of software functions as intended. Testing aims to identify bugs, ensure the software’s reliability and performance, and validate that the software meets all specified requirements. There are several types of testing, each pertinent to different scenarios. For instance, integration testing assesses the interactions between various pieces of code, while system testing evaluates the software as a whole. Reference testing compares the results produced by your software against those generated by another existing software.
Unit testing is often the first level of testing. A “unit” refers to the smallest component that can be tested in your code, a function, a class, or module. Unit testing aims to ensure that each component behaves as expected.
Test Driven Development
There are two general approaches to writing code and its tests. The traditional one must first write the code and then write testing functions to validate its functionality. Alternatively, one can write tests and then write code that passes these tests. This is called Test Driven Development, or TDD. Although it seems counterintuitive, this approach helps to develop code alongside testing, leading to a robust code base. Also, when a project has precise requirements, this approach helps to ensure that the code satisfies them, as each requirement can be used as a test case.
In this tutorial, we will develop code using the TDD approach.
How does it work?

TDD is based on an iterative cycle known as “Red-Green-Refactor” (see figure)
- Write a test to verify the functionality of the code.
- Red: Run the test and see if it fails. Remember, the code does not exist yet.
- Green: Write minimal code that passes the test
- Refactor: Improve your code
The above steps can be repeated for adding a new feature or functionality.
Install Pytest
Before going any further, we need to ensure that we have the necessary package. In a Python shell, check if Pytest is installed. You can do this simply by importing the Pytest library:
If the command returns a
ModuleNotFoundError
message, then you can
run:
Challenge: a rolling dice game
Let us write code to simulate a rolling dice game. The dice game’s goal is to precisely reach the target score, for example, 21 points, by rolling dice without going over. Each roll adds to the player total score, and if they exceed 21, they lose. The player who reaches 21 in the fewest number of rolls wins!
Before proceeding to the code, we need to define and understand the requirements. Requirements serve as clear guidelines that define what the code should do and how it should behave. By implementing these requirements into the code, we can ensure it meets all specified needs and constraints. Requirements also provide clear criteria for what to test to ensure code functionality and verify that the code behaves as expected.
Let us define the requirements.
-
Core Requirements
-
Dice Rolling Function:
- Requirement 1: The function must generate random numbers between 1 and the number of dice sides. The number of sides must be six by default, but a user can create custom dice.
- Requirement 2: - The dice rolling system must be fair and unbiased.
-
Game logic
- Requirement 3: The users can change the target score.
- Requirement 4: Track the total score and count the number of rolls.
- Requirement 5 : Implement the target score. The code should stop when the target score is reached, or the total score exceeds the target score.
-
Dice Rolling Function:
The requirements above define and regulate the behaviour of the code. However, as the authors of the code, we should define some implementation requirements, which helps us write good code from the start.
-
Implementation Requirements
- Every function should check for valid input.
- The code should handle errors gracefully.
- The code must use meaningful names.
- The code must include docstrings and comments.
We can use the above requirements as guidelines to write our code.
The following activities help us write the code step by step, using a
Test-Driven Development (TDD) approach. We will use the Pytest package.
The general idea is to use each requirement as a guideline, write
relevant tests to check each one, and then develop code. You should
develop two Python files: dice_game.py
, which contains all
the core functions, and test_dice_game.py
, which contains
all the unit tests.
Make sure that the two files are in the same location
Requirement 1: The dice rolling function
Step1: Write a test and see it fails.
Let’s start by considering the following requirement:
Dice Rolling Function: Must generate random numbers between 1 and the number of die sides. The number of sides must be six by default, but a user can create custom dice.
So the first step is to write a function roll_die
that
takes the number of sides as input. By default, the number of sides must
be six.
Open the file dice_game.py
in the editor and type the
following function:
PYTHON
def roll_die(sides=6):
"""Roll a die with the specified number of sides.
Args:
sides (int, optional): Number of sides. Defaults to 6.
"""
pass
The pass
statement defines a null operation; it is a
placeholder for future code.
The next step in TDD is to design a unit test for this function. Note
that as it is, roll_die()
is designed to fail all the
tests.
Open the file test_dice_game.py
and import the following
modules:
The first import statement tells Python to use the Pytest framework, while the second explicitly says what function (or unit) to use for testing.
When the number of sides is six, the function roll_die
should return a value between 1 (lowest score) and 6 (highest score).
This requirement helps to write the first test. Keep working on the file
test_dice_game.py
and add the following test.
Save the file, and in a command-line terminal, run:
:>pytest
The output should be similar to the one below.
test_dice_game.py F
================================== FAILURES ====================================
__________________________________ test_roll_die________________________________
def test_roll_die():
result = roll_die(sides=6)
> assert 1 <= result <= 6
E TypeError: '<=' not supported between instances of 'int' and 'NoneType'
test_dice_game.py:8: TypeError
============================short test summary info=============================
FAILED test_dice_game.py::test_roll_die - TypeError: '<=' not supported between instances of 'int' and 'NoneType'
==============================1 failed in 0.43s=================================
As expected, the test failed.
Step2 : write minimal code to pass the test.
Once the test_roll_dice()
is in place, it is time to
develop the actual function. Remember that this function must generate a
random integer number (score
) between 1 and the number of
sides. Therefore, the code must use a random number generator, like the
Numpy Random module. So modify the file dice_game.py
to
import numpy
(using np
as an alias) and change
the function roll_dice
to return the score.
PYTHON
def roll_die(sides=6):
"""Roll a die with the specified number of sides.
Args:
sides (int, optional): Number of sides. Defaults to 6.
Returns:
int: Rolling score
"""
score = np.random.randint(1, sides+1)
return score
And run pytest
again.
test_dice_game.py . [100%]
=============================1 passed in 0.19s==================================
Point to the use of sides+1
in the code. This ensures
the random numbers are generated within [1, sides] (inclusive).
Challenge
The function roll_die()
should work for custom dice, or
rather, when the user decides to use a die with a different number of
sides. Add a test to the test_dice_game.py
that checks this
requirement as well.
Run pytest
to check your code. The test should pass
smoothly.
The above challenge is designed as a pure exercise to check student
understanding. It can be used to discuss the similarity with the
previous code and to introduce the
pytest.mark.parametrize
.
Testing function with multiple arguments
Often, you need to verify the same behaviour with different inputs.
You could use multiple assertions in the same function or repeat the
same function multiple times but change the testing condition. This
approach leads to several problems, such as code duplication. The best
approach is to use pytest.mark.parametrize
,
which enables the parametrisation of arguments for a test.
The pytest.mark.parametrize
takes the argument’s name
defined in the function to test (“sides” in the example) as input and
passes different values as a list. When running the test, Pytest runs
the test function using each value, checking that the given input leads
to the expected results.
To show how it works, let’s say we want to test the
roll_dice
function by passing different values for the
variable sides
. We can change test_roll_die
function to
Using Pytest to test exception
So far, we used Pytest to check that the code produced the expected results. However, an essential part of testing is checking that the code manages exceptions correctly. For example, we need to ensure that the code can handle errors smoothly or that the correct exception type is raised when the code is not used correctly.
In Pytest, we can use the pytest.raise
to check the type
of exceptions and the error message ( read
more about pytest.raise
). For example, let’s say we
create a function to divide two numbers like the one below
PYTHON
def divide_number(numerator,denominator):
if denominator == 0 :
raise ZeroDivisionError("Cannot divide by zero")
return numerator/denominator
We can use pytest.raises
to check the expected behaviour
when the denominator is zero. If the correct exception is raised, the
test passes; if no exception is raised or a different type is raised,
the test fails.
Step3 : Incorporate input validation
One of the requirements is to handle errors gracefully. Currently,
the function roll_die
does not check its input. Clearly,
the number of sides should not be less than 0, and since it is a game,
it is unreasonable to think about a die of only two sides. So, let us
add another test to test_dice_game.py
to check the error
handling when the number of sides is incorrect.
The test will fail because we have not yet changed the
roll_die
function. Can you modify the function to pass the
test?
PYTHON
def roll_die(sides=6):
"""Roll a die with the specified number of sides.
Args:
slide (int, optional): Number of sides. Defaults to 6.
Returns:
int: Rolling score
Raise:
ValueError: When the number of sides is less than 2.
"""
if sides <=2:
raise ValueError('Number of sides must be greater than 2')
score = np.random.randint(1, sides+1)
return score
Highlight the fact that the docstrings should change as the code changes.
Requirement 2: The dice rolling system must be fair and unbiased.
Checking if a die-rolling system is fair involves understanding theoretical expectations and practical verification methods. For a die with N sides to be fair, we expect that
- Each number has the same probability of appearing ( \(\rm{p}_{i} = \frac{1}{N}\))
- Average roll value to converge to the theoretical mean (\(\frac{1}{N}\sum_i^N X_i\), where \(N\) is the number of sides, and \(X_i\) is the individual score).
- The distribution becomes uniform over a large number of rolls.
Considering this expectation, we can create tests to verify that our die is unbiased.
A possible approach is to use the average roll value and compare it
with the theoretical mean. This approach requires writing code in
dice_game.py
that calculates the mean score for a given
number of rolls and then compares it to the expected values. The test
function in test_dice_game.py
will check that the two
values are equal.
The caveat in this approach is that comparing two non-integer values is not straightforward, as there are often numerical errors and approximations. For example, have a look at the output below
This is a common problem in testing and usually requires asserting that two floating-point numbers are equal within some tolerance, something like
where the tolerance is \(1\times10^{-6}\). Writing this type of
tests is tedious and usually requires code repetition. However, Pytest
has a built-in method that can help solve this problem: pytest.approx
.
By default, pytest.approx
uses a \(1\times10^{-6}\) tolerance. There might be
situations where this value is not adequate. For example, let us say
that the expected value is 0.62, but the function might return 0.6199
instead. Then
PYTHON
result = 0.6199
assert result == pytest.approx(0.62), "Results don't match"
[...]
AssertionError: Results don't match
A better check therefore might be
where we changed the tolerance to \(1\times10^{-3}\). An alternative solution
might be to use the round()
function to approximate the
solution to a given number of decimals.
Write test_average_roll
Let’s start by writing the calculate_average_function
in
dice_game.py
PYTHON
def calculate_average_rolls(sides, number_of_rolls):
"""Calculate average roll over multiple trials.
Args:
sides (int): Number of die sides.
number_of_rolls (int): number of rolls
"""
pass
Then, add the corresponding test function in
test_dice_game.py
.
Remember to import the new function calculate_average_rolls
PYTHON
def test_average_roll():
"""Test the die is unbiased.
Assert that the average score over a number of rolls is equal to the expected average
"""
average = calculate_average_rolls(sides=6, num_of_rolls=1000)
EXPECTED_AVERAGE = 3.5 #expected average for a 6-sided die.
assert round(average,1) == pytest.approx(EXPECTED_AVERAGE)
Now, modify the code
calculate_average_rolls(sides, number_of_rolls)
. An example
can be:
PYTHON
def calculate_average_rolls(sides, number_of_rolls):
"""Calculate average roll over multiple trials.
Args:
sides (int): Number of die sides.
number_of_rolls (int): number of rolls
Raises:
ValueError: When the number of rolls is less than or equal to 1.
Returns:
float: Average total score
"""
if number_of_rolls <=1:
raise ValueError("""Number of rolls should be greater than 1.
For better results, it should be at least 1e3""")
total_score =[roll_die(sides) for n in range(number_of_rolls)]
average = np.mean(total_score)
return average
Take some time to read and explain the code to the students. Things
to focus on: - The EXPECTED_AVERAGE is calculated for a six-sided die.
The code can be extended to test other values, for example, by combining
it with pytest.mark.parametrize
:
PYTHON
@Pytest.mark.parametrize("sides, num_rolls", [
(6, 10000),(10, 10000)
])
def test_average_roll(sides, num_rolls):
"""Test the die is unbiased.
Assert that the average score over a number of rolls is equal to the expected average for different dice.
"""
average = calculate_average_rolls(sides, num_rolls)
EXPECTED_AVERAGE = sum(range(1,sides+1)) / sides
assert round(average,1) == pytest.approx(EXPECTED_AVERAGE)
- The use of
round
is to fix approximation errors, especially when the number of rolls is small. - Discuss that the test should not take too long to run. So using a very large number of rolls can provide better results, but it might also take too long to run. For real-life case, you want to balance execution time and accuracy.
To test your code, you can run pytest
as before.
Simulate the game
We developed and tested code to simulate a die’s rolling and ensured the system was unbiased. For this challenge, you need to write code that simulates the game. As for the previous case, you can focus on the following requirements and develop code and tests to complete the challenge. The requirements are
-
Game logic
- Requirement 3: The users can change the target score.
- Requirement 4: Track the total score and count the number of rolls.
- Requirement 5 : Implement the target score. The code should stop when the target score is reached or the total score exceeds the target score.
You can call your game function play_game
.
In test_dice_game.py
PYTHON
def test_game_completition():
""" Test that the game returns a reasonable number of rolls
"""
rolls_needed, total = play_game(target_score=6)
assert rolls_needed > 0
In dice_game.py
PYTHON
def play_game(target_score = 21):
"""Roll the die until total score reaches or overcome the total score
Args:
target_score (int, optional): The target score. Defaults to 21.
Returns:
tuple (int, int): Number of rolls and total score
"""
total = 0
number_of_rolls = 0
while total<target_score:
roll = roll_die()
total = total + roll
number_of_rolls = number_of_rolls +1
return number_of_rolls, total
Key Points
- Testing increases trust in code and its results.
- Test Driven Development helps to write tests alongside code
- Pytest offers a powerful testing framework from Python code,
offering different built-in functions and methods to test various
aspects of the code:
-
@pytest.mark.parametrize
: To test different input values -
pytest.raises
: To test error handling
-