Write Reliable Code

Last updated on 2025-07-29 | Edit this page

Estimated time: 55 minutes

Overview

Questions

What is reliable code?
What is testing? Why do we care?

Objectives

Understanding the importance of code testing
Understanding basic unit testing concepts
Learning Pytest syntax and structure
Practicing test-driven development

Reliable code and the importance of testing

Reliable code consistently performs as expected, manages unexpected situations gracefully, and functions correctly over time. While error handling deals with managing failures, reliability provides assurance that your code consistently delivers expected results. In simple terms, reliable code is code that can be trusted. However, writing reliable code is not a one-time task; it often requires continuous revision and improvement.

The first step towards achieving reliable code is testing. Code testing ensures reliability by systematically checking the code under controlled conditions to identify any errors or weaknesses. This approach allows you to catch errors and bugs early in the development process, which contributes to writing more structured and maintainable code. Complex and unstructured code is always challenging to test and maintain. Think about unstructured code as a precariously set-up game of dominoes. Each domino represents a piece of code, and they are arranged haphazardly. If you inadvertently knock over one domino while attempting to adjust another, the entire arrangement is affected, causing a cascade of errors. In contrast, structured and well-organised code is like a carefully planned domino setup where each piece is meticulously placed. Adjusting one piece does not lead to a domino effect, ensuring the system remains stable and reliable.

Testing helps you to organise pieces of code together, making sure that each of them is stable and can be replaced or modified without destroying anything else.

What is testing?

Testing in software development is a systematic process aimed at evaluating and verifying that a piece of software functions as intended. Testing aims to identify bugs, ensure the software’s reliability and performance, and validate that the software meets all specified requirements. There are several types of testing, each pertinent to different scenarios. For instance, integration testing assesses the interactions between various pieces of code, while system testing evaluates the software as a whole. Reference testing compares the results produced by your software against those generated by another existing software.

Unit testing is often the first level of testing. A “unit” refers to the smallest component that can be tested in your code, a function, a class, or module. Unit testing aims to ensure that each component behaves as expected.

Test Driven Development

There are two general approaches to writing code and its tests. The traditional one must first write the code and then write testing functions to validate its functionality. Alternatively, one can write tests and then write code that passes these tests. This is called Test Driven Development, or TDD. Although it seems counterintuitive, this approach helps to develop code alongside testing, leading to a robust code base. Also, when a project has precise requirements, this approach helps to ensure that the code satisfies them, as each requirement can be used as a test case.

In this tutorial, we will develop code using the TDD approach.

How does it work?

Illustration showing three phases of Test-Driven Development cycle: Red (failing test), Green (passing test), and Refactor (improving code)

TDD is based on an iterative cycle known as “Red-Green-Refactor” (see figure)

Write a test to verify the functionality of the code.
Red: Run the test and see if it fails. Remember, the code does not exist yet.
Green: Write minimal code that passes the test
Refactor: Improve your code

The above steps can be repeated for adding a new feature or functionality.

Install Pytest

Before going any further, we need to ensure that we have the necessary package. In a Python shell, check if Pytest is installed. You can do this simply by importing the Pytest library:

PYTHON

import pytest

If the command returns a ModuleNotFoundError message, then you can run:

PYTHON

pip install pytest

Challenge: a rolling dice game

Let us write code to simulate a rolling dice game. The dice game’s goal is to precisely reach the target score, for example, 21 points, by rolling dice without going over. Each roll adds to the player total score, and if they exceed 21, they lose. The player who reaches 21 in the fewest number of rolls wins!

Before proceeding to the code, we need to define and understand the requirements. Requirements serve as clear guidelines that define what the code should do and how it should behave. By implementing these requirements into the code, we can ensure it meets all specified needs and constraints. Requirements also provide clear criteria for what to test to ensure code functionality and verify that the code behaves as expected.

Let us define the requirements.

Core Requirements
- Dice Rolling Function:
  - Requirement 1: The function must generate random numbers between 1 and the number of dice sides. The number of sides must be six by default, but a user can create custom dice.
  - Requirement 2: - The dice rolling system must be fair and unbiased.
- Game logic
  - Requirement 3: The users can change the target score.
  - Requirement 4: Track the total score and count the number of rolls.
  - Requirement 5 : Implement the target score. The code should stop when the target score is reached, or the total score exceeds the target score.

The requirements above define and regulate the behaviour of the code. However, as the authors of the code, we should define some implementation requirements, which helps us write good code from the start.

Implementation Requirements
- Every function should check for valid input.
- The code should handle errors gracefully.
- The code must use meaningful names.
- The code must include docstrings and comments.

We can use the above requirements as guidelines to write our code. The following activities help us write the code step by step, using a Test-Driven Development (TDD) approach. We will use the Pytest package. The general idea is to use each requirement as a guideline, write relevant tests to check each one, and then develop code. You should develop two Python files: dice_game.py, which contains all the core functions, and test_dice_game.py, which contains all the unit tests.

Make sure that the two files are in the same location

Requirement 1: The dice rolling function

Step1: Write a test and see it fails.

Let’s start by considering the following requirement:

Dice Rolling Function: Must generate random numbers between 1 and the number of die sides. The number of sides must be six by default, but a user can create custom dice.

So the first step is to write a function roll_die that takes the number of sides as input. By default, the number of sides must be six.

Open the file dice_game.py in the editor and type the following function:

PYTHON

def roll_die(sides=6):
"""Roll a die with the specified number of sides.

    Args:
        sides (int, optional): Number of sides. Defaults to 6.
"""
pass

The pass statement defines a null operation; it is a placeholder for future code.

The next step in TDD is to design a unit test for this function. Note that as it is, roll_die() is designed to fail all the tests.

Open the file test_dice_game.py and import the following modules:

PYTHON

import pytest
from dice_game import roll_die

The first import statement tells Python to use the Pytest framework, while the second explicitly says what function (or unit) to use for testing.

When the number of sides is six, the function roll_die should return a value between 1 (lowest score) and 6 (highest score). This requirement helps to write the first test. Keep working on the file test_dice_game.py and add the following test.

PYTHON

def test_roll_die():
  score=roll_die(sides=6)
  assert 1<=score<=6

Save the file, and in a command-line terminal, run:

:>pytest

Show me the solution

The output should be similar to the one below.

test_dice_game.py F
================================== FAILURES ====================================
__________________________________ test_roll_die________________________________

    def test_roll_die():
        result = roll_die(sides=6)
>       assert 1 <= result <= 6
E       TypeError: '<=' not supported between instances of 'int' and 'NoneType'

test_dice_game.py:8: TypeError
============================short test summary info=============================
FAILED test_dice_game.py::test_roll_die - TypeError: '<=' not supported between instances of 'int' and 'NoneType'
==============================1 failed in 0.43s=================================

As expected, the test failed.

Step2 : write minimal code to pass the test.

Once the test_roll_dice() is in place, it is time to develop the actual function. Remember that this function must generate a random integer number (score) between 1 and the number of sides. Therefore, the code must use a random number generator, like the Numpy Random module. So modify the file dice_game.py to import numpy (using np as an alias) and change the function roll_dice to return the score.

PYTHON

def roll_die(sides=6):
  """Roll a die with the specified number of sides.

    Args:
        sides (int, optional): Number of sides. Defaults to 6.
    Returns:
          int: Rolling score
  """
  score = np.random.randint(1, sides+1)
  return score

And run pytest again.

Show me the solution

test_dice_game.py .                           [100%]

=============================1 passed in 0.19s==================================

Instructor Note

Point to the use of sides+1 in the code. This ensures the random numbers are generated within [1, sides] (inclusive).

Challenge

The function roll_die()should work for custom dice, or rather, when the user decides to use a die with a different number of sides. Add a test to the test_dice_game.py that checks this requirement as well.

Run pytest to check your code. The test should pass smoothly.

Show me the solution

Using a test die with 20 sides:

PYTHON

def test_custom_die():
    result = roll_die(sides=20)
    assert 1<=result<= 20

Instructor Note

The above challenge is designed as a pure exercise to check student understanding. It can be used to discuss the similarity with the previous code and to introduce the pytest.mark.parametrize.

Testing function with multiple arguments

Often, you need to verify the same behaviour with different inputs. You could use multiple assertions in the same function or repeat the same function multiple times but change the testing condition. This approach leads to several problems, such as code duplication. The best approach is to use pytest.mark.parametrize, which enables the parametrisation of arguments for a test.

The pytest.mark.parametrize takes the argument’s name defined in the function to test (“sides” in the example) as input and passes different values as a list. When running the test, Pytest runs the test function using each value, checking that the given input leads to the expected results.

To show how it works, let’s say we want to test the roll_dice function by passing different values for the variable sides. We can change test_roll_die function to

PYTHON

@pytest.mark.parametrize("sides", [6,8,10])
def test_roll_die(sides):
    result = roll_die(sides=sides)
    assert 1<=result<= sides

Using Pytest to test exception

So far, we used Pytest to check that the code produced the expected results. However, an essential part of testing is checking that the code manages exceptions correctly. For example, we need to ensure that the code can handle errors smoothly or that the correct exception type is raised when the code is not used correctly.

In Pytest, we can use the pytest.raise to check the type of exceptions and the error message ( read more about pytest.raise). For example, let’s say we create a function to divide two numbers like the one below

PYTHON

def divide_number(numerator,denominator):

  if denominator == 0 :
    raise ZeroDivisionError("Cannot divide by zero")
  
  return numerator/denominator

We can use pytest.raises to check the expected behaviour when the denominator is zero. If the correct exception is raised, the test passes; if no exception is raised or a different type is raised, the test fails.

PYTHON

def test_divide():
  with Pytest.raises(ZeroDivisionError):
    divide_number(2,0)

Step3 : Incorporate input validation

One of the requirements is to handle errors gracefully. Currently, the function roll_die does not check its input. Clearly, the number of sides should not be less than 0, and since it is a game, it is unreasonable to think about a die of only two sides. So, let us add another test to test_dice_game.py to check the error handling when the number of sides is incorrect.

PYTHON

def test_incorrect_die():
  with Pytest.raises(ValueError):
    roll_die(1)

The test will fail because we have not yet changed the roll_die function. Can you modify the function to pass the test?

Show me the solution

PYTHON

def roll_die(sides=6):
    """Roll a die with the specified number of sides.

    Args:
        slide (int, optional): Number of sides. Defaults to 6.

    Returns:
        int: Rolling score
    
    Raise:
      ValueError: When the number of sides is less than 2.
    """
    
    if sides <=2:
        raise ValueError('Number of sides must be greater than 2')
    score = np.random.randint(1, sides+1)
    return score

Instructor Note

Highlight the fact that the docstrings should change as the code changes.

Requirement 2: The dice rolling system must be fair and unbiased.

Checking if a die-rolling system is fair involves understanding theoretical expectations and practical verification methods. For a die with N sides to be fair, we expect that

Each number has the same probability of appearing ( \(\rm{p}_{i} = \frac{1}{N}\))
Average roll value to converge to the theoretical mean (\(\frac{1}{N}\sum_i^N X_i\), where \(N\) is the number of sides, and \(X_i\) is the individual score).
The distribution becomes uniform over a large number of rolls.

Considering this expectation, we can create tests to verify that our die is unbiased.

A possible approach is to use the average roll value and compare it with the theoretical mean. This approach requires writing code in dice_game.py that calculates the mean score for a given number of rolls and then compares it to the expected values. The test function in test_dice_game.py will check that the two values are equal.

The caveat in this approach is that comparing two non-integer values is not straightforward, as there are often numerical errors and approximations. For example, have a look at the output below

PYTHON

> 0.2+0.4 == 0.6
False

This is a common problem in testing and usually requires asserting that two floating-point numbers are equal within some tolerance, something like

PYTHON

abs((0.2 + 0.4) - 0.6) < 1e-6

where the tolerance is \(1\times10^{-6}\). Writing this type of tests is tedious and usually requires code repetition. However, Pytest has a built-in method that can help solve this problem: pytest.approx. By default, pytest.approx uses a \(1\times10^{-6}\) tolerance. There might be situations where this value is not adequate. For example, let us say that the expected value is 0.62, but the function might return 0.6199 instead. Then

PYTHON

result = 0.6199
assert result == pytest.approx(0.62), "Results don't match"
[...]
AssertionError: Results don't match

A better check therefore might be

PYTHON

result = 0.6199
assert result == pytest.approx(0.62,1e-3), "Results don't match"

where we changed the tolerance to \(1\times10^{-3}\). An alternative solution might be to use the round() function to approximate the solution to a given number of decimals.

PYTHON

result = 0.6199
assert round(result,3) == pytest.approx(0.62), "Results don't match"

Write test_average_roll

Let’s start by writing the calculate_average_function in dice_game.py

PYTHON

def calculate_average_rolls(sides, number_of_rolls):
    """Calculate average roll over multiple trials.

    Args:
        sides (int): Number of die sides.
        number_of_rolls (int): number of rolls
    
    """
    pass

Then, add the corresponding test function in test_dice_game.py.

Remember to import the new function calculate_average_rolls

PYTHON

def test_average_roll():
    """Test the die is unbiased. 
    Assert that the average score over a number of rolls is equal to the expected average
    """
    average = calculate_average_rolls(sides=6, num_of_rolls=1000)
    EXPECTED_AVERAGE = 3.5 #expected average for a 6-sided die.
    assert  round(average,1) == pytest.approx(EXPECTED_AVERAGE)

Now, modify the code calculate_average_rolls(sides, number_of_rolls). An example can be:

PYTHON

def calculate_average_rolls(sides, number_of_rolls):
    """Calculate average roll over multiple trials.

    Args:
        sides (int): Number of die sides.
        number_of_rolls (int): number of rolls
    
    Raises:
        ValueError: When the number of rolls is less than or equal to 1. 

    Returns:
        float: Average total score

    """

    if number_of_rolls <=1:
        raise ValueError("""Number of rolls should be greater than 1.
                        For better results, it should be at least 1e3""")
    total_score =[roll_die(sides) for n in range(number_of_rolls)]
    average = np.mean(total_score)
    return average

Instructor Note

Take some time to read and explain the code to the students. Things to focus on: - The EXPECTED_AVERAGE is calculated for a six-sided die. The code can be extended to test other values, for example, by combining it with pytest.mark.parametrize:

PYTHON

@Pytest.mark.parametrize("sides, num_rolls", [
    (6, 10000),(10, 10000)
])
def test_average_roll(sides, num_rolls):
    """Test the die is unbiased. 
    Assert that the average score over a number of rolls is equal to the expected average for different dice.
    """
    average = calculate_average_rolls(sides, num_rolls)
    EXPECTED_AVERAGE = sum(range(1,sides+1)) / sides
    assert  round(average,1) == pytest.approx(EXPECTED_AVERAGE)

The use of round is to fix approximation errors, especially when the number of rolls is small.
Discuss that the test should not take too long to run. So using a very large number of rolls can provide better results, but it might also take too long to run. For real-life case, you want to balance execution time and accuracy.

To test your code, you can run pytest as before.

Simulate the game

We developed and tested code to simulate a die’s rolling and ensured the system was unbiased. For this challenge, you need to write code that simulates the game. As for the previous case, you can focus on the following requirements and develop code and tests to complete the challenge. The requirements are

Game logic
- Requirement 3: The users can change the target score.
- Requirement 4: Track the total score and count the number of rolls.
- Requirement 5 : Implement the target score. The code should stop when the target score is reached or the total score exceeds the target score.

You can call your game function play_game.

Show me the solution

In test_dice_game.py

PYTHON

def test_game_completition():
    """ Test that the game returns a reasonable number of rolls
    """
    rolls_needed, total = play_game(target_score=6)
    assert rolls_needed > 0

In dice_game.py

PYTHON


def play_game(target_score = 21):
    """Roll the die until total score reaches or overcome the total score

    Args:
        target_score (int, optional): The target score. Defaults to 21.
    Returns:
        tuple (int, int): Number of rolls and total score
    """
    total = 0
    number_of_rolls = 0
    while total<target_score:
        roll = roll_die()
        total = total + roll
        number_of_rolls = number_of_rolls +1
        
    return number_of_rolls, total

Key Points

Testing increases trust in code and its results.
Test Driven Development helps to write tests alongside code
Pytest offers a powerful testing framework from Python code, offering different built-in functions and methods to test various aspects of the code:
- @pytest.mark.parametrize: To test different input values
- pytest.raises: To test error handling