Content from Julia Fundamentals


Last updated on 2026-01-27 | Edit this page

Overview

Questions

  • What basic data types can I work with in Julia?
  • How can I create a new variable in Julia?
  • How do I use a function?
  • Can I change the value associated with a variable after I create it?

Objectives

  • Assign values to variables.

Variables


Any Julia REPL or script can be used as a calculator:

JULIA

3 + 5 * 4

OUTPUT

23

This is great, but not very interesting. To do anything useful with data, we need to assign its value to a variable. In Julia, we assign a value to a variable using the equals sign =. For example, we can track the weight of a patient who weighs 60 kilograms by assigning the value 60 to a variable weight_kg::

JULIA

weight_kg = 60

Now, whenever we use weight_kg, Julia will substitute the value we assigned to it. In simple terms, a variable is a name for a value.

In Julia, variable names:

  • can include letters, digits, and underscores
  • cannot start with a digit
  • are case sensitive

This means that:

  • weight0 is valid, but 0weight is not
  • weight and Weight refer to different variables

Types of Data


Julia supports various data types. Common ones include:

  • Integer numbers
  • Floating point numbers
  • Strings

For example, weight_kg = 60 creates an integer variable. If we want a more precise value, we can use a floating point value:

JULIA

weight_kg = 60.3

To store text, we create a string by using double quotes:

JULIA

patient_id = "001"

Using Variables in Julia


Once we’ve stored values in variables, we can use them in calculations:

JULIA

weight_lb = 2.2 * weight_kg

OUTPUT

132.66

Or modify strings:

JULIA

patient_id = "inflam_" * patient_id

OUTPUT

"inflam_001"

Built-in Julia Functions


Functions are called with parentheses. You can include variables or values inside them. Julia provides many built-in functions. To display a value, we use println or print:

JULIA

println(weight_lb)
println(patient_id)

OUTPUT

132.66
inflam_001

To display multiple values in Julia, we can pass them to println separated by commas.

JULIA

println(patient_id, " weight in kilograms: ", weight_kg)

This prints the value of patient_id, followed by the string " weight in kilograms: ", and then the value of weight_kg, all in one line.

In Julia, every value has a specific data type (e.g., integer, floating-point number, string). To check the type of a value or variable, use the typeof function:

JULIA

typeof(60.3)
typeof(patient_id)

OUTPUT

Float64
String

In this example:

  • 60.3 is interpreted as a floating-point number (specifically, a Float64).
  • patient_id contains a sequence of characters, so its type is String.

Understanding data types is important because they determine how values behave in operations, and some functions may only work with certain types.

You can also use typeof to explore the structure of more complex objects like arrays or dictionaries:

JULIA

typeof([1, 2, 3])      # Array of integers
typeof(["a", "b", "c"]) # Array of strings

OUTPUT

Vector{Int64}
Vector{String}

We can even do math directly in println:

JULIA

println("weight in pounds: ", 2.2 * weight_kg)

OUTPUT

weight in pounds: 132.66

The above doesn’t change weight_kg:

JULIA

println(weight_kg)

To change the value of the weight_kg variable, we have to assign a new value to weight_kg

JULIA

weight_kg = 65.0
println("weight in kilograms is now: ", weight_kg)

OUTPUT

weight in kilograms is now: 65.0
Challenge

Check Your Understanding

What values do the variables mass and age have after each line?

JULIA

mass = 50.0
age = 56
println(mass * 2.0)
mass = mass * 2.0
age_new = age - 20

OUTPUT

50.0
56
50.0
100.0
56 
Challenge

Sorting Out References

Julia allows multiple assignments in one line. What will this print?

JULIA

first, second = "Hello", "World!"
println(first," ", second)

OUTPUT

Hello World!

(Note: println prints without space by default. We insert a space by adding a string with just one space character " ".)

Challenge

Seeing Data Types

What are the types of the following?

JULIA

planet = "Earth"
apples = 5
distance = 10.5

JULIA

typeof(planet)
typeof(apples)
typeof(distance)

OUTPUT

String
Int64
Float64
Key Points
  • Basic data types in Julia include integers, strings, and floating-point numbers.
  • Use variable = value to assign a value to a variable.
  • Use println(value) to display output.
  • Julia provides many built-in functions, such as typeof.

Content from Analyzing Patient Data


Last updated on 2026-01-27 | Edit this page

Overview

Questions

  • How can I process tabular data files in Julia?

Objectives

  • Explain what a package is and what libraries are used for.
  • Import a package and use the functions it contains.
  • Read tabular data from a file into a program.
  • Select individual values and subsections from data.
  • Perform operations on arrays of data.

Loading data into Julia


To begin processing the clinical trial inflammation data, we need to load it into Julia. Depending on the file format we have to use different packages. Some examples are XLSX.jl or JSON3.jl. In this example we work with a CSV File. That means we use the package CSV.jl

Before we can use a package in Julia, we need to install it. This can be done either by entering the package mode in the Julia REPL or by using Pkg.add("PackageName"), for example inside a script.

To enter the package manager mode, press ] in the Julia REPL:

JULIA

]

Then you can add a package

JULIA

pkg> add CSV

Alternatively, to add a package inside a script use

JULIA

using Pkg
Pkg.add("CSV")

After installing the package, you still need to load it before using its functionality:

JULIA

using CSV

Besides CSV.jl, we also need DataFrames.jl. You can install it the same way — give it a try!

After installing both packages, we can read the data file like this:

JULIA

df = CSV.read("inflammation-01.csv", DataFrame)

OUTPUT

59×40 DataFrame
 Row │ 0      0_1    1      3      1_1    2      4      7      8      3_1    3 ⋯
     │ Int64  Int64  Int64  Int64  Int64  Int64  Int64  Int64  Int64  Int64  I ⋯
─────┼──────────────────────────────────────────────────────────────────────────
   1 │     0      1      2      1      2      1      3      2      2      6    ⋯
   2 │     0      1      1      3      3      2      6      2      5      9
   3 │     0      0      2      0      4      2      2      1      6      7
   4 │     0      1      1      3      3      1      3      5      2      4
   5 │     0      0      1      2      2      4      2      1      6      4    ⋯
   6 │     0      0      2      2      4      2      2      5      5      8
   7 │     0      0      1      2      3      1      2      3      5      3
   8 │     0      0      0      3      1      5      6      5      5      8
  ⋮  │   ⋮      ⋮      ⋮      ⋮      ⋮      ⋮      ⋮      ⋮      ⋮      ⋮      ⋱
  53 │     0      0      2      1      1      4      4      7      2      9    ⋯
  54 │     0      1      2      1      1      4      5      4      4      5
  55 │     0      0      1      3      2      3      6      4      5      7
  56 │     0      1      1      2      2      5      1      7      4      2
  57 │     0      1      1      1      4      1      6      4      6      3    ⋯
  58 │     0      0      0      1      4      5      6      3      8      7
  59 │     0      0      1      0      3      2      5      4      8      2
                                                  30 columns and 44 rows omitted

If we want to check that the data loaded correctly, we can just print it:

JULIA

print(df)

Or view the first few rows using:

JULIA

first(df, 5)

We can check the type of object we’ve created:

JULIA

typeof(df)

OUTPUT

DataFrame

To see how many rows and columns the data contains, we can use:

JULIA

size(df)

OUTPUT

(59,40)

We can also get just the number of rows or columns:

JULIA

nrow(df)
ncol(df)

OUTPUT

59
40

Accessing Elements


In Julia, you can access data in a DataFrame by column, by name, or by specifying row and column indices.

Accessing a Single Column

You can access a single column by its position (column number) or its name:

JULIA

df[!, 1]        # First column (by index)
df[!, :column1] # Column named `:column1`

The ! means you’re accessing the actual data — a view, not a copy.

Important: df[!, 1] gives you a view into the DataFrame. If you modify this vector, it will also change the original DataFrame. Use df[:, 1] instead if you want a copy of the data.

Accessing a Specific Value

You can access individual values by specifying row and column numbers:

JULIA

df[30, 20]  # value at row 30, column 20

Or mix names and indices:

JULIA

df[30, :column_name]  # value at row 30, column named `:column_name`

Checking the Type of a Column

To inspect the type of data stored in a column:

JULIA

eltype(df[!, 1])      
eltype(df[!, :column1])  

This is useful when you want to confirm whether a column contains Float64, Int, String, etc.

Slicing data


An index like [30, 20] selects a single element of an array, but we can select whole sections as well. For example, we can select the first ten columns of values for the first four patients (rows) like this:

JULIA

print(df[1:4, 1:10])

OUTPUT

4×10 DataFrame
 Row │ 0      0_1    1      3      1_1    2      4      7      8      3_1
     │ Int64  Int64  Int64  Int64  Int64  Int64  Int64  Int64  Int64  Int64
─────┼──────────────────────────────────────────────────────────────────────
   1 │     0      1      2      1      2      1      3      2      2      6
   2 │     0      1      1      3      3      2      6      2      5      9
   3 │     0      0      2      0      4      2      2      1      6      7
   4 │     0      1      1      3      3      1      3      5      2      4

The slice 1:4 means, “Start at index 1 and go up to and including index 4”. Julia uses 1-based indexing, so indices start at 1.

We don’t have to start slices at 1:

JULIA

println(df[6:10, 1:10])

OUTPUT

5×10 DataFrame
 Row │ 0      0_1    1      3      1_1    2      4      7      8      3_1
     │ Int64  Int64  Int64  Int64  Int64  Int64  Int64  Int64  Int64  Int64
─────┼──────────────────────────────────────────────────────────────────────
   1 │     0      0      2      2      4      2      2      5      5      8
   2 │     0      0      1      2      3      1      2      3      5      3
   3 │     0      0      0      3      1      5      6      5      5      8
   4 │     0      1      1      2      1      3      5      3      5      8
   5 │     0      1      0      0      4      3      3      5      5      4

We can also use :end to select everything from a certain position up to the last element. If we use : on its own, it includes everything:

JULIA

smaller_df = df[1:3, 37:end]

This selects rows 1 through 3 and columns 37 through to the end of the array.

OUTPUT

3×4 DataFrame
 Row │ 2_1    3_5    0_2    0_3
     │ Int64  Int64  Int64  Int64
─────┼────────────────────────────
   1 │     1      1      0      1
   2 │     2      2      1      1
   3 │     2      3      2      1

Analyzing Data


Julia provides powerful tools for analyzing data stored in a DataFrame. To calculate the average inflammation for all patients on all days, we can use the mean function from the Statistics standard library.

Tip: Make sure to install the required packages first

Then load the packages:

JULIA

using DataFrames
using Statistics

JULIA

mean(Matrix(df))

OUTPUT

6.160593220338983

Here’s what’s happening:

  • Matrix(df) converts the DataFrame to a regular array of numbers.
  • mean(...) calculates the average of all the values.

Descriptive Statistics in Julia


Let’s use three Julia functions to get some basic descriptive statistics from our dataset: maximum, minimum, and standard deviation.

We can use multiple assignment to store all the results in one line.

JULIA


maxval, minval, stdval = maximum(Matrix(df)), minimum(Matrix(df)), std(Matrix(df))

println("maximum inflammation: ", maxval)
println("minimum inflammation: ", minval)
println("standard deviation: ", stdval)
maximum inflammation: 20
minimum inflammation: 0
standard deviation: 4.625075651890539
Callout

Exploring Functions in Julia

How can you find out what functions are available in a Julia module and how to use them?

Julia provides several ways to explore functions and get help:

  • To list functions and names in a module, use the names() function. For example:

    JULIA

    using Statistics
    names(Statistics)
  • To get detailed documentation on a function, use the help mode by typing a question mark ? before the function name in the REPL or Jupyter notebook:

    JULIA

    ?mean

    This will show you the official help text for mean.

  • To see all methods and signatures of a function, use:

    JULIA

    methods(mean)

When analyzing data, we often want to find values like the maximum inflammation per patient or the average inflammation per day.

One way is to first select the data for a single patient (row), then apply a function to that data:

JULIA

# Select data for patient 1
patient_1 = df[1, :]  

println("maximum inflammation for patient 1: ", maximum(patient_1))

OUTPUT

maximum inflammation for patient 1: 18

We don’t need to store the row separately — we can combine selecting the data and applying the function in one step:

JULIA

println("maximum inflammation for patient 3: ", maximum(df[3, :]))

OUTPUT

maximum inflammation for patient 3: 17

It is much easier to work with an array than with a DataFrame for many numerical operations. To convert a DataFrame to an array, use:

JULIA

data = Matrix(df)

What if we want the maximum inflammation for each patient across all days (i.e., row-wise maximum), or the average inflammation for each day across all patients (i.e., column-wise average)?

In Julia, many functions accept a dims keyword argument that specifies the dimension (axis) to operate on:

  • dims=1 means operate across rows, producing one result per column.
  • dims=2 means operate across columns, producing one result per row.

For example, to find the average inflammation per day (i.e., average across all patients for each day — column-wise):

JULIA

day_avg = mean(data, dims=1)
println(day_avg)

OUTPUT

[0.0 0.4576271186440678 1.11864406779661 1.728813559322034 ...]

To check the shape:

JULIA

println(size(day_avg))

To get the average inflammation per patient:

JULIA

patient_avg = mean(data, dims=2)
println(patient_avg)

OUTPUT

[5.45; 5.425; 6.1; 5.9; 5.55; ...]
Challenge

Slicing Strings

A section of an array is called a slice. We can take slices of character strings as well:

JULIA

element = "oxygen"
println("first three characters: ", element[1:3])
println("last three characters: ", element[4:6])

OUTPUT

first three characters: oxy
last three characters: gen

What is the value of element[1:4]? What about element[5:end]? Or element[:]? What is element[end]? What is element[end-1]?

OUTPUT

oxyg
en
oxygen
n
e
Challenge

Thin Slices

The expression element[4:3] (a range where the start is greater than the end) produces an empty string in Julia, a string that contains no characters.

If data is an array what does data[4:3, 5:4] produce? What about data[4:4, :]?

JULIA

data[4:3, 5:4]   # Empty range in both dimensions
data[4:4, :]     # Just row 4 (as a 1×n matrix)

OUTPUT

0×0 Matrix{Int64}
1×40 Matrix{Int64}:
 0  1  1  3  3  1  3  5  2  4  4  7  6  …  7  7  9  6  3  2  2  4  2  0  1  1
Challenge

Stacking Arrays

Arrays can be concatenated and stacked on top of one another in Julia using square bracket syntax.

JULIA

A = [1 2 3; 4 5 6; 7 8 9]

B = [A A]   # horizontal stacking

C = [A; A]  # vertical stacking

OUTPUT

3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9

3×6 Matrix{Int64}:
 1  2  3  1  2  3
 4  5  6  4  5  6
 7  8  9  7  8  9

6×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9
 1  2  3
 4  5  6
 7  8  9

Write additional code that slices the first and last columns of A, and stacks them side by side into a 3×2 array, using only square bracket syntax.

Use column indexing with square brackets and combine the slices horizontally:

JULIA

D = [A[:, 1] A[:, end]]

OUTPUT

D =
[1 3
 4 6
 7 9]
Challenge

Change in Inflammation

The patient data is longitudinal — each row represents a series of measurements for one patient over time. That means calculating the change in inflammation over consecutive days is meaningful.

Here’s how you can explore those changes:

JULIA

patient3_week1 = data[4, 1:7]  
 # the data for patient 4 over days 1 to 7

OUTPUT

7-element Vector{Int64}:
 0
 1
 1
 3
 3
 1
 3

To compute changes day by day, we use diff:

JULIA

println(diff(patient3_week1))

OUTPUT

[1, 0, 2, 0, -2, 2]

Questions

  1. If you use diff(data; dims=?), which dims value computes daily changes for each patient?
  2. If your data array has shape (60, 40), what will the shape be after calling diff(data; dims=?), and why?
  3. How can you compute the largest absolute change for each patient across all days?
  • Set dims=2 in diff(data; dims=2) to compute differences along each row (across days for each patient).
  • If your data is shaped (60, 40), the result of diff will be (60, 39) — one fewer column, because differences are between pairs of adjacent days.
  • To get the largest magnitude change for each patient, combine diff, abs., and maximum, again using dims=2:

JULIA

maximum(abs.(diff(data; dims=2)), dims=2)

This returns a 60×1 array, where each entry is the maximum absolute change in inflammation for that patient.

Key Points
  • Use Pkg.add("PackageName") to install and using PackageName to load packages in Julia.
  • Load CSV data into a DataFrame with CSV.read("file.csv", DataFrame).
  • Use df[row, column] to access specific values; use df[!, column] to access entire columns.
  • Use size(df), nrow(df), and ncol(df) to inspect DataFrame dimensions.
  • Convert a DataFrame to a matrix using Matrix(df) for numerical operations.
  • Use mean, maximum, minimum, and std to compute statistics on data arrays.
  • Use mean(data, dims=1) for column-wise and dims=2 for row-wise operations.
  • Use diff(data; dims=2) to calculate daily changes per patient.

Content from Visualizing Tabular Data


Last updated on 2026-01-27 | Edit this page

Plotting


The mathematician Richard Hamming once said, “The purpose of computing is insight, not numbers,” and the best way to develop insight is often to visualize data. Visualization deserves an entire lecture of its own, but we can explore a few features of Julia’s Plots.jl library here. While Julia has several plotting libraries, Plots.jl is a powerful and flexible option that works well for most use cases. First, we will load Plots and use its functions to create and display a heat map of our data:

JULIA

Pkg.add("Plots")
using Plots

heatmap(data)

This will display a heat map where colors represent the magnitude of values in a matrix data. You can customize it further by adding labels, colorbars, and other visual elements. Each row in the heat map corresponds to a patient in the clinical trial dataset, and each column corresponds to a day in the dataset. Darker (blue) pixels in this heat map represent lower values, while lighter (yellow) pixels represent higher values. As we can see, the general number of inflammation flare-ups for the patients rises and falls over a 40-day period.

Heat map representing the data variable. Each cell is colored by value along a color gradientfrom blue to yellow.

Let’s calculate the average inflammation per day across all patients and plot it:

JULIA

plot(mean(data, dims=1)', 
     xlabel="Day", 
     ylabel="Average inflammation", 
     title="Average inflammation over time", 
     legend=false)
A line graph showing the average inflammation across all patients over a 40-day period.

This line of code creates a plot showing the average inflammation over time. The options xlabel="Day" and ylabel="Average inflammation" label the axes, title="Average inflammation over time" adds a title to the graph, and legend=false hides the legend since only one line is being shown.

This plot shows how the average inflammation changes day by day, across all patients. It typically starts low, increases steadily, and then decreases — supporting the idea that the treatment takes effect after about three weeks.

JULIA

plot(maximum(data, dims=1)', 
     xlabel="Day", 
     ylabel="Maximum inflammation", 
     title="Maximum inflammation over time", 
     legend=false)
A line graph showing the maximum inflammation across all patients over a 40-day period.

JULIA

plot(minimum(data, dims=1)', 
     xlabel="Day", 
     ylabel="Minimum inflammation", 
     title="Minimum inflammation over time", 
     legend=false)
A line graph showing the minimum inflammation across all patients over a 40-day period.

The maximum value rises and falls in a linear pattern, while the minimum values appear to follow a step-like function. Neither of these trends seems biologically plausible, so there may be an error in the data or in how it was collected. These insights would have been difficult to uncover without visualizing the results.

Grouping Plots

With Plots.jl, it is easy to group multiple plots into a single figure. We first create the individual subplots and save them in variables. Then, we combine them into one figure using the layout argument.

JULIA


p1 = plot(mean(data, dims=1)', title="Average", ylabel="average", xlabel="Day", legend=false)
p2 = plot(maximum(data, dims=1)', title="Maximum", ylabel="max", xlabel="Day", legend=false)
p3 = plot(minimum(data, dims=1)', title="Minimum", ylabel="min", xlabel="Day", legend=false)

plot(p1, p2, p3, layout=(1, 3))
savefig("inflammation.svg")

With the layout argument to arrange three plots in one row. The call to savefig stores the figure as a svg file. You can change the filename extension to save in other formats like .pdf or .png.

By using grouped plots, we can compare different trends side by side — for example, while the average inflammation rises and falls gradually, the minimum appears to jump in discrete steps. This visual comparison helps us spot unusual patterns that would be hard to detect just by looking at numbers.

Challenge

Plot Scaling

Why do all of our plots stop just short of the upper end of our graph?

By default, Plots.jl scales the axes to fit the data range exactly, so lines often stop right before the plot edge.

Challenge

Plot Scaling (continued)

If we want to change this, we can manually set the y-axis limits using the ylims keyword argument. Try updating your plotting code so that all subplots share the same y-axis.

JULIA

p1 = plot(mean(data, dims=1)', title="Average", ylabel="average", xlabel="Day", ylims=(0, 20), legend=false)
p2 = plot(maximum(data, dims=1)', title="Maximum", ylabel="max", xlabel="Day", ylims=(0, 20), legend=false)
p3 = plot(minimum(data, dims=1)', title="Minimum", ylabel="min", xlabel="Day",ylims=(0, 20), legend=false)

plot(p1, p2, p3, layout=(1, 3))
Key Points
  • Use the Plots.jl package to create simple and flexible visualizations.

Content from Storing Multiple Values in Vectors


Last updated on 2026-01-27 | Edit this page

In the previous episode, we analyzed a single file of clinical trial inflammation data. However, after finding some peculiar and potentially suspicious trends in the trial data we ask Dr. Maverick if they have performed any other clinical trials. Surprisingly, they say that they have and provide us with 11 more CSV files for a further 11 clinical trials they have undertaken since the initial trial.

Our goal now is to process all the inflammation data we have, which means that we still have eleven more files to go!

The natural first step is to collect the names of all the files that we have to process. In Julia, a vector is a way to store multiple values together. In this episode, we will learn how to store multiple values in a vector as well as how to work with vectors.

Julia vectors


Unlike packages such as DataFrames.jl, vectors are built into the language, so we do not have to load a library to use them. We create a vector by putting values inside square brackets and separating the values with commas:

JULIA

odds = [1, 3, 5, 7]
println("odds are: ", odds)

OUTPUT

odds are: [1, 3, 5, 7]

We can access elements of a vector using indices — numbered positions of elements in the vector. These positions are numbered starting at 1 in Julia, so the first element has an index of 1.

JULIA

println("first element: ", odds[1])
println("last element: ", odds[4])
println("\"end\" keyword element: ", odds[end])

OUTPUT

first element: 1
last element: 7
"end" keyword element: 7
Callout

Ch-Ch-Ch-Ch-Changes

Data which can be modified in place is called mutable, while data which cannot be modified is called immutable. Strings and numbers are immutable. This does not mean that variables with string or number values are constants, but when we want to change the value of a string or number variable, we can only replace the old value with a completely new value.

Vectors and other collections, on the other hand, are mutable: we can modify them after they have been created. We can change individual elements, append new elements, or reorder the whole vector. For some operations, like sorting, we can choose whether to use a function that modifies the data in-place or a function that returns a modified copy and leaves the original unchanged.

Be careful when modifying data in-place. If two variables refer to the same vector, and you modify the vector value, it will change for both variables!

JULIA

mild_salsa = ["peppers", "onions", "cilantro", "tomatoes"]
hot_salsa = mild_salsa 
hot_salsa[1] = "hot peppers"
println("Ingredients in mild salsa: ", mild_salsa)
println("Ingredients in hot salsa: ", hot_salsa)

OUTPUT

Ingredients in mild salsa: ["hot peppers", "onions", "cilantro", "tomatoes"]
Ingredients in hot salsa: ["hot peppers", "onions", "cilantro", "tomatoes"]

If you want variables with mutable values to be independent, you must make a copy of the value when you assign it.

JULIA

mild_salsa = ["peppers", "onions", "cilantro", "tomatoes"]
hot_salsa = copy(mild_salsa)  # <-- makes a *copy* of the vector
hot_salsa[1] = "hot peppers"
println("Ingredients in mild salsa: ", mild_salsa)
println("Ingredients in hot salsa: ", hot_salsa)

OUTPUT

Ingredients in mild salsa: ["peppers", "onions", "cilantro", "tomatoes"]
Ingredients in hot salsa: ["hot peppers", "onions", "cilantro", "tomatoes"]

Because of pitfalls like this, code which modifies data in place can be more difficult to understand. However, it is often far more efficient to modify a large data structure in place than to create a modified copy for every small change. You should consider both of these aspects when writing your code.

Callout

Heterogeneous Vectors

Vectors in Julia can also contain elements of different types.

JULIA

sample_ages = Any[10, 12.5, "Unknown"]

This gives us flexibility, but comes at a small performance cost compared to vectors where all elements have the same type. When possible, you should use vectors with a consistent element type for efficiency.

In Julia, functions that modify their arguments in place follow a naming convention: their name ends with an exclamation mark !.

For example:

reverse!(odds) reverses the vector in place

reverse(odds) returns a new, reversed copy while leaving odds unchanged

This convention makes it easy to tell at a glance whether a function will mutate its input or not. There are many ways to change the contents of vectors besides assigning new values to individual elements.

JULIA

push!(odds, 11)  
println("odds after adding a value: ", odds)

OUTPUT

odds after adding a value: [1, 3, 5, 7, 11]

JULIA

removed_element = popfirst!(odds)   
println("odds after removing the first element: ", odds)
println("removed_element: ", removed_element)

OUTPUT

odds after removing the first element: [3, 5, 7, 11]
removed_element: 1

JULIA

reverse!(odds)  
println("odds after reversing in place: ", odds)

OUTPUT

odds after reversing in place: [11, 7, 5, 3]

Julia also provides non-mutating versions of many functions. These return a modified copy of the data and leave the original unchanged:

JULIA

odds_copy = reverse(odds)   
println("odds after non-mutating reverse: ", odds)
println("copy after reverse: ", odds_copy)

OUTPUT

odds after non-mutating reverse: [11, 7, 5, 3]
copy after reverse: [3, 5, 7, 11]

As we saw earlier, when we modify a vector in-place, multiple variables can refer to the same vector, leading to unintended changes:

JULIA

odds = [3, 5, 7]
primes = odds
push!(primes, 2)
println("primes: ", primes)
println("odds: ", odds)

OUTPUT

primes: [3, 5, 7, 2]
odds: [3, 5, 7, 2]

This happens because primes and odds point to the same vector in memory. If we want to make an independent copy of a vector, we can use the copy function:

JULIA

odds = [3, 5, 7]
primes = copy(odds)
push!(primes, 2)
println("primes: ", primes)
println("odds: ", odds)

OUTPUT

primes: [3, 5, 7, 2]
odds: [3, 5, 7]

Subsets of vectors and strings can be accessed using ranges:

JULIA

binomial_name = "Drosophila melanogaster"
group = binomial_name[1:10]
println("group: ", group)

species = binomial_name[11:23]
println("species: ", species)

chromosomes = ["X", "Y", "2", "3", "4"]
autosomes = chromosomes[3:5]
println("autosomes: ", autosomes)

last = chromosomes[end]
println("last: ", last)

OUTPUT

group: Drosophila
species: melanogaster
autosomes: ["2", "3", "4"]
last: 4
Key Points
  • [value1, value2, value3, ...] creates a vector.
  • Vectors can contain any Julia object, including other vectors (i.e., a vector of vectors).
  • Vectors are indexed and sliced with square brackets (e.g., vec[1] and vec[2:9]), in the same way as strings and arrays.
  • Vectors are mutable (i.e., their values can be changed in place).

Content from Automating Repetition with Loops


Last updated on 2026-01-27 | Edit this page

Overview

Questions

  • How can I do the same operations on many different values?

Objectives

  • Explain what a for loop does.
  • Correctly write for loops to repeat simple calculations.
  • Explain what a while loop does.
  • Trace changes to a loop variable as the loop runs.
  • Trace changes to other variables as they are updated by a for loop.

In the episode “Visualizing Tabular Data”, we wrote Julia code that plots values of interest from our first inflammation dataset (inflammation-01.csv), which revealed some suspicious features in it.

Line graphs showing average, maximum and minimum inflammation across all patients over a 40-day period.

We now have a dozen datasets, and potentially more on the way if Dr. Maverick keeps up their surprisingly fast clinical trial rate. We would like to create plots for all of our datasets without having to copy-paste code for each file.

To do that, we need to teach the computer how to repeat actions automatically — this is where loops come in.

An example of a task that can be solved with a loop is accessing the numbers stored in a vector. We could do this by printing each number on its own.

JULIA

odds = [1, 3, 5, 7]

In Julia, an array (1D arrays are called vectors, 2D arrays are called matrices) is an ordered collection of elements, and each element has a unique number associated with it — its index.

For example, the first number in odds is accessed via odds[1]. One way to print each number is to write four separate println statements:

JULIA

println(odds[1])
println(odds[2])
println(odds[3])
println(odds[4])

OUTPUT

1
3
5
7

However, this approach has three major drawbacks:

  1. Not scalable – if the array has hundreds of elements, writing one line per element is unmanageable.
  2. Difficult to maintain – if we want to decorate each element an asterisk or any other character, we’d have to change every line.
  3. Fragile – if the array is longer or shorter than expected, we either miss elements or get an error.

Example with a shorter array:

JULIA

odds = [1, 3, 5]
println(odds[1])
println(odds[2])
println(odds[3])
println(odds[4])

OUTPUT

1
3
5

ERROR

ERROR: BoundsError: attempt to access 3-element Vector{Int64} at index [4]

Here’s a better approach: a for loop

JULIA

odds = [1, 3, 5, 7]

for num in odds
    println(num)
end

OUTPUT

1
3
5
7

This is shorter — definitely shorter than writing a println for every number in a long list — and more robust as well:

JULIA

odds = [1, 3, 5, 7, 9, 11]

for num in odds
    println(num)
end

OUTPUT

1
3
5
7
9
11

In a for loop, the loop variable (like num in the example) is just a name we give to each element of the collection as we go through it.

  • num is the loop variable.

  • During the first iteration, num = 1; during the second, num = 3; and during the third, num = 5, etc.

  • You can choose any valid variable name instead of num

In Julia, the general form of a for loop is:

JULIA

for variable in collection
    # do things using variable, such as println
end

Here’s another loop that repeatedly updates a variable:

JULIA

count = 0
people = ["Curie", "Darwin", "Turing"]

for person in people
    count += 1
end

println("There are ", count, " names in the vector.")

OUTPUT

There are 3 names in the vector.

It’s worth tracing the execution of this little program step by step. Since there are three names in people, the statement inside the loop will be executed three times.

  • First iteration: count is 0 (set on line 1) and person is "Curie". count = count + 1 updates count to 1.

  • Second iteration: person is "Darwin" and count is 1, so count becomes 2.

  • Third iteration: person is "Turing" and count becomes 3.

Since there are no more elements in people, the loop finishes. Finally, the println statement shows the result.

Note that in Julia, the loop variable does not overwrite a variable with the same name outside the loop. The loop variable is local to the loop, so it only exists inside the loop body.

For example:

JULIA

person = "Rosalind"

for person in ["Curie", "Darwin", "Turing"]
    println(person)
end

println("after the loop, name is ", person)

OUTPUT

Curie
Darwin
Turing
after the loop, name is Rosalind

Note also that finding the length of an object is such a common operation that Julia has a built-in function for it called length:

JULIA

println(length([0, 1, 2, 3]))

OUTPUT

4

length is much faster than any function we could write ourselves, and much easier to read than writing a loop to count elements. It also works on many different kinds of collections in Julia, so we should always use it when we can.

While Loops


Sometimes, we want to repeat an action until a certain condition is met, rather than looping over a collection. For this, Julia provides a “while loop”.

The general form is:

JULIA

while condition
    # do something
end

Example:

JULIA

count = 0

while count < 5
    println("count is ", count)
    count += 1
end

OUTPUT

count is 0
count is 1
count is 2
count is 3
count is 4

The loop checks the condition count < 5 before each iteration. As long as the condition is true, the loop body runs. Once count reaches 5, the condition is false and the loop stops.

ou can use while loops when the number of iterations is not known in advance. But be careful!: if the condition never becomes false, the loop will run forever (an infinite loop).

!!!WARNING!!! Example of a potential infinite loop:

JULIA

x = 0
while x < 3
    println(x)
end

This will print 0 endlessly because x never changes.

Challenge

Understanding the loops

Given the following loop:

JULIA

word = "oxygen"
for letter in word
    println(letter)
end

How many times is the body of the loop executed?

  • 3 times
  • 4 times
  • 5 times
  • 6 times

The body of the loop is executed 6 times, once for each character in "oxygen".

Challenge

Computing Powers With Loops

Exponentiation is built into Julia:

JULIA

println(5 ^ 3)

OUTPUT

125

Write a loop that calculates the same result as 5 ^ 3 using multiplication (and without exponentiation).

JULIA

result = 1
for i in 1:3
    result = result * 5
end
println(result)

OUTPUT

125
Challenge

Summing a vector

Write a loop that calculates the sum of elements in a vector by adding each element and printing the final value, so [124, 402, 36] prints 562.

JULIA

numbers = [124, 402, 36] 
sum = 0
for num in numbers
    sum = sum + num
end
println(sum)

OUTPUT

562

A shorter way to reach the goal would be:

JULIA

numbers = [124, 402, 36] 
sum = 0
for num in numbers
    sum += num
end
println(sum)

OUTPUT

562
Key Points
  • Use for variable to process the elements of a collection (like a vector) one at a time.
  • The body of a for loop must be placed inside for ... end.
  • The body of a while loop must be placed inside while ... end.
  • Use length(thing) to determine the length of a collection (vector, array, string, etc.).

Content from Analyzing Multiple Files


Last updated on 2026-01-27 | Edit this page

Overview

Questions

  • How can I apply the same operations to many different files?

Objectives

  • Use a built-in function to collect filenames that match a wildcard pattern.
  • Write a for loop to process several files in sequence.

As a final step in processing our inflammation data, we need a way to gather all the files in our data directory whose names begin with inflammation- and end with .csv. In Julia, we can use the Glob.jl package together with file system functions to accomplish this.

JULIA

Pkg.add("Glob")
using Glob

The Glob.jl package provides a function called glob, which finds files and directories whose names match a given pattern. We provide those patterns as strings:

  • The character * matches zero or more characters.
  • The character ? matches exactly one character.

We can use this to get the names of all the CSV files in the current directory:

JULIA

println(glob("inflammation*.csv", "."))

OUTPUT

[".\\inflammation-01.csv", ".\\inflammation-02.csv", ".\\inflammation-03.csv", ".\\inflammation-04.csv", ".\\inflammation-05.csv", ".\\inflammation-06.csv", ".\\inflammation-07.csv", ".\\inflammation-08.csv", ".\\inflammation-09.csv", ".\\inflammation-10.csv", ".\\inflammation-11.csv", ".\\inflammation-12.csv"]

As these examples show, glob returns a vector of file and directory paths in arbitrary order. This means we can loop over the vector to do something with each filename in turn.

In our case, the “something” we want to do is generate a set of plots for each file in our inflammation dataset.

If we want to begin by analyzing only the first three files in alphabetical order, we can sort the output from glob and then take a slice of the first three filenames:

JULIA

using Glob
using DelimitedFiles
using Plots
using Statistics

# Get sorted list of matching files
filenames = sort(glob("inflammation*.csv", "."))
filenames = filenames[1:3]   # take the first three files

for filename in filenames
    println(filename)

    # Load data
    data = readdlm(filename, ',')

    # Create subplots
    plt1 = plot(mean(data, dims=1)', ylabel="average", legend=false)
    plt2 = plot(maximum(data, dims=1)', ylabel="max", legend=false)
    plt3 = plot(minimum(data, dims=1)', ylabel="min", legend=false)

    # Arrange side by side
    plot(plt1, plt2, plt3, layout=(1,3), size=(900,300))
    display(current())
end

OUTPUT

inflammation-01.csv
Output from the first iteration of the loop: three line plots showing daily average, maximum, and minimum inflammation over 40 days for all patients in the first dataset.

OUTPUT

inflammation-02.csv
Output from the second iteration of the loop: three line plots showing daily average, maximum, and minimum inflammation over 40 days for all patients in the second dataset.

OUTPUT

inflammation-03.csv
Output from the third iteration of the loop: three line plots showing daily average, maximum, and minimum inflammation over 40 days for all patients in the third dataset.

The plots from the second clinical trial file look almost identical to those from the first: the average curves show the same uneven rises and drops, the maximum values follow the same linear increase and decrease, and the minimum values form very similar show similar staircase structures

The third dataset, however, looks different. Its average and maximum plots are much noisier, and appear more realistic than those of the first two datasets. But the minimum values reveal that inflammation is always zero across all 40 days of the trial.

If we generate a heatmap of the third dataset, we can see why:

  • Zero values are scattered across patients and days, pointing to possible measurement or recording issues.
  • The very last patient has no recorded inflammation at all, which might indicate that this participant doesn’t actually suffer from arthritis.
Heat map of the third inflammation dataset. Note that there are sporadic zero values throughout the entire dataset, and the last patient only has zero values over the 40 day study.
Challenge

Comparing Maximum Inflammation Across Trials

Use a loop to analyze all inflammation datasets:

  1. Collect all CSV files whose names start with inflammation-.

  2. For each file:

    • Load the data.
    • Compute the maximum inflammation per day.
  3. Store the daily maxima from each file.

  4. Plot all daily maxima curves on the same figure to compare the trials.

  5. Identify which dataset shows the highest peak inflammation overall.

Optional extensions:

  • Highlight the dataset with the highest peak using a different color or line style.
  • Print the filename corresponding to the highest peak.
Challenge

Plotting Differences

Plot the difference between the average inflammations recorded in the first and second datasets (inflammation-01.csv and inflammation-02.csv), i.e., the difference between the leftmost plots of the first two figures.

JULIA

using DelimitedFiles
using Statistics
using Plots

# Load data
filenames = sort(glob("inflammation*.csv", "."))
data0 = readdlm(filenames[1], ',')
data1 = readdlm(filenames[2], ',')

# Compute averages across patients (rows) for each day (columns)
# vec convert 1×N matrix to a vector
avg1 = vec(mean(data0, dims=1))  
avg2 = vec(mean(data1, dims=1))

# Plot the difference between the first two datasets
plot(avg1 - avg2, ylabel="Difference in average", xlabel="Day",
     title="Difference between first and second dataset")
Challenge

Generate Composite Statistics

Use each of the files once to generate a dataset containing values averaged over all patients. Complete the code inside the loop given below:

JULIA

# get list of files (you may use Glob.jl or readdir + sort)
filenames = sort(glob("inflammation*.csv", "."))
composite_data = zeros(60, 40)

for filename in filenames
    # sum each new file's data into composite_data as it's read
    
end

# divide composite_data by number of files
composite_data = composite_data / length(filenames)

Then generate average, max, and min plots for all patients.

JULIA

using Glob
using DelimitedFiles
using Statistics
using Plots

# Step 1: Get all CSV files and sort them
filenames = glob("inflammation*.csv", ".")
composite_data = zeros(60, 40)

# Step 2: Sum data from all files
for filename in filenames
    data = readdlm(filename, ',')
    composite_data .+= data
end

# Step 3: Average over the number of files
composite_data ./= length(filenames)

# Step 4: Plot average, max, and min for all patients
avg_plot = plot(mean(composite_data, dims=1)' , ylabel="average", legend=false)
max_plot = plot(maximum(composite_data, dims=1)' , ylabel="max", legend=false)
min_plot = plot(minimum(composite_data, dims=1)' , ylabel="min", legend=false)

# Step 5: Arrange side by side
plot(avg_plot, max_plot, min_plot, layout=(1,3))

After exploring the heatmaps and statistical plots, and completing the exercises to plot differences between datasets and generate composite patient statistics, we can now summarize insights from the twelve clinical trial datasets.

The datasets seem to fall into two main categories:

  • “Ideal” datasets that match Dr. Maverick’s claims very closely, but show unusual maxima and minima (for example, inflammation-01.csv and inflammation-02.csv).
  • “Noisy” datasets that partially agree with Dr. Maverick’s claims, but contain concerning issues such as missing values scattered throughout, and even participants whose data suggest they may not belong in the trial.

Interestingly, all three of the “noisy” datasets (inflammation-03.csv, inflammation-08.csv, and inflammation-11.csv) are identical down to the very last value. Using this information, we confront Dr. Maverick about the suspicious and duplicated data.

Dr. Maverick admits that the clinical trial data were fabricated. The initial trial suffered from unreliable measurements and poorly selected participants. To demonstrate the efficacy of the drug, fake datasets were created, and the original flawed dataset was reused multiple times to make the trials appear more convincing.

Congratulations! We have analyzed the inflammation datasets and uncovered that they were synthetically generated.

But rather than discard these synthetic datasets, we can continue to use them as valuable tools for learning programming and data analysis.

Key Points
  • Use glob(pattern, folder) (from Glob.jl) to get a vector of files whose names match a given pattern.
  • In the pattern, * matches zero or more characters, and ? matches exactly one character.

Content from If/Else - Conditional Statements in Julia


Last updated on 2026-01-27 | Edit this page

Overview

Questions

How can my programs make decisions and behave differently depending on data values?

Objectives

Write conditional statements including if, elseif, and else branches.

Correctly evaluate expressions containing && (and) and || (or).

In our last lesson, we noticed some suspicious patterns in our inflammation data by creating plots. How can we use Julia to automatically detect the kinds of features we saw, and take different actions depending on the results?

In this lesson, we’ll learn how to write code that only runs when certain conditions are met.

Conditionals


We can ask Julia to take different actions depending on a condition with an if statement:

JULIA

num = 37
if num > 100
    println("greater")
else
    println("not greater")
end
println("done")

OUTPUT

not greater
done

The second line of this code uses the keyword if to tell Julia that we want to make a choice. If the test that follows the if statement is true, the body of the if (the indented lines beneath it) is executed, and "greater" is printed.

If the test is false, the body of the else branch is executed instead, and "not greater" is printed. Only one branch is ever taken before continuing execution to print "done":

Conditional statements don’t have to include an else. If there isn’t one, Julia simply does nothing if the test is false:

JULIA

num = 53
println("before conditional...")
if num > 100
    println(num, " is greater than 100")
end
println("...after conditional")

OUTPUT

before conditional...
...after conditional

We can also chain several tests together using elseif The following Julia code uses elseif to print the sign of a number:

JULIA

num = -3

if num > 0
    println(num, " is positive")
elseif num == 0
    println(num, " is zero")
else
    println(num, " is negative")
end

OUTPUT

-3 is negative

Note that to test for equality we use a double equals sign == rather than a single equals sign =, which is used to assign values.

Comparing in Julia


To compare values we can use the following operators:

  • > : greater than
  • < : less than
  • == : equal to
  • != : not equal to
  • >= : greater than or equal to
  • <= : less than or equal to

We can also combine comparisons using logical operators:

  • && : logical AND (true if both conditions are true)
  • || : logical OR (true if at least one condition is true)
  • ! : logical NOT (inverts the truth value)

The syntax to combine operators looks like this:

JULIA

if (1 > 0) && (-1 >= 0)
    println("both parts are true")
else
    println("at least one part is false")
end

OUTPUT

at least one part is false

JULIA

if (1 < 0) || (1 >= 0)
    println("at least one test is true")
end

OUTPUT

at least one test is true
Callout

true and false

true and false are special values in Julia called Booleans, which represent truth values. A statement such as 1 < 0 returns false, while -1 < 0 returns true.

Checking Our Data


Now that we’ve learned how conditionals work, we can use them to check for the suspicious features we observed in our inflammation data. We’ll load the CSV file using Julia’s standard library module DelimitedFiles.

JULIA

using DelimitedFiles

data = readdlm("inflammation-01.csv", ',')

From the first plots, we noticed that the maximum daily inflammation increases by one unit each day. We can check for this suspicious pattern by comparing the maximum values at the start (day 0) and in the middle (day 20) of the study. We also noticed a different issue in the third dataset: the daily minima were all zero (as if a healthy participant had been included in the study). We can check for this using an elseif branch. If neither the maxima check nor the minima check is true, we can use else to give the all-clear.

JULIA

max_inflam_0 = maximum(data[:, 1])
max_inflam_20 = maximum(data[:, 21])

if max_inflam_0 == 0 && max_inflam_20 == 20
    println("Suspicious looking maxima!")
elseif sum(minimum(data, dims=1)) == 0
    println("Minima add up to zero!")
else
    println("Seems OK!")
end

We can test it with another dataset:

JULIA

data = readdlm("inflammation-03.csv", ',')

max_inflam_0 = maximum(data[:, 1])
max_inflam_20 = maximum(data[:, 21])

if max_inflam_0 == 0 && max_inflam_20 == 20
    println("Suspicious looking maxima!")
elseif sum(minimum(data, dims=1)) == 0
    println("Minima add up to zero!")
else
    println("Seems OK!")
end

OUTPUT

Minima add up to zero!

Using this approach, Julia evaluates the conditions in order:

  • If the first condition is true, it executes the corresponding block.
  • If not, it checks the elseif condition.
  • If neither condition is true, the else block provides a default action.

This allows us to automatically flag suspicious datasets without manually inspecting every plot, saving time and catching patterns systematically.

Challenge

How Many Paths?

Consider this code:

JULIA

if 4 > 5
    println("A")
elseif 4 == 4
    println("B")
elseif 4 < 5
    println("C")
end

Which of the following would be printed if you were to run this code? Why did you pick this answer?

  1. A
  2. B
  3. C
  4. B and C

B gets printed because 4 > 5 is false, and 4 == 4 is the first true condition. Even though 4 < 5 is also true, it is not executed because in an if / elseif chain, only the first true branch runs.

Even if multiple elseif conditions could theoretically be true, Julia will execute just the first one that is true, starting from the top of the conditional section.

This contrasts with multiple independent if statements, where every condition that is true will execute its block, not just the first.

Challenge

Close Enough

Write conditions that print true if the variable a is within 10% of the variable b, and false otherwise. Compare your implementation with a partner: do you get the same result for all possible pairs of numbers?

Julia has a built-in function abs() that returns the absolute value of a number:

JULIA

println(abs(-12))

OUTPUT

12

JULIA

a = 5
b = 5.1

if abs(a - b) <= 0.1 * abs(b)
    println(true)
else
    println(false)
end

JULIA

a = 5
b = 5.1

println(abs(a - b) <= 0.1 * abs(b))

This works because the Boolean values true and false can be printed directly.

Challenge

In-Place Operators

Julia also provides in-place operators that modify a variable in place. For example:

JULIA

x = 1   # original value
x += 1  # add one to x
x *= 3  # multiply x by 3
println(x)

OUTPUT

6

Write some code that sums the positive and negative numbers in a vector separately, using in-place operators. Do you think this is more or less readable than writing it without in-place operators?

JULIA

positive_sum = 0
negative_sum = 0
test_vector = [3, 4, 6, 1, -1, -5, 0, 7, -8]

for num in test_vector
    if num > 0
        positive_sum += num
    elseif num == 0
        # do nothing
    else
        negative_sum += num
    end
end

println("Sum of positives: ", positive_sum)
println("Sum of negatives: ", negative_sum)

Here, the elseif num == 0 branch is optional since neither sum changes for zero values, but it illustrates the use of elseif.

Challenge

Sorting Filenames Into Buckets

In our data folder, large datasets are stored in files whose names start with "inflammation-" and small datasets are in files whose names start with "small-". Other files can be ignored for now.

Your task is to sort these filenames into three separate vectors: large_files, small_files, and other_files.

Hint:

use startswith:

JULIA

println(startswith("string", "str"))   
println(startswith("string", "abc"))     

OUTPUT

true
false

Starting Point

JULIA

filenames = ["inflammation-01.csv",
             "myscript.jl",
             "inflammation-02.csv",
             "small-01.csv",
             "small-02.csv"]

large_files = String[]
small_files = String[]
other_files = String[]

Your Task

  1. Loop over the filenames.
  2. Determine which category each filename belongs to.
  3. Append the filename to the corresponding vector.

JULIA

for filename in filenames
    if startswith(filename, "inflammation-")
        push!(large_files, filename)
    elseif startswith(filename, "small-")
        push!(small_files, filename)
    else
        push!(other_files, filename)
    end
end

println("large_files: ", large_files)
println("small_files: ", small_files)
println("other_files: ", other_files)
large_files: ["inflammation-01.csv", "inflammation-02.csv"]
small_files: ["small-01.csv", "small-02.csv"]
other_files: ["myscript.jl"]
Challenge

Counting Vowels

  1. Write a loop that counts the number of vowels in a string.
  2. Test it on a few words and full sentences.
  3. Compare your solution with a neighbor’s — did you handle the letter y the same way?

JULIA

vowels = "aeiouAEIOU"
sentence = "Hallo World!."
count = 0

for char in sentence
    if char in vowels
        count += 1
    end
end

println("The number of vowels in this string is ", count)

OUTPUT

The number of vowels in this string is 3
Key Points
  • Use if condition to start a conditional statement, elseif condition to provide additional tests, and else to provide a default.
  • The bodies of the branches of conditional statements must be enclosed within if/elseif/else and end.
  • Use == to test for equality.
  • X && Y is only true if both X and Y are true.
  • X || Y is true if either X or Y, or both, are true.

Content from Creating Functions


Last updated on 2026-01-27 | Edit this page

Overview

Questions

  • How can I define new functions?
  • What’s the difference between defining and calling a function?
  • What happens when I call a function?

Objectives

  • Define a function that takes parameters.
  • Return a value from a function.
  • Test and debug a function.
  • Set default values for function parameters.
  • Explain why we should divide programs into small, single-purpose functions.

In the last episode, we’ve seen that Julia can make decisions about what it sees in our data. What if we want to convert some of our data, like taking a temperature in Fahrenheit and converting it to Celsius? We could write something like this for converting a single number:

JULIA

fahrenheit_val = 99
celsius_val = (fahrenheit_val - 32) * (5/9)

And for a second number we could just copy the line and rename the variables:

JULIA

fahrenheit_val2 = 43
celsius_val2 = (fahrenheit_val2 - 32) * (5/9)

But we would be in trouble as soon as we had to do this more than a couple of times. Cutting and pasting makes our code very long and repetitive very quickly.

We’d like a way to package our code so that it is easier to reuse — a shorthand way of re-executing longer pieces of code. We can do this with functions.

Let’s start by defining a function fahr_to_celsius that converts temperatures from Fahrenheit to Celsius:

JULIA

function fahr_to_celsius(temp)
    converted = (temp - 32) * (5/9)
    return converted
end

The function definition starts with the keyword function, followed by the function name (fahr_to_celsius) and a parenthesized list of parameter names (temp). The body of the function — the statements that are executed when it runs — is indented (by convention) and ends with an end keyword.

Inside the function we use a return statement to send a result back. When we call the function, the values we pass in are substituted for the parameter names, so we can use them inside the function.

Let’s try running our function:

JULIA

fahr_to_celsius(32)

OUTPUT

0.0

This calls our function with input 32 and return the converted value. It works just like calling functions from Julia’s standard library or external packages.

Composing Functions


Now that we’ve seen how to turn Fahrenheit into Celsius, we can also write a function to turn Celsius into Kelvin:

JULIA

function celsius_to_kelvin(temp_c)
    return temp_c + 273.15
end

println("freezing point of water in Kelvin: ", celsius_to_kelvin(0.0))

OUTPUT

freezing point of water in Kelvin: 273.15

If we want to turn Fahrenheit into Kelvin, we could write out the formula directly, but we don’t need to. Instead, we can compose the two functions we already created:

JULIA

function fahr_to_kelvin(temp_f)
    temp_c = fahr_to_celsius(temp_f)
    temp_k = celsius_to_kelvin(temp_c)
    return temp_k
end

println("boiling point of water in Kelvin: ", fahr_to_kelvin(212.0))

OUTPUT

boiling point of water in Kelvin: 373.15

In Julia, there’s an even shorter way. We can use the function composition operator (typed with \circ<TAB> in the REPL or editor):

JULIA

fahr_to_kelvin2 = celsius_to_kelvin  fahr_to_celsius

This creates a new function fahr_to_kelvin2 that first applies fahr_to_celsius and then feeds the result into celsius_to_kelvin.

JULIA

println("Boiling point of water in Kelvin (via ∘): ", fahr_to_kelvin2(212.0))

OUTPUT

Boiling point of water in Kelvin (via ∘): 373.15

So instead of writing out the intermediate steps every time, we can build bigger functions out of smaller ones just by linking them with .

This shows how larger programs are built: we define simple operations, and then combine them into more powerful ones. Real-life functions are usually longer than these small examples but they should stay short enough that someone else can still read and understand them.

Variable Scope


In our temperature conversion functions, we created variables inside those functions, such as temp, temp_c, temp_f, and temp_k. These are called local variables, because they only exist while the function is running. Once the function finishes, those variables disappear.

If we try to access them outside the function, Julia will throw an error:

JULIA

function fahr_to_kelvin(temp_f)
    temp_c = fahr_to_celsius(temp_f)
    temp_k = celsius_to_kelvin(temp_c)
    return temp_k
end

fahr_to_kelvin(212.0)

println(temp_k)  # trying to access local variable

ERROR

ERROR: UndefVarError: `temp_k` not defined

If we want to keep the result for later use, we need to assign the return value of the function to a variable outside the function:

JULIA

temp_kelvin = fahr_to_kelvin(212.0)
println("temperature in Kelvin was: ", temp_kelvin)

OUTPUT

temperature in Kelvin was: 373.15

Here, temp_kelvin is defined in the global scope (outside any function).

Inside a function, Julia can read global variables, but it’s usually better style to pass them as arguments. Still, here’s an example:

JULIA

temp_fahr = 212.0
temp_kelvin = fahr_to_kelvin(temp_fahr)

function print_temperatures()
    println("temperature in Fahrenheit was: ", temp_fahr)
    println("temperature in Kelvin was: ", temp_kelvin)
end

print_temperatures()

OUTPUT

temperature in Fahrenheit was: 212.0
temperature in Kelvin was: 373.15

Tidying up


Now that we know how to wrap bits of code up in functions, we can make our inflammation analysis easier to read and easier to reuse. First, let’s make a visualize function that generates our plots:

JULIA

using CSV, DataFrames, Plots, Statistics

function visualize(filename)
    data = Matrix(CSV.read(filename, DataFrame; header=false))

    plt = plot(layout=(1,3), size=(900,300))

    plot!(plt[1], mean(data, dims=1)', label="", ylabel="average")
    plot!(plt[2], maximum(data, dims=1)', label="", ylabel="max")
    plot!(plt[3], minimum(data, dims=1)', label="", ylabel="min")

    display(plt)
end

and another function called detect_problems that checks for those systematics we noticed:

JULIA

function detect_problems(filename)
    data = readdlm(filename, ',')

    max_inflam_0 = maximum(data[:, 1])
    max_inflam_20 = maximum(data[:, 21])

    if max_inflam_0 == 0 && max_inflam_20 == 20
        println("Suspicious looking maxima!")
    elseif sum(minimum(data, dims=1)) == 0
        println("Minima add up to zero!")
    else
        println("Seems OK!")
    end
end

Notice that rather than jumbling this code together in one giant for loop, we can now read and reuse both ideas separately. We can reproduce the previous analysis with a much simpler loop:

JULIA

filenames = sort(readdir(); by=identity)

for filename in filenames[1:3]
    println(filename)
    visualize(filename)
    detect_problems(filename)
end

By giving our functions readable names, we can more easily read and understand what is happening in the loop. Even better, if at some later date we want to use either of those pieces of code again, we can do so in a single line.

Testing and Documenting


When we put code into functions and want to reuse it, it is important to check whether those functions work correctly. That’s why we write tests.

First, let’s define a simple function that we can test:

JULIA

using Statistics

function offset_mean(data, target_mean_value)
    return (data .- mean(data)) .+ target_mean_value
end

Of course, we could test this on real data. But real datasets are often large, and we usually don’t know the correct result in advance. That’s why we use simple examples like this small matrix where we can easily verify the output:

JULIA

z = zeros(2, 2)
println(offset_mean(z, 3))

OUTPUT

[3.0  3.0
 3.0  3.0]

That looks right. Now we can use the function on our real data:

JULIA

using DelimitedFiles, Statistics

data = readdlm("inflammation-01.csv", ',')
println(offset_mean(data, 0))

OUTPUT

[-6.14875  -6.14875  -5.14875  …  -3.14875  -6.14875  -6.14875
 -6.14875  -5.14875  -4.14875  …  -5.14875  -6.14875  -5.14875
 -6.14875  -5.14875  -5.14875  …  -4.14875  -5.14875  -5.14875
   ⋮                               ⋮
 -6.14875  -6.14875  -6.14875  …  -6.14875  -4.14875  -6.14875
 -6.14875  -6.14875  -5.14875  …  -5.14875  -5.14875  -6.14875]

It’s hard to tell from the default output whether the result is correct, but we can check some basic statistics to reassure ourselves:

JULIA

println("original min, mean, and max are: ",
    minimum(data), ", ", mean(data), ", ", maximum(data))

offset_data = offset_mean(data, 0)

println("min, mean, and max of offset data are: ",
    minimum(offset_data), ", ", mean(offset_data), ", ", maximum(offset_data))

OUTPUT

original min, mean, and max are: 0.0, 6.14875, 20.0
min, mean, and max of offset data are: -6.14875, 2.842170943040401e-16, 13.85125

That seems almost right: the original mean was about 6.1, so shifting it to 0 makes the lower bound about –6.1. The mean of the offset data isn’t exactly zero, but it’s extremely close.

We can also check that the standard deviation hasn’t changed:

JULIA

println("std dev before and after: ",
    std(data), ", ", std(offset_data))

OUTPUT

std dev before and after: 4.613833197118566, 4.613833197118566

The values match, but to be more precise we can check the difference:

JULIA

println("difference in standard deviations before and after: ",
    std(data) - std(offset_data))

OUTPUT

difference in standard deviations before and after: 0.0

Everything looks good. Before we continue with the analysis, let’s document our function so we remember what it does.

Documenting Functions in Julia


The usual way to add documentation in Julia is with a docstring, written in triple quotes """ immediately before the function definition:

JULIA

"""
    offset_mean(data, target_mean_value)

Return a new array containing the original data,
with its mean shifted to match the desired value.

  Examples
  ========

  offset_mean([1, 2, 3], 0)
  3-element Vector{Float64}:
   -1.0
    0.0
    1.0
"""
function offset_mean(data, target_mean_value)
return (data .- mean(data)) .+ target_mean_value
end

Now we can use Julia’s built-in help system:

JULIA

?offset_mean

OUTPUT

  offset_mean(data, target_mean_value)

  Return a new array containing the original data,
  with its mean shifted to match the desired value.

  Examples
  ========

  offset_mean([1, 2, 3], 0)
  3-element Vector{Float64}:
   -1.0
    0.0
    1.0

Defining Defaults


In Julia, we can pass arguments to functions in two ways: positional argument, like typeof(data) ,or keyword argument, like delimin CSV.read("something.csv", delim=',').

For example, we can read a CSV file with:

JULIA

using CSV, DataFrames

data = CSV.read("inflammation-01.csv", DataFrame; delim=',')

Notice that the filename is passed as the first positional argument, but we specify delim using a keyword argument.

To make our own functions easier to use, we can define default values for parameters. For example, let’s redefine our offset_mean function:

JULIA

"""
    offset_mean(data::AbstractArray, target_mean_value::Float64=0.0)

Return a new array containing the original data,
with its mean shifted to match the desired value.

  Examples
  ========

  offset_mean([1, 2, 3])
  3-element Vector{Float64}:
   -1.0
    0.0
    1.0
"""
function  offset_mean(data::AbstractArray, target_mean_value::Float64=0.0)
return (data .- mean(data)) .+ target_mean_value
end

The key difference is that target_mean_value now has a default value of 0.0. If we call the function with two arguments, it works as before:

JULIA

test_data = zeros(2, 2)
println(offset_mean(test_data, 3))

OUTPUT

2×2 Matrix{Float64}:
 3.0  3.0
 3.0  3.0

But we can also call it with just one parameter. In that case, target_mean_value is automatically 0.0:

JULIA

more_data = 5 .+ zeros(2, 2)
println("data before mean offset:")
println(more_data)
println("offset data:")
println(offset_mean(more_data))

OUTPUT

data before mean offset:
[5.0 5.0; 5.0 5.0]
offset data:
[0.0 0.0; 0.0 0.0]

This is useful: we can provide a default value for parameters that usually stay the same but still allow flexibility when needed.

Julia matches positional arguments from left to right, and any argument not explicitly provided takes its default value. We can also override defaults using keyword arguments:

JULIA

function show_values(; a=1, b=2, c=3)
    println("a: $a b: $b c: $c")
end

println("no parameters:")
show_values()
println("one parameter:")
show_values(a=55)
println("two parameters:")
show_values(a=55, b=66)

OUTPUT

no parameters:
a: 1 b: 2 c: 3
one parameter:
a: 55 b: 2 c: 3
two parameters:
a: 55 b: 66 c: 3

We can also set only c:

JULIA

println("only setting the value of c")
show_values(c=77)

OUTPUT

only setting the value of c
a: 1 b: 2 c: 77

In summary, Julia’s keyword arguments let us provide sensible defaults for optional parameters, making functions easier to use while still flexible.

Slurping and Splatting


Sometimes we don’t know in advance how many arguments a function should take. In Julia, we can use the slurping operator ... to collect multiple arguments into a single variable, and the splatting operator ... to pass the elements of a collection as separate arguments.

For example:

JULIA

function add_all(nums...)
    return sum(nums)
end

println(add_all(1, 2, 3))
println(add_all(10, 20, 30, 40))

OUTPUT

6
100

In add_all(nums...), all inputs are slurped into the tuple nums.

Splatting: expand a collection into separate arguments

JULIA

values = [5, 15, 25]
println(add_all(values...))

OUTPUT

45

When we call add_all(values...), the array is splatted so that each element is passed as its own argument.

This makes functions more flexible when working with variable numbers of arguments.

Multiple Dispatch


One of Julia’s most powerful features is multiple dispatch. This means that the function that gets called depends on the types of all its arguments.

You can define the same function name with different method signatures, and Julia will automatically choose the most specific one.

For example:

JULIA

# Define a function for two integers
function add_together(a::Int, b::Int)
    return a + b
end

OUTPUT

add_together (generic function with 1 method)

Now we can use add_together for integers. But if we try using it with floats, we get an error:

JULIA

add_together(1.0, 2.0)

OUTPUT

MethodError: no method matching add_together(::Float64, ::Float64)
The function `add_together` exists, but no method is defined for this combination of argument types.

Thanks to multiple dispatch, we can simply define a new method for the same function:

JULIA

function add_together(a::Float64, b::Float64)
    return a + b  
end

OUTPUT

add_together (generic function with 2 methods)

Now we can call it again, and Julia will automatically use the matching method:

JULIA

add_together(1.0, 2.0)

OUTPUT

3.0

This feature allows you to write clean, readable code while handling many different types naturally.

Readable Functions


Consider these two functions in Julia:

JULIA

# Short, less descriptive version
function s(p)
    a = 0.0
    for v in p
        a += v
    end
    m = a / length(p)
    d = 0.0
    for v in p
        d += (v - m)^2
    end
    return sqrt(d / (length(p) - 1))
end

# More descriptive and readable version
function std_dev(sample)
    sample_sum = 0.0
    for value in sample
        sample_sum += value
    end

    sample_mean = sample_sum / length(sample)

    sum_squared_devs = 0.0
    for value in sample
        sum_squared_devs += (value - sample_mean)^2
    end

    return sqrt(sum_squared_devs / (length(sample) - 1))
end

The functions s and std_dev compute the same thing — the sample standard deviation — but std_dev is much easier for a human to read and understand.

As this example shows, documentation and coding style are key for readability. Meaningful variable names and breaking code into logical sections with blank lines help make your code easier to follow.

Readable code is useful not only when sharing with others but also for your future self: if you revisit code months later, good readability will save you a lot of headache!

Challenge

Combining Strings

In Julia, “adding” two strings with * produces their concatenation: "a" * "b" is "ab".

Write a function called fence that takes two parameters, original and wrapper, and returns a new string that has the wrapper character at the beginning and end of the original.

A call to your function should look like this:

JULIA

println(fence("name", "*"))

OUTPUT

*name*

JULIA

function fence(original::AbstractString, wrapper::AbstractString)
    return wrapper * original * wrapper
end
Challenge

Return versus print

Note that return and println are not interchangeable in Julia. println prints data to the screen so we, the users, can see it. return, on the other hand, makes data visible to the program for further use.

Consider the following function:

JULIA

function add(a, b)
    println(a + b)
end

What happens if we execute the following commands?

JULIA

A = add(7, 3)
println(A)

Julia will first execute the function add with a = 7 and b = 3, so it prints 10.

However, because add does not explicitly return a value, it returns nothing by default. Thus, A is assigned nothing and the final println(A) prints:

OUTPUT

10
nothing
Challenge

Selecting Characters From Strings

In Julia, you can refer to a character in a string using [index] (for example, [1] for the first character, [2] for the second, and so on). Additionally, the keyword end refers to the last character.

Write a function called outer that returns a string made up of just the first and last characters of its input.

A call to your function should look like this:

JULIA

println(outer("helium"))

OUTPUT

hm

JULIA

function outer(input_string::AbstractString)
    return input_string[1] * input_string[end]
end
Challenge

Greeting Function with Default Parameter

Write a function called greet that:

  1. Takes one required parameter name.
  2. Takes one optional parameter greeting that defaults to "Hello".
  3. Returns a string that combines the greeting and the name in the format:
"<greeting>, <name>!"

JULIA

println(greet("Alice"))

OUTPUT

Hello, Alice!

JULIA

println(greet("Bob", "Hi"))

OUTPUT

Hi, Bob!

JULIA

function greet(name::AbstractString, greeting::AbstractString="Hello")
    return "$greeting, $name"*"!"
end
Key Points
  • Define a function using function function_name(parameter)end.
  • Call a function using function_name(value).
  • Numbers are stored as integers or floating-point numbers.
  • Variables defined within a function are local and can only be seen and used inside that function.
  • Variables created outside of any function are global.
  • Within a function, global variables can be accessed
  • Use docstrings (triple-quoted strings """ ... """) to document a function.
  • Specify default values for parameters when defining a function using parameter=value in the parameter list.
  • Parameters can be passed by position, by name (keyword arguments), or omitted to use their default value.

Content from Handling errors


Last updated on 2026-01-27 | Edit this page

Overview

Questions

  • How does Julia report errors?
  • How can I handle errors?

Objectives

  • To be able to read a traceback, and determine where the error took place and what type it is.
  • To be able to describe the types of situations in which syntax errors, indentation errors, name errors, index errors, and missing file errors occur.

Every programmer encounters errors, both those who are just beginning, and those who have been programming for years. Encountering errors and exceptions can be very frustrating at times, and can make coding feel like a hopeless endeavour. However, understanding what the different types of errors are and when you are likely to encounter them can help a lot. Once you know why you get certain types of errors, they become much easier to fix.

JULIA

function icecream()
       ice_creams = ["chocolate", "vanilla", "strawberry"]
       println(ice_creams[5])       
       end

icecream()

OUTPUT

ERROR: BoundsError: attempt to access 3-element Vector{String} at index
[5]
Stacktrace:
 [1] getindex
   @ .\essentials.jl:13 [inlined]
 [2] ice()
   @ Main .\REPL[1]:3
 [3] top-level scope
   @ REPL[2]:1

Let’s look at the error massage step by step:

ERROR: BoundsError: attempt to access 3-element Vector{String} at index [5]

BoundsError means: you tried to look up something outside the allowed range. The vector has 3 elements, so the valid indices are 1, 2, and 3 index 5 does not exist.

[1] getindex
   @ .\essentials.jl:13 [inlined]
[2] ice()
   @ Main .\REPL[1]:3
[3] top-level scope
   @ REPL[2]:1

This is Julia telling you where the problem happened:

  • At line 3 inside your function (println(ice_creams[5])).
  • That function was called at the REPL (line [2]).

Don’t panic if your error message is very long!

Sometimes a traceback can go on for 20 lines or more. This doesn’t mean the error is worse — it just means many functions were called before Julia hit the problem.

In most cases, the most useful part is near the bottom, where Julia shows the exact line in your code that caused the error. The lines above just show the chain of function calls that led there.

If you encounter an error and don’t know what it means, it is still important to read the traceback closely. That way, if you fix the error but encounter a new one, you can tell that the error changed.

Additionally, sometimes knowing where the error occurred is enough to fix it, even if you don’t entirely understand the message.

Syntax Errors


When you forget a closing parenthesis, a end keyword, or type something in the wrong order, you will encounter a syntax error. This means that Julia couldn’t figure out how to read your program.

For example:

JULIA

function some_function(
    msg = "hello, world!"
    println(msg)
end    

ERROR

ERROR: ParseError:
# Error @ REPL[1]:3:5
    msg = "hello, world!"
    println(msg)
#   └─────────┘ ── Expected `)`
Stacktrace:
 [1] top-level scope
   @ none:1

Here, Julia tells us that there is a syntax error and shows us where parsing failed. In this case the problem is that the opening parenthesis in the function header was never closed.

Variable Name Errors


Another very common type of error occurs when you try to use a variable that does not exist. For example:

JULIA

println(a)

ERROR

ERROR: UndefVarError: `a` not defined
Stacktrace:
 [1] top-level scope
   @ REPL[8]:1

Julia tells us that the variable a is not defined. These errors are usually very informative, of the form:

UndefVarError: <variable_name> not defined

Why does this error occur? It depends on what your code was supposed to do, but there are a few very common reasons:

  1. Forgetting to use quotes around a string

JULIA

println(hello)

ERROR

ERROR: UndefVarError: `hello` not defined
Stacktrace:
 [1] top-level scope
   @ REPL[9]:1

Here, Julia thinks hello is a variable, not text. To fix it, you need to write "hello" instead.

  1. Using a variable before defining it

JULIA

for number in 1:10
    count = count + number
    println("The count is: ", count)
end

ERROR

ERROR: UndefVarError: `count` not defined
Stacktrace:
 [1] top-level scope
   @ .\REPL[10]:2

The variable count must be initialized (e.g. count = 0) before it can be updated in the loop.

  1. Typos and case-sensitivity

JULIA

Count = 0
for number in 1:10
    count = count + number
    println("The count is: ", count)
end

ERROR

ERROR: UndefVarError: `count` not defined
Stacktrace:
 [1] top-level scope
   @ .\REPL[10]:2

In Julia, variable names are case-sensitive: Count and count are two different variables. Here we defined Count, but tried to use count, so Julia reports it as undefined.

File Errors


Another common type of error occurs when working with files. If you try to open a file that does not exist, Julia will raise a SystemError telling you so.

JULIA

open("myfile.txt", "r")

ERROR

ERROR: SystemError: opening file "myfile.txt": No such file or directory
Stacktrace:
 [1] systemerror(p::String, errno::Int32; extrainfo::Nothing)
   @ Base .\error.jl:176
 [2] #systemerror#82
   @ .\error.jl:175 [inlined]
 [3] systemerror
   @ .\error.jl:175 [inlined]
 [4] open(fname::String; lock::Bool, read::Bool, write::Nothing, create::Nothing, truncate::Nothing, append::Nothing)
   @ Base .\iostream.jl:293
 [5] open
   @ .\iostream.jl:275 [inlined]
 [6] open(fname::String, mode::String; lock::Bool)
   @ Base .\iostream.jl:356
 [7] open(fname::String, mode::String)
   @ Base .\iostream.jl:355
 [8] top-level scope
   @ REPL[13]:1

This usually happens because:

  • The file does not exist at the given path
  • The path is misspelled
  • You are in the wrong working directory

If your project looks like this:

myproject/
  writing/
    myfile.txt

and you try:

JULIA

open("myfile.txt", "r")

Julia will throw an error, because the correct path is:

JULIA

open("writing/myfile.txt", "r")

These are the most common errors with files, though many others exist. If you get an error that you’ve never seen before, searching the Internet for that error type often reveals common reasons why you might get that error.

Challenge

Index Error

What is the error here? Fix it.

JULIA

numbers = [10, 20, 30]
println(numbers[4])

ERROR

Error: BoundsError: attempt to access 3-element Vector{Int64} at index [4]

JULIA

println(numbers[3])   
Challenge

Syntax Error

The following function is supposed to calculate the average of a list of numbers. But it has a syntax error. What is the error, and how can you fix it?

JULIA

function average numbers
    total = sum(numbers
    return total / length(numbers)
end

Problems: - Missing parentheses in the function definition - Missing closing parenthesis in sum numbers Fixed version:

JULIA

function average(numbers)
    total = sum(numbers)
    return total / length(numbers)
end
Key Points
  • Julia error messages may look intimidating at first, but they contain useful information: what type of error occurred, where it happened, and sometimes hints about why.

  • An error having to do with the grammar or structure of the program is called a syntax: ... error.

  • An UndefVarError will occur when trying to use a variable that does not exist.

  • Containers like arrays and strings will generate a BoundsError if you try to access an element at an index that does not exist.

  • Trying to open a file that does not exist will give you a SystemError.

Content from Writing Tests


Last updated on 2026-01-27 | Edit this page

Overview

Questions

How can I make my programs more reliable?

Objectives

  • Explain what a test is.
  • Explain what an assertion is.
  • Use tests and assertions to check my code.

Why Testing Matters


When we write code, we want to be confident that it produces the right results.
Additionally, it should keep working when we make changes. And even if we make mistakes,
they should be caught early.

Julia provides two simple ways to test your code:

  1. @test (from the Test standard library).
  2. @assert (built-in macro).

Using @test


The @test macro is part of Julia’s Test standard library.
It checks whether an expression evaluates to true.
If not, the test fails but your program keeps running.

JULIA

using Test  # load the testing tools

@test 2 + 2 == 4 

OUTPUT

Test Passed

JULIA

@test sqrt(9) == 3  

OUTPUT

Test Passed

JULIA

@test 10 ÷ 3 == 3

OUTPUT

Test Passed

JULIA

@test 10 / 3 == 3

OUTPUT

Test Failed at REPL[5]:1
  Expression: 10 / 3 == 3
   Evaluated: 3.3333333333333335 == 3

ERROR: There was an error during testing

@test reports failures without stopping execution. You usually use it for larger projects where you want to test many things at once. For these cases, there is another structure we can use: @testset.

JULIA

@testset "Math tests" begin
    @test 2 + 2 == 4
    @test 3^2 == 9
    @test sqrt(16) == 4
end

OUTPUT

Test Summary: | Pass  Total  Time
Math tests    |    3      3  0.1s
Test.DefaultTestSet("Math tests", Any[], 3, false, false, true, 1.75629781203e9, 1.756297812095e9, false)

The @assert Macro


The @assert macro is built into Julia. It also checks if an expression is true, but if the check fails, it immediately throws an error and stops the program.

JULIA

@assert 2 + 2 == 4     

JULIA

@assert 10 / 3 == 3     

OUTPUT

ERROR: AssertionError: 10 / 3 == 3
Stacktrace:
 [1] top-level scope
   @ REPL[8]:1

You can also provide a custom error message:

JULIA

x = -1
@assert x  0 "x must be non-negative!"

OUTPUT

ERROR: AssertionError: x must be non-negative!
Stacktrace:
 [1] top-level scope
   @ REPL[10]:1

When to Use What?


  • Use @test when writing test files or checking lots of conditions at once.
  • Use @assert inside your program to enforce assumptions (like “input must be positive”).
Challenge

Test

  1. Write a function is_even(n) that returns true if n is even.

  2. Add a testset that checks:

    • is_even(2) is true.
    • is_even(3) is false.
  3. Add an assertion in your function that throws an error if n is not an integer.

JULIA

function is_even(n)
    @assert isa(n, Integer) "Input must be an integer"
    return n % 2 == 0
end

@testset "Even number tests" begin
    @test is_even(2) == true
    @test is_even(3) == false
end

Content from Debugging


Last updated on 2026-01-27 | Edit this page

Overview

Questions

  • How can I debug my program?

Objectives

  • Debug code containing an error systematically.
  • Identify ways of making code less error-prone and more easily tested.

Once testing has uncovered problems, the next step is to fix them. Many novices do this by making more-or-less random changes to their code until it seems to produce the right answer, but that’s very inefficient (and the result is usually only correct for the one case they’re testing). The more experienced a programmer is, the more systematically they debug, and most follow some variation on the rules explained below.

Callout

It’s always important to check that our code is “plugged in”, i.e., that we’re actually exercising the problem that we think we are. Every programmer has spent hours chasing a bug, only to realize that they were actually calling their code on the wrong data set or with the wrong configuration parameters, or are using the wrong version of the software entirely. Mistakes like these are particularly likely to happen when we’re tired, frustrated, and up against a deadline, which is one of the reasons late-night (or overnight) coding sessions are almost never worthwhile.

  1. The first step in debugging something is to know what it’s supposed to do. “My program doesn’t work” isn’t good enough: in order to diagnose and fix problems, we need to be able to tell correct output from incorrect. But writing test cases for scientific software is hard, because if we knew what the output of the scientific code was supposed to be, we wouldn’t be running the software In practice, scientists tend to do the following: Test with simplified data, Test a simplified case, Check conservation laws, Visualize

  2. We can only debug something when it fails, so the second step is always to find a test case that makes it fail every time. The “every time” part is important because few things are more frustrating than debugging an intermittent problem: if we have to call a function a dozen times to get a single failure, the odds are good that we’ll scroll past the failure when it actually occurs.

  3. Make It Fail Fast: If it takes 20 minutes for the bug to surface, we can only do three experiments an hour. This means that we’ll get less data in more time and that we’re more likely to be distracted by other things as we wait for our program to fail, which means the time we are spending on the problem is less focused. It’s therefore critical to make it fail fast.

  4. Change One Thing at a Time: Every time we make a change, however small, we should re-run our tests immediately, because the more things we change at once, the harder it is to know what’s responsible for what (those N! interactions again). And we should re-run all of our tests: more than half of fixes made to code introduce (or re-introduce) bugs, so re-running all of our tests tells us whether we have regressed.

  5. Keep Track of What You’ve Done: Debugging works best when we keep track of what we’ve done and how well it worked. If we find ourselves asking, “Did left followed by right with an odd number of lines cause the crash? Or was it right followed by left? Or was I using an even number of lines?” then it’s time to step away from the computer, take a deep breath, and start working more systematically. Records are particularly useful when the time comes to ask for help. People are more likely to listen to us when we can explain clearly what we did, and we’re better able to give them the information they need to be useful.

Callout

Version Control

Version control is often used to reset software to a known state during debugging, and to explore recent changes to code that might be responsible for bugs.

Git is a great example for a version control Software.

If we can’t find a bug we should ask for help.
We could ask a colleague or describe our problem in an online forum. Asking for help also helps alleviate confirmation bias. If we have just spent an hour writing a complicated program, we want it to work, so we’re likely to keep telling ourselves why it should, rather than searching for the reason it doesn’t. People who aren’t emotionally invested in the code can be more objective, which is why they’re often able to spot the simple mistakes we have overlooked.

Key Points
  • Know what code is supposed to do before trying to debug it.
  • Make it fail every time.
  • Make it fail fast.
  • Change one thing at a time, and for a reason.
  • Keep track of what you’ve done.
  • Ask for help.

Content from Course Conclusion


Last updated on 2026-01-27 | Edit this page

Congratulations! You have completed all parts of this Julia course and built a solid foundation for working with Julia.

By completing these lessons, you not only gained practical skills in Julia but also learned key principles of scientific programming: structuring code, reusability, readability, and testing.

What’s Next?


  • Apply Julia to your own projects and research questions.
  • Connect with the Julia community: https://julialang.org/community/.
  • Continue your learning with more advanced courses on topics like machine learning in Julia, scientific computing, optimization, or high-performance numerical simulations.

Final Thought


Learning to program means: practice, experimentation, and curiosity. With the foundations you have gained here, you are well prepared to use Julia for your own data analysis and scientific projects.

Good luck on your Julia journey