Volumes in Kubernetes
Last updated on 2024-12-03 | Edit this page
Overview
Questions
- How do I make data available in a Pod?
Objectives
- Explore creating and using volumes
- Understand the “stateless” nature of pods
Pods are stateless entities and should be treated as such. Anything the pod creates in it’s filesystem is removed when the pod stops or dies. For temporary files and directories this is fine. However this is not condusive for research since results need to be saved somewhere. Pods and their containers are in a ephemeral or “stateless” configuration by default. This means that any data stored on the filesystems within the containers are lost when the pod is stopped. For workflows, this creates a small barrier. Workflows can still be run using this methodology, however the pods and their containers would need to be setup to download their input and upload their output to a remote server.
Volumes are the solution to this problem. The main type of volume used for pods is the Persistent Volume Claim or “PVC”. These are requests to the Kubernetes cluster to “claim” space on the storage system or a Persistent Volume (PV) of a cluster. PVCs are scoped to different namespaces, which means that a PVC is not visible or accessible by another namespace on the Kubernetes cluster.
Once a PVC is mounted in a pod, data can be stored or retrieved from using the mount path on the container’s filesystem. This allows data to stay persistent after the pod is terminated.
Creating a PV
A Kubernetes cluster may automatically create a PV when a PVC is created. This will be mentioned in any documentation of the cluster itself.
First we need to create a Persistent Volume to give us a space to claim for files.
pv_create.yaml
YAML
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv0001
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 5Gi
hostPath:
path: /mnt/pv0001/
In our pv_create.yaml
, we defined the basic details on
our Persistent Volume. We are using a generic nam, pv0001, for our
Persistent Volume, but it can be any name if desired. Since we are only
running a single pod in this part of the lesson, we are using the
ReadWriteOnce access mode, which allows multiple pods to access the
volume if they are running on the same node. There are other access
modes available that allow working across multiple nodes or setting it
to read only. More details on this are available in the Kubernetes
documentation. We also configured the Persistent Volume to have a
capcity of 5 GB of space and have it on the host filesystem at
/mnt/pv0001
.
OUTPUT
persistentvolume/pv0001 created
This will create a dedicated space for our Pods to store files. The Pods won’t have immediate access to store data in a PV. In order for the Pods to store data, they need to use a claim against the PV using a Persistent Volume Claim.
Creating a PVC
For our Pods to store data, they need to claim space in a Persistent Volume.
pvc_storage.yaml
YAML
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-test-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
This configuration will look for a Persistent Volume to claim against if there is a mathcing and available PV.
If there is an available and matching PV, a claim will then be available for a Pod or multiple Pods to use.
OUTPUT
persistentvolumeclaim/my-test-pv-claim created
This created a PVC and dynamically made a PV since a PV did not yet exist.
We can check both by using kubectl get pv
and
kubectl get pvc
.
OUTPUT
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
pvc-7c06f9ef-909a-4cd7-b450-d136219a8964 8Gi RWO Delete Bound openproject/data-my-openproject-postgresql-0 standard <unset> 83d
OUTPUT
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
my-test-pv-claim Pending manual <unset> 82s
Transfering data to and from a PVC
In order to transfer data to and from a PVC, you need to have the volume mounted in a pod for the transfer. Data from a local computer can then be copied to a PVC through the Pod using the mount path of the PVC.
First we will need to make a pod to copy our data through. We will use a similar structure as our first pod.
YAML
apiVersion: v1
kind: Pod
metadata:
name: hello-world-pod
spec:
containers:
- name: hello-world-container
image: busybox
command: ["/bin/sh", "-c"]
args: ["echo 'Hello World! I am in a pod!' && sleep infinity"]
volumeMounts:
- mountPath: /mnt/my_pvc
name: my-pvc-for-pod
volumes:
- name: my-pvc-for-pod
persistentVolumeClaim:
claimName: my-test-pv-claim
We can confirm if our mountPath
has our PVC available by
running ls
in the pod.
OUTPUT
total 4
drwxr-xr-x 2 root root 4096 Oct 7 16:27 my_pvc
Since our Pod is running the sleep
command indefinitely,
the pod will remain active and we can use it to transfer some data.
We will create a basic file to place in the PVC. In
my_file.md
we will insert the following text.
Hello, I am a file in a PVC!
OUTPUT
Hello, I am a file in a PVC!
If we check the content of the PVC mounted in the Pod, we can see our file and look at it’s contents.
OUTPUT
total 4
-rw-rw-r-- 1 1000 1000 29 Oct 7 16:49 my_file.md
OUTPUT
Hello, I am a file in a PVC!
Verifying the data is persistent
We can verify that the data is persistent by deleting the pod and creating a new pod.
OUTPUT
pod "hello-world-pod" deleted
At this point any data in the Pod’s filesystem itself would be gone as the Pod’s filesystem is ephemeral.
We will create a new Pod that will output the contents of the file.
check_pvc.yaml
YAML
apiVersion: v1
kind: Pod
metadata:
name: check-pvc-pod
spec:
containers:
- name: file-check-container
image: busybox
command: ["/bin/sh", "-c"]
args: ["cat /mnt/my_pvc/my_file.md && sleep infinity"]
volumeMounts:
- mountPath: /mnt/my_pvc
name: my-pvc-for-pod
volumes:
- name: my-pvc-for-pod
persistentVolumeClaim:
claimName: my-test-pv-claim
OUTPUT
pod/check-pvc-pod created
We can then check the logs of the pod to see the contents of the file.
OUTPUT
Hello, I am a file in a PVC!
We can also modify create a file in the volume using a Pod.
pod_create_file.yaml
YAML
apiVersion: v1
kind: Pod
metadata:
name: file-create-pod
spec:
containers:
- name: file-create-container
image: busybox
command: ["/bin/sh", "-c"]
args: ["for i in 1 2 3 4 5; do cat /mnt/my_pvc/my_file.md >> /mnt/my_pvc/output.log; done; ls /mnt/my_pvc; cat /mnt/my_pvc/output.log && sleep infinity"]
volumeMounts:
- mountPath: /mnt/my_pvc
name: my-pvc-for-pod
volumes:
- name: my-pvc-for-pod
persistentVolumeClaim:
claimName: my-test-pv-claim
What we should expect to see is the contents of our PVC mounted in
the pod and the contents of output.log
OUTPUT
my_file.md
output.log
Hello, I am a file in a PVC!
Hello, I am a file in a PVC!
Hello, I am a file in a PVC!
Hello, I am a file in a PVC!
Hello, I am a file in a PVC!
We can then copy the output.log
file back to our
computer.
OUTPUT
Hello, I am a file in a PVC!
Hello, I am a file in a PVC!
Hello, I am a file in a PVC!
Hello, I am a file in a PVC!
Hello, I am a file in a PVC!