In this tutorial I am going to cover yet another important topic of Kubernetes which is StatefulSet.
Table of Contents
Prerequisites
To perform the practical demonstration at your own I assume that you have a healthy 3-node Kubernetes cluster already been provisioned.
My environment
Node | IP | HostName | OS | Kubernetes version | Docker version |
Master | 172.32.32.100 | kmaster-ft.example.com | Ubuntu 18.04 | v1.19.3 | 19.03.6 |
Worker 1 | 172.32.32.101 | kworker-ft1.example.com | Ubuntu 18.04 | v1.19.3 | 19.03.6 |
Worker 2 | 172.32.32.102 | kworker-ft2.example.com | Ubuntu 18.04 | v1.19.3 | 19.03.6 |
Before we start the practical demonstration lets first understand what is Kubernetes StatefulSet.
What is Kubernetes StatefulSet?
As per Kubernetes official documentation Kubernetes StatefulSet is the workload API object used to manage stateful applications.
Not quite clear. Right!
Lets first understand what is stateful and stateless means in a bit more easier and detailed manner.
What is a stateful application?
A Stateful application is the one which saves all the data to a persistent disk storage which will be used by the application clients, other dependent applications or by the server itself. For example A database is a stateful application or or you can say any key-value store to which data is saved and retrieved by other applications. Some popular examples of stateful applications are MongoDB, Cassandra, and MySQL etc.
What is stateless application?
A stateless app is an application that does not save any client data on the server-side generated in one session for use in the next session with that client. Instead client is responsible for storing and handling all application state-related information on client side. It improves the performance of applications.
For example All the web services (for eg. Apache, Nginx, or Tomcat) which are reliant of RESTful API designs. They do not care which network they are using, and they don’t need a permanent storage as well.
Now since we have a good understanding of stateful and stateless applications we can easily go back to our original topic of understanding Kubernetes StatefulSet.
When to use Kubernetes StatefulSet ?
If you have a stateless app needs to be deployed over Kubernetes cluster go with a Deployment. As far as a Deployment is concerned, Pods are interchangeable. The client never cares about from which pod he is getting the request response for example an Nginx web server.
But if you want to deploy stateful applications such as databases you must go with Kubernetes StatefulSets. Because unlike a deployment, the StatefulSet
provides a certain guarantee about the identity of the pods it is managing (that is, predictable names) and about the startup order.
Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling for example the hostnames.
If you want to use storage volumes to provide persistence for your workload, you can use a StatefulSet as part of the solution. Although individual Pods in a StatefulSet are susceptible to failure, the persistent Pod identifiers make it easier to match existing volumes to the new Pods that replace any that have failed.
What do you achieve from StatefulSet Deployments?
- Stable, unique network identifiers: Each pod part of a StatefulSet will be given a hostname which will be based on the application name and increment it by one. For example, redis-cluster-0, redis-cluster-1, redis-cluster-2, redis-cluster-3 and so on for a StatefulSet named “redis-cluster” that has 4 instances running.
- Stable, persistent storage: Every pod in the cluster will be assigned its own persistent volume which will be based on the storage class we defined. It will assigned to default, if no storage classes are defined. Deleting or scaling down pods will not automatically delete the volumes associated with them- so that the data persists. In order to delete the resources not needed, you could scale the StatefulSet down to 0 first, prior to deletion of the unused pods.
- Ordered, graceful deployment and scaling: All the pods part of a StatefulSet are always created and will be brought online in an specific order, starting from 1 to n, and they will be shut down in reverse order to ensure a reliable and repeatable deployment and runtime. The StatefulSet will never scale until all the desired pods are running. In case one pod dies, it will recreate the pod before making an attempt to add additional instances to meet the scaling criteria.
- Ordered, automated rolling updates: If you’ve chosen RollingUpdate , when you apply the manifest, StatefulSet pods will be removed and then be replaced in reverse ordinal order. They have the ability to handle upgrades in a rolling manner where it shuts down each node in the order it was created originally and builds them, continuing this until all the old versions instances have been shut down and cleaned up. Persistent volumes as we know will be reused, and data is automatically migrated to the upgraded version instances.
Practical Demonstration of Kubernetes Statefulset
To understand the StatefulSet in its entirety we will be taking an example of Deploying Redis Cluster on our Kubernetes cluster.
Step #1: Create Persistent Volume and Storage Class
A local persistent volume represents a local disk directly-attached to a single Kubernetes Node. A StorageClass provides a way for administrators to describe the “classes” of storage they offer.
Here is our Persistent Volume (PV) and Staorage class manifest file: We are creating 4 persistent volumes each for a single redis instance on the node kworker-ft1.
root@kmaster-ft:~/statefulset/statefulset-ft-demo# cat local-pv.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: statefulset-ft-demo-pv-1
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /mnt/local-storage1
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- kworker-ft1
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: statefulset-ft-demo-pv-2
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /mnt/local-storage2
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- kworker-ft1
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: statefulset-ft-demo-pv-3
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /mnt/local-storage3
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- kworker-ft1
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: statefulset-ft-demo-pv-4
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /mnt/local-storage4
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- kworker-ft1
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
NOTE: Local volumes do not currently support dynamic provisioning, however a StorageClass should still be created to delay volume binding until Pod scheduling. This is specified by the WaitForFirstConsumer
volume binding mode.
Create the PV and Storage Class:
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl apply -f local-pv.yml
persistentvolume/statefulset-ft-demo-pv-1 created
persistentvolume/statefulset-ft-demo-pv-2 created
persistentvolume/statefulset-ft-demo-pv-3 created
persistentvolume/statefulset-ft-demo-pv-4 created
storageclass.storage.k8s.io/local-storage created
Verify the PV created:
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
statefulset-ft-demo-pv-1 10Gi RWO Delete Bound default/data-redis-cluster-0 local-storage 3m49s
statefulset-ft-demo-pv-2 10Gi RWO Delete Bound default/data-redis-cluster-2 local-storage 3m49s
statefulset-ft-demo-pv-3 10Gi RWO Delete Bound default/data-redis-cluster-3 local-storage 3m49s
statefulset-ft-demo-pv-4 10Gi RWO Delete Bound default/data-redis-cluster-1 local-storage 3m49s
verify the Storage class created:
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl get storageclass local-storage
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-storage kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 4m2s
Step #2: Create StatefulSet
Here is the StatefulSet manifest file: (We will go through each and every part of this file to make you understand what its expected to do)
root@kmaster-ft:~/statefulset/statefulset-ft-demo# cat statefulset.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-cluster
data:
update-node.sh: |
#!/bin/sh
REDIS_NODES="/data/nodes.conf"
sed -i -e "/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/${POD_IP}/" ${REDIS_NODES}
exec "$@"
redis.conf: |+
cluster-enabled yes
cluster-require-full-coverage no
cluster-node-timeout 15000
cluster-config-file /data/nodes.conf
cluster-migration-barrier 1
appendonly yes
protected-mode no
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-cluster
spec:
serviceName: redis-cluster
replicas: 4
selector:
matchLabels:
app: redis-cluster
template:
metadata:
labels:
app: redis-cluster
spec:
containers:
- name: redis
image: redis:5.0.1-alpine
ports:
- containerPort: 6379
name: client
- containerPort: 16379
name: gossip
command: ["/conf/update-node.sh", "redis-server", "/conf/redis.conf"]
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
volumeMounts:
- name: conf
mountPath: /conf
readOnly: false
- name: data
mountPath: /data
readOnly: false
volumes:
- name: conf
configMap:
name: redis-cluster
defaultMode: 0755
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-storage"
resources:
requests:
storage: 50Mi
---
apiVersion: v1
kind: Service
metadata:
name: redis-cluster
spec:
clusterIP: None
ports:
- port: 6379
targetPort: 6379
name: client
- port: 16379
targetPort: 16379
name: gossip
selector:
app: redis-cluster
Kubernetes StatefulSet yaml structure is almost identical to a Deployment. The only difference is that we need serviceName. So we need to define a service that’s going expose out pods.
- In the first part we have a configMap defined which will be consumed by redis instances
- second part we have statefulSet defined with serviceName, Replicas and label selectors for the pods.
- we are using redis:5.0.1-alpine image for our pod containers, defined the ports needs to be exposed and the entry commands for the containers to run and few environment variables.
- Then we have our volumeMounts where we are going to make use of our configMap.
- Then we have volumeClaimTemplates where we are going to make use of PV’s and storage class we have created as part of first step.
- At last we have a headless Service created which is responsible for the network identity of the Pods.
NOTE: StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods. You are responsible for creating this Service.
It’s is very important to understand this concept.
What is Headless Services ?
Sometimes you don’t need load-balancing and a single Service IP. In this case, you can create what are termed “headless” Services, by explicitly specifying "None"
for the cluster IP (.spec.clusterIP
).
For headless Services
, a cluster IP is not allocated, kube-proxy does not handle these Services, and there is no load balancing or proxying done by the platform for them. How DNS is automatically configured depends on whether the Service has selectors defined:
With selectors
For headless Services that define selectors, the endpoints controller creates Endpoints
records in the API, and modifies the DNS configuration to return records (addresses) that point directly to the Pods
backing the Service
.
Without selectors
For headless Services that do not define selectors, the endpoints controller does not create Endpoints
records. However, the DNS system looks for and configures either:
- CNAME records for
ExternalName
-type Services. - A records for any
Endpoints
that share a name with the Service, for all other types
In our manifest file we have the selectors defined. (app: redis-cluster)
Let us deploy the above manifest file.
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl apply -f statefulset.yaml
configmap/redis-cluster unchanged
statefulset.apps/redis-cluster created
service/redis-cluster created
verify the Statfulset:
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl get statefulset -o wide
NAME READY AGE CONTAINERS IMAGES
redis-cluster 4/4 3m41s redis redis:5.0.1-alpine
verify the pods running:
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
redis-cluster-0 1/1 Running 0 26s 192.168.77.181 kworker-ft1 <none> <none>
redis-cluster-1 1/1 Running 0 23s 192.168.77.182 kworker-ft1 <none> <none>
redis-cluster-2 1/1 Running 0 19s 192.168.77.183 kworker-ft1 <none> <none>
redis-cluster-3 1/1 Running 0 16s 192.168.77.184 kworker-ft1 <none> <none>
You should notice all the pods have been created on kworker-ft1 node as we are using PV deployed on that node for data storage.
Describe the service created:
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl describe svc redis-cluster
Name: redis-cluster
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=redis-cluster
Type: ClusterIP
IP: None
Port: client 6379/TCP
TargetPort: 6379/TCP
Endpoints: 192.168.77.181:6379,192.168.77.182:6379,192.168.77.183:6379 + 1 more...
Port: gossip 16379/TCP
TargetPort: 16379/TCP
Endpoints: 192.168.77.181:16379,192.168.77.182:16379,192.168.77.183:16379 + 1 more...
Session Affinity: None
Events: <none>
Step #3: start and verify the redis cluster deployment
To do this, we run the following commands and type yes
to accept the configuration.
root@kmaster-ft:~/statefulset/statefulset-ft-demo# IPs=$(kubectl get pods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:6379 '{end})
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl exec -it redis-cluster-0 -- /bin/sh -c "redis-cli -h 127.0.0.1 -p 6379 --cluster create ${IPs}"
>>> Performing hash slots allocation on 4 nodes...
Master[0] -> Slots 0 - 4095
Master[1] -> Slots 4096 - 8191
Master[2] -> Slots 8192 - 12287
Master[3] -> Slots 12288 - 16383
M: c836f4cd6b9ef4aa4d1f56a706e5687aea3d89a9 192.168.77.181:6379
slots:[0-4095] (4096 slots) master
M: 6877f0d17e24e04dda1ba5e25037568313d42a81 192.168.77.182:6379
slots:[4096-8191] (4096 slots) master
M: 62dc5d4c89dc3ca9aa0c3258efe6e0442dce1d30 192.168.77.183:6379
slots:[8192-12287] (4096 slots) master
M: d4f1e6345f6327d12e5e8688d1ed660052a155c6 192.168.77.184:6379
slots:[12288-16383] (4096 slots) master
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
....
>>> Performing Cluster Check (using node 192.168.77.181:6379)
M: c836f4cd6b9ef4aa4d1f56a706e5687aea3d89a9 192.168.77.181:6379
slots:[0-4095] (4096 slots) master
M: d4f1e6345f6327d12e5e8688d1ed660052a155c6 192.168.77.184:6379
slots:[12288-16383] (4096 slots) master
M: 62dc5d4c89dc3ca9aa0c3258efe6e0442dce1d30 192.168.77.183:6379
slots:[8192-12287] (4096 slots) master
M: 6877f0d17e24e04dda1ba5e25037568313d42a81 192.168.77.182:6379
slots:[4096-8191] (4096 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
Step #4: Test the Redis Cluster by deploying the Hit Counter App
We’ll deploy a simple hit counter app into our cluster and put a load balancer in front of it. The purpose of this app is to increment a counter and store the value in the Redis cluster before returning the counter value as an HTTP response over UI.
And since we have Persistent Storage configured even if we delete the pods the data will not be lost.
Here is the application manifest file:
root@kmaster-ft:~/statefulset/statefulset-ft-demo# cat example-app.yaml
---
apiVersion: v1
kind: Service
metadata:
name: hit-counter-lb
spec:
type: NodePort
ports:
- port: 80
protocol: TCP
targetPort: 5000
nodePort: 32200
selector:
app: myapp
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: hit-counter-app
spec:
replicas: 1
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: rakeshrhcss/hit-counter-app-redis:1.0
ports:
- containerPort: 5000
Create the deployment and associated service.
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl apply -f example-app.yaml
service/hit-counter-lb created
deployment.apps/hit-counter-app created
verify the pods running:
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hit-counter-app-548b8c58f-s8c6f 1/1 Running 0 12s 192.168.17.30 kworker-ft2 <none> <none>
redis-cluster-0 1/1 Running 0 6m3s 192.168.77.181 kworker-ft1 <none> <none>
redis-cluster-1 1/1 Running 0 6m 192.168.77.182 kworker-ft1 <none> <none>
redis-cluster-2 1/1 Running 0 5m56s 192.168.77.183 kworker-ft1 <none> <none>
redis-cluster-3 1/1 Running 0 5m53s 192.168.77.184 kworker-ft1 <none> <none>
verify the deployment:
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl get deploy hit-counter-app -o wide
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
hit-counter-app 1/1 1 1 52s myapp rakeshrhcss/hit-counter-app-redis:1.0 app=myapp
verify the service exposed (NodePort):
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl describe svc hit-counter-lb
Name: hit-counter-lb
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=myapp
Type: NodePort
IP: 10.111.80.255
Port: <unset> 80/TCP
TargetPort: 5000/TCP
NodePort: <unset> 32200/TCP
Endpoints: 192.168.17.30:5000
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
Step #5: Verify the application deployment and StatefulSet running
Access the deployment from web browser outside of your network with NodePort 32200.
Now the number of times you are going to hit the url the number will get incremented by one. try this once from cli.
Now I will hit the url few times and increase the counter to 10 and then I will delete few pods.
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl delete pods redis-cluster-1 redis-cluster-2
pod "redis-cluster-1" deleted
pod "redis-cluster-2" deleted
Lets verify the pods status now.
root@kmaster-ft:~/statefulset/statefulset-ft-demo# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hit-counter-app-548b8c58f-s8c6f 1/1 Running 0 7m47s 192.168.17.30 kworker-ft2 <none> <none>
redis-cluster-0 1/1 Running 0 13m 192.168.77.181 kworker-ft1 <none> <none>
redis-cluster-1 1/1 Running 0 14s 192.168.77.185 kworker-ft1 <none> <none>
redis-cluster-2 1/1 Running 0 12s 192.168.77.186 kworker-ft1 <none> <none>
redis-cluster-3 1/1 Running 0 13m 192.168.77.184 kworker-ft1 <none> <none>
new pods with the same DNS name have been created.
Lets verify if our data is also persistent. If yes then we should get the hit counter starting from 11.
Yes Its working as expected. The data is persistent and the same pods are serving the requests.
This was all about statefulSets. It’s bit lengthy article but would help you a lot in understanding the core concepts of a Kubernetes StatefulSets.
Hope you like the article. Please let me know your feedback in the response section.
Thanks. Happy learning!
Related Articles
Kubernetes Tutorial for Beginners [10 Practical Articles]
Reference: