In this tutorial I am going to demonstrate how Rolling out and Rolling back updates with Zero Downtime on Kubernetes Cluster. This is one of the most important concept when you talk about Kubernetes Application Deployments.
Table of Contents
Prerequisites
To perform the practical demonstration at your own I assume that you have a healthy 3-node Kubernetes cluster already been provisioned.
My environment
Node | IP | HostName | OS | Kubernetes version | Docker version |
Master | 172.32.32.100 | kmaster-ft.example.com | Ubuntu 18.04 | v1.19.3 | 19.03.6 |
Worker 1 | 172.32.32.101 | kworker-ft1.example.com | Ubuntu 18.04 | v1.19.3 | 19.03.6 |
Worker 2 | 172.32.32.102 | kworker-ft2.example.com | Ubuntu 18.04 | v1.19.3 | 19.03.6 |
Before we start the practical let us first understand what is Kubernetes Deployment in brief.
What is Kubernetes Deployment ?
As per the Kubernetes official documentation A Deployment provides declarative updates for Pods and ReplicaSets.
You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.
The ultimate goal of a Kubernetes deployment is to provide declarative updates to both the pod and the ReplicaSets. First, declare a state for a manifest using a yaml file, then the Deployment controller ensures the current state is reconciled to match the desired state defined in manifest.
Why not to use Replication Controller in place of Deployments?
Its suggested to useDeployment
instead of Replication Controller(rc)
to perform a rolling update. Though, they are same in many ways, such as ensuring the homogeneous set of pods are always up/available and also they provide the ability to help the user to roll out the new images. However, Deployment provides more functionalities such as rollback support as well.
Rolling Updates
Users expect applications to be available all the time and developers are expected to deploy new versions of them several times a day.
In Kubernetes this is done with rolling updates. Rolling updates allow Deployment’s update to take place with zero downtime by incrementally updating Pods instances with new ones. The new Pods will be scheduled on Nodes with available resources.
Let us take a very simple example to understand this:
We have our application image available at docker hub registry at: rakeshrhcss/hello-fosstechnix-v1
Let us create our Deployment manifest file.
root@kmaster-ft:~/deployments# cat deployment-rollout.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ft-rollout-demo
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: ft-rollout-demo
template:
metadata:
labels:
app: ft-rollout-demo
spec:
containers:
- image: rakeshrhcss/hello-fosstechnix-v1
imagePullPolicy: Always
name: ft-rollout-demo
ports:
- containerPort: 80
We are going to create 2 Replicas for our application deployment which will be running on port 80 inside the container.
Now we will create a Service to expose our application in order to make it available for the users outside our cluster environment.
Here is our Service manifest file.
root@kmaster-ft:~/deployments# cat deployment-rollout-svc.yml
apiVersion: v1
kind: Service
metadata:
name: ft-rollout-demo-service
spec:
type: NodePort
ports:
- port: 80 #service port
targetPort: 80 #Pod Port
nodePort: 30088 #Node Port from the range - 30000-32767
selector:
app: ft-rollout-demo
Create the deployment and associated service using the manifest file.
root@kmaster-ft:~/deployments# kubectl apply -f deployment-rollout.yml
deployment.apps/ft-rollout-demo created
Verify the deployment creation.
root@kmaster-ft:~/deployments# kubectl get deployment -o wide
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
ft-rollout-demo 2/2 2 2 3m27s ft-rollout-demo rakeshrhcss/hello-fosstechnix-v1 app=ft-rollout-demo
Verify the pods running.
root@kmaster-ft:~/deployments# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ft-rollout-demo-7c74c98d66-src5c 1/1 Running 0 2m51s 192.168.77.143 kworker-ft1 <none> <none>
ft-rollout-demo-7c74c98d66-x46vn 1/1 Running 0 2m51s 192.168.17.12 kworker-ft2 <none> <none>
Describe the deployment to have more insights regarding the deployment.
root@kmaster-ft:~/deployments# kubectl describe deployments.apps ft-rollout-demo
Name: ft-rollout-demo
Namespace: default
CreationTimestamp: Sat, 31 Oct 2020 16:13:31 +0000
Labels: <none>
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=ft-rollout-demo
Replicas: 2 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=ft-rollout-demo
Containers:
ft-rollout-demo:
Image: rakeshrhcss/hello-fosstechnix-v1
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: ft-rollout-demo-7c74c98d66 (2/2 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 3m24s (x2 over 3m45s) deployment-controller Scaled up replica set ft-rollout-demo-7c74c98d66 to 2
Verify the application running on both the pods part of our deployment.
root@kmaster-ft:~/deployments# curl 192.168.77.143:80
<h1>Welcome to fosstechnix<h1>
<h2>You are running version 1 of our application. Enjoy!<h2>
root@kmaster-ft:~/deployments# curl 192.168.17.12:80
<h1>Welcome to fosstechnix<h1>
<h2>You are running version 1 of our application. Enjoy!<h2>
Create the service now to make it accessible from outside.
root@kmaster-ft:~/deployments# kubectl apply -f deployment-rollout-svc.yml
service/ft-rollout-demo-service created
Verify the service status.
root@kmaster-ft:~/deployments# kubectl get service ft-rollout-demo-service -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
ft-rollout-demo-service NodePort 10.100.132.191 <none> 80:30088/TCP 25s app=ft-rollout-demo
Describe the service and verify our two pods attached as service endpoints.
root@kmaster-ft:~/deployments# kubectl describe service ft-rollout-demo-service
Name: ft-rollout-demo-service
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=ft-rollout-demo
Type: NodePort
IP: 10.100.132.191
Port: <unset> 80/TCP
TargetPort: 80/TCP
NodePort: <unset> 30088/TCP
Endpoints: 192.168.17.12:80,192.168.77.143:80
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
Now try to access our service from outside of kubernetes cluster network at nodePort 30088.
So our application is working fine.
Roll Out a new update to our application
Here is our updated manifest file.
root@kmaster-ft:~/deployments# cat deployment-rollout.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ft-rollout-demo
namespace: default
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 1
selector:
matchLabels:
app: ft-rollout-demo
template:
metadata:
labels:
app: ft-rollout-demo
spec:
containers:
- image: rakeshrhcss/hello-fosstechnix-v2
imagePullPolicy: Always
name: ft-rollout-demo
ports:
- containerPort: 80
readinessProbe:
httpGet:
path: /
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
Lets first understand this updated file.
If you simply update your original deployment-rollout.yml file with the new image version which carries your application update and apply that then you will notice that there maybe a little downtime on your application because the old pods will start getting terminated and the new ones will be getting created.
Why this happens and what’s the solution ?
The reason behind such behavior is Kubernetes doesn’t know when your new pod is ready to start accepting requests, so as soon as your new pod gets created, the old pod is terminated without waiting to see if all the necessary services, processes have started in the new pod which would then enable it to receive requests. And you might see a downtime because of this.
Solution to this is a config option called Readiness Probe. Kubernetes provides a config option in deployment called Readiness Probe. Readiness Probe makes sure that the new pods created are ready to take on requests before terminating the old pods.
To enable this, first you need to have a route in whatever path the application you want to run which would return a 200 on an HTTP GET request. (Note: you can have other HTTP request methods as well, but for this post, I’m sticking with GET method)
Probes have a number of fields that you can use to more precisely control the behavior of liveness and readiness checks:
initialDelaySeconds
: Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to 0 seconds. Minimum value is 0.periodSeconds
: How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.timeoutSeconds
: Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.successThreshold
: Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness. Minimum value is 1.failureThreshold
: When a probe fails, Kubernetes will tryfailureThreshold
times before giving up. In case of readiness probe the Pod will be marked Unready. Defaults to 3. Minimum value is 1.
Another thing we should add is something called RollingUpdate strategy and it can be configured as follows.
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 1
The above specifies the strategy used to replace old Pods by new ones. The type can be “Recreate” or “RollingUpdate”. “RollingUpdate” is the default value. So it should be configured along with readiness/liveness probe settings because the Kubernetes doesn’t know at what time your pod is ready, so you might have a downtime due to that.
maxUnavailable
is an optional field that specifies the maximum number of Pods that can be unavailable during the update process. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%).
The absolute number is calculated from percentage by rounding down. The value cannot be 0 if maxSurge
is 0. The default value is 25%.
maxSurge
is an optional field that specifies the maximum number of Pods that can be created over the desired number of Pods. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%). The value cannot be 0 if MaxUnavailable
is 0. The absolute number is calculated from the percentage by rounding up. The default value is 25%.
So now we have understood our updated deployment file to rollout a new update to our application. Lets apply it now.
root@kmaster-ft:~/deployments# kubectl apply -f deployment-rollout.yml
deployment.apps/ft-rollout-demo configured
Please observe the command output below closely as this is the most important part here to understand as you can clearly see the existing pods going down to termination state while new pods are getting live ..
root@kmaster-ft:~/deployments# kubectl get pods -o wide -w
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ft-rollout-demo-7c74c98d66-src5c 1/1 Running 0 42m 192.168.77.143 kworker-ft1 <none> <none>
ft-rollout-demo-7c74c98d66-x46vn 1/1 Running 0 42m 192.168.17.12 kworker-ft2 <none> <none>
ft-rollout-demo-984ff575d-t6nhj 0/1 ContainerCreating 0 4s <none> kworker-ft2 <none> <none>
ft-rollout-demo-984ff575d-t6nhj 0/1 Running 0 8s 192.168.17.13 kworker-ft2 <none> <none>
ft-rollout-demo-984ff575d-t6nhj 1/1 Running 0 14s 192.168.17.13 kworker-ft2 <none> <none>
ft-rollout-demo-7c74c98d66-x46vn 1/1 Terminating 0 42m 192.168.17.12 kworker-ft2 <none> <none>
ft-rollout-demo-984ff575d-b6vf4 0/1 Pending 0 0s <none> <none> <none> <none>
ft-rollout-demo-984ff575d-b6vf4 0/1 Pending 0 0s <none> kworker-ft1 <none> <none>
ft-rollout-demo-984ff575d-b6vf4 0/1 ContainerCreating 0 0s <none> kworker-ft1 <none> <none>
ft-rollout-demo-7c74c98d66-x46vn 1/1 Terminating 0 42m 192.168.17.12 kworker-ft2 <none> <none>
ft-rollout-demo-7c74c98d66-x46vn 0/1 Terminating 0 42m 192.168.17.12 kworker-ft2 <none> <none>
ft-rollout-demo-984ff575d-b6vf4 0/1 ContainerCreating 0 2s <none> kworker-ft1 <none> <none>
ft-rollout-demo-7c74c98d66-x46vn 0/1 Terminating 0 42m 192.168.17.12 kworker-ft2 <none> <none>
ft-rollout-demo-7c74c98d66-x46vn 0/1 Terminating 0 42m 192.168.17.12 kworker-ft2 <none> <none>
ft-rollout-demo-984ff575d-b6vf4 0/1 Running 0 6s 192.168.77.145 kworker-ft1 <none> <none>
ft-rollout-demo-984ff575d-b6vf4 1/1 Running 0 14s 192.168.77.145 kworker-ft1 <none> <none>
ft-rollout-demo-7c74c98d66-src5c 1/1 Terminating 0 43m 192.168.77.143 kworker-ft1 <none> <none>
ft-rollout-demo-7c74c98d66-src5c 1/1 Terminating 0 43m 192.168.77.143 kworker-ft1 <none> <none>
ft-rollout-demo-7c74c98d66-src5c 0/1 Terminating 0 43m 192.168.77.143 kworker-ft1 <none> <none>
ft-rollout-demo-7c74c98d66-src5c 0/1 Terminating 0 43m 192.168.77.143 kworker-ft1 <none> <none>
ft-rollout-demo-7c74c98d66-src5c 0/1 Terminating 0 43m 192.168.77.143 kworker-ft1 <none> <none>
We can see now that two new pods have been created:
root@kmaster-ft:~/deployments# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ft-rollout-demo-984ff575d-b6vf4 1/1 Running 0 54s 192.168.77.145 kworker-ft1 <none> <none>
ft-rollout-demo-984ff575d-t6nhj 1/1 Running 0 68s 192.168.17.13 kworker-ft2 <none> <none>
Now verify if our application is also updated to the newer version or not:
root@kmaster-ft:~/deployments# curl 192.168.77.145:80
<h1>Welcome to fosstechnix<h1>
<h2>You are running version 2 of our application. Enjoy!<h2>
root@kmaster-ft:~/deployments# curl 192.168.17.13:80
<h1>Welcome to fosstechnix<h1>
<h2>You are running version 2 of our application. Enjoy!<h2>
Yes. It has been. So our Rolling Updates are just working fine.
Rollout history
By default, all of the Deployment’s rollout history is kept in the system so that you can rollback anytime you want (you can change that by modifying revision history limit).
root@kmaster-ft:~/deployments# kubectl rollout history deployment ft-rollout-demo
deployment.apps/ft-rollout-demo
REVISION CHANGE-CAUSE
1 <none>
2 <none>
Important Note: A Deployment’s revision is created when a Deployment’s rollout is triggered. This means that the new revision is created if and only if the Deployment’s Pod template is changed, for example if you update the labels or container images of the template. Other updates, such as scaling the Deployment, do not create a Deployment revision, so that you can facilitate simultaneous manual- or auto-scaling. This means that when you roll back to an earlier revision, only the Deployment’s Pod template part is rolled back.
Rolling back updates
What if we did something wrong with the new update and because of that our customers are getting impacted .. We have to ROLLBACK our changes! But How ??
Let’s do it and Rollback to our last stable version of application:
root@kmaster-ft:~/deployments# kubectl rollout undo deployment ft-rollout-demo
deployment.apps/ft-rollout-demo rolled back
Verify the revision history now:
root@kmaster-ft:~/deployments# kubectl rollout history deployment ft-rollout-demo
deployment.apps/ft-rollout-demo
REVISION CHANGE-CAUSE
2 <none>
3 <none>
List out the newly created pods:
root@kmaster-ft:~/deployments# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ft-rollout-demo-7c74c98d66-ssps8 1/1 Running 0 62s 192.168.77.150 kworker-ft1 <none> <none>
ft-rollout-demo-7c74c98d66-wmvn2 1/1 Running 0 67s 192.168.17.17 kworker-ft2 <none> <none>
Its time to verify our rolled back changes which has gone to previous version:
root@kmaster-ft:~/deployments# curl 192.168.77.150:80
<h1>Welcome to fosstechnix<h1>
<h2>You are running version 1 of our application. Enjoy!<h2>
root@kmaster-ft:~/deployments# curl 192.168.17.17:80
<h1>Welcome to fosstechnix<h1>
<h2>You are running version 1 of our application. Enjoy!<h2>
It’s working perfectly fine and we are back to our previous stable version. (Our customers will be happy now).
Now finally I will show you how we can go to a specific revision of rollout history.
To rollback to the second revision of the Deployment, run the following command:
root@kmaster-ft:~/deployments# kubectl rollout undo deployment ft-rollout-demo --to-revision=2
deployment.apps/ft-rollout-demo rolled back
By performing this action wrt to our example it again repeated the same process (of terminating older pods and creating newer one’s) and took us to version 2 of the update.
root@kmaster-ft:~/deployments# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ft-rollout-demo-984ff575d-bgq28 1/1 Running 0 29s 192.168.77.151 kworker-ft1 <none> <none>
ft-rollout-demo-984ff575d-jjjpz 1/1 Running 0 18s 192.168.17.18 kworker-ft2 <none> <none>
Verify the changes:
root@kmaster-ft:~/deployments# curl 192.168.17.18:80
<h1>Welcome to fosstechnix<h1>
<h2>You are running version 2 of our application. Enjoy!<h2>
This is all about Kubernetes Deployments Rollouts and Rollbacks.
Hope you like the tutorial. Please let me know your feedback in the response section.
Thanks! Happy Learning!
Related Articles:
Kubernetes Tutorial for Beginners [10 Practical Articles]
Reference: