Rolling out and Rolling back updates with Zero Downtime on Kubernetes Cluster

In this tutorial I am going to demonstrate how Rolling out and Rolling back updates with Zero Downtime on Kubernetes Cluster. This is one of the most important concept when you talk about Kubernetes Application Deployments.

Rolling out and Rolling back updates Kubernetes Cluster

Prerequisites

To perform the practical demonstration at your own I assume that you have a healthy 3-node Kubernetes cluster already been provisioned.

My environment

NodeIPHostNameOSKubernetes versionDocker version
Master172.32.32.100kmaster-ft.example.comUbuntu 18.04v1.19.319.03.6
Worker 1172.32.32.101kworker-ft1.example.comUbuntu 18.04v1.19.319.03.6
Worker 2172.32.32.102kworker-ft2.example.comUbuntu 18.04v1.19.319.03.6

Before we start the practical let us first understand what is Kubernetes Deployment in brief.

What is Kubernetes Deployment ?

As per the Kubernetes official documentation A Deployment provides declarative updates for Pods and ReplicaSets.

You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.

The ultimate goal of a Kubernetes deployment is to provide declarative updates to both the pod and the ReplicaSets. First, declare a state for a manifest using a yaml file, then the Deployment controller ensures the current state is reconciled to match the desired state defined in manifest.

Why not to use Replication Controller in place of Deployments?

Its suggested to useDeployment instead of Replication Controller(rc) to perform a rolling update. Though, they are same in many ways, such as ensuring the homogeneous set of pods are always up/available and also they provide the ability to help the user to roll out the new images. However, Deployment provides more functionalities such as rollback support as well.

Rolling Updates

Users expect applications to be available all the time and developers are expected to deploy new versions of them several times a day.

In Kubernetes this is done with rolling updates. Rolling updates allow Deployment’s update to take place with zero downtime by incrementally updating Pods instances with new ones. The new Pods will be scheduled on Nodes with available resources.

Let us take a very simple example to understand this:

We have our application image available at docker hub registry at: rakeshrhcss/hello-fosstechnix-v1

Let us create our Deployment manifest file.

[email protected]:~/deployments# cat deployment-rollout.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ft-rollout-demo
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ft-rollout-demo
  template:
    metadata:
      labels:
        app: ft-rollout-demo
    spec:
      containers:
      - image: rakeshrhcss/hello-fosstechnix-v1
        imagePullPolicy: Always
        name: ft-rollout-demo
        ports:
        - containerPort: 80

We are going to create 2 Replicas for our application deployment which will be running on port 80 inside the container.

Now we will create a Service to expose our application in order to make it available for the users outside our cluster environment.

Here is our Service manifest file.

[email protected]:~/deployments# cat deployment-rollout-svc.yml
apiVersion: v1
kind: Service
metadata:
  name: ft-rollout-demo-service
spec:
  type: NodePort
  ports:
  - port: 80         #service port
    targetPort: 80   #Pod Port
    nodePort: 30088  #Node Port from the range - 30000-32767

  selector:
    app: ft-rollout-demo

Create the deployment and associated service using the manifest file.

[email protected]:~/deployments# kubectl apply -f deployment-rollout.yml
deployment.apps/ft-rollout-demo created

Verify the deployment creation.

[email protected]:~/deployments# kubectl get deployment -o wide
NAME              READY   UP-TO-DATE   AVAILABLE   AGE     CONTAINERS        IMAGES                             SELECTOR
ft-rollout-demo   2/2     2            2           3m27s   ft-rollout-demo   rakeshrhcss/hello-fosstechnix-v1   app=ft-rollout-demo

Verify the pods running.

[email protected]:~/deployments# kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE     IP               NODE          NOMINATED NODE   READINESS GATES
ft-rollout-demo-7c74c98d66-src5c   1/1     Running   0          2m51s   192.168.77.143   kworker-ft1   <none>           <none>
ft-rollout-demo-7c74c98d66-x46vn   1/1     Running   0          2m51s   192.168.17.12    kworker-ft2   <none>           <none>

Describe the deployment to have more insights regarding the deployment.

[email protected]:~/deployments# kubectl describe deployments.apps ft-rollout-demo
Name:                   ft-rollout-demo
Namespace:              default
CreationTimestamp:      Sat, 31 Oct 2020 16:13:31 +0000
Labels:                 <none>
Annotations:            deployment.kubernetes.io/revision: 1
Selector:               app=ft-rollout-demo
Replicas:               2 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:  app=ft-rollout-demo
  Containers:
   ft-rollout-demo:
    Image:        rakeshrhcss/hello-fosstechnix-v1
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   ft-rollout-demo-7c74c98d66 (2/2 replicas created)
Events:
  Type    Reason             Age                    From                   Message
  ----    ------             ----                   ----                   -------
  Normal  ScalingReplicaSet  3m24s (x2 over 3m45s)  deployment-controller  Scaled up replica set ft-rollout-demo-7c74c98d66 to 2

Verify the application running on both the pods part of our deployment.

[email protected]:~/deployments# curl 192.168.77.143:80
<h1>Welcome to fosstechnix<h1>

<h2>You are running version 1 of our application. Enjoy!<h2>


[email protected]:~/deployments# curl 192.168.17.12:80
<h1>Welcome to fosstechnix<h1>

<h2>You are running version 1 of our application. Enjoy!<h2>

Create the service now to make it accessible from outside.

[email protected]:~/deployments# kubectl apply -f deployment-rollout-svc.yml
service/ft-rollout-demo-service created

Verify the service status.

[email protected]:~/deployments# kubectl get service ft-rollout-demo-service -o wide
NAME                      TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE   SELECTOR
ft-rollout-demo-service   NodePort   10.100.132.191   <none>        80:30088/TCP   25s   app=ft-rollout-demo

Describe the service and verify our two pods attached as service endpoints.

[email protected]:~/deployments# kubectl describe service ft-rollout-demo-service
Name:                     ft-rollout-demo-service
Namespace:                default
Labels:                   <none>
Annotations:              <none>
Selector:                 app=ft-rollout-demo
Type:                     NodePort
IP:                       10.100.132.191
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
NodePort:                 <unset>  30088/TCP
Endpoints:                192.168.17.12:80,192.168.77.143:80
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

Now try to access our service from outside of kubernetes cluster network at nodePort 30088.

Rolling out and Rolling back updates with Zero Downtime on Kubernetes Cluster 1

So our application is working fine.

Roll Out a new update to our application

Here is our updated manifest file.

[email protected]:~/deployments# cat deployment-rollout.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ft-rollout-demo
  namespace: default
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
    rollingUpdate:
       maxUnavailable: 25%
       maxSurge: 1
  selector:
    matchLabels:
      app: ft-rollout-demo
  template:
    metadata:
      labels:
        app: ft-rollout-demo
    spec:
      containers:
      - image: rakeshrhcss/hello-fosstechnix-v2
        imagePullPolicy: Always
        name: ft-rollout-demo
        ports:
        - containerPort: 80
        readinessProbe:
          httpGet:
            path: /
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          successThreshold: 1

Lets first understand this updated file.

If you simply update your original deployment-rollout.yml file with the new image version which carries your application update and apply that then you will notice that there maybe a little downtime on your application because the old pods will start getting terminated and the new ones will be getting created.

Why this happens and what’s the solution ?

The reason behind such behavior is Kubernetes doesn’t know when your new pod is ready to start accepting requests, so as soon as your new pod gets created, the old pod is terminated without waiting to see if all the necessary services, processes have started in the new pod which would then enable it to receive requests. And you might see a downtime because of this.

Solution to this is a config option called Readiness Probe. Kubernetes provides a config option in deployment called Readiness Probe. Readiness Probe makes sure that the new pods created are ready to take on requests before terminating the old pods.

To enable this, first you need to have a route in whatever path the application you want to run which would return a 200 on an HTTP GET request. (Note: you can have other HTTP request methods as well, but for this post, I’m sticking with GET method)

Probes have a number of fields that you can use to more precisely control the behavior of liveness and readiness checks:

  • initialDelaySeconds: Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to 0 seconds. Minimum value is 0.
  • periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
  • timeoutSeconds: Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
  • successThreshold: Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness. Minimum value is 1.
  • failureThreshold: When a probe fails, Kubernetes will try failureThreshold times before giving up. In case of readiness probe the Pod will be marked Unready. Defaults to 3. Minimum value is 1.

Another thing we should add is something called RollingUpdate strategy and it can be configured as follows.

strategy:
 type: RollingUpdate
 rollingUpdate:
    maxUnavailable: 25%
    maxSurge: 1

The above specifies the strategy used to replace old Pods by new ones. The type can be “Recreate” or “RollingUpdate”. “RollingUpdate” is the default value. So it should be configured along with readiness/liveness probe settings because the Kubernetes doesn’t know at what time your pod is ready, so you might have a downtime due to that.

maxUnavailable is an optional field that specifies the maximum number of Pods that can be unavailable during the update process. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%).

The absolute number is calculated from percentage by rounding down. The value cannot be 0 if maxSurge is 0. The default value is 25%.

maxSurge is an optional field that specifies the maximum number of Pods that can be created over the desired number of Pods. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%). The value cannot be 0 if MaxUnavailable is 0. The absolute number is calculated from the percentage by rounding up. The default value is 25%.

So now we have understood our updated deployment file to rollout a new update to our application. Lets apply it now.

[email protected]:~/deployments# kubectl apply -f deployment-rollout.yml
deployment.apps/ft-rollout-demo configured

Please observe the command output below closely as this is the most important part here to understand as you can clearly see the existing pods going down to termination state while new pods are getting live ..

[email protected]:~/deployments# kubectl get pods -o wide -w
NAME                               READY   STATUS              RESTARTS   AGE   IP               NODE          NOMINATED NODE   READINESS GATES
ft-rollout-demo-7c74c98d66-src5c   1/1     Running             0          42m   192.168.77.143   kworker-ft1   <none>           <none>
ft-rollout-demo-7c74c98d66-x46vn   1/1     Running             0          42m   192.168.17.12    kworker-ft2   <none>           <none>
ft-rollout-demo-984ff575d-t6nhj    0/1     ContainerCreating   0          4s    <none>           kworker-ft2   <none>           <none>
ft-rollout-demo-984ff575d-t6nhj    0/1     Running             0          8s    192.168.17.13    kworker-ft2   <none>           <none>
ft-rollout-demo-984ff575d-t6nhj    1/1     Running             0          14s   192.168.17.13    kworker-ft2   <none>           <none>
ft-rollout-demo-7c74c98d66-x46vn   1/1     Terminating         0          42m   192.168.17.12    kworker-ft2   <none>           <none>
ft-rollout-demo-984ff575d-b6vf4    0/1     Pending             0          0s    <none>           <none>        <none>           <none>
ft-rollout-demo-984ff575d-b6vf4    0/1     Pending             0          0s    <none>           kworker-ft1   <none>           <none>
ft-rollout-demo-984ff575d-b6vf4    0/1     ContainerCreating   0          0s    <none>           kworker-ft1   <none>           <none>
ft-rollout-demo-7c74c98d66-x46vn   1/1     Terminating         0          42m   192.168.17.12    kworker-ft2   <none>           <none>
ft-rollout-demo-7c74c98d66-x46vn   0/1     Terminating         0          42m   192.168.17.12    kworker-ft2   <none>           <none>
ft-rollout-demo-984ff575d-b6vf4    0/1     ContainerCreating   0          2s    <none>           kworker-ft1   <none>           <none>
ft-rollout-demo-7c74c98d66-x46vn   0/1     Terminating         0          42m   192.168.17.12    kworker-ft2   <none>           <none>
ft-rollout-demo-7c74c98d66-x46vn   0/1     Terminating         0          42m   192.168.17.12    kworker-ft2   <none>           <none>
ft-rollout-demo-984ff575d-b6vf4    0/1     Running             0          6s    192.168.77.145   kworker-ft1   <none>           <none>
ft-rollout-demo-984ff575d-b6vf4    1/1     Running             0          14s   192.168.77.145   kworker-ft1   <none>           <none>
ft-rollout-demo-7c74c98d66-src5c   1/1     Terminating         0          43m   192.168.77.143   kworker-ft1   <none>           <none>
ft-rollout-demo-7c74c98d66-src5c   1/1     Terminating         0          43m   192.168.77.143   kworker-ft1   <none>           <none>
ft-rollout-demo-7c74c98d66-src5c   0/1     Terminating         0          43m   192.168.77.143   kworker-ft1   <none>           <none>
ft-rollout-demo-7c74c98d66-src5c   0/1     Terminating         0          43m   192.168.77.143   kworker-ft1   <none>           <none>
ft-rollout-demo-7c74c98d66-src5c   0/1     Terminating         0          43m   192.168.77.143   kworker-ft1   <none>           <none>

We can see now that two new pods have been created:

[email protected]:~/deployments# kubectl get pods -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP               NODE          NOMINATED NODE   READINESS GATES
ft-rollout-demo-984ff575d-b6vf4   1/1     Running   0          54s   192.168.77.145   kworker-ft1   <none>           <none>
ft-rollout-demo-984ff575d-t6nhj   1/1     Running   0          68s   192.168.17.13    kworker-ft2   <none>           <none>

Now verify if our application is also updated to the newer version or not:

[email protected]:~/deployments# curl 192.168.77.145:80
<h1>Welcome to fosstechnix<h1>
<h2>You are running version 2 of our application. Enjoy!<h2>

[email protected]:~/deployments# curl 192.168.17.13:80
<h1>Welcome to fosstechnix<h1>

<h2>You are running version 2 of our application. Enjoy!<h2>
Rolling out and Rolling back updates with Zero Downtime on Kubernetes Cluster 2

Yes. It has been. So our Rolling Updates are just working fine.

Rollout history

By default, all of the Deployment’s rollout history is kept in the system so that you can rollback anytime you want (you can change that by modifying revision history limit).

[email protected]:~/deployments# kubectl rollout history deployment ft-rollout-demo
deployment.apps/ft-rollout-demo
REVISION  CHANGE-CAUSE
1         <none>
2         <none>

Important Note: A Deployment’s revision is created when a Deployment’s rollout is triggered. This means that the new revision is created if and only if the Deployment’s Pod template is changed, for example if you update the labels or container images of the template. Other updates, such as scaling the Deployment, do not create a Deployment revision, so that you can facilitate simultaneous manual- or auto-scaling. This means that when you roll back to an earlier revision, only the Deployment’s Pod template part is rolled back.

Rolling back updates

What if we did something wrong with the new update and because of that our customers are getting impacted .. We have to ROLLBACK our changes! But How ??

Let’s do it and Rollback to our last stable version of application:

[email protected]:~/deployments# kubectl rollout undo deployment ft-rollout-demo
deployment.apps/ft-rollout-demo rolled back

Verify the revision history now:

[email protected]:~/deployments# kubectl rollout history deployment ft-rollout-demo
deployment.apps/ft-rollout-demo
REVISION  CHANGE-CAUSE
2         <none>
3         <none>

List out the newly created pods:

[email protected]:~/deployments# kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP               NODE          NOMINATED NODE   READINESS GATES
ft-rollout-demo-7c74c98d66-ssps8   1/1     Running   0          62s   192.168.77.150   kworker-ft1   <none>           <none>
ft-rollout-demo-7c74c98d66-wmvn2   1/1     Running   0          67s   192.168.17.17    kworker-ft2   <none>           <none>

Its time to verify our rolled back changes which has gone to previous version:

[email protected]:~/deployments# curl 192.168.77.150:80
<h1>Welcome to fosstechnix<h1>

<h2>You are running version 1 of our application. Enjoy!<h2>

[email protected]:~/deployments# curl 192.168.17.17:80
<h1>Welcome to fosstechnix<h1>

<h2>You are running version 1 of our application. Enjoy!<h2>
Rolling out and Rolling back updates with Zero Downtime on Kubernetes Cluster 3

It’s working perfectly fine and we are back to our previous stable version. (Our customers will be happy now).

Now finally I will show you how we can go to a specific revision of rollout history.

To rollback to the second revision of the Deployment, run the following command:

[email protected]:~/deployments# kubectl rollout undo deployment ft-rollout-demo --to-revision=2
deployment.apps/ft-rollout-demo rolled back

By performing this action wrt to our example it again repeated the same process (of terminating older pods and creating newer one’s) and took us to version 2 of the update.

[email protected]:~/deployments# kubectl get pods -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP               NODE          NOMINATED NODE   READINESS GATES
ft-rollout-demo-984ff575d-bgq28   1/1     Running   0          29s   192.168.77.151   kworker-ft1   <none>           <none>
ft-rollout-demo-984ff575d-jjjpz   1/1     Running   0          18s   192.168.17.18    kworker-ft2   <none>           <none>

Verify the changes:

[email protected]:~/deployments# curl 192.168.17.18:80
<h1>Welcome to fosstechnix<h1>

<h2>You are running version 2 of our application. Enjoy!<h2>

This is all about Kubernetes Deployments Rollouts and Rollbacks.

Hope you like the tutorial. Please let me know your feedback in the response section.

Thanks! Happy Learning!

Related Articles:

Kubernetes Tutorial for Beginners [10 Practical Articles]

Reference:

Kubernetes official guide

FOSS TechNix

FOSS TechNix (Free,Open Source Software's and Technology Nix*) founded in 2019 is a community platform where you can find How-to Guides, articles for DevOps Tools,Linux and Databases.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share via
Copy link