Kubernetes Pod Troubleshooting Commands with Examples

In this article we are going to cover Kubernetes Pod Troubleshooting Commands with Examples.

When working with Kubernetes, troubleshooting pod issues is a crucial skill. Pods may encounter various problems, such as scheduling failures, crashes, or network connectivity issues. This guide provides a detailed step-by-step approach to diagnosing and resolving common pod issues using various Kubernetes commands.

Table of Contents

Prerequisites

A running Kubernetes cluster
kubectl installed and configured
Basic knowledge of Kubernetes concepts

#1.Create a Sample Deployment, Service, and Kustomization

deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: example
  template:
    metadata:
      labels:
        app: example
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: example-service
spec:
  selector:
    app: example
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80

kustomization.yaml:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - deployment.yaml
  - service.yaml

Apply the Configuration:

kubectl apply -k .

Explanation:

This setup creates a deployment with two replicas of an Nginx container, a service to expose it, and a Kustomization file to manage the resources.

#2.Check Pod Status

Command:

kubectl get pods

Explanation:

This command provides an overview of pod statuses.

Common Issues & Solutions:

Pending: The pod is waiting for resources.
- Check events: kubectl describe pod <pod-name>
- Check node status: kubectl get nodes
CrashLoopBackOff: The pod is crashing and restarting.
- Check logs: kubectl logs <pod-name>
- Check health probes in deployment YAML.

#3.Describe Pod for Detailed Info

Command:

kubectl describe pod <pod-name>

Explanation:

Provides detailed information about the pod, including events and error messages.

Common Issues & Solutions:

ImagePullBackOff:
- Check image name and registry: kubectl get pod <pod-name> -o yaml
- Manually pull the image: docker pull <image>
OOMKilled: Pod exceeded memory limit.
- Increase resource limits in the pod spec.
- Optimize the application’s memory usage.

#4.Check Pod Logs

Command:

kubectl logs <pod-name>

Explanation:

Fetches logs from the container running inside the pod.

Common Issues & Solutions:

Application errors:
- Check logs for stack traces and fix the code.
Container restart loop:
- Use kubectl logs <pod-name> --previous to check previous logs.

#5.Execute a Command in a Running Pod

Command:

kubectl exec -it <pod-name> -- /bin/sh

Explanation:

Opens a shell inside the running container to manually inspect and debug issues.

Common Issues & Solutions:

File or dependency missing:
- Inspect the file system: ls -lah
- Install missing packages if required.
Application not starting:
- Check environment variables: env

#6.Pod Eviction Due to Resource Pressure

Commands:

kubectl get events --sort-by=.metadata.creationTimestamp
kubectl top nodes
kubectl top pods

Kubernetes Pod Troubleshooting Commands with Examples 7

Explanation:

kubectl get events checks eviction messages.
kubectl top nodes/pods shows resource usage.

Common Issues & Solutions:

Issue: Node runs out of resources.
Solution: Increase node capacity or distribute workloads.
Issue: Resource requests too high.
Solution: Optimize resource requests/limits in deployments.

#7.Restart a Failing Pod

Command:

kubectl delete pod <pod-name>

Explanation:

Deletes the pod, causing Kubernetes to recreate it.

Common Issues & Solutions:

Pod stuck in error state:
- Delete and let the controller recreate it.
Network issues:
- Restart pod to trigger a fresh connection.

#8.DNS Resolution Failure

Commands:

kubectl logs -n kube-system -l k8s-app=kube-dns

Explanation:

Checks logs of CoreDNS to identify DNS issues.

Common Issues & Solutions:

Issue: CoreDNS is down.
Solution: Restart CoreDNS (kubectl rollout restart deployment coredns -n kube-system).
Issue: DNS Policy misconfigured.
Solution: Ensure dnsPolicy: ClusterFirst is set.
Issue: Network issues.
Solution: Check if the pod has network connectivity.

#9.RBAC Authorization Error (Forbidden)

Commands:

kubectl auth can-i <action> --as=<user>

Explanation:

Checks if a user or service account has the necessary permissions.

Common Issues & Solutions:

Issue: Incorrect role assignment.
Solution: Assign the correct role using:

kubectl create rolebinding <binding-name> --clusterrole=<role> --user=<user> --namespace=<namespace>

Issue: ServiceAccount lacks required permissions.
Solution: Use kubectl describe sa <service-account> to verify permissions.

#10.Ingress Controller Misconfiguration

Commands:

kubectl get ingress
kubectl describe ingress <ingress-name>
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx

Kubernetes Pod Troubleshooting Commands with Examples 11

Kubernetes Pod Troubleshooting Commands with Examples 12

Explanation:

kubectl get ingress ensures Ingress exists.
kubectl describe ingress checks for misconfigurations.
kubectl logs helps debug Ingress controller issues.

Common Issues & Solutions:

Issue: Backend service misconfiguration.
Solution: Ensure the service is reachable and exposed.
Issue: TLS configuration errors.
Solution: Validate TLS certificates and secrets.

#11.ETCD Cluster Issues

Commands:

kubectl get pods -n kube-system -l component=etcd
kubectl logs -n kube-system etcd-<node-name>

Kubernetes Pod Troubleshooting Commands with Examples 14

Explanation:

kubectl get pods checks etcd availability.
kubectl logs helps debug etcd errors.

Common Issues & Solutions:

Issue: ETCD out of disk space.
Solution: Clear old snapshots and increase storage.
Issue: Cluster inconsistency.
Solution: Check leader election and restore from backup.

#12.CNI Plugin Issues

Commands:

kubectl get pods -n kube-flannel -l tier=node
kubectl logs -n kube-flannel -l tier=node

Explanation:

kubectl get pods checks if CNI plugins are running.
kubectl logs provides error details.

Common Issues & Solutions:

Issue: CNI plugin missing or crashed.
Solution: Restart CNI DaemonSet.
Issue: Incorrect network policies.
Solution: Validate CNI settings and rules.

#13.Invalid Labels or Selectors

Commands:

kubectl get pods --selector=<label>
kubectl describe deployment <deployment-name>

Explanation:

kubectl get pods --selector=<label> lists matching pods.
kubectl describe deployment checks selector configurations.

Common Issues & Solutions:

Issue: Mismatched labels.
Solution: Ensure pod labels match deployment selectors.
Issue: Incorrect service selector.
Solution: Update service YAML to match pod labels.

#14.PersistentVolumeClaim (PVC) Stuck in Pending

Commands:

kubectl get pvc
kubectl describe pvc <pvc-name>
kubectl get sc

Kubernetes Pod Troubleshooting Commands with Examples 18

Kubernetes Pod Troubleshooting Commands with Examples 19

Explanation:

kubectl get pvc shows claim status.
kubectl describe pvc provides detailed binding errors.
kubectl get sc checks available storage classes.

Common Issues & Solutions:

Issue: No matching Persistent Volume.
Solution: Ensure a PV exists with the correct storage class and capacity.
Issue: StorageClass misconfiguration.
Solution: Verify and update the StorageClass in PVC spec.

#15.API Server Unreachable

Commands:

kubectl cluster-info
kubectl get pods -A
kubectl config view

Kubernetes Pod Troubleshooting Commands with Examples 21

Kubernetes Pod Troubleshooting Commands with Examples 22

Explanation:

kubectl cluster-info checks API server availability.
kubectl get pods -A ensures the control plane is running.
kubectl config view verifies kubeconfig settings.

Common Issues & Solutions:

Issue: API server is down.
Solution: Restart API server components (kube-apiserver, etcd).
Issue: Wrong kubeconfig configuration.
Solution: Verify the API server address in ~/.kube/config.

#16.Network Policy Restrictions

Commands:

kubectl get networkpolicy -A
kubectl describe networkpolicy <policy-name>

Explanation:

kubectl get networkpolicy lists all network policies.
kubectl describe networkpolicy details the applied rules.

Common Issues & Solutions:

Issue: Incorrect policy blocking pod communication.
Solution: Update the podSelector and egress/ingress rules in the policy.
Issue: No network policy exists.
Solution: Ensure a policy is created for the required traffic flow.

#17.Kube-proxy Issues

Commands:

kubectl get pods -n kube-system -l k8s-app=kube-proxy
kubectl logs -n kube-system -l k8s-app=kube-proxy

Kubernetes Pod Troubleshooting Commands with Examples 24

Explanation:

kubectl get pods checks kube-proxy availability.
kubectl logs inspects kube-proxy logs for errors.

Common Issues & Solutions:

Issue: Kube-proxy pod is crashing.
Solution: Restart kube-proxy pod:

kubectl rollout restart daemonset kube-proxy -n kube-system

Kubernetes Pod Troubleshooting Commands with Examples 25

Issue: iptables rules not applied.
Solution: Manually reset iptables and restart kube-proxy.

#18.Service Not Accessible (Pending or No External IP)

Commands:

kubectl get svc
kubectl describe svc <service-name>

Kubernetes Pod Troubleshooting Commands with Examples 27

Explanation:

kubectl get svc checks if the service has an external IP.
kubectl describe svc helps debug why it’s stuck in “Pending.”

Common Issues & Solutions:

Issue: Wrong service type.
Solution: Ensure service type is LoadBalancer if external access is needed.
Issue: Cloud provider-specific issues.
Solution: For Azure, check Application Gateway/NGINX ingress settings.

#19.Get Detailed Pod YAML Configuration

Commands:

kubectl get pod <pod-name> -o yaml

Kubernetes Pod Troubleshooting Commands with Examples 29

Explanation:

Displays the complete YAML configuration of the pod.

#20.Check Node Status

Commands:

kubectl get nodes -o wide
kubectl describe node <node-name>

Kubernetes Pod Troubleshooting Commands with Examples 31

Kubernetes Pod Troubleshooting Commands with Examples 32

Explanation:

Lists node details, including status, roles, and conditions.

#21.Check Service Endpoint

Command:

kubectl get endpoints

Explanation:

Lists endpoints associated with services.

#22.Debug Network Connectivity

Command:

kubectl exec -it <pod-name> -- curl <service-name>:<port>

Explanation:

Tests connectivity between pods and services.

#23.View ConfigMaps

Command:

kubectl get configmap

#24.View Secrets

Command:

kubectl get secret

#25.Get Running Deployments

Command:

kubectl get deployments

Conclusion:

Kubernetes pod troubleshooting requires a combination of commands and analysis. By systematically checking pod status, logs, events, and resource usage, you can efficiently diagnose and fix issues. Mastering these commands will help you maintain a stable and reliable Kubernetes environment.

Related Articles:

Kubernetes Tutorial for Beginners [20 Practical Articles]

Reference:

Kubernetes pod official page

Prerequisites

#1.Create a Sample Deployment, Service, and Kustomization

#2.Check Pod Status

Common Issues & Solutions:

#3.Describe Pod for Detailed Info

Common Issues & Solutions:

#4.Check Pod Logs

Common Issues & Solutions:

#5.Execute a Command in a Running Pod

Common Issues & Solutions:

#6.Pod Eviction Due to Resource Pressure

Common Issues & Solutions:

#7.Restart a Failing Pod

Common Issues & Solutions:

#8.DNS Resolution Failure

Common Issues & Solutions:

#9.RBAC Authorization Error (Forbidden)

Common Issues & Solutions:

#10.Ingress Controller Misconfiguration

Common Issues & Solutions:

#11.ETCD Cluster Issues

Common Issues & Solutions:

#12.CNI Plugin Issues

Common Issues & Solutions:

#13.Invalid Labels or Selectors

Common Issues & Solutions:

#14.PersistentVolumeClaim (PVC) Stuck in Pending

Common Issues & Solutions:

#15.API Server Unreachable

Common Issues & Solutions:

#16.Network Policy Restrictions

Common Issues & Solutions:

#17.Kube-proxy Issues

Common Issues & Solutions:

#18.Service Not Accessible (Pending or No External IP)

Common Issues & Solutions:

#19.Get Detailed Pod YAML Configuration

#20.Check Node Status

#21.Check Service Endpoint

#22.Debug Network Connectivity

#23.View ConfigMaps

#24.View Secrets

#25.Get Running Deployments

Harish Reddy

Leave a Comment Cancel reply