In this article we are going to cover Kubernetes Pod Troubleshooting Commands with Examples.
When working with Kubernetes, troubleshooting pod issues is a crucial skill. Pods may encounter various problems, such as scheduling failures, crashes, or network connectivity issues. This guide provides a detailed step-by-step approach to diagnosing and resolving common pod issues using various Kubernetes commands.
Table of Contents
Prerequisites
- A running Kubernetes cluster
- kubectl installed and configured
- Basic knowledge of Kubernetes concepts
#1.Create a Sample Deployment, Service, and Kustomization
deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-deployment
spec:
replicas: 2
selector:
matchLabels:
app: example
template:
metadata:
labels:
app: example
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
service.yaml
:
apiVersion: v1
kind: Service
metadata:
name: example-service
spec:
selector:
app: example
ports:
- protocol: TCP
port: 80
targetPort: 80
kustomization.yaml
:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml
Apply the Configuration:
kubectl apply -k .

Explanation:
This setup creates a deployment with two replicas of an Nginx container, a service to expose it, and a Kustomization file to manage the resources.
#2.Check Pod Status
Command:
kubectl get pods

Explanation:
This command provides an overview of pod statuses.
Common Issues & Solutions:
- Pending: The pod is waiting for resources.
- Check events:
kubectl describe pod <pod-name>
- Check node status:
kubectl get nodes
- Check events:
- CrashLoopBackOff: The pod is crashing and restarting.
- Check logs:
kubectl logs <pod-name>
- Check health probes in deployment YAML.
- Check logs:
#3.Describe Pod for Detailed Info
Command:
kubectl describe pod <pod-name>

Explanation:
Provides detailed information about the pod, including events and error messages.
Common Issues & Solutions:
- ImagePullBackOff:
- Check image name and registry:
kubectl get pod <pod-name> -o yaml
- Manually pull the image:
docker pull <image>
- Check image name and registry:
- OOMKilled: Pod exceeded memory limit.
- Increase resource limits in the pod spec.
- Optimize the application’s memory usage.
#4.Check Pod Logs
Command:
kubectl logs <pod-name>

Explanation:
Fetches logs from the container running inside the pod.
Common Issues & Solutions:
- Application errors:
- Check logs for stack traces and fix the code.
- Container restart loop:
- Use
kubectl logs <pod-name> --previous
to check previous logs.
- Use
#5.Execute a Command in a Running Pod
Command:
kubectl exec -it <pod-name> -- /bin/sh

Explanation:
Opens a shell inside the running container to manually inspect and debug issues.
Common Issues & Solutions:
- File or dependency missing:
- Inspect the file system:
ls -lah
- Install missing packages if required.
- Inspect the file system:
- Application not starting:
- Check environment variables:
env
- Check environment variables:
#6.Pod Eviction Due to Resource Pressure
Commands:
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl top nodes
kubectl top pods


Explanation:
kubectl get events
checks eviction messages.kubectl top nodes/pods
shows resource usage.
Common Issues & Solutions:
- Issue: Node runs out of resources.
Solution: Increase node capacity or distribute workloads. - Issue: Resource requests too high.
Solution: Optimize resource requests/limits in deployments.
#7.Restart a Failing Pod
Command:
kubectl delete pod <pod-name>

Explanation:
Deletes the pod, causing Kubernetes to recreate it.
Common Issues & Solutions:
- Pod stuck in error state:
- Delete and let the controller recreate it.
- Network issues:
- Restart pod to trigger a fresh connection.
#8.DNS Resolution Failure
Commands:
kubectl logs -n kube-system -l k8s-app=kube-dns
Explanation:
- Checks logs of CoreDNS to identify DNS issues.
Common Issues & Solutions:
- Issue: CoreDNS is down.
Solution: Restart CoreDNS (kubectl rollout restart deployment coredns -n kube-system
). - Issue: DNS Policy misconfigured.
Solution: EnsurednsPolicy: ClusterFirst
is set. - Issue: Network issues.
Solution: Check if the pod has network connectivity.
#9.RBAC Authorization Error (Forbidden)
Commands:
kubectl auth can-i <action> --as=<user>

Explanation:
- Checks if a user or service account has the necessary permissions.
Common Issues & Solutions:
Issue: Incorrect role assignment.
Solution: Assign the correct role using:
kubectl create rolebinding <binding-name> --clusterrole=<role> --user=<user> --namespace=<namespace>
Issue: ServiceAccount lacks required permissions.
Solution: Use kubectl describe sa <service-account>
to verify permissions.
#10.Ingress Controller Misconfiguration
Commands:
kubectl get ingress
kubectl describe ingress <ingress-name>
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx



Explanation:
kubectl get ingress
ensures Ingress exists.kubectl describe ingress
checks for misconfigurations.kubectl logs
helps debug Ingress controller issues.
Common Issues & Solutions:
- Issue: Backend service misconfiguration.
Solution: Ensure the service is reachable and exposed. - Issue: TLS configuration errors.
Solution: Validate TLS certificates and secrets.
#11.ETCD Cluster Issues
Commands:
kubectl get pods -n kube-system -l component=etcd
kubectl logs -n kube-system etcd-<node-name>


Explanation:
kubectl get pods
checks etcd availability.kubectl logs
helps debug etcd errors.
Common Issues & Solutions:
- Issue: ETCD out of disk space.
Solution: Clear old snapshots and increase storage. - Issue: Cluster inconsistency.
Solution: Check leader election and restore from backup.
#12.CNI Plugin Issues
Commands:
kubectl get pods -n kube-flannel -l tier=node
kubectl logs -n kube-flannel -l tier=node

Explanation:
kubectl get pods
checks if CNI plugins are running.kubectl logs
provides error details.
Common Issues & Solutions:
- Issue: CNI plugin missing or crashed.
Solution: Restart CNI DaemonSet. - Issue: Incorrect network policies.
Solution: Validate CNI settings and rules.
#13.Invalid Labels or Selectors
Commands:
kubectl get pods --selector=<label>
kubectl describe deployment <deployment-name>

Explanation:
kubectl get pods --selector=<label>
lists matching pods.kubectl describe deployment
checks selector configurations.
Common Issues & Solutions:
- Issue: Mismatched labels.
Solution: Ensure pod labels match deployment selectors. - Issue: Incorrect service selector.
Solution: Update service YAML to match pod labels.
#14.PersistentVolumeClaim (PVC) Stuck in Pending
Commands:
kubectl get pvc
kubectl describe pvc <pvc-name>
kubectl get sc



Explanation:
kubectl get pvc
shows claim status.kubectl describe pvc
provides detailed binding errors.kubectl get sc
checks available storage classes.
Common Issues & Solutions:
- Issue: No matching Persistent Volume.
- Solution: Ensure a PV exists with the correct storage class and capacity.
- Issue: StorageClass misconfiguration.
- Solution: Verify and update the StorageClass in PVC spec.
#15.API Server Unreachable
Commands:
kubectl cluster-info
kubectl get pods -A
kubectl config view



Explanation:
kubectl cluster-info
checks API server availability.kubectl get pods -A
ensures the control plane is running.kubectl config view
verifies kubeconfig settings.
Common Issues & Solutions:
- Issue: API server is down.
Solution: Restart API server components (kube-apiserver
,etcd
). - Issue: Wrong kubeconfig configuration.
Solution: Verify the API server address in~/.kube/config
.
#16.Network Policy Restrictions
Commands:
kubectl get networkpolicy -A
kubectl describe networkpolicy <policy-name>
Explanation:
kubectl get networkpolicy
lists all network policies.kubectl describe networkpolicy
details the applied rules.
Common Issues & Solutions:
- Issue: Incorrect policy blocking pod communication.
Solution: Update thepodSelector
andegress/ingress
rules in the policy. - Issue: No network policy exists.
Solution: Ensure a policy is created for the required traffic flow.
#17.Kube-proxy Issues
Commands:
kubectl get pods -n kube-system -l k8s-app=kube-proxy
kubectl logs -n kube-system -l k8s-app=kube-proxy


Explanation:
kubectl get pods
checks kube-proxy availability.kubectl logs
inspects kube-proxy logs for errors.
Common Issues & Solutions:
- Issue: Kube-proxy pod is crashing.
Solution: Restart kube-proxy pod:
kubectl rollout restart daemonset kube-proxy -n kube-system

Issue: iptables rules not applied.
Solution: Manually reset iptables and restart kube-proxy.
#18.Service Not Accessible (Pending or No External IP)
Commands:
kubectl get svc
kubectl describe svc <service-name>


Explanation:
kubectl get svc
checks if the service has an external IP.kubectl describe svc
helps debug why it’s stuck in “Pending.”
Common Issues & Solutions:
- Issue: Wrong service type.
Solution: Ensure service type isLoadBalancer
if external access is needed. - Issue: Cloud provider-specific issues.
Solution: For Azure, check Application Gateway/NGINX ingress settings.
#19.Get Detailed Pod YAML Configuration
Commands:
kubectl get pod <pod-name> -o yaml


Explanation:
Displays the complete YAML configuration of the pod.
#20.Check Node Status
Commands:
kubectl get nodes -o wide
kubectl describe node <node-name>



Explanation:
Lists node details, including status, roles, and conditions.
#21.Check Service Endpoint
Command:
kubectl get endpoints

Explanation:
Lists endpoints associated with services.
#22.Debug Network Connectivity
Command:
kubectl exec -it <pod-name> -- curl <service-name>:<port>

Explanation:
Tests connectivity between pods and services.
#23.View ConfigMaps
Command:
kubectl get configmap

#24.View Secrets
Command:
kubectl get secret

#25.Get Running Deployments
Command:
kubectl get deployments

Conclusion:
Kubernetes pod troubleshooting requires a combination of commands and analysis. By systematically checking pod status, logs, events, and resource usage, you can efficiently diagnose and fix issues. Mastering these commands will help you maintain a stable and reliable Kubernetes environment.
Related Articles:
Kubernetes Tutorial for Beginners [20 Practical Articles]
Reference: