The Error
When working with Kubernetes, you may encounter this error:
ingress 502 bad gatewayThis error is common in production clusters and can be frustrating to debug. Let’s break down what causes it and how to fix it.
Common Causes
1. Configuration Issues
The most frequent cause is misconfiguration in your Kubernetes manifests or cluster settings. Check your YAML files carefully for:
- Typos in resource names or API versions
- Missing required fields
- Incorrect indentation
2. Resource Constraints
Your cluster may not have enough resources to handle the request:
# Check node resources
kubectl describe nodes | grep -A 5 "Allocated resources"
# Check pod resource requests
kubectl get pods -o jsonpath='{range .items[*]}{.metadata.name}: cpu={.spec.containers[0].resources.requests.cpu}, mem={.spec.containers[0].resources.requests.memory}{"\n"}{end}'3. Network or Connectivity Problems
Network issues between components can trigger this error:
# Check pod networking
kubectl exec -it <pod-name> -- ping <service-name>
# Check DNS resolution
kubectl exec -it <pod-name> -- nslookup kubernetes.defaultStep-by-Step Fix
Step 1: Gather Information
# Get detailed pod status
kubectl describe pod <pod-name> -n <namespace>
# Check events
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20
# Check logs
kubectl logs <pod-name> -n <namespace> --previousStep 2: Identify the Root Cause
Review the output from Step 1. Look for:
- Events: Warning events often point directly to the cause
- Conditions: Pod and node conditions reveal cluster state issues
- Logs: Application logs may show the underlying failure
Step 3: Apply the Fix
Based on your findings, apply the appropriate fix:
# Example: Adjust resource limits
apiVersion: v1
kind: Pod
metadata:
name: example
spec:
containers:
- name: app
image: myapp:latest
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"Step 4: Verify
# Watch pod status
kubectl get pods -w -n <namespace>
# Confirm events are clean
kubectl get events -n <namespace> --field-selector reason!=PullingPrevention
To avoid this error in the future:
- Use resource requests and limits on all pods
- Implement health checks (readiness and liveness probes)
- Monitor your cluster with Prometheus and Grafana
- Keep Kubernetes updated to benefit from bug fixes
- Use GitOps (ArgoCD or Flux) for consistent deployments
Related Resources
Need hands-on Kubernetes training? Check out Luca Berton’s courses for practical DevOps and cloud engineering skills.
