Why KEDA?
The built-in Kubernetes HPA scales on CPU and memory. But real workloads scale on:
- Messages in a queue (RabbitMQ, Kafka, SQS)
- Pending HTTP requests
- Database connection count
- Custom business metrics
- Time of day (cron-based scaling)
KEDA (Kubernetes Event-Driven Autoscaling) adds 60+ scalers and the ability to scale to zero β something native HPA cannot do.
Installation
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda \
--namespace keda \
--create-namespace \
--set webhooks.enabled=trueVerify:
kubectl get pods -n keda
# keda-operator-xxx Running
# keda-metrics-apiserver-xxx Running
# keda-admission-webhooks-xxx RunningCore Concepts
KEDA introduces two CRDs:
- ScaledObject β scales Deployments/StatefulSets
- ScaledJob β scales Jobs (batch processing)
βββββββββββββββ ββββββββββββ ββββββββββββββββββ
β Event Source ββββββΆβ KEDA ββββββΆβ HPA (under hood)β
β (Queue/API) β β Operator β β scales workload β
βββββββββββββββ ββββββββββββ ββββββββββββββββββExample 1: Scale on RabbitMQ Queue Depth
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: order-processor
spec:
scaleTargetRef:
name: order-processor # Deployment name
pollingInterval: 10
cooldownPeriod: 30
minReplicaCount: 0 # Scale to zero when queue is empty
maxReplicaCount: 50
triggers:
- type: rabbitmq
metadata:
host: amqp://user:pass@rabbitmq.default:5672/
queueName: orders
queueLength: "5" # 1 pod per 5 messagesWhen the orders queue has 0 messages β 0 pods. 25 messages β 5 pods. 250 messages β 50 pods.
Example 2: Scale on Kafka Consumer Lag
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: event-consumer
spec:
scaleTargetRef:
name: event-consumer
minReplicaCount: 1
maxReplicaCount: 100
triggers:
- type: kafka
metadata:
bootstrapServers: kafka-broker:9092
consumerGroup: my-consumer-group
topic: events
lagThreshold: "10"
offsetResetPolicy: earliestExample 3: Scale on Prometheus Metrics
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: api-gateway
spec:
scaleTargetRef:
name: api-gateway
minReplicaCount: 2
maxReplicaCount: 20
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus.monitoring:9090
metricName: http_requests_per_second
query: |
sum(rate(http_requests_total{app="api-gateway"}[1m]))
threshold: "100" # 1 pod per 100 req/sExample 4: Cron-Based Scaling
Pre-scale for known traffic patterns:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: web-frontend
spec:
scaleTargetRef:
name: web-frontend
minReplicaCount: 2
maxReplicaCount: 30
triggers:
- type: cron
metadata:
timezone: Europe/Amsterdam
start: 0 8 * * 1-5 # 8 AM weekdays
end: 0 20 * * 1-5 # 8 PM weekdays
desiredReplicas: "10"
- type: cron
metadata:
timezone: Europe/Amsterdam
start: 0 20 * * 1-5 # 8 PM weekdays
end: 0 8 * * 1-5 # 8 AM weekdays
desiredReplicas: "3"Example 5: AWS SQS
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: sqs-processor
spec:
scaleTargetRef:
name: sqs-processor
minReplicaCount: 0
maxReplicaCount: 25
triggers:
- type: aws-sqs-queue
metadata:
queueURL: https://sqs.us-east-1.amazonaws.com/012345678901/my-queue
queueLength: "5"
awsRegion: us-east-1
authenticationRef:
name: aws-credentials
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: aws-credentials
spec:
secretTargetRef:
- parameter: awsAccessKeyID
name: aws-secret
key: access-key
- parameter: awsSecretAccessKey
name: aws-secret
key: secret-keyScaledJobs for Batch Processing
Instead of long-running pods, spin up Jobs per message:
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
name: image-processor
spec:
jobTargetRef:
template:
spec:
containers:
- name: processor
image: image-processor:latest
env:
- name: QUEUE_URL
value: amqp://rabbitmq:5672
restartPolicy: Never
backoffLimit: 3
pollingInterval: 5
maxReplicaCount: 20
successfulJobsHistoryLimit: 10
failedJobsHistoryLimit: 5
triggers:
- type: rabbitmq
metadata:
host: amqp://user:pass@rabbitmq:5672/
queueName: images
queueLength: "1" # 1 job per messageMultiple Triggers (Composite Scaling)
KEDA uses the maximum across all triggers:
triggers:
# Scale on queue depth OR high CPU
- type: rabbitmq
metadata:
queueName: orders
queueLength: "10"
- type: cpu
metricType: Utilization
metadata:
value: "70"Scaling Behavior Tuning
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: my-app
spec:
scaleTargetRef:
name: my-app
advanced:
horizontalPodAutoscalerConfig:
behavior:
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 1
periodSeconds: 60
triggers:
- type: prometheus
metadata:
query: sum(rate(http_requests_total[1m]))
threshold: "50"Monitoring KEDA
# Check ScaledObject status
kubectl get scaledobjects
kubectl describe scaledobject my-app
# Check generated HPA
kubectl get hpa
# KEDA metrics
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" | jq .
# Operator logs
kubectl logs -n keda deploy/keda-operator --tail=50Prometheus Metrics from KEDA
# KEDA exposes metrics on :8080/metrics
- keda_scaler_active: Is the scaler triggering scale-up?
- keda_scaler_metrics_value: Current metric value
- keda_scaled_object_errors: Error count per ScaledObjectKEDA vs Native HPA
| Feature | HPA | KEDA |
|---|---|---|
| CPU/Memory scaling | β | β |
| External metrics | Manual setup | 60+ built-in scalers |
| Scale to zero | β | β |
| Cron-based scaling | β | β |
| ScaledJobs (batch) | β | β |
| Custom metrics | Requires adapter | Built-in |
| Complexity | Low | Medium |
Deepen Your Kubernetes Skills
If you found this article useful, check out my books for hands-on Kubernetes mastery:
- Kubernetes Recipes β A practical guide for container orchestration and deployment with real-world patterns
- Ansible for Kubernetes by Example β Automate Kubernetes cluster operations with Ansible playbooks
Both books follow the same practical, example-driven approach you see in my articles.