Skip to main content
πŸŽ“ Claude Code Masterclass Learn AI-assisted development on Udemy β€” plus the companion book on Leanpub & Amazon. Start Learning
KEDA vs HPA: Kubernetes Autoscaling Compared (2026 Guide)
Platform Engineering

KEDA vs HPA: Kubernetes Autoscaling Compared (2026 Guide)

KEDA vs Horizontal Pod Autoscaler β€” when to use event-driven vs metric-based scaling. Configuration examples, performance benchmarks, and production patterns.

LB
Luca Berton
Β· 2 min read

Quick Decision

ScenarioUse HPAUse KEDA
CPU/memory scalingβœ…Overkill
Scale from/to zeroβŒβœ…
Queue-based scalingβŒβœ…
Cron-based scalingβŒβœ…
Custom Prometheus metricsComplexβœ… Easy
No extra dependenciesβœ…Requires KEDA operator

Horizontal Pod Autoscaler (HPA)

HPA is Kubernetes’ built-in autoscaling mechanism. It watches resource metrics (CPU, memory) or custom metrics and adjusts replica count.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 2
  maxReplicas: 50
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15

How HPA Works

  1. Metrics Server collects CPU/memory from kubelets (every 15s default)
  2. HPA controller checks metrics (every 15s default)
  3. Calculates desired replicas: ceil(currentReplicas Γ— (currentMetric / targetMetric))
  4. Scales deployment up or down within min/max bounds

HPA Limitations

  • Cannot scale to zero β€” minimum is 1 replica
  • Limited metric sources β€” CPU, memory, or custom metrics via adapters
  • Custom metrics are complex β€” requires Prometheus Adapter or Datadog Cluster Agent
  • No event-driven scaling β€” purely metric-based with polling interval

KEDA (Kubernetes Event-Driven Autoscaling)

KEDA extends Kubernetes with 60+ scalers for event-driven autoscaling. It creates and manages HPA objects behind the scenes.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: order-processor
spec:
  scaleTargetRef:
    name: order-processor
  pollingInterval: 10
  cooldownPeriod: 300
  minReplicaCount: 0    # Scale to zero!
  maxReplicaCount: 100
  triggers:
    - type: rabbitmq
      metadata:
        queueName: orders
        host: amqp://rabbitmq.default.svc:5672
        queueLength: "5"    # 1 pod per 5 messages
    - type: cron
      metadata:
        timezone: Europe/Amsterdam
        start: "0 8 * * 1-5"
        end: "0 20 * * 1-5"
        desiredReplicas: "3"

KEDA Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Scaler    │────▢│ KEDA Operator│────▢│     HPA     β”‚
β”‚ (RabbitMQ)  β”‚     β”‚              β”‚     β”‚ (auto-created)β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Deployment  β”‚
                    β”‚  0β†’N replicasβ”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
ScalerUse Case
kafkaScale on consumer lag
rabbitmqScale on queue depth
aws-sqsScale on SQS message count
prometheusAny Prometheus metric
cronTime-based pre-scaling
postgresqlScale on query result
redis-streamsScale on stream length
azure-servicebusScale on subscription count

Scale to Zero

KEDA’s killer feature β€” when no events exist, scale deployment to zero replicas:

spec:
  minReplicaCount: 0
  triggers:
    - type: aws-sqs
      metadata:
        queueURL: https://sqs.eu-west-1.amazonaws.com/123/orders
        queueLength: "1"
        awsRegion: eu-west-1

Cold start latency: first pod takes 5-30s to become ready (depends on image size and init).

Performance Comparison

Scaling Speed

MetricHPAKEDA
Polling interval15s (default)10-30s (configurable)
Scale-up reaction15-45s10-40s
Scale-down cooldown5min (default)Configurable
Zero→1 cold startN/A5-30s

Resource Overhead

ComponentCPUMemory
HPA (built-in)~0~0
Metrics Server100m200Mi
KEDA Operator100m128Mi
KEDA Metrics Server100m128Mi

Production Patterns

Pattern 1: HPA for Web + KEDA for Workers

# Web tier: CPU-based HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-api
spec:
  minReplicas: 3
  maxReplicas: 30
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          averageUtilization: 60
---
# Worker tier: KEDA for queue processing
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: worker
spec:
  minReplicaCount: 0
  maxReplicaCount: 50
  triggers:
    - type: kafka
      metadata:
        topic: events
        consumerGroup: workers
        lagThreshold: "10"

Pattern 2: KEDA with Prometheus for Custom Metrics

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: api-gateway
spec:
  minReplicaCount: 2
  maxReplicaCount: 20
  triggers:
    - type: prometheus
      metadata:
        serverAddress: http://prometheus:9090
        query: |
          sum(rate(http_requests_total{service="api"}[2m]))
        threshold: "100"    # Scale when >100 req/s per pod

Installation

KEDA via Helm

helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda --namespace keda --create-namespace

Verify

kubectl get pods -n keda
# NAME                                      READY
# keda-operator-5f8b4b6d4-xxxxx            1/1
# keda-operator-metrics-apiserver-xxx       1/1

When to Choose

Stick with HPA when:

  • CPU/memory metrics are sufficient
  • You don’t need scale-to-zero
  • You want zero additional dependencies
  • Simple web applications with predictable load

Choose KEDA when:

  • You need scale-to-zero for cost savings
  • Event-driven workloads (queues, streams, schedules)
  • You want simpler custom metric configuration
  • Batch processing or async job workers
  • Multi-trigger scaling logic

Free 30-min AI & Cloud consultation

Book Now