Skip to main content
πŸŽ“ Claude Code Masterclass Learn AI-assisted development on Udemy β€” plus the companion book on Leanpub & Amazon. Start Learning
Kubernetes Cost Optimization with Kubecost and OpenCost
Platform Engineering

Kubernetes Cost Optimization with Kubecost and OpenCost

Track and reduce Kubernetes spending β€” per-namespace cost allocation, right-sizing recommendations, idle resource detection, and showback reports.

LB
Luca Berton
Β· 1 min read

The Problem: Invisible Spending

Most organizations don’t know what each team or service costs in Kubernetes. The cloud bill shows β€œEKS: $50,000/month” β€” but which team? Which workload? What’s idle?

Kubecost vs OpenCost

FeatureOpenCostKubecost
LicenseApache 2.0 (CNCF)Free tier + Enterprise
Cost allocationβœ…βœ…
RecommendationsβŒβœ…
AlertsβŒβœ…
Multi-clusterβŒβœ… (Enterprise)
Savings plansβŒβœ…
UIBasicFull dashboard
APIβœ…βœ…

Installation (Kubecost)

helm install kubecost cost-analyzer/cost-analyzer \
  --namespace kubecost --create-namespace \
  --set kubecostToken="your-token" \
  --set prometheus.server.enabled=true

Cost Allocation Model

Total Cluster Cost = Node Cost + Storage Cost + Network Cost

Per-Pod Cost = (CPU Request / Node CPU) Γ— Node Cost/hr
             + (Memory Request / Node Memory) Γ— Node Cost/hr
             + PV Cost Γ— (PV Size / Total PV)

Per-Namespace Cost = Ξ£ Pod Costs in Namespace
Per-Team Cost = Ξ£ Namespace Costs (label: team=X)

Right-Sizing Recommendations

# Kubecost API β€” get savings opportunities
curl http://kubecost:9090/model/savings/requestSizing?window=7d

# Response:
{
  "containerName": "api-server",
  "currentCPURequest": "2000m",
  "recommendedCPURequest": "350m",
  "currentMemoryRequest": "4Gi",
  "recommendedMemoryRequest": "1.2Gi",
  "monthlySavings": "$180"
}

Common Waste Patterns

PatternTypical WasteFix
Over-provisioned requests40-60%VPA or right-size manually
Idle namespaces10-20%Auto-delete dev envs nightly
Orphan PVCs5-10%PVC cleanup CronJob
No autoscaling20-30%HPA/KEDA for variable workloads
Wrong instance type15-25%Node auto-provisioner (Karpenter)
No spot/preemptible30-60%Spot for stateless workloads

Showback Dashboard (Grafana)

# Prometheus recording rules for cost
groups:
  - name: cost-allocation
    rules:
      - record: namespace:cost:hourly
        expr: |
          sum by (namespace) (
            container_cpu_allocation * on(node) group_left()
            node_cost_per_cpu_hour
          ) +
          sum by (namespace) (
            container_memory_allocation_bytes / 1024 / 1024 / 1024
            * on(node) group_left()
            node_cost_per_gb_hour
          )

Quick Wins (First 30 Days)

  1. Delete idle workloads β€” dev/staging environments running 24/7
  2. Right-size top 10 over-provisioned pods β€” usually saves 30%+
  3. Enable HPA β€” for services with variable traffic
  4. Spot instances β€” for stateless, fault-tolerant workloads
  5. Reserved capacity β€” for stable baseline (commit 1-3 years)

Cost per Request (Business Metrics)

The ultimate goal β€” cost per business transaction:

Cost per API request = Namespace cost / Total requests
Cost per order = (Payment ns + Cart ns + Shipping ns) / Orders processed

When you can show β€œ$0.003 per order processed” vs β€œ$50,000 cloud bill,” finance understands.

Free 30-min AI & Cloud consultation

Book Now