🎓 Claude Code Masterclass | Learn AI-assisted development on Udemy — plus the companion book on Leanpub & Amazon. Start Learning

Dynatrace at KubeCon EU 2026: AI

Caught up with Andreas (Andi) Grabner from Dynatrace at KubeCon EU 2026. AI is the top K8s workload but we cannot keep throwing compute at it.

April 10, 2026 · 3 min read

AI is officially the number one workload on Kubernetes — but are we actually running it efficiently?

The Right-Sizing Imperative

I had a great catch-up with Andreas (Andi) Grabner on the expo floor at KubeCon to talk about the reality of scaling AI. The big takeaway? We cannot just keep throwing raw compute at these workloads. If we do not figure out how to optimize them, we are going to hit a wall with massive energy demands.

Andi made a fantastic point: the next phase of cloud-native AI is all about right-sizing. It is about optimizing workloads and shifting from massive, resource-heavy models to smaller, highly efficient ones.

This aligns perfectly with what I have been seeing in enterprise AI deployments. Organizations that started with the biggest model they could find are now realizing:

Smaller models fine-tuned for specific tasks often outperform general-purpose giants
Token economics matter — every unnecessary parameter costs GPU cycles and electricity
Model profiles let you pick the right GPU memory footprint for your actual workload, not the theoretical maximum
Autoscaling inference prevents over-provisioning during off-peak hours

You Cannot Fix What You Cannot Measure

Here is the challenge: you cannot fix what you cannot measure. That is where observability becomes the absolute linchpin. You need deep, granular visibility to pinpoint exactly where your AI infrastructure is bleeding efficiency.

For AI workloads on Kubernetes, the observability gaps are real:

GPU utilization — are your expensive GPUs actually busy, or are they idle waiting for data?
Inference latency distribution — not just p50, but p95 and p99 under real traffic patterns
Token throughput per watt — the sustainability metric that matters
Queue depth and batch efficiency — are you batching requests optimally?
Memory pressure — KV cache utilization, model weight distribution across multi-node deployments

Naturally, Andi recommends Dynatrace to solve this — and for good reason, given their deep roots in making complex environments understandable. Their approach of automatic discovery and AI-powered root cause analysis becomes even more valuable when you are dealing with the complexity of distributed inference pipelines.

From Running AI to Running It Sustainably

Always an insightful conversation with Andi. It is refreshing to see the industry focus shifting from just “running AI” to running it sustainably.

The sustainability angle is not just idealism — it is economics. GPU compute is expensive. Energy costs are rising. The organizations that figure out how to get more inference per dollar and per kilowatt will have a structural advantage over those who just keep scaling horizontally.

This connects directly to the inference gold rush I have been writing about. The winners will not be the ones with the most GPUs — they will be the ones who use their GPUs most efficiently.

Learn More

Check out how Dynatrace is tackling observability for the AI era: dynatrace.com

About the Author

I am Luca Berton, AI and Cloud Advisor. I help enterprises right-size their AI infrastructure for performance and sustainability. Book a consultation.

Related Articles

Embodied AI infrastructure for robotics and physical-world intelligence

Embodied AI Infrastructure for the Physical World

Embodied AI moves artificial intelligence from screens into machines. Here is the infrastructure, data and safety stack required to make it work.

Analytics screenshot showing a traffic spike on ansiblepilot.com

Is Your Website Ready for AI Agents?

AI agents will not browse websites like humans. Here is how to prepare your website, database and infrastructure for autonomous traffic.

AI governance dashboard showing findings remediation status and agent identity controls

AI Governance in Practice: Findings Remediation and Agent Identity

What separates companies that manage AI risk well from those that just publish a policy PDF: fast findings remediation and real agent identity controls.

Calendar view of scoping events and mock sessions in an enterprise Copilot assessment delivery pipeline

What Delivering Enterprise Copilot Assessments Actually Looks Like

Behind the scenes of enterprise Copilot rollouts: scoping events, mock sessions, and the delivery cadence that decides if a program is seen as working.

Need expert guidance?

Free 30-min consultation with Luca Berton

Free 30-min AI & Cloud consultation