Skip to main content
πŸŽ“ Claude Code Masterclass Learn AI-assisted development on Udemy β€” plus the companion book on Leanpub & Amazon. Start Learning
Luca Berton speaking to a packed room at KubeCon Europe 2026 Amsterdam
AI

Packed Room at KubeCon Europe 2026:

780 attendees saved the session and packed the room for my KubeCon EU 2026 talk on multi-tenant GPUs on bare metal OpenShift AI. Recap, questions, and.

LB
Luca Berton
Β· 4 min read

Standing room only

I looked up from the podium and the room was full. Not β€œa few empty seats in the back” full β€” 780 attendees had saved my session in the KubeCon agenda, engineers packed in, people standing along the walls, and more trying to get in through the doors.

This was my talk at KubeCon + CloudNativeCon Europe 2026 in Amsterdam: Multi-Tenant GPUs on Bare Metal OpenShift AI β€” A GitOps Blueprint from the Trenches.

After months of preparation, after the nerves of the morning, after checking the slides one last time β€” seeing that room filled with people who cared about the same problems I have been solving every day was the most incredible feeling of my career.

What I presented

The talk was not theoretical. It was a battle-tested blueprint from real production infrastructure:

The environment: bare metal, no safety net

We run a Dell AI Factory bare-metal cluster β€” PowerEdge R660 control plane, R670 CPU nodes, and XE7740 GPU workers loaded with NVIDIA H200 GPUs. PowerScale scale-out NAS with RDMA. Latest OpenShift plus OpenShift AI, NVIDIA GPU and Network Operators, and Run:AI for scheduling.

All of this in an air-gapped environment with a local Quay mirror. No public registries. No cloud abstractions. Every layer is ours to manage.

The real problems

β€œIt runs” does not mean β€œit is safe to share.” From day one we had to solve:

  • Noisy neighbors hoarding GPU memory, causing latency spikes for everyone else
  • Queue explosions where jobs starve and scheduling becomes β€œrandom wins”
  • MIG misfit β€” some workloads thrived with Multi-Instance GPU partitioning, others crawled
  • Driver drift β€” proprietary kernel modules and firmware mismatches across nodes
  • Network chaos β€” SR-IOV misconfigured, RDMA failures killing distributed training

The GitOps-driven solution

We needed guardrails, not manual intervention. The blueprint I shared covers:

  • GPU partitioning strategies β€” when to use MIG, when to use time-slicing, when to dedicate full GPUs
  • NVIDIA GPU Operator configuration β€” managing driver lifecycles across bare-metal nodes
  • Network Operator and SR-IOV β€” NicClusterPolicy for RDMA-capable network isolation
  • Run:AI scheduling β€” fair-share policies, quotas, and preemption that actually work
  • GitOps everything β€” ArgoCD managing the entire GPU infrastructure stack declaratively

The energy in the room

What struck me most was the quality of the questions. These were not surface-level β€œwhat is Kubernetes” questions. People were asking about:

  • Specific MIG partition layouts for different model sizes
  • How we handle firmware updates without disrupting running training jobs
  • SR-IOV virtual function allocation strategies for multi-tenant RDMA
  • Whether our GitOps approach works with air-gapped OCI registries

These are the questions of people who are living these problems right now. GPU infrastructure on Kubernetes is not a future trend β€” it is today’s reality, and the room proved it.

Why this topic resonated

A year ago, GPU on Kubernetes was a niche topic. Today, every enterprise is trying to figure out how to share expensive GPU resources across teams without chaos. The demand for AI infrastructure on Kubernetes has exploded, and the tooling is finally catching up.

The fact that hundreds of engineers showed up for a talk about bare-metal GPU multi-tenancy tells you where the industry is heading. Cloud GPUs are expensive and scarce. Enterprises are building their own GPU clusters. And they need the patterns, operators, and GitOps blueprints to manage them.

Get the slides

I have published the full slide deck for anyone who could not attend or wants to revisit the content:

Download the slide deck (PDF)

The deck covers the full architecture, configuration examples, lessons learned, and the GitOps blueprint we use in production.

Thank you

To everyone who showed up, who asked questions, who came up afterward to share their own GPU war stories β€” thank you. This is what makes KubeCon special. Not the expo hall, not the swag. The people who are building real infrastructure and sharing what they learn.

A special thanks to the KubeCon program committee for selecting this talk, and to the incredible Cloud Native community leaders who make events like this possible.

What is next

If you are building GPU infrastructure on Kubernetes and want to go deeper:

See you at the next one.


Read more about the KubeCon 2026 side events, the companies shaping Cloud Native, and my experience co-hosting Cloud Native Rejekts.

Free 30-min AI & Cloud consultation

Book Now