Build and Scale AI Workloads on Kubernetes

AI is easy to demo and hard to operate.

That is the real gap many teams are facing today. Building an impressive prototype is no longer the main challenge. The harder problem is creating the infrastructure foundation required to run AI workloads reliably, efficiently, and at scale.

On March 28, I will be joining Packt’s live workshop, Build & Scale AI Workloads on Kubernetes, including an exclusive AMA session focused on the practical side of deploying and operating AI workloads at scale.

This workshop stands out because it focuses on the problems engineering teams actually face in production environments — not just in slide decks or proof-of-concepts.

Why this topic matters

Many organizations have already crossed the first hurdle: they can experiment with AI.

The harder part is building the platform capabilities needed to support AI workloads consistently, efficiently, and securely. That means thinking beyond the model itself and paying attention to the infrastructure layer:

How do you schedule GPU-intensive workloads efficiently?
How do you scale up and down without wasting resources?
How do you monitor performance, reliability, and cost?
How do you design deployment patterns that can survive real traffic and operational complexity?

These are not abstract questions. They are the difference between an AI initiative that looks good in a demo and one that creates durable business value.

Why Kubernetes is part of the answer

Kubernetes has become the default control plane for modern infrastructure for a reason. It provides the primitives to orchestrate complex workloads, standardize deployments, and create repeatable operational patterns.

For AI, that matters even more.

As workloads become more compute-intensive and more distributed, teams need a platform that can support:

Containerized training and inference workloads
GPU allocation and scheduling
Autoscaling strategies for variable demand
Observability across services and pipelines
Resilience and repeatability across environments

But none of this happens automatically just because Kubernetes is in the stack.

Running AI on Kubernetes well requires sound architecture, platform discipline, and a clear understanding of the trade-offs involved.

What this workshop will cover

This session is designed for practitioners working across Kubernetes, MLOps, SRE, DevOps, and platform engineering.

The workshop will explore practical areas such as:

GPU scheduling — efficient allocation and multi-tenant GPU sharing
Autoscaling — scaling inference workloads up and down with demand
Observability — monitoring AI pipelines, GPU utilization, and model performance
Deployment patterns for AI agents and ML workloads
Operating AI systems in production — the operational truth beyond the hype

Not more hype. More operational truth.

The speakers

I am honored to share the stage with an incredible lineup:

Shadab Hussain — Google Developer Expert (AI/ML), Instructor
Sandeep Raghuwanshi — Former Kubernetes SME at Microsoft, Instructor
Nicolas Vermande — Senior Developer Advocate at ScaleOps, Panelist
Derek Ashmore — Agentic AI Enablement Principal, Enterprise AI and AIOps Strategist, Panelist
Luca Berton — AI Infrastructure Architect, KubeCon Speaker, Exclusive AMA

Why I am joining

A big part of my work has been helping organizations think through scalable AI infrastructure, cloud-native platforms, and the bridge between technical implementation and business outcomes.

What I find most valuable in sessions like this is the chance to discuss what happens after the initial excitement fades — when teams are left with the reality of building platforms that are cost-effective, scalable, and maintainable.

That is also the spirit of the AMA session. I am looking forward to an open conversation around the decisions, bottlenecks, and lessons that emerge when AI moves into production.

Who should attend

This workshop will be especially relevant if you are:

Building or operating AI workloads on Kubernetes
Scaling platform capabilities for ML teams
Working in MLOps, SRE, DevOps, or platform engineering
Trying to move from experimentation to production-grade AI systems
Responsible for balancing performance, reliability, and cost

Register now

Date: March 28, 2026

30% off with code: LUCA30

Limited slots available — I will be there on March 28 and look forward to the discussion.

Want to go deeper on AI infrastructure for your organization? Book a call to discuss your specific challenges with GPU scheduling, Kubernetes platform design, and production AI operations.

Build and Scale AI Workloads on Kubernetes

Why this topic matters

Why Kubernetes is part of the answer

What this workshop will cover

The speakers

Why I am joining

Who should attend

Register now

Related Articles

Differential Privacy: How Math Protects Your Privacy

GLM-5.2 744B: Sparse Attention Meets Efficient MoE

Reliable AI Agents in Java with LangChain4J — Workshop

AI Gateway on Kubernetes: Route and Load-Balance LLM Traffic