AI is easy to demo and hard to operate.
That is the real gap many teams are facing today. Building an impressive prototype is no longer the main challenge. The harder problem is creating the infrastructure foundation required to run AI workloads reliably, efficiently, and at scale.
On March 28, I will be joining Packtβs live workshop, Build & Scale AI Workloads on Kubernetes, including an exclusive AMA session focused on the practical side of deploying and operating AI workloads at scale.
This workshop stands out because it focuses on the problems engineering teams actually face in production environments β not just in slide decks or proof-of-concepts.
Why this topic matters
Many organizations have already crossed the first hurdle: they can experiment with AI.
The harder part is building the platform capabilities needed to support AI workloads consistently, efficiently, and securely. That means thinking beyond the model itself and paying attention to the infrastructure layer:
- How do you schedule GPU-intensive workloads efficiently?
- How do you scale up and down without wasting resources?
- How do you monitor performance, reliability, and cost?
- How do you design deployment patterns that can survive real traffic and operational complexity?
These are not abstract questions. They are the difference between an AI initiative that looks good in a demo and one that creates durable business value.
Why Kubernetes is part of the answer
Kubernetes has become the default control plane for modern infrastructure for a reason. It provides the primitives to orchestrate complex workloads, standardize deployments, and create repeatable operational patterns.
For AI, that matters even more.
As workloads become more compute-intensive and more distributed, teams need a platform that can support:
- Containerized training and inference workloads
- GPU allocation and scheduling
- Autoscaling strategies for variable demand
- Observability across services and pipelines
- Resilience and repeatability across environments
But none of this happens automatically just because Kubernetes is in the stack.
Running AI on Kubernetes well requires sound architecture, platform discipline, and a clear understanding of the trade-offs involved.
What this workshop will cover
This session is designed for practitioners working across Kubernetes, MLOps, SRE, DevOps, and platform engineering.
The workshop will explore practical areas such as:
- GPU scheduling β efficient allocation and multi-tenant GPU sharing
- Autoscaling β scaling inference workloads up and down with demand
- Observability β monitoring AI pipelines, GPU utilization, and model performance
- Deployment patterns for AI agents and ML workloads
- Operating AI systems in production β the operational truth beyond the hype
Not more hype. More operational truth.
The speakers
I am honored to share the stage with an incredible lineup:
- Shadab Hussain β Google Developer Expert (AI/ML), Instructor
- Sandeep Raghuwanshi β Former Kubernetes SME at Microsoft, Instructor
- Nicolas Vermande β Senior Developer Advocate at ScaleOps, Panelist
- Derek Ashmore β Agentic AI Enablement Principal, Enterprise AI and AIOps Strategist, Panelist
- Luca Berton β AI Infrastructure Architect, KubeCon Speaker, Exclusive AMA
Why I am joining
A big part of my work has been helping organizations think through scalable AI infrastructure, cloud-native platforms, and the bridge between technical implementation and business outcomes.
What I find most valuable in sessions like this is the chance to discuss what happens after the initial excitement fades β when teams are left with the reality of building platforms that are cost-effective, scalable, and maintainable.
That is also the spirit of the AMA session. I am looking forward to an open conversation around the decisions, bottlenecks, and lessons that emerge when AI moves into production.
Who should attend
This workshop will be especially relevant if you are:
- Building or operating AI workloads on Kubernetes
- Scaling platform capabilities for ML teams
- Working in MLOps, SRE, DevOps, or platform engineering
- Trying to move from experimentation to production-grade AI systems
- Responsible for balancing performance, reliability, and cost
Register now
Date: March 28, 2026
30% off with code: LUCA30
Limited slots available β I will be there on March 28 and look forward to the discussion.
Want to go deeper on AI infrastructure for your organization? Book a call to discuss your specific challenges with GPU scheduling, Kubernetes platform design, and production AI operations.