Skip to main content
πŸŽ“ Claude Code Masterclass Learn AI-assisted development on Udemy β€” plus the companion book on Leanpub & Amazon. Start Learning
OpenShift AI introduction slide at Red Hat Summit 2026
AI

Upgrading to OpenShift AI 3.x at Red Hat Summit 2026

Version history, support windows, supported configurations, and a 5-step migration framework for upgrading from RHOAI 2.x to 3.x β€” from Red Hat Summit 2026.

LB
Luca Berton
Β· 15 min read

At Red Hat Summit 2026 in Atlanta, I attended a practical session on upgrading Red Hat OpenShift AI (RHOAI) β€” covering version history, support timelines, supported configurations, and a structured migration framework from 2.x to 3.x.

If you are running RHOAI in production, this is critical planning information.

Introduction: What You Need to Know

OpenShift AI Introduction β€” version history, support timelines, and supported configurations

The session covered three essential areas:

  1. A quick history of RHOAI versions
  2. Support timelines for OpenShift AI
  3. Supported configurations for each version

The 5-Step Migration Framework

5-step migration agenda β€” evaluate, understand, prepare, execute

Red Hat presented a structured approach to upgrading:

  1. Introduction and quick β€œmust-know” β€” version compatibility and support windows
  2. Evaluate β€” Assess the scope of change for your environment
  3. Understand β€” Map the key architectural shifts and migration strategies
  4. Prepare β€” Build your prerequisite checklist and migration plan
  5. Execute β€” Apply best practices and avoid common pitfalls

This is not a β€œclick upgrade and hope” process β€” RHOAI 3.x introduces significant architectural changes that require planning.

Support Windows: RHOAI 2.x

RHOAI 2.x support windows β€” 3 channels: fast, stable, EUS

RHOAI 2.x uses three release channels:

Fast Channel (1 month cadence):

  • RHOAI 2.19, 2.20, 2.21
  • Short-lived releases for early adopters

Stable Channel (7 months support):

  • RHOAI 2.16, 2.19, 2.22
  • Production-ready with extended support window

EUS (Extended Update Support) (18 months):

  • RHOAI 2.16, 2.25
  • For enterprises that need long-term stability and minimal upgrade frequency

Support Windows: RHOAI 3.x

RHOAI 3.x support windows β€” 2 channels plus Early Adopter drops

RHOAI 3.x simplifies to two channels plus an Early Adopter track:

Early Adopter (EA) Drops:

  • RHOAI 3.5 EA1, EA2
  • Short support window for testing and feedback

Stable Channel (7 months):

  • RHOAI 3.3, 3.4, 3.5
  • Production-grade with predictable lifecycle

EUS (18 months):

  • RHOAI 3.5, 3.8
  • Long-term support for conservative upgrade cycles

The key change from 2.x: the β€œfast” channel is replaced by Early Adopter drops, and the naming convention shifts from 2.x minor versions to 3.x minor versions.

Supported Configurations

Supported configurations β€” RHOAI version to OpenShift version compatibility matrix

Version compatibility is critical β€” not all RHOAI versions work with all OpenShift versions:

RHOAI 2.16:

  • OpenShift 4.14, 4.16, 4.18, 4.19
  • OpenShift 4.16, 4.17, 4.18, 4.19, 4.20, 4.21

RHOAI 3.3:

  • OpenShift 4.19.9+, 4.20, 4.21

Note the significant jump: RHOAI 3.3 requires minimum OpenShift 4.19.9 β€” if you are running older OpenShift versions, you must upgrade OpenShift first before migrating to RHOAI 3.x.

What Is New in OpenShift AI 3

New features in OpenShift AI 3 β€” MLFlow, MaaS, AutoRAG, Kagenti, DRA, and more

The session highlighted the key new capabilities coming to RHOAI 3.x (not the entire Red Hat AI roadmap, but the highlights):

  • MLFlow β€” Experiment tracking and model registry
  • Models-as-a-Service (MaaS) β€” Simplified model deployment
  • AI PlayGround β€” Interactive model experimentation
  • AutoRAG β€” Automated Retrieval-Augmented Generation pipelines
  • AutoML β€” Automated machine learning model selection and tuning
  • Synthetic Data Generation β€” Create training data at scale
  • Eval Hub β€” Centralized model evaluation and benchmarking
  • Nemo Guardrails β€” NVIDIA’s safety and alignment framework integrated
  • Kagenti β€” Red Hat’s Kubernetes-native AI agent orchestration
  • DRA (Dynamic Resource Allocation) β€” Fine-grained GPU and accelerator scheduling

This is a massive feature jump from 2.x β€” the platform is evolving from a notebook/pipeline tool into a full AI application lifecycle platform.

Platform Architecture: What Is New in OpenShift AI 3

What is new in OpenShift AI 3 β€” LLM-d, MaaS, Gateway API, kube-rbac-proxy, Kubeflow Trainer v2, Unified Observability

The session described RHOAI 3 as β€œa platform evolution, not just a version bump.” The six core architectural changes:

  • LLM-d β€” Distributed LLM inference with intelligent routing (KV-cache aware)
  • Model-as-a-Service β€” Multi-tenant LLM governance, RBAC, and rate limiting
  • Gateway API β€” Cloud-native traffic management replacing OpenShift Routes
  • kube-rbac-proxy β€” External identity provider support for authentication
  • Kubeflow Trainer v2 β€” Unified TrainJob API for distributed PyTorch, JAX, DeepSpeed
  • Unified Observability β€” Zero-config GPU monitoring and native vLLM metrics

Architecture at a Glance: 2.25 vs 3.3

Architecture comparison β€” RHOAI 2.25 vs 3.3 showing removed, unchanged, and new components

This side-by-side comparison is the most important slide in the entire session. Here is what moved, what is new, and what is gone:

Model Serving:

  • ModelMesh REMOVED β†’ RawDeployment (NEW)
  • KServe Serverless REMOVED β†’ LLM-d (NEW)

Networking:

  • OpenShift Routes REMOVED β†’ Gateway API (NEW)
  • Service Mesh v2 REMOVED β†’ Service Mesh v3 embedded (NEW)

Authentication:

  • oauth-proxy REMOVED β†’ kube-rbac-proxy (NEW)
  • Authorino (standalone) REMOVED β†’ RHCL / Authorino + Limitador (NEW)

Operators:

  • Codeflare REMOVED β†’ KubeRay
  • Embedded Kueue REMOVED β†’ RH Build of Kueue (NEW)
  • cert-manager now REQUIRED

What Is Removed

Components removed in RHOAI 3.x and their replacements

Components you must migrate away from before upgrading:

  • ModelMesh (multi-model serving) β†’ RawDeployment
  • KServe Serverless (Knative-based) β†’ RawDeployment or LLM-d
  • oauth-proxy β†’ kube-rbac-proxy
  • OpenShift Serverless operator β†’ Not needed, uninstall
  • Service Mesh v2 β†’ Service Mesh v3 (embedded in OCP 4.19)
  • Standalone Authorino β†’ Red Hat Connectivity Link
  • Codeflare Operator β†’ KubeRay (standalone)
  • Embedded Kueue β†’ Red Hat Build of Kueue

Critical data warning: Llama Stack SQLite is replaced by PostgreSQL β€” complete data loss during transition. No automated migration exists.

What Breaks If You Do Nothing

What breaks if you upgrade without preparation β€” these are not edge cases

This is the scariest slide. If you run RHOAI in production today, at least some of these apply to you:

  • ModelMesh / KServe Serverless models β†’ HTTP 503 errors, endpoints stop responding
  • Custom workbench images (built for oauth-proxy) β†’ Redirection loops, users cannot log in
  • Running workbenches not stopped before migration β†’ Redirection loops, no warning
  • Kueue left in β€œManaged” state β†’ Unrecoverable cluster instability, full restore required
  • Llama Stack data β†’ Permanently lost, no automated migration
  • All Route-based URLs β†’ Stop working: bookmarks, scripts, monitoring, CI/CD, DNS

Assessing Your Exposure

Impact assessment β€” minimal, moderate, and high impact categories

The rhai-cli assessment tool will tell you exactly where you stand:

Minimal impact β€” Standard RawDeployment models, default workbench images, no custom auth

Moderate impact β€” Custom workbench images, Kueue for batch scheduling, Llama Stack

High impact β€” ModelMesh or Serverless models, custom oauth-proxy configurations, Service Mesh v2 dependencies, distributed inference workloads

Migration Paths: 2.25.x to 3.3.z

Migration path diagram β€” supported sources (2.25.4+) to supported target (3.3.latest)

The migration path is strict:

  • Supported sources: RHOAI 2.25.4 and later (2.25.4, 2.25.5, 2.25.latest) β†’ ALLOWED to 3.3.latest
  • Blocked: RHOAI 2.25.3 and earlier β€” known vulnerabilities, migration blocked
  • Unsupported targets: 3.3.0, 3.3.1 β€” outdated with reproducibility risks. Only 3.3.latest (e.g., 3.3.2) is a valid target

Support Timeline

Support timeline β€” RHOAI 2.25 EUS until April 2027, migration path from 2.25.4+ to 3.3.latest

Key dates and rules β€” β€œProper Planning Prevents Poor Performance”:

  • RHOAI 2.25 β€” Extended Update Support until April 2027 (security patches and CVE fixes via z-stream releases)
  • RHOAI 3.3.2 β€” First release supporting migration from 2.25.X(latest)
  • All new features and capabilities exclusive to 3.x stream
  • Migration path: from 2.25.4 and later β†’ to 3.3.latest
  • Migration to 3.3.0 or 3.3.1 is NOT supported

Reference links from the session:

Model Serving Consolidation

Model serving consolidation β€” two modes removed, two modes forward

The model serving stack simplifies from four options to two:

  • ModelMesh βœ— β†’ RawDeployment (standard models)
  • KServe Serverless βœ— β†’ RawDeployment or LLM-d (if distributed)
  • RawDeployment β†’ stays, also feeds into LLM-d for distributed LLMs

The red warning at the bottom says it all: Unconverted models return HTTP 503 after upgrade. No grace period, no fallback β€” if you have not migrated your ModelMesh or KServe Serverless models to RawDeployment before upgrading, they simply stop serving.

Why ModelMesh and KServe Were Removed

Model serving consolidation details β€” reasons for removal and going forward options

The reasoning behind the removals:

Removed:

  • ModelMesh β€” not designed for LLM inference topologies
  • KServe Serverless β€” incompatible with Service Mesh v3 and Gateway API

Going forward:

  • RawDeployment β€” standard Kubernetes deployments, no external dependencies
  • LLM-d β€” distributed inference with intelligent routing (requires RHCL)

Critical: All ModelMesh and Serverless models must be converted before migration. There is no automated conversion path.

Auth and Operator Changes

Authentication and operator changes β€” oauth-proxy to kube-rbac-proxy, operator uninstall/install matrix

The authentication model changes completely:

Authentication: oauth-proxy β†’ kube-rbac-proxy (with external IdP support)

Important: Custom workbench images built for oauth-proxy must be rebuilt β€” the image data paths are not the same for kube-rbac-proxy. The security proxy changed completely, so any workbench image that hardcoded oauth-proxy paths or configurations will fail with redirection loops after migration.

Operators to uninstall:

  • βœ— OpenShift Serverless
  • βœ— Service Mesh v2
  • βœ— Authorino

Required install:

  • βœ“ cert-manager

Optional installs:

  • RHCL (Red Hat Connectivity Link β€” for rate limiting and policy)
  • JobSet (for batch workloads)
  • LWS (LeaderWorkerSet β€” for distributed training)
  • RHBoK (Red Hat Build of Kueue β€” for job scheduling)

Gateway API Replaces Routes

Gateway API replaces Routes β€” why the change, what it means, requirements

The move from OpenShift Routes to Gateway API is the foundation for LLM-d, MaaS, and future multi-tenancy:

Why the change:

  • Routes are feature-frozen in Kubernetes
  • Gateway API is the CNCF standard for traffic management
  • Required for LLM-d intelligent routing and AI Gateway capabilities

What it means for you:

  • All endpoint URLs change β€” bookmarks, scripts, firewall rules need updating
  • Dashboard routing now uses Gateway API internally
  • Bare metal environments may need MetalLB for Gateway traffic

Requires: OpenShift 4.19.9+ (Service Mesh v3 embedded)

Networking Flow: 2.25 vs 3.3

Networking flow comparison β€” Routes + Service Mesh v2 + oauth-proxy vs Gateway API + RHCL + LLM-d + kube-rbac-proxy

The request path simplifies significantly:

RHOAI 2.25: Client β†’ OpenShift Route (dashed) β†’ Service Mesh v2 (dashed) β†’ Model Pod with oauth-proxy (dashed)

RHOAI 3.3: Client β†’ Gateway API β†’ LLM-d intelligent routing β†’ Model Pod with kube-rbac-proxy, plus RHCL for policy/rate limiting

Every dashed component in 2.25 is removed. The 3.3 path is cleaner, with LLM-d providing KV-cache-aware routing and RHCL (Authorino + Limitador) handling policy and rate limiting at the gateway level.

Migration Strategy: No Rollback

One of the most important points from the session: there is no rollback path. If the migration fails or something goes wrong, the only solution is a full restore from backup.

The recommended migration strategy is a parallel install:

  1. Stand up RHOAI 3.x alongside 2.x β€” this means running twice the workload temporarily
  2. Migrate workloads one by one β€” move individual models, pipelines, and notebooks from the old environment to the new one
  3. Validate each workload before decommissioning its 2.x counterpart
  4. Only then remove the 2.x installation

This is not an in-place upgrade you can undo. Plan for the parallel capacity, budget the extra GPU/compute time, and take a full cluster backup before touching anything.

Realistic Timelines

Realistic timelines β€” small, medium, and large/regulated migration durations by phase

How long will this actually take? The session provided honest timelines:

PhaseSmallMediumLarge / Regulated
Assessment2-3 days1-2 weeks2-4 weeks
Preparation1-2 weeks3-6 weeks6-12 weeks
Rehearsal1-2 days1 week1-2 weeks
Migration window4-8 hours1-2 days2-5 days
Stabilization1 week2-4 weeks4-8 weeks
Total3-5 weeks2-3 months4-7 months

Environment definitions:

  • Small: 1-2 teams, no custom images, few models, no compliance
  • Medium: 3-10 teams, some custom images, multiple models, standard change management
  • Large / Regulated: 10+ teams, many custom images, extensive integrations, compliance requirements

For large enterprises with compliance requirements, you are looking at up to 7 months end-to-end. Start planning now β€” RHOAI 2.25 EUS runs out in April 2027.

Prerequisites Checklist

Prerequisites checklist β€” platform, operator cleanup, workload preparation, backup

Before you start the migration, every item on this checklist must be complete:

Platform:

  • OpenShift Container Platform 4.19.9 or higher
  • cert-manager operator installed
  • MetalLB deployed (bare metal environments)

Operator cleanup:

  • Uninstall: OpenShift Serverless, Service Mesh v2, standalone Authorino
  • Set Kueue management state to Removed ⚠️

Workload preparation:

  • Convert all ModelMesh and Serverless models to RawDeployment
  • Rebuild custom workbench images for kube-rbac-proxy
  • Stop all running workbenches
  • Archive Llama Stack data (if applicable)

Backup:

  • Full cluster backup mandatory for in-place migration
  • Backup all Persistent Volume Claims

Critical insight from the session: Kueue is the most impactful item on this list. If Kueue is left in β€œManaged” state during migration, it causes unrecoverable cluster instability requiring a full restore. Also β€” do not just take backups, test the restore of your backups before starting migration. An untested backup is not a backup.

rhai-cli Assessment Tool

rhai-cli lint output β€” CRITICAL, WARNING, and INFO findings for migration readiness

The rhai-cli tool is available as a container image and serves as your migration readiness companion. Run it against your cluster to get a clear picture:

rhai-cli lint --target-version 3.3.2

Example output:

  • βœ— CRITICAL β€” kserve: Serverless mode enabled but will be removed
  • βœ— CRITICAL β€” modelMesh: ModelMesh enabled but will be removed
  • ⚠ WARNING β€” servicemesh-v2: No longer required
  • βœ“ INFO β€” cert-manager: Prerequisite met

The workflow is iterative: Run β†’ Remediate β†’ Re-run β†’ Repeat until all CRITICAL and WARNING items are resolved.

How rhai-cli Works

rhai-cli assessment tool β€” what it does and how to read the output

The tool is available as a container image with multiple helpers to convert and prepare YAML β€” fully baked and ready to use:

What it does:

  • Scans your entire cluster for migration blockers
  • Identifies every artifact requiring modification
  • Non-intrusive β€” diagnostic only, changes nothing

How to read the output:

  • Prohibited β€” upgrade not possible, do not continue
  • Critical β€” blocker, component will fail after upgrade
  • Warning β€” potential issue, review and remediate
  • Info β€” no action needed

Run it multiple times to track your progress as you resolve issues. The full review takes a few loops to get everything clean.

Full rhai-cli Output in Action

rhai-cli full terminal output β€” comprehensive cluster scan showing all checks

rhai-cli full output continued β€” summary with Total: 40, Passed: 12, Warnings: 16, Failed: 12, Prohibited: 0

The real-world output is dense. In this demo cluster, the tool found:

  • Total: 40 checks β€” Passed: 12, Warnings: 16, Failed: 12, Prohibited: 0
  • CRITICAL findings: Llama Stack data will be lost, notebooks referencing HardwareProfiles that do not exist on the cluster, Kueue management state issues
  • WARNING findings: Deprecated AcceleratorProfiles being auto-migrated to HardwareProfiles, custom workbench images needing verification, running workbenches that must be stopped
  • Notebook image analysis: 61 custom images, 4 incomplete, 0 incompatible β€” each needs user verification for kube-rbac-proxy compatibility

The summary at the bottom: FAIL β€” blocking findings detected. This cluster is not ready for migration until all critical items are resolved.

Before You Start: Open a Support Case with Red Hat

Before you start β€” open a proactive support case with Red Hat, 3-step workflow

Red Hat strongly recommends opening a proactive support ticket before any migration attempt. They will assign dedicated people to oversee and review results, assess your environment, inventory components, and provide targeted advice for your specific setup.

Why:

  • The migration process can be complex β€” expert guidance reduces risk
  • Mandatory for in-place migration
  • Red Hat will assign people to review your rhai-cli output

What to include in the ticket:

  • Your rhai-cli assessment output
  • Your chosen migration strategy (side-by-side or in-place)
  • Inventory of components and workloads in scope

What you get:

  • Targeted advice for your specific environment
  • Guidance on migration sequencing and risk areas
  • Access to Red Hat expertise for complex migrations

The recommended 3-step workflow:

  1. Run rhai-cli β€” get your assessment
  2. Open support case β€” share your results
  3. Plan together β€” get targeted guidance

Component Migration at a Glance

Component migration effort matrix β€” pre-upgrade, upgrade, post-upgrade actions

Here is the effort breakdown per component:

ComponentEffortKey Action
Model servingHighConvert all models to RawDeployment
WorkbenchesModerateRebuild custom images, stop all workbenches
Ray trainingModerateRun migration script, remove Codeflare
TrustyAIModerateBackup metrics and data before upgrade
AI PipelinesLowUpdate RBAC roles if customized
Model RegistryLowVerify pods after upgrade
Feature StoreLowVerify status after upgrade
Llama StackHighArchive data, delete and recreate CRs
KueueLowSet to Removed, reinstall via RHBoK after

The two High effort items β€” Model serving and Llama Stack β€” are the ones that require the most planning. Model serving because every model must be converted to RawDeployment, and Llama Stack because all data will be lost during migration.

Execution: Best Practices and Pitfalls

A key insight from the session: rerun rhai-cli after each remediation step to verify one less thing is flagged. The iterative loop is essential β€” do not try to fix everything at once and hope it works.

And a critical reminder: GitOps is not a backup. Your Git repository contains your desired state, but not your persistent data, PVC contents, model weights, or notebook artifacts. A proper OADP or Velero backup is mandatory.

Key Pitfalls to Avoid

Key pitfalls to avoid β€” lessons from migration testing

These are hard lessons from real migration testing:

  • 🚫 Do not upgrade with Kueue in β€œManaged” state β€” causes unrecoverable cluster instability
  • 🚫 Do not leave ModelMesh or Serverless models unconverted β€” all unconverted models return HTTP 503 after upgrade
  • 🚫 Do not skip the backup for in-place migration β€” no rollback capability, backup is your only recovery path
  • 🚫 Do not forget to stop running workbenches β€” unmigrated running workbenches cause redirection loops
  • 🚫 Do not run the Ray pre-upgrade script too early β€” it removes Codeflare, delaying the upgrade after this creates a security gap
  • 🚫 Do not assume your backup restores successfully β€” an untested backup is not a backup, verify on a separate cluster first

How to Know You Are Done

How to know you are done β€” migration verification checklist

Your migration is complete when all of these are verified:

  • βœ… Workbenches β€” all start successfully, users log in without redirection loops
  • βœ… Model serving β€” all endpoints return correct inference responses with real requests
  • βœ… Pipelines β€” on-demand and scheduled runs execute successfully
  • βœ… Persistent data β€” notebooks, model artifacts intact and accessible
  • βœ… GPU workloads β€” schedule and run correctly
  • βœ… Monitoring and integrations β€” alerting operational, CI/CD, API gateways, DNS, firewalls updated
  • βœ… RBAC and quotas β€” role bindings and resource limits intact
  • βœ… Users and documentation β€” notified of new access points, runbooks and training materials updated

Key Takeaways

Key takeaways β€” what to remember from this session

  • This is a migration, not an upgrade β€” nearly every platform layer has been re-architected
  • You have time, but start planning now β€” RHOAI 2.25 supported until April 2027, use that window to prepare
  • Use rhai-cli to assess your specific exposure β€” it tells you exactly what needs to change in your environment
  • Engage Red Hat support early β€” open a case before you start, include your rhai-cli assessment
  • Choose your strategy carefully β€” side-by-side minimizes risk, in-place has no rollback
  • Persistent data and pipelines: plan validation in advance β€” check before and after the migration to ensure nothing was lost
  • Minimum source: RHOAI 2.25.4+ β†’ target 3.3.latest only (not 3.3.0 or 3.3.1)
  • OpenShift 4.19.9+ required β€” may be a multi-step upgrade from your current version
  • Llama Stack data is permanently lost β€” back up before migrating, no automated path
  • Test in non-production first β€” especially ModelMesh β†’ RawDeployment and oauth-proxy β†’ kube-rbac-proxy transitions

Resources and Next Steps

Resources and next steps β€” migration documentation and rhai-cli assessment tool links

Speakers presenting the resources slide at Georgia World Congress Center

Free 30-min AI & Cloud consultation

Book Now