🎤 Speaking at KubeCon EU 2026 Lessons Learned Orchestrating Multi-Tenant GPUs on OpenShift AI View Session
🎤 Speaking at Red Hat Summit 2026 GPUs take flight: Safety-first multi-tenant Platform Engineering with NVIDIA and OpenShift AI Learn More
Luca Berton
AI

Getting Started with RHEL AI: Installation and GPU Setup

Luca Berton
#rhel-ai#installation-guide#gpu-setup#nvidia-cuda#amd-rocm#kickstart#podman#instructlab

Getting Started with RHEL AI: Installation and GPU Setup

Red Hat Enterprise Linux AI (RHEL AI) simplifies enterprise AI deployment by bundling the essential components needed to build, train, and deploy machine learning models at scale. In this guide, we’ll walk through the installation process and configure GPU acceleration for optimal performance.

Prerequisites

Before you begin, ensure you have (as covered in Chapter 2 of Practical RHEL AI):

Step 1: Update Your System

Start by updating your RHEL system to the latest packages:

sudo dnf update -y
sudo dnf install -y git curl wget

Step 2: Install RHEL AI

Red Hat provides RHEL AI through their enterprise repositories. The book covers multiple installation methods:

Option A: Standard Installation

sudo subscription-manager repos --enable rhel-9-for-x86_64-appstream-rpms
sudo dnf install -y rhel-ai

Option B: Kickstart for Bare Metal (Chapter 2) For automated bare metal installs, use Kickstart snippets provided in the book.

Option C: Cloud Templates The book provides cloud templates for AWS, Azure, and GCP deployments.

This installs the core RHEL AI components including:

Step 3: Configure GPU Acceleration

For NVIDIA GPUs:

Install NVIDIA GPU drivers and CUDA toolkit:

# Add NVIDIA repository
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo

# Install NVIDIA drivers and CUDA
sudo dnf install -y cuda-toolkit nvidia-driver

# Verify installation
nvidia-smi

For AMD MI300X:

Install ROCm (Radeon Open Compute):

# Add AMD ROCm repository
sudo dnf config-manager --add-repo https://repo.radeon.com/rocm/rhel9/rocm.repo

# Install ROCm runtime
sudo dnf install -y rocm-core rocm-runtime

# Verify installation
rocm-smi

Step 4: Verify GPU Access

Confirm your GPU is properly configured:

# Check available GPUs
rhel-ai gpu list

# Test GPU functionality
rhel-ai gpu test

Step 5: Initialize RHEL AI

Create your RHEL AI working directory and initialize the environment:

mkdir -p ~/rhel-ai
cd ~/rhel-ai

# Initialize RHEL AI environment
rhel-ai init

This creates:

Step 6: Download Your First Model

RHEL AI comes pre-configured with access to open-source models. Download a model for testing:

# List available models
rhel-ai model list

# Download Granite model (recommended)
rhel-ai model download granite-8b

# Verify download
rhel-ai model list --local

Hardware Sizing Guide

Choose the right GPU for your workload:

GPU ModelMemoryBest ForCost
NVIDIA A10040/80GBMulti-user inference, trainingHigh
NVIDIA H10080GBLarge model trainingVery High
AMD MI300X192GBMixed workloadsVery High
NVIDIA A1024GBSingle-user developmentMedium

Troubleshooting Common Issues

Issue: GPU not detected

# Restart GPU service
sudo systemctl restart nvidia-persistenced

# Reload drivers
sudo modprobe -r nvidia_uvm
sudo modprobe nvidia_uvm

Issue: CUDA version mismatch

# Check installed CUDA version
nvcc --version

# Update to latest CUDA
sudo dnf update -y cuda-toolkit

Issue: Insufficient GPU memory

Enable NVMe offload to use system memory:

export DEEPSPEED_OFFLOAD=1
export DEEPSPEED_OFFLOAD_DEVICE=cpu

Next Steps

Now that RHEL AI is installed and GPU is configured, you’re ready to:

  1. Fine-tune models using InstructLab
  2. Serve models with vLLM
  3. Monitor performance with Prometheus and Grafana
  4. Scale across clusters with Kubernetes

Resources


Ready to deploy enterprise AI? With RHEL AI installed and GPU configured, you have a solid foundation for building production-grade AI solutions. In the next article, we’ll explore InstructLab for fine-tuning models tailored to your organization’s needs.


📚 Get the Complete Installation Guide

This article only scratches the surface!

Practical RHEL AI provides everything you need for a successful deployment:

🚀 Pre-Order Now - Available March 2026

Get Practical RHEL AI from Apress and deploy production-ready AI on Red Hat Enterprise Linux with confidence.

Learn More →Buy on Amazon →
← Back to Blog