Skip to main content
🎀 Speaking at KubeCon EU 2026 Lessons Learned Orchestrating Multi-Tenant GPUs on OpenShift AI View Session
🎀 Speaking at Red Hat Summit 2026 GPUs take flight: Safety-first multi-tenant Platform Engineering with NVIDIA and OpenShift AI Learn More
AI

Building a Private AI Cloud with RHEL AI and InstructLab

Luca Berton β€’ β€’ 1 min read
#rhel-ai#instructlab#private-cloud#ai#enterprise

🏒 Your Own AI Cloud

For regulated industries β€” healthcare, finance, defense, government β€” sending data to OpenAI or Anthropic isn’t an option. RHEL AI and InstructLab let you build a fully private AI cloud with custom models trained on your data.

Architecture

RHEL AI (Base Platform)
  β”œβ”€β”€ InstructLab (Model Customization)
  β”‚   β”œβ”€β”€ Taxonomy-based training
  β”‚   └── Synthetic data generation
  β”œβ”€β”€ vLLM (Model Serving)
  β”œβ”€β”€ Granite Models (Base Foundation)
  └── GPU Management (NVIDIA drivers + container toolkit)

Installation

# RHEL AI bootable container image
sudo bootc switch registry.redhat.io/rhel-ai/rhel-ai-nvidia:1.4

# Initialize InstructLab
ilab config init
ilab model download --repository instructlab/granite-7b-lab

# Serve the base model
ilab model serve

Custom Model Training

1. Define Your Knowledge

Create taxonomy entries for your domain:

# taxonomy/knowledge/company/policies/qna.yaml
created_by: platform-team
domain: company_policies
seed_examples:
  - question: What is the data retention policy?
    answer: |
      Data must be retained for 7 years for financial records,
      3 years for operational data, and deleted within 30 days
      upon customer request per GDPR Article 17.
  - question: What are the approved cloud regions?
    answer: |
      Production workloads must run in EU-West-1 (Ireland) or
      EU-Central-1 (Frankfurt). US regions require CISO approval.
document:
  repo: https://gitlab.internal/policies
  commit: abc123
  patterns:
    - "*.md"

2. Generate Synthetic Training Data

ilab data generate \
  --taxonomy-path ./taxonomy \
  --num-instructions 1000 \
  --model granite-7b-lab

InstructLab generates diverse question-answer pairs from your seed examples β€” multiplying 10 examples into 1000+ training samples.

3. Train

ilab model train \
  --model-path models/granite-7b-lab \
  --data-path generated_data \
  --num-epochs 5 \
  --effective-batch-size 16 \
  --device cuda

4. Evaluate and Deploy

# Test the model
ilab model evaluate --model models/granite-7b-trained

# Serve in production
ilab model serve \
  --model-path models/granite-7b-trained \
  --host 0.0.0.0 \
  --port 8000

Production Deployment on Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: private-llm
  namespace: ai-platform
spec:
  replicas: 2
  template:
    spec:
      containers:
      - name: vllm
        image: registry.internal/vllm:latest
        args:
        - "--model=/models/granite-7b-company"
        - "--max-model-len=8192"
        resources:
          limits:
            nvidia.com/gpu: "1"
        volumeMounts:
        - name: models
          mountPath: /models
      volumes:
      - name: models
        persistentVolumeClaim:
          claimName: model-storage

Why Private AI?

  • Data sovereignty: Your data never leaves your infrastructure
  • Compliance: Meet GDPR, HIPAA, SOX requirements by design
  • Customization: Models trained on YOUR domain knowledge outperform general models
  • Cost predictability: No per-token API costs, just infrastructure
  • Availability: No dependency on external API providers

Building a private AI cloud? I help organizations deploy RHEL AI and InstructLab for custom AI platforms. Get in touch.

Share:

Luca Berton

AI & Cloud Advisor with 18+ years experience. Author of 8 technical books, creator of Ansible Pilot, and instructor at CopyPasteLearn Academy. Speaker at KubeCon EU & Red Hat Summit 2026.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens TechMeOut