Skip to main content
🎀 Speaking at KubeCon EU 2026 Lessons Learned Orchestrating Multi-Tenant GPUs on OpenShift AI View Session
🎀 Speaking at Red Hat Summit 2026 GPUs take flight: Safety-first multi-tenant Platform Engineering with NVIDIA and OpenShift AI Learn More
Container security for AI workloads
AI

Securing AI Workloads: Container Isolation for LLM Inference

Protect AI inference workloads with container security best practices. SELinux, seccomp profiles, read-only filesystems, and GPU isolation strategies.

LB
Luca Berton
Β· 2 min read

AI workloads present unique security challenges. GPU passthrough breaks container isolation assumptions. Model weights are valuable intellectual property. Training data may contain sensitive information. Standard container security is necessary but not sufficient.

GPU Security Challenges

When you pass a GPU into a container, you are granting direct hardware access. This bypasses many of the isolation guarantees that containers provide:

  • Shared GPU memory β€” without proper isolation, one container could read another’s GPU memory
  • Driver vulnerabilities β€” GPU drivers run in kernel space with full system access
  • Side-channel attacks β€” GPU cache timing attacks can leak information across containers

Isolation Strategies

NVIDIA MIG (Multi-Instance GPU) β€” hardware-level GPU partitioning. Each partition gets dedicated memory and compute. This is the strongest isolation for multi-tenant GPU sharing.

# Kubernetes pod requesting a MIG partition
resources:
  limits:
    nvidia.com/mig-3g.40gb: 1

gVisor with GPU support β€” Google’s application kernel provides stronger syscall filtering. Limited GPU support is available but adds latency.

Confidential containers β€” run AI workloads in hardware-encrypted enclaves using AMD SEV-SNP or Intel TDX. The host cannot inspect container memory, even with root access.

Protecting Model Weights

Model weights are your competitive advantage. Protect them:

# Mount model weights as read-only from encrypted storage
volumes:
  - name: model-weights
    persistentVolumeClaim:
      claimName: encrypted-models
      readOnly: true

securityContext:
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  allowPrivilegeEscalation: false
  capabilities:
    drop: ["ALL"]

Use Kubernetes secrets management with Vault or External Secrets Operator for API keys and model access tokens.

Network Policies for AI Workloads

AI inference services should have strict network boundaries:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: inference-isolation
spec:
  podSelector:
    matchLabels:
      app: inference-server
  policyTypes: ["Ingress", "Egress"]
  ingress:
    - from:
        - podSelector:
            matchLabels:
              role: api-gateway
      ports:
        - port: 8080
  egress:
    - to:
        - podSelector:
            matchLabels:
              role: model-store

No internet access for inference pods. No lateral movement. Only the API gateway can reach them.

Supply Chain Security for AI

AI supply chains include model registries, training data pipelines, and framework dependencies. Apply SLSA and Sigstore practices to:

  • Sign model artifacts with Cosign
  • Verify model provenance before deployment
  • Scan container images for vulnerabilities
  • Use SBOM generation for compliance

Runtime Monitoring

Monitor AI workloads for security anomalies using eBPF-based tools:

  • Unexpected network connections from inference pods
  • File system writes in read-only containers
  • Unusual GPU utilization patterns (potential cryptomining)
  • Model endpoint scanning or prompt injection attempts

AI security is an evolving field. The fundamentals β€” least privilege, defense in depth, monitoring β€” apply. The specifics around GPU isolation and model protection require specialized attention.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens TechMeOut