If you run GPU training, HPC, or latency-sensitive workloads on OpenShift, the default pod networking model (overlay + kernel networking path) can become the bottleneck. Two technologies help you get closer to bare-metal behavior inside pods:
When you combine them and manage the host stack with the NVIDIA Network Operator, you get a repeatable, Kubernetes-native way to unlock a high-performance data plane for distributed AI and HPC.
SR-IOV lets one physical NIC port (PF) present multiple VFs. Each VF behaves like its own NIC function and can be attached to a pod via Multus as a secondary interface.
Why that matters:
Think of SR-IOV as âgive the pod a real NIC personality.â
RDMA changes the rules of networking by enabling direct memory access semantics for data movement. In practice, it can deliver:
That CPU savings is a big deal for GPU clusters: if the node burns CPU on networking, you often pay twiceâslower training/inference and fewer CPU cycles for input pipelines.
RDMA is commonly used with:
Using RDMA on top of an SR-IOV VF is a popular design because you get both:
This is especially valuable for:
In short: SR-IOV gives you the lane; RDMA makes the lane extremely fast.
The NVIDIA Network Operator is the âmake it boringâ part of this story.
Instead of manually installing and maintaining the networking stack across nodes (drivers, RDMA components, device plugins, and related configuration), the operator helps you manage it declaratively and consistently at cluster scale.
In real-world operations, that translates to:
You still need OpenShift networking pieces (like Multus/SR-IOV operator), but NVIDIAâs operator handles the NVIDIA/Mellanox-focused RDMA enablement and exposure.
A common pattern looks like this:
Cluster network (default CNI)
Used for normal pod-to-service traffic, API calls, image pulls, etc.
High-performance secondary network (Multus)
A NetworkAttachmentDefinition (NAD) attaches an SR-IOV VF to pods.
RDMA enabled on the VF
Pods that request the VF can use RDMA-capable libraries (depending on your stack and workload).
This keeps Kubernetes networking sane while providing a dedicated fast path for the workloads that need it.
Before you start, verify these basics:
You usually express the âplumbingâ in two layers:
Create and manage:
Request:
Hereâs an intentionally simplified sketch of what a workload request often resembles:
apiVersion: v1
kind: Pod
metadata:
name: rdma-workload
annotations:
k8s.v1.cni.cncf.io/networks: sriov-net
spec:
containers:
- name: app
image: your-image
resources:
limits:
example.com/sriov_vf: "1"Your actual resource name and NAD will differ, but the pattern is the same: âattach network + request VF.â
If you want SR-IOV/RDMA to pay off, focus on the things that usually dominate results:
These are the issues that tend to burn time:
VF not appearing on node Check BIOS SR-IOV + IOMMU, NIC firmware settings, and the SR-IOV policy reconciliation.
Pod schedules but no RDMA capability inside Confirm youâre attaching the right VF type and that the RDMA device plugin / components are present on the node.
Performance is âmehâ Start with NUMA topology and CPU pinning. Then validate MTU and (for RoCE) lossless fabric settings.
Mixing modes on the same ports In many designs youâll dedicate specific NIC ports for a given mode (SR-IOV/RDMA vs general networking) to keep behavior predictable.
Use SR-IOV + RDMA when you need:
Consider alternatives when:
SR-IOV and RDMA are about giving the right workloads a fast lane: fewer copies, lower jitter, lower CPU overhead, and better scaling under real distributed traffic. The NVIDIA Network Operator helps you operationalize this at scaleâso the cluster stays manageable while your GPUs spend more time doing actual compute instead of waiting on the network.