Everything works ā until the operator rolls out OFED and suddenly your Intel card and NVIDIA stack start fighting over the same kernel plumbing. Hereās whatās actually colliding, why it happens, and how to stop it fast.
This usually happens when NVIDIA Network Operator deploys a containerized OFED/DOCA driver and it collides with an Intel NIC driver stack on the same node (often ice / i40e).
A very common symptom is an auxiliary.ko module conflict (only one auxiliary module can be loaded):
āmodule auxiliary is in use by: iceā / āduplicate symbol ⦠owned by kernelā
If you donāt strictly need MOFED/DOCA features on that node, omit spec.ofedDriver entirely from your NicClusterPolicy. IBM explicitly calls out that defining ofedDriver makes the operator create MOFED pods; omitting it skips them and uses OS-provided drivers instead.
Note: If you use host/inbox drivers (no DOCA-OFED), you may need extra host packages (
linux-genericon Ubuntu,kernel-modules-extraon RHEL-based) andrdma-corefor inbox RDMA.
apiVersion: mellanox.com/v1alpha1
kind: NicClusterPolicy
metadata:
name: nic-cluster-policy
spec:
# IMPORTANT: do NOT define ofedDriver here
sriovDevicePlugin:
# keep SR-IOV, device plugins, etc.If you must run DOCA-OFED/MOFED, avoid Intelās out-of-tree/DKMS NIC driver package that drops a competing auxiliary.ko.
The conflict described by NVIDIA users is exactly: Intelās driver stack uses auxiliary and MOFED tries to unload/load it and fails.
In practice, this usually means:
ice/i40e driver that matches your distro kernelauxiliary.koIn NicClusterPolicy, set selectors so the SR-IOV device plugin only exposes vendor 15b3 (Mellanox/NVIDIA). NVIDIAās own full example shows vendor filtering in the SR-IOV plugin config.
spec:
sriovDevicePlugin:
config: |
{
"resourceList": [
{
"resourcePrefix": "nvidia.com",
"resourceName": "hostdev",
"selectors": {
"vendors": ["15b3"]
}
}
]
}Run these commands on the affected node:
# Check kernel messages for collision indicators
dmesg | egrep -i 'auxiliary|duplicate symbol|openibd|ice|i40e'
# See which modules are using auxiliary
lsmod | grep auxiliary
# Check which path/version of auxiliary is loaded
modinfo auxiliary| Approach | Best For |
|---|---|
Option 1: Omit ofedDriver | Nodes that donāt need MOFED/DOCA features |
| Option 2: Remove Intel out-of-tree driver | When DOCA-OFED is required but Intel NICs are present |
| Option 3: Vendor filtering | Mixed environments where you want SR-IOV only on Mellanox devices |
The key insight: the auxiliary.ko kernel module can only be loaded once, and both Intelās out-of-tree driver and NVIDIAās OFED/DOCA stack want to provide their own version. Choose the approach that matches your workload requirements.