A Dell PowerEdge host that hangs during POST with P2PBridge.SL.x on screen looks like it is naming the failed part. It usually is not. Here is how to read that FQDD correctly and isolate the actual fault instead of chasing the wrong component.
What P2PBridge.SL.x Actually Is
P2PBridge.SL.x is an iDRAC Fully Qualified Device Descriptor (FQDD) — Dell’s naming scheme for addressing individual components in inventory, logs, and RACADM output. It identifies a PCI-to-PCI bridge or an internal PCIe-switch path, the same way BIOS.Embedded.1-1 identifies the system BIOS and iDRAC.Embedded.1-1 identifies the iDRAC controller itself.
The important distinction: an FQDD names an address in the PCIe topology, not necessarily the component that is broken. A host stuck at this entry during POST is almost always stuck during PCIe enumeration or link training, and the actual fault can sit anywhere along that path:
- BOSS controller or one of its M.2 drives
- BOSS connector or cable
- PCIe riser or a downstream card
- Internal PCIe switch
- CPU PCIe root port
- System board
Dell’s own error-code matrix associates UEFI0067 with PCIe link-training failure and PCI1318 with a fatal PCIe bus/device/function error — both are the codes to search for alongside the FQDD when you pull logs.
Multiple Switch Domains: SL.1 Through SL.4
On platforms with several internal PCIe switches, iDRAC inventory typically lists them as P2PBridge.SL.1 through P2PBridge.SL.4. These represent four separate internal PCIe switch/bridge domains — not Ethernet switches, and not four instances of the same fault. During POST, BIOS enumerates all four in turn along with whatever is attached downstream of each: GPUs, NICs, the BOSS controller, risers, cables, or backplane connections.
Which pattern you see across the four domains changes the diagnosis:
- Only one SL switch is missing or reporting errors — the fault is most likely downstream of that specific switch. Focus on that domain’s cabling and attached devices, not the others.
- A switch enumerates fine but POST hangs immediately after — this points to a downstream endpoint failing PCIe link training, not the switch itself.
- All four switches disappear or fail together — suspect a component common to all of them: BIOS, the CPU’s PCIe root complex, the motherboard, power delivery, or the PCIe-switch board itself.
- BOSS is missing from inventory — do not jump straight to replacing the BOSS controller. Trace which
SL.xdomain BOSS sits behind first; a switch-side fault can make BOSS disappear without BOSS being at fault. - Only
P2PBridge.SL.2is missing, and specifically on a PowerEdge XE9680 — see the inventory-reporting caveat below before treating it as a hardware fault.
Isolation Sequence
Work through this in order. Each step is designed to narrow the fault domain before you start pulling hardware.
1. Preserve the evidence before clearing anything
racadm getsel
racadm lclog view
racadm hwinventorySearch the output for P2PBridge.SL, BOSS, PCI1318, UEFI0056, UEFI0066, and UEFI0067. RACADM’s Lifecycle log view and hardware inventory export are what let you correlate the FQDD to a specific bus/device/function before the state gets cleared by a reboot.
2. Reset iDRAC, then do a real cold power cycle
A warm reboot is not enough — PCIe switch and link state can persist across it. Shut the host down, remove both power feeds, hold the power button for roughly 15 seconds to drain residual power, reconnect, and boot. This is the step that most often resolves a transient link-training failure with no hardware fault behind it.
3. Isolate BOSS
Power off completely, reseat the BOSS module/controller and its M.2 drives, then retry POST. If the hang persists, temporarily remove or disable BOSS and boot from alternate media to rule it out entirely. Dell’s BOSS-S1 guidance specifically calls out updating BOSS firmware, reseating the adapter, and replacing it if it stays absent from UEFI after that.
4. Minimum-to-POST, one switch domain at a time
Remove every nonessential PCIe card and riser, boot, then add components back one at a time until the hang reappears — but do it by SL.x domain rather than by slot number if the platform exposes multiple internal switches. Disconnect the downstream devices behind one switch domain, test, then move to the next. Test without BOSS, GPUs, optional NICs, and removable risers where the platform supports it. Check the installation and service manual for your exact chassis first — PCIe slot numbering, riser assignment, and slot-priority rules are tied to which CPU socket each slot is wired to, and that mapping varies by model.
5. Update the matched firmware stack
Pull the firmware versions Dell has validated together for the exact service tag, not just “latest”:
- iDRAC / Lifecycle Controller
- BIOS
- BOSS firmware
- CPLD / backplane
- Internal PCIe-switch firmware, when listed for the platform
- Firmware for any downstream GPU, NIC, or storage card in the affected riser
Firmware mismatches between BIOS and an internal PCIe switch are a common, and easy to overlook, cause of link-training failures that look like a hardware fault.
6. Hardware conclusion
If the server still hangs at P2PBridge.SL.x with BOSS removed and every nonessential PCIe device disconnected, the fault has moved out of the removable-component category. If it is consistently one specific SL.x domain that keeps failing even with its downstream devices disconnected, that isolates the fault to the corresponding switch board, midplane segment, or riser rather than the system as a whole — replace that component before considering a full system-board swap.
Special Case: PowerEdge XE9680
Dell documents a specific XE9680 behavior where P2PBridge.SL.2 can disappear from the firmware inventory after an iDRAC update, and a cold boot or iDRAC reboot restores the entry. That is an inventory-reporting artifact, not evidence of a hardware failure — do not let a missing inventory entry on this platform send you down the hardware-replacement path on its own. Confirm against an actual POST hang and the matching PCI1318/UEFI00xx event before treating it as a fault.
What You Need Before Escalating to Dell
If the sequence above does not resolve it, the exact server model plus the complete PCI1318 or UEFI00xx event — including the bus, device, and function fields, not just the error code — is what Dell support needs to map P2PBridge.SL.x to the physical BOSS, riser, switch, or system-board path. Pulling that detail with racadm lclog view before you open the ticket will save at least one round trip.
References
- Dell Lifecycle Controller: Easy-to-Use System Component Names
- Dell PowerEdge 15G Error Code Matrix
- iDRAC9 RACADM CLI Reference: lclog
- iDRAC9 RACADM CLI Reference: hwinventory
- Dell EMC BOSS-S1 User’s Guide
- PowerEdge: Troubleshooting PCIe Device Detection Issues
- iDRAC9 7.20.30.50 Release Notes: Monitoring and Alerting
- iDRAC9 7.20.80.50 Release Notes: Monitoring and Alerting
Related Reading
- Dell PowerScale OneFS: NFS over RDMA for AI Training
- NFS over RDMA on Dell EMC PowerScale OneFS for OpenShift
- How Dell PowerScale Uses NVIDIA GPUDirect Storage
About the Author
I am Luca Berton, AI and Cloud Advisor. I work at the intersection of enterprise infrastructure, platform engineering, and AI deployments. Book a consultation.
