-
Notifications
You must be signed in to change notification settings - Fork 516
Description
I’m trying to bring up an Alveo U200 on a Dell PowerEdge R750 running AlmaLinux 9.6 (kernel 5.14.0-570.39.1.el9_6.x86_64). XRT 2025.1 (2.19.194) installs fine and the xocl/xclmgmt modules load, but the PCIe bus only ever shows the JTAG/UART function (Device ID 903f). The expected accelerator functions (Device IDs 5004/5005) never appear, so XRT can’t detect the card.
Environment
- Server: Dell PowerEdge R750, BIOS 1.14.1, both CPU sockets populated, PSU 1400 W
- OS: AlmaLinux 9.6 (GLIBC 2.34)
- Kernel: 5.14.0-570.39.1.el9_6.x86_64
- XRT: 2.19.194 (branch 2025.1), installed from RPM,
/opt/xilinx/xrt/bin/xbmgmt --versionmatches
Key commands and outputs
-
sudo lspci -vd 10ee:b1:00.0 Serial controller: Xilinx Corporation Device 903f (prog-if 01 [16450]) Subsystem: Xilinx Corporation Device 0007 Flags: fast devsel, NUMA node 1 Memory at dbc10000 (32-bit, non-prefetchable) [size=4K] Memory at dbc00000 (64-bit, non-prefetchable) [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [70] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [1c0] Secondary PCI Express -
sudo lspci -vvv -s b1:00.0b1:00.0 Serial controller: Xilinx Corporation Device 903f (prog-if 01 [16450]) Subsystem: Xilinx Corporation Device 0007 Control: I/O+ Mem+ BusMaster- Status: Cap+ … INTx- NUMA node: 1 Region 0: Memory at dbc10000 (32-bit, non-prefetchable) [size=4K] Region 2: Memory at dbc00000 (64-bit, non-prefetchable) [size=64K] … LnkSta: Speed 8GT/s (ok), Width x8 (downgraded) … -
sudo /opt/xilinx/xrt/bin/xbmgmt examineSystem Configuration OS Name : Linux Release : 5.14.0-570.39.1.el9_6.x86_64 Machine : x86_64 CPU Cores : 64 Memory : 514906 MB Distribution : AlmaLinux 9.6 (Sage Margay) GLIBC : 2.34 Model : PowerEdge R750 BIOS Vendor : Dell Inc. BIOS Version : 1.14.1 XRT Version : 2.19.194 Branch : 2025.1 Hash : 7d8151e6ee73c6ec2e99501a58c9c2eca6cc68ce Hash Date : 2025-05-18 11:45:44 xocl : 2.19.194, 7d8151e6ee73c6ec2e99501a58c9c2eca6cc68ce xclmgmt : 2.19.194, 7d8151e6ee73c6ec2e99501a58c9c2eca6cc68ce Device(s) Present 0 devices found -
sudo lsmod | grep -E "xocl|xclmgmt"xocl 2609152 0 xclmgmt 1417216 0
Current symptoms
lspcinever shows Device ID 5004/5005, only 903f.- BusMaster remains disabled.
xbmgmt examinereports “0 devices found”; no/dev/xclmgmt*nodes.- No driver is bound under
/sys/bus/pci/devices/0000:b1:00.0/driver.
Troubleshooting done
- Installed SC/CMC/base/validate RPMs for U200 (
xilinx-u200-gen3x16-xdma-*). - Rebuilt DKMS modules (
sudo dkms remove/install xrt/2.19.194). - Verified
xocl/xclmgmtload (modprobe). - Cold booted (shutdown, power cords removed for several minutes, reapply power).
- Reseated U200 card, reattached 8‑pin AUX power (checked firmly seated).
- No USB cable attached to the card.
- Tried
xbmgmt program --base --device <BDF>but it fails because the card isn’t detected beyond 903f.
Suspicions / Questions
- Could AUX power still be an issue despite reseating? Does the R750 need a specific power profile for Alveo cards?
- Is the card stuck in golden/safe mode that only exposes the JTAG function, requiring JTAG reprogramming?
- Are there BIOS or PCIe settings (slot bifurcation, hot-plug, etc.) that could prevent BusMaster from enabling?
- Any additional diagnostics recommended (e.g.,
lspci -xxxx, forcing golden image reload, different XRT version, different OS)?
Has anyone seen an Alveo (U200 or otherwise) that only enumerates as Device 903f with BusMaster disabled? Any pointers on recovering it to the normal 5004/5005 functions would be greatly appreciated.
Thanks!