Skip to content

Fix DLRMv3 fbgemm_gpu loading with hardcoded Bazel paths#2439

Merged
hanyunfan merged 11 commits intomlcommons:masterfrom
ssam18:fix-dlrmv3-fbgemm-hardcoded-paths
Mar 30, 2026
Merged

Fix DLRMv3 fbgemm_gpu loading with hardcoded Bazel paths#2439
hanyunfan merged 11 commits intomlcommons:masterfrom
ssam18:fix-dlrmv3-fbgemm-hardcoded-paths

Conversation

@ssam18
Copy link
Copy Markdown
Contributor

@ssam18 ssam18 commented Jan 14, 2026

Problem

The DLRMv3 harness was using hardcoded Bazel build paths specific to Meta's internal build system:

torch.ops.load_library("//deeplearning/fbgemm/fbgemm_gpu:sparse_ops")
torch.ops.load_library("//deeplearning/fbgemm/fbgemm_gpu:sparse_ops_cpu")

This caused failures when running outside that environment:

FAILED to load sparse_ops_cpu in position: Could not load this library: /deeplearning/fbgemm/fbgemm_gpu:sparse_ops
FAILED to load sparse_ops_cpu in jagged: Could not load this library: /deeplearning/fbgemm/fbgemm_gpu:sparse_ops
FAILED to load sparse_ops_cpu in jagged tensors: Could not load this library: /deeplearning/fbgemm/fbgemm_gpu:sparse_ops
FAILED to load sparse_ops_cpu in hstu attention: Could not load this library: /deeplearning/fbgemm/fbgemm_gpu:sparse_ops

Fixes #2429

Replace internal Bazel build paths with proper Python imports to fix library loading failures outside Meta's build environment.

Fixes mlcommons#2429
@ssam18 ssam18 requested a review from a team as a code owner January 14, 2026 02:04
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 14, 2026

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@ssam18
Copy link
Copy Markdown
Contributor Author

ssam18 commented Jan 14, 2026

Signing the CLA now via the Google form.

@ssam18
Copy link
Copy Markdown
Contributor Author

ssam18 commented Jan 14, 2026

recheck

@LinjianMa
Copy link
Copy Markdown
Contributor

I have verified the PR works as expected. @hanyunfan @mrmhodak can we merge the PR?

mrmhodak
mrmhodak previously approved these changes Feb 9, 2026
hanyunfan
hanyunfan previously approved these changes Mar 9, 2026
@hanyunfan
Copy link
Copy Markdown
Contributor

@pgmpablo157321 Could you or someone else help to merge this one, it got stuck at the CLA check.

@SamareshSingh
Copy link
Copy Markdown

recheck

@hanyunfan hanyunfan merged commit 2ee7190 into mlcommons:master Mar 30, 2026
17 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 30, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DLRMv3 FAILED to load sparse_ops_cpu in jagged

7 participants