[FEA] Add New Unsupervised Learning Example #371

alexbarghi-nv · 2025-12-12T02:43:32Z

Adds a new unsupervised learning example that can learn embeddings. Closes #364

copy-pr-bot · 2025-12-12T02:43:36Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

…ugraph-gnn into add-mag-examples

greptile-apps · 2026-01-05T16:20:03Z

Greptile Summary

Added two new example scripts for unsupervised learning on smaller datasets like ogbn-mag:

mag_lp_mnmg.py: Multi-GPU link prediction example that trains a heterogeneous GNN with encoder/decoder architecture using betweenness centrality as edge features, then exports learned embeddings and labels to parquet files
xgb.py: XGBoost classifier example that loads the exported embeddings and trains a multi-class classifier, demonstrating how to use GNN embeddings for downstream tasks

The examples provide a complete workflow from unsupervised embedding learning to supervised classification, addressing issue #364's request for single-GPU examples on smaller datasets.

Confidence Score: 5/5

This PR is safe to merge with no critical issues found
The code follows established patterns from existing examples in the repository, implements a well-structured ML pipeline with proper distributed training setup, includes comprehensive error handling, and correctly uses distributed feature stores with global indexing
No files require special attention

Important Files Changed

Filename	Overview
python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py	Adds comprehensive multi-GPU link prediction example with encoder/decoder architecture, betweenness centrality features, and embedding export to parquet
python/cugraph-pyg/cugraph_pyg/examples/xgb.py	Adds XGBoost classifier example that loads embeddings from parquet files and trains a multi-class classifier

Sequence Diagram

sequenceDiagram
    participant User
    participant mag_lp_mnmg as mag_lp_mnmg.py
    participant GraphStore
    participant Model
    participant Output as Parquet Files
    participant xgb as xgb.py
    participant XGBoost

    User->>mag_lp_mnmg: Run with torchrun
    mag_lp_mnmg->>mag_lp_mnmg: Initialize distributed workers (NCCL, cuGraph, WholeGraph)
    mag_lp_mnmg->>GraphStore: Load ogbn-mag dataset
    mag_lp_mnmg->>GraphStore: Add nodes and edges (with reverse edges)
    mag_lp_mnmg->>GraphStore: Calculate betweenness centrality
    mag_lp_mnmg->>GraphStore: Add betweenness features to edges
    mag_lp_mnmg->>Model: Create Classifier with Encoder/Decoder
    mag_lp_mnmg->>Model: Train with LinkNeighborLoader
    mag_lp_mnmg->>Model: Evaluate on test set
    mag_lp_mnmg->>Model: Generate paper embeddings
    mag_lp_mnmg->>Output: Export embeddings (x) and labels (y) to parquet
    mag_lp_mnmg->>mag_lp_mnmg: Shutdown workers
    
    User->>xgb: Run with --data_dir
    xgb->>xgb: Create LocalCUDACluster
    xgb->>Output: Read embeddings (x) and labels (y)
    xgb->>xgb: Join data and split train/test
    xgb->>XGBoost: Train multi-class classifier
    xgb->>XGBoost: Evaluate on test set
    xgb->>User: Display accuracy results

greptile-apps

Additional Comments (4)

python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 96 (link)

logic: global_rank is not defined in the scope of the Classifier.__init__ method

To fix this, you'll need to pass global_rank as a parameter to the Classifier constructor.
python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 111 (link)

logic: global_rank is not defined in the scope of the Classifier.__init__ method

To fix this, you'll need to pass global_rank as a parameter to the Classifier constructor.
python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 278 (link)

logic: global_rank is not defined in the scope of the train function

To fix this, you'll need to pass global_rank as a parameter to the train function.
python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 687-690 (link)

logic: doubled the embedding instead of adding x_paper residual

_{2 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

Additional Comments (1)

python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 681-684 (link)

logic: duplicates x_dict["paper"] addition - should add x_paper residual instead

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

Additional Comments (3)

python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 201 (link)

style: missing @torch.no_grad() decorator - inference should disable gradient computation for performance and memory efficiency
python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 643 (link)

logic: drop_last=True causes embeddings for the last batch to not be computed, leaving uninitialized values in the pre-allocated tensor at line 646-648
python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 272 (link)

style: global_rank accessed from global scope - consider passing it as a parameter for better function isolation and testability

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

_{2 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

Additional Comments (2)

python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 300 (link)

syntax: Typo: 'torchrunshould betorchrun` (missing opening backtick)
python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 547-548 (link)

logic: train_sz and test_sz are tensors, need conversion to int for slicing

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

Additional Comments (3)

python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 548-549 (link)

logic: Tensor used as index instead of scalar value. train_sz and test_sz are tensors but need to be converted to integers for slicing.
python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 492 (link)

syntax: "betweeness" is misspelled
python/cugraph-pyg/cugraph_pyg/examples/xgb.py, line 92 (link)

logic: Type consistency issue: predictions_computed is a cupy array but dfy_test_computed is likely still a cudf Series/DataFrame

_{2 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

alexbarghi-nv · 2026-01-05T22:57:04Z

Additional Comments (3)

python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 548-549 (link)
logic: Tensor used as index instead of scalar value. train_sz and test_sz are tensors but need to be converted to integers for slicing.

python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py, line 492 (link)
syntax: "betweeness" is misspelled

python/cugraph-pyg/cugraph_pyg/examples/xgb.py, line 92 (link)
logic: Type consistency issue: predictions_computed is a cupy array but dfy_test_computed is likely still a cudf Series/DataFrame

2 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

These are not isues - I've tested both files.

greptile-apps · 2026-01-05T22:59:48Z

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

_{This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".}

tingyu66

LGTM. Just left some nitpicks.

tingyu66 · 2026-01-22T02:03:02Z

python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py

+        pred_true_pos += (
+            ((y_pred > 0.5).float() == 1.0) & (y_true.float() == 1.0)
+        ).sum()
+        pred_false_pos += (
+            ((y_pred > 0.5).float() == 1.0) & (y_true.float() == 0.0)
+        ).sum()
+        pred_true_neg += (
+            ((y_pred <= 0.5).float() == 1.0) & (y_true.float() == 0.0)
+        ).sum()
+        pred_false_neg += (
+            ((y_pred <= 0.5).float() == 1.0) & (y_true.float() == 1.0)
+        ).sum()


Do we need .float() here? Can it be simplified as ((y_pred > 0.5) & (y_true == 1)).sum()

tingyu66 · 2026-01-22T02:11:21Z

python/cugraph-pyg/cugraph_pyg/examples/mag_lp_mnmg.py

+    model,
+    optimizer,
+    wm_optimizer,
+    neg_ratio,


Do we need the negative ratio inside train() or test()?

alexbarghi-nv added 3 commits December 7, 2025 21:08

initial write

37cfc3a

initial write

2f4a522

add new unsupervised learning example

8fcad61

alexbarghi-nv self-assigned this Dec 12, 2025

alexbarghi-nv added feature request New feature or request non-breaking Introduces a non-breaking change labels Dec 12, 2025

alexbarghi-nv and others added 4 commits December 12, 2025 15:53

add xgb trainer

9a28f7b

add info

b7abfa2

updates

2e6248a

Merge branch 'main' into add-mag-examples

be59989

alexbarghi-nv marked this pull request as ready for review January 5, 2026 16:17

alexbarghi-nv requested a review from a team as a code owner January 5, 2026 16:17

alexbarghi-nv added 2 commits January 5, 2026 08:19

remove print statements

1275645

Merge branch 'add-mag-examples' of https://github.com/alexbarghi-nv/c…

277d527

…ugraph-gnn into add-mag-examples

greptile-apps bot reviewed Jan 5, 2026

View reviewed changes

fix bug

a7abd43

greptile-apps bot reviewed Jan 5, 2026

View reviewed changes

more cleanup, bug fixes

3c62acf

greptile-apps bot reviewed Jan 5, 2026

View reviewed changes

fix warning message

be6a7c6

greptile-apps bot reviewed Jan 5, 2026

View reviewed changes

Merge branch 'main' into add-mag-examples

99ff493

Merge branch 'main' into add-mag-examples

79b149f

tingyu66 approved these changes Jan 22, 2026

View reviewed changes

[FEA] Add New Unsupervised Learning Example #371

Are you sure you want to change the base?

[FEA] Add New Unsupervised Learning Example #371

Uh oh!

Conversation

alexbarghi-nv commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Dec 12, 2025

Uh oh!

greptile-apps bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (4)

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (1)

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (3)

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (2)

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (3)

Uh oh!

alexbarghi-nv commented Jan 5, 2026

Additional Comments (3)

Uh oh!

greptile-apps bot commented Jan 5, 2026

Greptile's behavior is changing!

Uh oh!

tingyu66 left a comment

Choose a reason for hiding this comment

Uh oh!

tingyu66 Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

tingyu66 Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alexbarghi-nv commented Dec 12, 2025 •

edited

Loading

greptile-apps bot commented Jan 5, 2026 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

greptile-apps bot left a comment •

edited

Loading