GCN embeddings different on subsequent queries with GPUs #5539
-
Hi, I observed a strange phenomena while studying the graph embeddings generated out of GCN model. After training a model, I collect the learned graph embeddings for graph classification tasks and observed that for the same data point, the embeddings generated mismatch in decimal places by substantial margin (error in the range of 0.1). Surprisingly, this does not happen when I collect the embedding while inferring on CPU. The embeddings are exactly same. But, with GPU this doesn't happen. To replicate the same, I used one of the PyG tutorial notebook code and observed the same. `#!/usr/bin/env python coding: utf-8In[1]:Install required packages.import os In[2]:import torch dataset = TUDataset(root='data/TUDataset', name='MUTAG') print() data = dataset[0] # Get the first graph object. print() Gather some statistics about the first graph.print(f'Number of nodes: {data.num_nodes}') In[3]:torch.manual_seed(12345) train_dataset = dataset[:150] print(f'Number of training graphs: {len(train_dataset)}') In[4]:from torch_geometric.loader import DataLoader train_loader = DataLoader(train_dataset, batch_size=64, shuffle=False) for step, data in enumerate(train_loader): In[5]:from torch.nn import Linear class GCN(torch.nn.Module):
model = GCN(hidden_channels=64) In[6]:#from IPython.display import Javascript #model = GCN(hidden_channels=64)
def test_cpu(model,loader): def test_gpu(model,loader): model = model.to('cuda') for epoch in range(1, 50): In[7]:_,embedding_gpu1= test_gpu(model,test_loader) In[14]:embedding_gpu2 == embedding_gpu1 tensor([[ True, True, False, ..., True, False, False], In[16]:embedding_cpu1 == embedding_cpu2 tensor([[True, True, True, ..., True, True, True], ` Can you let me know what is going wrong? I tried seed_everything as well but no luck with that. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Beta Was this translation helpful? Give feedback.
Scatter
based onedge_index
is a non-deterministic operation and would result in numerical instabilities in CUDA. For message passing layers, deterministic aggregation is only guaranteed when using SparseTensor.