HeteroData silently accepts numpy arrays. Then fails during training with cryptic message

### 🐛 Describe the bug

Hi PyG team, thanks for the great package! The other day I spent a good while debugging an issue that I think could be caught much earlier with better error messages.

When creating a HeteroData object from networkx/pandas (I'm guessing this is a fairly common workflow), numpy arrays can accidentally end up as node features instead of torch tensors. PyG silently accepts these numpy arrays, and the error only surfaces deep in model execution with n unhelpful message.

**MRE**
```python
import numpy as np
import torch
import torch_geometric
from torch_geometric.data import HeteroData
from torch_geometric.nn import SAGEConv, to_hetero
from torch_geometric.transforms import ToUndirected

# Create HeteroData with numpy arrays forgetting to convert to tensor
data = HeteroData()
data['user'].x = np.random.randn(100, 64)  # numpy array, not tensor
data['item'].x = np.random.randn(100, 64)   # numpy array, not tensor
data['user', 'likes', 'item'].edge_index = torch.randint(0, 100, (2, 200))

print("validate():", data.validate())  # Returns True although dtypes are invalid

device = torch_geometric.device('auto')
print(device) # e.g. mps on my machine
data = data.to(device) # silently fails to move the numpy arrays

# Check what actually happened
print("x device after .to():", data['user'].x.device)  # Still shows 'cpu'
print("x type:", type(data['user'].x))  # numpy.ndarray!

# try to use it in a model and it failse
data = ToUndirected()(data)

class Encoder(torch.nn.Module):
    def __init__(self, hidden_channels):
        super().__init__()
        self.conv = SAGEConv((-1, -1), hidden_channels)
    
    def forward(self, x, edge_index):
        return self.conv(x, edge_index).relu()

encoder = Encoder(32)
encoder = to_hetero(encoder, data.metadata(), aggr='sum')
encoder = encoder.to(device)

out = encoder(data.x_dict, data.edge_index_dict)
```

The encoder gets up to Aggregate stage and then fails with a cryptic `AttributeError: 'NoneType' object has no attribute 'dim'` 😕.

There are several places this could be caught / dealt with earlier:
1. On assignment: `data['user'].x = numpy_array` should either auto-convert to torch or raise a clear error
2. In `.validate()`: Should check that all x, edge_index, etc. are actually torch tensors - i.e. check dtypes as well as structure
3. In `.to(device)`: if it encounters nested numpy arrays it should either auto-convert to tenor or raise a clear error instead of silently skipping them.  

It is trivial to convert to a torch tensor, if there is a clear error message, but much harder to debug a message like what I got...

Thougth this was worth sharing to save some other people time 😃 


### Versions

```Collecting environment information...
PyTorch version: 2.9.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 26.2 (arm64)
GCC version: Could not collect
Clang version: 17.0.0 (clang-1700.0.13.3)
CMake version: Could not collect
Libc version: N/A

Python version: 3.13.1 (main, Dec  3 2024, 17:59:52) [Clang 16.0.0 (clang-1600.0.26.4)] (64-bit runtime)
Python platform: macOS-26.2-arm64-arm-64bit-Mach-O
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Caching allocator config: N/A

CPU:
Apple M3 Pro

Versions of relevant libraries:
[pip3] flake8==7.3.0
[pip3] mypy==1.18.2
[pip3] mypy_extensions==1.1.0
[pip3] numpy==2.3.4
[pip3] torch==2.9.0
[pip3] torch-geometric==2.7.0
[conda] Could not collect```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HeteroData silently accepts numpy arrays. Then fails during training with cryptic message #10597

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

HeteroData silently accepts numpy arrays. Then fails during training with cryptic message #10597

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions