-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Description
In graphium/features/featurizer.py, line 634:
def mol_to_adj_and_features(
mol: Union[str, dm.Mol],
atom_property_list_onehot: List[str] = [],
atom_property_list_float: List[Union[str, Callable]] = [],
conformer_property_list: List[str] = [],
edge_property_list: List[str] = [],
add_self_loop: bool = False,
explicit_H: bool = False,
use_bonds_weights: bool = False,
pos_encoding_as_features: Dict[str, Any] = None,
dtype: np.dtype = np.float16,
mask_nan: Union[str, float, type(None)] = "raise",
) -> Union[
coo_matrix,
Union[Tensor, None],
Union[Tensor, None],
Dict[str, Tensor],
Union[Tensor, None],
Dict[str, Tensor],
]:graphium seems to use np.float16 as default dtype for this method. However, mol_to_adj_and_features calls
def mol_to_adjacency_matrix(
mol: dm.Mol,
use_bonds_weights: bool = False,
add_self_loop: bool = False,
dtype: np.dtype = np.float32,
) -> coo_matrix:(line 791)
which has default dtype of np.float32.
The problem is that in mol_to_adjacency_matrix, the adjacency matrix is converted to sparse array;
if len(adj_val) > 0: # ensure tensor is not empty
adj = coo_matrix(
(torch.as_tensor(adj_val), torch.as_tensor(adj_idx).T.reshape(2, -1)),
shape=(mol.GetNumAtoms(), mol.GetNumAtoms()),
dtype=dtype,
)Which causes, in my environment,
ValueError: scipy.sparse does not support dtype float16. The only supported types are: bool, int8, uint8, int16, uint16, int32, uint32, int64, uint64, longlong, ulonglong, float32, float64, longdouble, complex64, complex128, clongdouble.
As far as I know, this has been discussed in scipy (scipy/scipy#7408) and in recent versions the checks have become stronger (scipy/scipy#20207).
I believe this can be fixed simply by using np.float32 instead. However, if usage of small dtypes for memory efficiency is critical, workarounds would be more complicated.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels