Skip to content

Commit 3a4e007

Browse files
committed
go_uniprot: add sequence len to docstring
1 parent 78a38de commit 3a4e007

File tree

1 file changed

+7
-1
lines changed

1 file changed

+7
-1
lines changed

chebai/preprocessing/datasets/go_uniprot.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,10 +56,16 @@ class _GOUniProtDataExtractor(_DynamicDataset, ABC):
5656
Args:
5757
dynamic_data_split_seed (int, optional): The seed for random data splitting. Defaults to 42.
5858
splits_file_path (str, optional): Path to the splits CSV file. Defaults to None.
59-
**kwargs: Additional keyword arguments passed to XYBaseDataModule.
59+
max_sequence_length (int, optional): Specifies the maximum allowed sequence length for a protein, with a
60+
default of 1002. During data preprocessing, any proteins exceeding this length will be excluded from further
61+
processing.
62+
**kwargs: Additional keyword arguments passed to DynamicDataset and XYBaseDataModule.
6063
6164
Attributes:
6265
dynamic_data_split_seed (int): The seed for random data splitting, default is 42.
66+
max_sequence_length (int, optional): Specifies the maximum allowed sequence length for a protein, with a
67+
default of 1002. During data preprocessing, any proteins exceeding this length will be excluded from further
68+
processing.
6369
splits_file_path (Optional[str]): Path to the CSV file containing split assignments.
6470
"""
6571

0 commit comments

Comments
 (0)