Skip to content

Commit 01d4c91

Browse files
committed
Add info about the file used
1 parent 757042f commit 01d4c91

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

training/text_classification.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@ def main():
1818
- stop_words="english"
1919
- min_df=5
2020
"""
21+
# this file is available in RNA Resources -> RNAcentral -> RNAcentral References folder on Google Drive.
22+
# articles up to line 6295 were extracted using the export_data.py script. The last 400 articles were
23+
# manually reviewed by the team.
2124
df = pd.read_csv("data.csv")
2225
print(df["rna_related"].value_counts(), "\n")
2326
# rna_related

0 commit comments

Comments
 (0)