Skip to content

Commit 3d1785f

Browse files
authored
Remove Local Variables in biostars_qa Dataset Preprocessing Scripts (#2609)
Addresses comment regarding recent pull request: #2353 (comment)
1 parent 232a877 commit 3d1785f

File tree

1 file changed

+1
-7
lines changed

1 file changed

+1
-7
lines changed

data/datasets/biostars_qa/get_biostars_dataset.py

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -13,20 +13,14 @@ def get_biostars_dataset(start_idx=9557161, accept_threshold=1000000, sleep=0.1,
1313
Download BioStarts data set from the official API using GET requests
1414
1515
Args:
16-
start_idx (int): The identifier (UID) of the post to retrieve
16+
start_idx (int): The identifier (UID) of the post to retrieve; 9557161 was the last post included in the dataset
1717
accept_threshold (int): stop if this many posts with "has_accepted" true are retrieved
1818
sleep (float): Amount of time to sleep between requests
1919
folder (string): folder to store responses as JSON files
2020
Returns:
2121
Nothing. Content is saved to individual JSON files for each post.
2222
"""
2323

24-
# There is a large number gap in post IDs the numbers skip from 9463943 to 494831
25-
# Post ID: 9557161 was the last post included in the dataset
26-
start_idx = 9557161
27-
accept_threshold = 1000000
28-
sleep = 0.1
29-
3024
headers = {"Content-Type": "application/json"}
3125

3226
has_accepted_count = 0

0 commit comments

Comments
 (0)