Skip to content

Commit 4abe599

Browse files
committed
updated README
1 parent 11899e4 commit 4abe599

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

pull_requests_and_issues/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
This folder contains steps necessary to reproduce dataset of Issued and Pull Requests.
1+
This folder contains steps necessary to reproduce dataset of Issues and Pull Requests.
22

3-
## Ray scluster
3+
## Ray sluster
44

55
Most of the steps are designed to be executed on a Ray cluster. If the code is not run on the AI Toolkit, one must implement its own cluster provisioning and management. Specifically, the scaling up and down of the Ray cluster should be implemented in `ray_server.py`, or the cluster needs to be scaled up elsewhere, and the `scale_cluster` function may not have any effect. Additionally, all paths are intended to be accessible from all cluster nodes.
66

@@ -15,7 +15,7 @@ All configuration is in the `cfg.py`. Configs needed to change would be:
1515
Downloads evnets from the GHArchive. Done on one thread and with a delay in order to not overvelm the server.
1616

1717
## 1_parse_issue_and_pr_events.ipynb
18-
Extracts Issues and PRs information from the events, groups events by Issue or PR id, combines them into Issues or PR and splits to Issue and PRs.
18+
Extracts Issues and PRs information from the events, groups events by Issue or PR id, combines them into Issues or PR and splits to Issue dataset and PRs data for further processing.
1919
- `issues` dataset is stored by default in `root_path/issues_prs_grouped`
2020
- `pull requests` are stored by default in `root_path/pr_grouped` for further processing
2121

0 commit comments

Comments
 (0)