Skip to content

Commit 6ee9374

Browse files
committed
Expands docs for install, usage, and miner API documentation
1 parent 5580347 commit 6ee9374

File tree

1 file changed

+70
-27
lines changed

1 file changed

+70
-27
lines changed

README.md

Lines changed: 70 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,94 @@
1-
# GitHub and Travis mining utility
1+
# GitHub API Mining Utility
2+
3+
This is a simplified repository miner based on [caiusb/miner-utils](https://github.com/caiusb/miner-utils), and targeting the GitHub REST API (v3).
24

35
## Installation
46

5-
Run `pip install "git+https://github.com/caiusb/miner-utils"`
7+
### Requirements
8+
The following must be installed and available for the mining utility:
9+
* [Python 3](https://www.python.org/downloads/)
10+
* [`pip`](https://pypi.org/project/pip/)
11+
12+
To verify that these packages are installed and updated, use the following commands in a terminal/console:
13+
```bash
14+
python --version
15+
# example:
16+
# > python3 --version
17+
# Python 3.7.4
18+
19+
pip --version
20+
# example:
21+
# > pip --version
22+
# pip 20.2.3 from /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pip (python 3.7)
23+
```
24+
25+
### Installing the mining utility
26+
To install the mining utility into a Python global environment, run the following command in a terminal/console:
27+
```bash
28+
pip install "git+https://github.com/EPICLab/miner-utils"
29+
```
30+
31+
To install the mining utility into an enhanced shell like IPython or the Jupyter notebook, run the following commands in a code cell:
32+
```python
33+
!pip install 'git+https://github.com/EPICLab/miner-utils'
34+
```
635

736
## Usage
837

9-
### Instantiating a miner
38+
The GitHub REST API (v3) has rate limits for the number of resource objects that can be requested in a given timeframe.
1039

11-
To instantiate a GitHub miner, simply call the constructor:
40+
For API requests using Basic Authentication or OAuth, you can make up to 5000 requests per hour. Authenticated requests are associated with the authenticated user, regardless of whether Basic Authentication or an OAuth token was used. This means that all OAuth applications authorized by a user share the same quota of 5000 requests per hour when they authenticate with different tokens owned by the same user.
1241

13-
```
14-
gh = GitHub();
15-
```
42+
For unauthenticated requests, the rate limit allows for up to 60 requests per hour. Unauthenticated requests are associated with the originating IP address, and not the user making the requests.
1643

17-
The contructor takes 2 optional arguments, a username and a token. It is recommended that you use them, in order to greatly reduce the time it takes to collect the data:
44+
For more information on GitHub's rate limiting policy, see the [rate limiting documentation](https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting).
1845

19-
```
20-
gh = GitHub(username, token)
21-
```
46+
### Obtaining a GitHub authentication token
47+
The GitHub REST API (v3) originally supported Basic Authentication using either a username/password or username/token. However, authentication using username/password is currently being deprecated and will be completely removed as of November 13, 2020 at 16:00 UTC ([GitHub Developer release note](https://developer.github.com/changes/2020-02-14-deprecating-password-auth/)).
2248

23-
To instantiate a Travis miner, simply call the constructor:
49+
Follow the GitHub documentation, ["Creating a personal access token"](https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token), obtain a personal access token (PAT) that has `(no scope)` set so that read-only access to public information is allowed (i.e. leave the scope fields unchecked).
2450

25-
```
26-
tr = Travis()
27-
```
51+
> **WARNING**: Treat your tokens like passwords and keep them secret. When using the GitHub API Mining Utility, set the token during instantiation, but do not publish the token in any Python programs or IPython/Jupyter notebooks.
2852
29-
The constructor also takes 1 optional authentication token:
53+
### Instantiating the GitHub API Mining Utility
54+
To create an instance of the GitHub API Mining Utility in either a Python environment or a IPython/Jupyter notebook, run the following commands:
55+
```python
56+
from minerutils import GitHub
3057

58+
gh = GitHub(username, token)
59+
60+
# example:
61+
# gh = GitHub(username='nelsonni', token='b123c123d123e123')
3162
```
32-
tr = Travis(token)
63+
64+
### Interacting with the GitHub API Mining Utility
65+
Once the GitHub API Mining Utility has been instantiated, you can interact with the GitHub REST API through GET requests that take the following format:
66+
```python
67+
gh.get(url, params, headers)
68+
69+
# example (these are equivalent):
70+
# gh.get("/repos/scala/scala/pulls", params={'state': 'all'})
71+
# gh.get("/repos/scala/scala/pulls?state=all")
3372
```
3473

35-
### Calling the API
74+
The examples above get all of the pull requests for the specified project (e.g. `scala/scala`). The `params` and `header` arguments are optional, but useful for passing a parameter or query for a particular resource. Both parameters take a map of `(key, value)` pairs for the arguments that you want to pass to the GitHub API endpoint. The alternative is to embed the parameters directly into the `url` (as demonstrated in the second example above).
3675

37-
Both miners have a similar API. To perfom a get request, use:
76+
For all available GitHub REST API (v3) resources, including `url` and `params` values, refer to the [GitHub Docs: REST API](https://docs.github.com/en/rest/reference) site.
3877

39-
```
40-
gh.get("/repos/scala/scala/pulls", params={'state': 'all'})
41-
```
78+
### Python 3
79+
This miner is written in Python 3, and should be run in a Python 3.x environment. If you attempt to run in a Python 2 environment, runtime errors will warn that `urllib.parse` module cannot be imported (this is because the `urlparse` module was renamed to `urllib.parse` in Python 3).
4280

43-
The example above, gets all the pull requests for the specified project. Consult the documentation of the service that you are using to determine what resources are available. If you need to pass a parameter, or a query, use the `params` argument. It takes a map (key, value pairs) of the arguments that you want to pass. Alternatively, you can pass the parameters in the url directly, like this:
81+
## Commands Documentation
4482

45-
```
46-
gh.get("/repos/scala/scala/pulls?state=all")
47-
```
83+
| Command | Return Type | Description |
84+
| :------ | ----------- | ----------: |
85+
|`printConfig()` | `None` | Prints the symbols table associated with the GitHub API Mining Utility instance, including authentication values. |
86+
| `get(url, params={}, headers={}, perPage=100)` | `list` | Calls the GitHub REST API (v3) using GET requests that include the authentication parameters (if provided during instantiation), any `params` pairs (if provided), any `headers` pairs (if provided), and paginates the results based on the `perPage` rate. This call respects the GitHub REST API (v3) rate limits (included in 403 status code responses) to determine when the rate limit has been exhausted, and will sleep until the limit has been reset. |
87+
| `getRepoRoot(repo)` | `string` | Accepts a `repo` parameter in the form of a map containing `username` and `repo` key-value pairs, and returns a GitHub URL of the form `https://api.github.com/{username}/{repo}`. |
88+
| `getRemainingRateLimit()` | `int` | Obtains the numerical count of the remaining GitHub REST API (v3) calls allowed before reaching the rate limit. |
89+
| `printRemainingRateLimit()` | `None` | Prints the numerical count of the remaining GitHub API (v3) calls allowed before reaching the rate limit. |
90+
| `repoExists(user, repo)` | `bool` | Calls the GitHub REST API (v3) using a GET request with a URL of the form `https://api.github.com/repos/{user}/{repo}` and indicates whether that response was successful (i.e. whether the repository exists on GitHub). |
4891

4992
## License
5093

51-
This project is licensed under the MIT License - see the LICENSE.md file for details
94+
This project is licensed under the MIT License - see the LICENSE.md file for details.

0 commit comments

Comments
 (0)