|
1 |
| -# nodestream-plugin-github |
| 1 | +# nodestream-plugin-github |
| 2 | + |
| 3 | +# Overview |
| 4 | +This plugin provides a way to scrape github data from the REST api and ingest |
| 5 | +them as extractors in nodestream pipelines. |
| 6 | + |
| 7 | + |
| 8 | +# Setup Neo4j |
| 9 | +1. Download and install Neo4j: https://neo4j.com/docs/desktop-manual/current/installation/download-installation/ |
| 10 | +1. Create and start database (version 5.7.0: https://neo4j.com/docs/desktop-manual/current/operations/create-dbms/ |
| 11 | +1. Install APOC: https://neo4j.com/docs/apoc/5/installation/ |
| 12 | + |
| 13 | +# Create github credentials |
| 14 | +1. Create and github access codes: https://docs.github.com/en/[email protected]/apps/creating-github-apps/authenticating-with-a-github-app/generating-a-user-access-token-for-a-github-app |
| 15 | +NOTE: These values will be used in your `.env` |
| 16 | + |
| 17 | +# Install and run the app |
| 18 | +1. Install python3: https://www.python.org/downloads/ |
| 19 | +1. Install poetry: https://python-poetry.org/docs/#installation |
| 20 | +1. Install nodestream: https://nodestream-proj.github.io/nodestream/0.5/docs/tutorial/ |
| 21 | +1. Generate a new nodestream project |
| 22 | +1. Add `nodestream-github` to your project dependencies in your nodestream projects pyproject.toml file. |
| 23 | +1. Install necessary dependencies: `poetry install` |
| 24 | +1. In `nodestream.yaml` add the following: |
| 25 | +```yaml |
| 26 | +plugins: |
| 27 | + - name: github |
| 28 | + config: |
| 29 | + github_hostname: github.example.com |
| 30 | + auth_token: !env GITHUB_ACCESS_TOKEN |
| 31 | + user_agent: skip-jbristow-test |
| 32 | + per_page: 100 |
| 33 | + collecting: |
| 34 | + all_public: True |
| 35 | + rate_limit_per_minute: 225 |
| 36 | + targets: |
| 37 | + - my-db: |
| 38 | + pipelines: |
| 39 | + - name: github_repos |
| 40 | + - name: github_teams |
| 41 | +targets: |
| 42 | + database: neo4j |
| 43 | + uri: bolt://localhost:7687 |
| 44 | + username: neo4j |
| 45 | + password: neo4j123 |
| 46 | +``` |
| 47 | +1. Set environment variables in your terminal session for: `GITHUB_ACCESS_TOKEN`. |
| 48 | +1. Verify nodestream has loaded the pipelines: `poetry run nodestream show` |
| 49 | +1. Use nodestream to run the pipelines: `poetry run nodestream run <pipeline-name> --target my-db` |
| 50 | + |
| 51 | +# Using make |
| 52 | +1. Install make (ie. `brew install make`) |
| 53 | +1. Run `make run` |
| 54 | + |
| 55 | + |
| 56 | +# Authors |
| 57 | +* Jon Bristow |
| 58 | +* Zach Probst |
0 commit comments