|
| 1 | +# Meltano GitHub -> CrateDB example |
| 2 | + |
| 3 | +## About |
| 4 | + |
| 5 | +Acquire repository metadata from GitHub API, and insert into CrateDB database |
| 6 | +tables, using [meltano-target-cratedb]. |
| 7 | + |
| 8 | +It follows the canonical example demonstrated at the [Meltano Getting Started Tutorial]. |
| 9 | + |
| 10 | +## Configuration |
| 11 | + |
| 12 | +### tap-github |
| 13 | + |
| 14 | +For accessing the GitHub API, you will need an authentication token. It |
| 15 | +can be acquired at [GitHub Developer Settings » Tokens]. |
| 16 | + |
| 17 | +To configure the recipe, please store it into the `TAP_GITHUB_AUTH_TOKEN` |
| 18 | +environment variable, either interactively, or by creating a dotenv |
| 19 | +configuration file `.env`. |
| 20 | + |
| 21 | +```shell |
| 22 | +TAP_GITHUB_AUTH_TOKEN='ghp_hmQR3XTFWkfIcuyjRTBuVrRt6mnL1j2mMPT8' |
| 23 | +``` |
| 24 | + |
| 25 | +Then, in `meltano.yml`, identify the `tap-github` section in `plugins.extractors`, |
| 26 | +and adjust the value of `config.repositories` to correspond to the repository |
| 27 | +you intend to scrape. |
| 28 | + |
| 29 | +### target-cratedb |
| 30 | + |
| 31 | +Within `loaders` section `target-cratedb`, adjust `config.sqlalchemy_url` to |
| 32 | +match your database connectivity settings. |
| 33 | + |
| 34 | + |
| 35 | +## Usage |
| 36 | + |
| 37 | +Install dependencies. |
| 38 | +```shell |
| 39 | +meltano install |
| 40 | +``` |
| 41 | + |
| 42 | +Invoke data transfer to JSONL files. |
| 43 | +```shell |
| 44 | +meltano run tap-github target-jsonl |
| 45 | +cat github-to-cratedb/output/commits.jsonl |
| 46 | +``` |
| 47 | + |
| 48 | +Invoke data transfer to CrateDB database. |
| 49 | +```shell |
| 50 | +meltano run tap-github target-cratedb |
| 51 | +``` |
| 52 | + |
| 53 | +## Screenshot |
| 54 | + |
| 55 | +Enjoy the release notes. |
| 56 | +```sql |
| 57 | +SELECT repo, tag_name, body FROM melty.releases ORDER BY tag_name DESC; |
| 58 | +``` |
| 59 | + |
| 60 | + |
| 61 | + |
| 62 | +## Troubleshooting |
| 63 | + |
| 64 | +If you see such errors on stdout, please verify your GitHub authentication |
| 65 | +token stored within the `TAP_GITHUB_AUTH_TOKEN` environment variable. |
| 66 | +```python |
| 67 | +singer_sdk.exceptions.RetriableAPIError: 401 Client Error: b'{"message":"This endpoint requires you to be authenticated.","documentation_url":"https://docs.github.com/graphql/guides/forming-calls-with-graphql#authenticating-with-graphql"}' (Reason: Unauthorized) for path: /graphql cmd_type=elb consumer=False name=tap-github producer=True stdio=stderr string_id=tap-github |
| 68 | +``` |
| 69 | + |
| 70 | +## Development |
| 71 | +In order to link the sandbox to a development installation of [meltano-target-cratedb], |
| 72 | +configure the `pip_url` of the component like this: |
| 73 | +```yaml |
| 74 | +pip_url: --editable=/path/to/sources/meltano-target-cratedb |
| 75 | +``` |
| 76 | + |
| 77 | + |
| 78 | +[GitHub Developer Settings » Tokens]: https://github.com/settings/tokens |
| 79 | +[Meltano Getting Started Tutorial]: https://docs.meltano.com/getting-started/part1 |
| 80 | +[meltano-target-cratedb]: https://github.com/crate-workbench/meltano-target-cratedb |
| 81 | +[tap-github]: https://hub.meltano.com/extractors/tap-github/ |
| 82 | +[target-jsonl]: https://hub.meltano.com/loaders/target-jsonl/ |
0 commit comments