Skip to content

Commit 51d04dd

Browse files
authored
docs: updated readme, contributing, and release docs (#120)
1 parent c347760 commit 51d04dd

File tree

4 files changed

+205
-227
lines changed

4 files changed

+205
-227
lines changed

README.md

Lines changed: 13 additions & 208 deletions
Original file line numberDiff line numberDiff line change
@@ -4,226 +4,31 @@ Airbyte Python CDK is a framework for building Airbyte API Source Connectors. It
44
classes and helpers that make it easy to build a connector against an HTTP API (REST, GraphQL, etc),
55
or a generic Python source connector.
66

7-
## Usage
7+
## Building Connectors with the CDK
88

9-
If you're looking to build a connector, we highly recommend that you
9+
If you're looking to build a connector, we highly recommend that you first
1010
[start with the Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview).
1111
It should be enough for 90% connectors out there. For more flexible and complex connectors, use the
1212
[low-code CDK and `SourceDeclarativeManifest`](https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview).
1313

14-
If that doesn't work, then consider building on top of the
15-
[lower-level Python CDK itself](https://docs.airbyte.com/connector-development/cdk-python/).
16-
17-
### Quick Start
18-
19-
To get started on a Python CDK based connector or a low-code connector, you can generate a connector
20-
project from a template:
21-
22-
```bash
23-
# from the repo root
24-
cd airbyte-integrations/connector-templates/generator
25-
./generate.sh
26-
```
27-
28-
### Example Connectors
29-
30-
**HTTP Connectors**:
31-
32-
- [Stripe](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-stripe/)
33-
- [Salesforce](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-salesforce/)
34-
35-
**Python connectors using the bare-bones `Source` abstraction**:
36-
37-
- [Google Sheets](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-google-sheets/google_sheets_source/google_sheets_source.py)
38-
39-
This will generate a project with a type and a name of your choice and put it in
40-
`airbyte-integrations/connectors`. Open the directory with your connector in an editor and follow
41-
the `TODO` items.
14+
For more information on building connectors, please see the [Connector Development](https://docs.airbyte.com/connector-development/) guide on [docs.airbyte.com](https://docs.airbyte.com).
4215

4316
## Python CDK Overview
4417

4518
Airbyte CDK code is within `airbyte_cdk` directory. Here's a high level overview of what's inside:
4619

47-
- `connector_builder`. Internal wrapper that helps the Connector Builder platform run a declarative
48-
manifest (low-code connector). You should not use this code directly. If you need to run a
49-
`SourceDeclarativeManifest`, take a look at
50-
[`source-declarative-manifest`](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-declarative-manifest)
51-
connector implementation instead.
52-
- `destinations`. Basic Destination connector support! If you're building a Destination connector in
53-
Python, try that. Some of our vector DB destinations like `destination-pinecone` are using that
54-
code.
55-
- `models` expose `airbyte_protocol.models` as a part of `airbyte_cdk` package.
56-
- `sources/concurrent_source` is the Concurrent CDK implementation. It supports reading data from
57-
streams concurrently per slice / partition, useful for connectors with high throughput and high
58-
number of records.
59-
- `sources/declarative` is the low-code CDK. It works on top of Airbyte Python CDK, but provides a
60-
declarative manifest language to define streams, operations, etc. This makes it easier to build
61-
connectors without writing Python code.
62-
- `sources/file_based` is the CDK for file-based sources. Examples include S3, Azure, GCS, etc.
20+
- `airbyte_cdk/connector_builder`. Internal wrapper that helps the Connector Builder platform run a declarative manifest (low-code connector). You should not use this code directly. If you need to run a `SourceDeclarativeManifest`, take a look at [`source-declarative-manifest`](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-declarative-manifest) connector implementation instead.
21+
- `airbyte_cdk/cli/source_declarative_manifest`. This module defines the `source-declarative-manifest` (aka "SDM") connector execution logic and associated CLI.
22+
- `airbyte_cdk/destinations`. Basic Destination connector support! If you're building a Destination connector in Python, try that. Some of our vector DB destinations like `destination-pinecone` are using that code.
23+
- `airbyte_cdk/models` expose `airbyte_protocol.models` as a part of `airbyte_cdk` package.
24+
- `airbyte_cdk/sources/concurrent_source` is the Concurrent CDK implementation. It supports reading data from streams concurrently per slice / partition, useful for connectors with high throughput and high number of records.
25+
- `airbyte_cdk/sources/declarative` is the low-code CDK. It works on top of Airbyte Python CDK, but provides a declarative manifest language to define streams, operations, etc. This makes it easier to build connectors without writing Python code.
26+
- `airbyte_cdk/sources/file_based` is the CDK for file-based sources. Examples include S3, Azure, GCS, etc.
6327

6428
## Contributing
6529

66-
Thank you for being interested in contributing to Airbyte Python CDK! Here are some guidelines to
67-
get you started:
68-
69-
- We adhere to the [code of conduct](/CODE_OF_CONDUCT.md).
70-
- You can contribute by reporting bugs, posting github discussions, opening issues, improving
71-
[documentation](/docs/), and submitting pull requests with bugfixes and new features alike.
72-
- If you're changing the code, please add unit tests for your change.
73-
- When submitting issues or PRs, please add a small reproduction project. Using the changes in your
74-
connector and providing that connector code as an example (or a satellite PR) helps!
75-
76-
### First time setup
77-
78-
Install the project dependencies and development tools:
79-
80-
```bash
81-
poetry install --all-extras
82-
```
83-
84-
Installing all extras is required to run the full suite of unit tests.
85-
86-
#### Running tests locally
87-
88-
- Iterate on the CDK code locally
89-
- Run tests via `poetry run poe unit-test-with-cov`, or `python -m pytest -s unit_tests` if you want
90-
to pass pytest options.
91-
- Run `poetry run poe check-local` to lint all code, type-check modified code, and run unit tests
92-
with coverage in one command.
93-
94-
To see all available scripts, run `poetry run poe`.
95-
96-
#### Formatting the code
97-
98-
- Iterate on the CDK code locally
99-
- Run `poetry run ruff format` to format your changes.
100-
101-
To see all available `ruff` options, run `poetry run ruff`.
102-
103-
##### Autogenerated files
104-
105-
Low-code CDK models are generated from `sources/declarative/declarative_component_schema.yaml`. If
106-
the iteration you are working on includes changes to the models or the connector generator, you
107-
might want to regenerate them. In order to do that, you can run:
108-
109-
```bash
110-
poetry run poe build
111-
```
112-
113-
This will generate the code generator docker image and the component manifest files based on the
114-
schemas and templates.
115-
116-
#### Testing
117-
118-
All tests are located in the `unit_tests` directory. Run `poetry run poe unit-test-with-cov` to run
119-
them. This also presents a test coverage report. For faster iteration with no coverage report and
120-
more options, `python -m pytest -s unit_tests` is a good place to start.
121-
122-
#### Building and testing a connector with your local CDK
123-
124-
When developing a new feature in the CDK, you may find it helpful to run a connector that uses that
125-
new feature. You can test this in one of two ways:
126-
127-
- Running a connector locally
128-
- Building and running a source via Docker
129-
130-
##### Installing your local CDK into a local Python connector
131-
132-
Open the connector's `pyproject.toml` file and replace the line with `airbyte_cdk` with the
133-
following:
134-
135-
```toml
136-
airbyte_cdk = { path = "../../../airbyte-cdk/python/airbyte_cdk", develop = true }
137-
```
138-
139-
Then, running `poetry update` should reinstall `airbyte_cdk` from your local working directory.
140-
141-
##### Building a Python connector in Docker with your local CDK installed
142-
143-
_Pre-requisite: Install the
144-
[`airbyte-ci` CLI](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)_
145-
146-
You can build your connector image with the local CDK using
147-
148-
```bash
149-
# from the airbytehq/airbyte base directory
150-
airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
151-
```
152-
153-
Note that the local CDK is injected at build time, so if you make changes, you will have to run the
154-
build command again to see them reflected.
155-
156-
##### Running Connector Acceptance Tests for a single connector in Docker with your local CDK installed
157-
158-
_Pre-requisite: Install the
159-
[`airbyte-ci` CLI](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md)_
160-
161-
To run acceptance tests for a single connectors using the local CDK, from the connector directory,
162-
run
163-
164-
```bash
165-
airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test
166-
```
167-
168-
#### When you don't have access to the API
169-
170-
There may be a time when you do not have access to the API (either because you don't have the
171-
credentials, network access, etc...) You will probably still want to do end-to-end testing at least
172-
once. In order to do so, you can emulate the server you would be reaching using a server stubbing
173-
tool.
174-
175-
For example, using [mockserver](https://www.mock-server.com/), you can set up an expectation file
176-
like this:
177-
178-
```json
179-
{
180-
"httpRequest": {
181-
"method": "GET",
182-
"path": "/data"
183-
},
184-
"httpResponse": {
185-
"body": "{\"data\": [{\"record_key\": 1}, {\"record_key\": 2}]}"
186-
}
187-
}
188-
```
189-
190-
Assuming this file has been created at `secrets/mock_server_config/expectations.json`, running the
191-
following command will allow to match any requests on path `/data` to return the response defined in
192-
the expectation file:
193-
194-
```bash
195-
docker run -d --rm -v $(pwd)/secrets/mock_server_config:/config -p 8113:8113 --env MOCKSERVER_LOG_LEVEL=TRACE --env MOCKSERVER_SERVER_PORT=8113 --env MOCKSERVER_WATCH_INITIALIZATION_JSON=true --env MOCKSERVER_PERSISTED_EXPECTATIONS_PATH=/config/expectations.json --env MOCKSERVER_INITIALIZATION_JSON_PATH=/config/expectations.json mockserver/mockserver:5.15.0
196-
```
197-
198-
HTTP requests to `localhost:8113/data` should now return the body defined in the expectations file.
199-
To test this, the implementer either has to change the code which defines the base URL for Python
200-
source or update the `url_base` from low-code. With the Connector Builder running in docker, you
201-
will have to use domain `host.docker.internal` instead of `localhost` as the requests are executed
202-
within docker.
203-
204-
#### Publishing a new version to PyPi
205-
206-
Python CDK has a
207-
[GitHub workflow](https://github.com/airbytehq/airbyte/actions/workflows/publish-cdk-command-manually.yml)
208-
that manages the CDK changelog, making a new release for `airbyte_cdk`, publishing it to PyPI, and
209-
then making a commit to update (and subsequently auto-release)
210-
[`source-declarative-m anifest`](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-declarative-manifest)
211-
and Connector Builder (in the platform repository).
212-
213-
> [!Note]: The workflow will handle the `CHANGELOG.md` entry for you. You should not add changelog
214-
> lines in your PRs to the CDK itself.
30+
For instructions on how to contribute, please see our [Contributing Guide](docs/CONTRIBUTING.md).
21531

216-
> [!Warning]: The workflow bumps version on it's own, please don't change the CDK version in
217-
> `pyproject.toml` manually.
32+
## Release Management
21833

219-
1. You only trigger the release workflow once all the PRs that you want to be included are already
220-
merged into the `master` branch.
221-
2. The
222-
[`Publish CDK Manually`](https://github.com/airbytehq/airbyte/actions/workflows/publish-cdk-command-manually.yml)
223-
workflow from master using `release-type=major|manor|patch` and setting the changelog message.
224-
3. When the workflow runs, it will commit a new version directly to master branch.
225-
4. The workflow will bump the version of `source-declarative-manifest` according to the
226-
`release-type` of the CDK, then commit these changes back to master. The commit to master will
227-
kick off a publish of the new version of `source-declarative-manifest`.
228-
5. The workflow will also add a pull request to `airbyte-platform-internal` repo to bump the
229-
dependency in Connector Builder.
34+
Please see the [Release Management](docs/RELEASES.md) guide for information on how to perform releases and pre-releases.

0 commit comments

Comments
 (0)