Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
71d42be to
5c1212c
Compare
227c99c to
7f4c04e
Compare
a7a3aea to
85b9880
Compare
322fee4 to
ca30439
Compare
20e02e3 to
2e31b94
Compare
| extras: "postgres postgis parquet duckdb cli filesystem" | ||
| needs_postgres: true | ||
|
|
||
| # Clickhouse OSS (TODO: test with minio s3) |
There was a problem hiding this comment.
clickhouse oss is disabled for now, it would be good to make it work without any remote services, for example test it with s3 using minio
There was a problem hiding this comment.
hmmmm why? it is only using local filesystem. I'm running tests locally on it with network disconnected. we have compose file for it:
services:
clickhouse:
image: clickhouse/clickhouse-server
ports:
- "9000:9000"
- "8123:8123"
environment:
- CLICKHOUSE_DB=dlt_data
- CLICKHOUSE_USER=loader
- CLICKHOUSE_PASSWORD=loader
- CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
volumes:
- clickhouse_data:/var/lib/clickhouse/
- clickhouse_logs:/var/log/clickhouse-server/
restart: unless-stopped
healthcheck:
test: [ "CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8123/ping" ]
interval: 3s
timeout: 5s
retries: 5
volumes:
clickhouse_data:
clickhouse_logs:
There was a problem hiding this comment.
I think it uses the local clickhouse but depends on s3, az etc. are you sure you can run the full test suite without network connection? I will uncomment it again and have another look tomorrow, but I think it will not work without secrets.
| - name: postgres and redshift | ||
| # TODO: check wether duckdb and all filesystem drivers need to be here, if so, explain why | ||
| destinations: "[\"redshift\", \"postgres\", \"duckdb\", \"dummy\"]" | ||
| filesystem_drivers: "[\"memory\", \"file\", \"r2\", \"s3\", \"gs\", \"az\", \"abfss\", \"gdrive\"]" # excludes sftp |
There was a problem hiding this comment.
I don't think we need all these drivers here, I took this from test_destinations.yml, so it is a pretty ancient part of this setup and I was not sure if there are any combinations of the above destinations that need these bucket settings, but afaik filesystem_drivers only applies to filesystem tests, right?
There was a problem hiding this comment.
those are left there after I extracted filesystem to a separate workflow. I think you can leave memory and file. let's see what happens
| extras: "postgres redshift postgis s3 gs az parquet duckdb" | ||
| with: ",adbc" | ||
|
|
||
| # Qdrant (disabled, TODO: explain why) |
There was a problem hiding this comment.
this workflow currently is disabled in the repo. I'd like to add a note here why exactly (because I don't know)
There was a problem hiding this comment.
I think we lost our test account. we only test local version now
fead3e6 to
83978ba
Compare
rudolfix
left a comment
There was a problem hiding this comment.
thanks for doing this!
| branches: [ master, devel ] | ||
|
|
||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} |
There was a problem hiding this comment.
so this workflow will cancel all the workflows it started?
There was a problem hiding this comment.
yes it will, it's pretty cool!
| authorize_run_from_fork: | ||
| name: check if fork and if so wether secrets are available | ||
| # run when label is assigned OR when we are not a fork | ||
| if: ${{ github.event.label.name == 'ci from fork' || (github.event.pull_request.head.repo.full_name == github.repository && (github.event.action == 'opened' || github.event.action == 'synchronize'))}} |
There was a problem hiding this comment.
now the pull_request_target is deactivated so this authorization is defunct (github.event.pull_request.head.repo.full_name == github.repository as always true)
we can enable PRs from forks in a separate ticket
| # Destination and Sources local tests, do not provide secrets | ||
| # Other tests that do not require remote connections | ||
| # | ||
| test_destinations_local: |
There was a problem hiding this comment.
those run in parallel right?
There was a problem hiding this comment.
yes, all tests that have their conditions met start immediately, so you have a bunch of tests running in parallel which also include matrixes.
| needs: test_common | ||
| uses: ./.github/workflows/test_sources_local.yml | ||
|
|
||
| test_plus: |
There was a problem hiding this comment.
maybe those 3 workflows should only depend on lint?
| extras: "postgres postgis parquet duckdb cli filesystem" | ||
| needs_postgres: true | ||
|
|
||
| # Clickhouse OSS (TODO: test with minio s3) |
There was a problem hiding this comment.
hmmmm why? it is only using local filesystem. I'm running tests locally on it with network disconnected. we have compose file for it:
services:
clickhouse:
image: clickhouse/clickhouse-server
ports:
- "9000:9000"
- "8123:8123"
environment:
- CLICKHOUSE_DB=dlt_data
- CLICKHOUSE_USER=loader
- CLICKHOUSE_PASSWORD=loader
- CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
volumes:
- clickhouse_data:/var/lib/clickhouse/
- clickhouse_logs:/var/log/clickhouse-server/
restart: unless-stopped
healthcheck:
test: [ "CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8123/ping" ]
interval: 3s
timeout: 5s
retries: 5
volumes:
clickhouse_data:
clickhouse_logs:
| - name: postgres and redshift | ||
| # TODO: check wether duckdb and all filesystem drivers need to be here, if so, explain why | ||
| destinations: "[\"redshift\", \"postgres\", \"duckdb\", \"dummy\"]" | ||
| filesystem_drivers: "[\"memory\", \"file\", \"r2\", \"s3\", \"gs\", \"az\", \"abfss\", \"gdrive\"]" # excludes sftp |
There was a problem hiding this comment.
those are left there after I extracted filesystem to a separate workflow. I think you can leave memory and file. let's see what happens
| post_install_commands: "poetry run pip install sqlalchemy==2.0.18" # minimum version required by `pyiceberg` | ||
| additional_tests: "poetry run pytest tests/cli" | ||
|
|
||
| - name: postgres, duckdb |
There was a problem hiding this comment.
we were also running dummy destination
There was a problem hiding this comment.
not in the local tests, but I moved it here now from the remote tests where it also was before (in test_destinations.yml)
| filesystem_drivers: "[\"memory\"]" | ||
| extras: "motherduck s3 gs az parquet" | ||
|
|
||
| # MSSQL |
There was a problem hiding this comment.
mssql was always running a full suite. this one is fast. we should do the same for remote postgres
| extras: "postgres redshift postgis s3 gs az parquet duckdb" | ||
| with: ",adbc" | ||
|
|
||
| # Qdrant (disabled, TODO: explain why) |
There was a problem hiding this comment.
I think we lost our test account. we only test local version now
| poetry run pytest tests/load --ignore tests/load/sources | ||
| poetry run pytest tests/cli | ||
| env: | ||
| DESTINATION__POSTGRES__CREDENTIALS: postgresql://loader:loader@localhost:5432/dlt_data |
There was a problem hiding this comment.
to this new file, see comment above. there is one step in the workflow that copies these back in.
always run all mssql and postgres tests
4aa0388 to
1149593
Compare
e86fb4a to
f0cffee
Compare
* use both pull request and pull request target on destination workflows * remove additional triggers * marks one test as smoke test and only runs this for the time being * only run one test in common, needs to be reverted later * run common tests only on linter success * fix common workflow * only start workflows on call (do not call them yet) * test master workflow * remove docs changes step from lint * remove local destinations docs change * rename master trigger workflows * change concurrency key * try other dependencies * add destination tests with authorize step * remove authorize and docs step from destination tests * fix destination test * rename main workflow * test inherit secrets * add more workflows to main file * fix starting conditions for some workflows * rename plus tests matrix job * remove concurrency settings for now * add first remote destinations workflow version * move some more remote destinations * remove pytest args * try to fix extras string * add more remote destination tests * rename some workflows and add concurrency settings to main workflow * move test_destinations * fix link to called workflow * add better main workflow labels move clickhouse remote tests * create local destinations test * disabled some workflows * disable clickhouse oss for now split duckdb and postgres local tests into own matrix job * copy ssh agent key * move all local destination secrets into template secrets file * small fixes * enable all tests again * fix local tests * add missing openai dep * try to fix qdrant creds * fix qdrant server / local file differentiation * fix cli test * change workflow dependencies * remove telemetry info and other small changes * run dummy destination with the local tests * remove duckdb from remote tests, always run all mssql and postgres tests * enable clickhouse oss * fix condition for always running all tests * move cli commands to postgres tests * rename clickhouse-compose to be inline with other services * fix clickhouse local credentials and disable tests which require staging destinations * adapt postgres to postgres example to new fixture * fix clickhouse excluded configs * update essential test handling * skip gcs compat test for local clickhouse tests
* use both pull request and pull request target on destination workflows * remove additional triggers * marks one test as smoke test and only runs this for the time being * only run one test in common, needs to be reverted later * run common tests only on linter success * fix common workflow * only start workflows on call (do not call them yet) * test master workflow * remove docs changes step from lint * remove local destinations docs change * rename master trigger workflows * change concurrency key * try other dependencies * add destination tests with authorize step * remove authorize and docs step from destination tests * fix destination test * rename main workflow * test inherit secrets * add more workflows to main file * fix starting conditions for some workflows * rename plus tests matrix job * remove concurrency settings for now * add first remote destinations workflow version * move some more remote destinations * remove pytest args * try to fix extras string * add more remote destination tests * rename some workflows and add concurrency settings to main workflow * move test_destinations * fix link to called workflow * add better main workflow labels move clickhouse remote tests * create local destinations test * disabled some workflows * disable clickhouse oss for now split duckdb and postgres local tests into own matrix job * copy ssh agent key * move all local destination secrets into template secrets file * small fixes * enable all tests again * fix local tests * add missing openai dep * try to fix qdrant creds * fix qdrant server / local file differentiation * fix cli test * change workflow dependencies * remove telemetry info and other small changes * run dummy destination with the local tests * remove duckdb from remote tests, always run all mssql and postgres tests * enable clickhouse oss * fix condition for always running all tests * move cli commands to postgres tests * rename clickhouse-compose to be inline with other services * fix clickhouse local credentials and disable tests which require staging destinations * adapt postgres to postgres example to new fixture * fix clickhouse excluded configs * update essential test handling * skip gcs compat test for local clickhouse tests
Description
We need to be more economical with our CI minutes. Proposed changes are:
TODO before finalizing
Review checklist