-
-
Notifications
You must be signed in to change notification settings - Fork 122
feat(etl-api, etl-validation): connection validation endpoints #458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(etl-api, etl-validation): connection validation endpoints #458
Conversation
|
@codex review |
|
Thanks for your contribution, given our production setup, this validation is good but not bulletproof. I am not sure we want to get this much code into the project as of now, but if we will see the need, we will definitely come back to this PR! I will use your ideas and improvements to the test harnesses though, those are nice improvements and thanks a lot for them! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pg_config | ||
| .validate() | ||
| .await | ||
| .map_err(|e| SourcesDbError::Database(sqlx::Error::Configuration(Box::new(e))))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return source validation failures as client errors
Validation failures from pg_config.validate() are wrapped into SourcesDbError::Database(...) here, which SourceError::status_code returns as HTTP 500 with the generic "internal server error" body. The OpenAPI annotation for this endpoint advertises 400 for failed connection tests, and the new ValidationError types distinguish misconfiguration from server faults. With the current mapping, bad credentials, non‑logical wal_level, or missing replication slots will be reported as 500s with no actionable message, preventing clients from telling user errors apart from real server failures.
Useful? React with 👍 / 👎.
| destination_config.validate().await.map_err(|e| { | ||
| DestinationsDbError::Database(sqlx::Error::Configuration(Box::new(std::io::Error::new( | ||
| std::io::ErrorKind::Other, | ||
| e, | ||
| )))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Destination validation errors mapped to 500 internal error
Destination connection validation errors are converted into DestinationsDbError::Database(sqlx::Error::Configuration(..)), which DestinationError renders as HTTP 500 with the generic "internal server error" message. The route’s OpenAPI docs include a 400 response for connection test failures, and ValidationError provides specific, user‑actionable reasons (e.g., bad BigQuery key, unreachable Iceberg catalog). Currently all such user errors surface as 500s without details, making clients unable to tell misconfiguration from server faults.
Useful? React with 👍 / 👎.
@iambriccardo Thanks for the feedback! If the issue scope has changed or is no longer needed, should we close issue #30? |
Your implementation is perfectly fine. The main issue I was referring to is that our replicator instances are running in the same network as the API (which performs the validation). However, we should not make any assumptions about that. For example, the API might reach the systems while the replicators cannot for some other reason. This is why the validation is helpful from a UX standpoint, but allowing the pipeline to fail and report its own problem is also a valid solution. It results in slightly worse UX but requires no code changes. |
@iambriccardo |
What kind of change does this PR introduce?
Fixes #30
What is the new behavior?
New Endpoints
Created new
etl-validationcrate with validation logic separated from API concerns:SourceValidatortrait for PostgreSQL validationDestinationValidatortrait for BigQuery and Iceberg validationAnother case for
etl-validationis that we can reuse validation logic before even pipeline starts, because with current implementation this is were all potential source/destination errors will appearPOST /v1/sources/test-connection
POST /v1/destinations/test-connection
Additional context
Test Infrastructure Improvements
TestApp Dropimplementation to handle cleanup errors gracefully instead of panickingdrop_pg_databaseto log warnings instead of panicking on connection failurescatch_unwindto prevent cleanup failures from crashing test suite