feat: skip config validation during discovery for declarative sources that don't use DynamicSchemaLoader#464
Conversation
…cSchemaLoader Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
|
Original prompt from Aaron: |
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
…ation control Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
…class Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Aaron ("AJ") Steers (aaronsteers)
left a comment
There was a problem hiding this comment.
I've done a few rounds of review and I think this is looking well. I'll open up now to other reviewers and mark ready for review.
| if cmd == "spec": | ||
| message = AirbyteMessage(type=Type.SPEC, spec=source_spec) | ||
| yield from [ | ||
| yield from ( |
There was a problem hiding this comment.
aside - Note: I corrected Devin's implementation to use generator comprehension instead of list comprehension, and Devin applies in these adjacent locations as well. I think this is a positive change, calling out though to explain why other code paths are touched.
More about generator comprehensions here: https://stackoverflow.com/a/47826
…ver property Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
Show resolved
Hide resolved
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
|
/autofix
|
Aaron ("AJ") Steers (aaronsteers)
left a comment
There was a problem hiding this comment.
I am moving this back to draft status and will dive deeper later when time permits.
Directionally, I think this is doable and makes sense, and most of the code here in the PR is correct, I believe. But I want to add some tests and make sure we're getting the failure behaviors we expect.
Will move future work to this PR with me as author:
Feel free to drop comments/suggestions here on this PR and I'll make sure to consider them in future work.
| check_config_against_spec: bool = True | ||
| """Configure whether `check_config_against_spec_or_exit()` needs to be called.""" | ||
|
|
||
| check_config_during_discover: bool = False |
There was a problem hiding this comment.
the value and the default mentionned in the comment don't align.
| ) -> Iterable[AirbyteMessage]: | ||
| self.set_up_secret_filter(config, source_spec.connectionSpecification) | ||
| if self.source.check_config_against_spec: | ||
| if not self.source.check_config_during_discover: |
There was a problem hiding this comment.
why is this changing?
|
Closing due to inactivity for more than 7 days. |
|
Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it. |
Preface from AJ
This came up in the S.H.I.T. list and it's something Ben Church (@bnchrch) and I have discussed a bit. It came up again today in Slack, raised as an inquiry from Michel. I decided to have Devin to a spike, and it looks like this could be close to shippable.
What changes:
--discoverwithout a config, unless they useDynamicSchemaLoader, in which case, they will require config.discoververb. Other verbs are unchanged.Prior art:
Devin-Created Summary: Allow Airbyte sources to run discovery unprivileged with DynamicSchemaLoader
This PR allows Airbyte sources to run discovery unprivileged if the source API doesn't need auth in order to provide the catalog info.
Implementation
Added automatic detection of
DynamicSchemaLoaderto skip config validation during discovery. When a source uses the dynamic schema feature, we assume that the schema endpoint might not require authentication and automatically skip config validation during discovery.Modified the AirbyteEntrypoint to make the
--configparameter optional for the discover command. This allows running discovery without providing a config file when the source doesn't require config validation.This enables sources with dynamic schemas to provide catalog information without authentication when the schema endpoint doesn't require it.
Testing
Added unit tests to verify that:
DynamicSchemaLoaderskip config validation during discoveryDynamicSchemaLoaderstill require config validationLink to Devin run: https://app.devin.ai/sessions/e6aa19df336347919b6cabcff7143a1c
Requested by: Aaron ("AJ") Steers (aj@airbyte.io)