Skip to content

feat(cli): add source-smoke-test CLI#996

Merged
Aaron ("AJ") Steers (aaronsteers) merged 9 commits intomainfrom
devin/1773820262-smoke-test-source-extraction
Mar 18, 2026
Merged

feat(cli): add source-smoke-test CLI#996
Aaron ("AJ") Steers (aaronsteers) merged 9 commits intomainfrom
devin/1773820262-smoke-test-source-extraction

Conversation

@aaronsteers
Copy link
Contributor

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Mar 18, 2026

Summary

Extracts the smoke-test source from PR #969 into its own dedicated submodule at airbyte/cli/smoke_test_source/, so it can be reviewed and merged independently from the broader universal connector work.

The new source-smoke-test CLI entrypoint is registered in pyproject.toml.

Related: #995 (ops repo test harness delivery), #969 (parent universal connector PR)

Structure:

  • airbyte/cli/smoke_test_source/_scenarios.py — 15 predefined test scenarios (basic types, nulls, naming edge cases, wide tables, unicode, large batches, etc.)
  • airbyte/cli/smoke_test_source/source.pySourceSmokeTest class (extends CDK Source)
  • airbyte/cli/smoke_test_source/run.py — thin CLI entry point

Structural change — cli.pycli/ package:

Adding the smoke_test_source/ submodule required converting airbyte/cli.py (a module) into airbyte/cli/ (a package), which would shadow the original module. To resolve this:

  • airbyte/cli.pyairbyte/cli/pyab.py (preserves git history via git mv)
  • airbyte/cli/__init__.py re-exports cli for backwards compatibility
  • Entry points updated to airbyte.cli.pyab:cli
  • docs/generate.py updated to reference the new path

Other fixes included:

  • Type validation for scenario_filter and custom_scenarios config fields (guards against None/wrong types)
  • _get_connector_name() fix: uses maxsplit=1 and rsplit() for correct Docker image name parsing

Review & Testing Checklist for Human

  • Verify pyairbyte --help and pyab --help still work — the CLI module was converted from airbyte/cli.py to a package at airbyte/cli/. Entry points now resolve via airbyte.cli.pyab:cli. Confirm both commands work after pip install -e .
  • Verify source-smoke-test spec runs — install in dev mode and run source-smoke-test spec to confirm the entrypoint resolves and outputs a valid connector spec JSON
  • Spot-check docs/generate.py — the pdoc path changed from airbyte/cli.py to airbyte/cli/pyab.py. Run poe docs-generate to confirm docs still build correctly
  • Review scenario data for correctness — spot-check a few predefined scenarios (e.g., null_handling, column_naming_edge_cases, large_batch_stream) to confirm the JSON schemas match the inline records

Notes

  • ~840 lines added with no tests. The scenarios and source logic (check, discover, read, _get_all_scenarios) are untested. Determine if tests should be added before merge or tracked as follow-up.
  • The scenario data uses dict[str, Any] throughout (carried over from PR feat(connectors): add universal source and destination using PyAirbyte #969). A future improvement could type these with Pydantic models or dataclasses.
  • Business logic lives under cli/ per explicit request, though the project convention prefers CLI modules as thin wrappers. This is an intentional placement decision.
  • No Dockerfile is included — containerization will be handled by the companion monorepo scaffold PR.

Summary by CodeRabbit

  • New Features

    • Added a Smoke Test Source that emits synthetic datasets across diverse predefined scenarios (nested data, nulls, edge-case names, wide tables, high-volume streams).
    • Added a CLI command to run predefined or custom scenarios with configurable large-batch generation and selective fast/slow stream options.
  • Bug Fixes

    • Fixed Docker image name parsing so tags and registries are handled correctly.
  • Documentation

    • Added module-level docs and public exports to surface the smoke test source in tooling and docs.

Link to Devin session: https://app.devin.ai/sessions/9c72389579884c06bf18cef11c4550e8
Requested by: Aaron ("AJ") Steers (@aaronsteers)

Note

Auto-merge may have been disabled. Please check the PR status to confirm.

@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1773820262-smoke-test-source-extraction' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1773820262-smoke-test-source-extraction'

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /uv-lock - Updates uv.lock file
  • /test-pr - Runs tests with the updated PyAirbyte
  • /prerelease - Builds and publishes a prerelease version to PyPI
📚 Show Repo Guidance

Helpful Resources

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a Smoke Test source to the PyAirbyte CLI: package headers, predefined and runtime-configurable scenarios with helpers, a SourceSmokeTest implementation (spec/check/discover/read), a CLI entrypoint, and a pyproject script entry.

Changes

Cohort / File(s) Summary
CLI package header
airbyte/cli/__init__.py
Added module docstring/copyright, imports cli from airbyte.cli.pyab, and exports cli.
Smoke test package init
airbyte/cli/smoke_test_source/__init__.py
New package initializer exporting SourceSmokeTest and documenting the package as experimental.
Predefined scenarios & helpers
airbyte/cli/smoke_test_source/_scenarios.py
New module adding PREDEFINED_SCENARIOS, HIGH_VOLUME_SCENARIO_NAMES, generate_large_batch_records() and get_scenario_records() with large-batch generation and many scenario fixtures.
Source implementation
airbyte/cli/smoke_test_source/source.py
New SourceSmokeTest and helper _build_streams_from_scenarios() implementing spec(), _get_all_scenarios(), check(), discover(), and read(); supports custom scenarios, validation, and emits synthetic Airbyte messages.
CLI entrypoint
airbyte/cli/smoke_test_source/run.py
New launcher exposing run() that calls airbyte_cdk.entrypoint.launch(SourceSmokeTest(), ...).
Project scripts / CLI mapping
pyproject.toml
Updated pyairbyte/pyab script targets to airbyte.cli.pyab:cli and added source-smoke-test = "airbyte.cli.smoke_test_source.run:run".
CLI behavior tweak
airbyte/cli/pyab.py
Adjusted _get_connector_name parsing to handle registries with ports using split(..., maxsplit=1) and rsplit(..., maxsplit=1).
Docs generator
docs/generate.py
Switched documented module from airbyte/cli.py to airbyte/cli/pyab.py.

Sequence Diagram

sequenceDiagram
    participant User as User/CLI
    participant Launcher as Launcher (entrypoint)
    participant Source as SourceSmokeTest
    participant Scenarios as Scenarios Module
    participant Catalog as AirbyteCatalog
    participant Records as Record Stream

    User->>Launcher: invoke `source-smoke-test`
    Launcher->>Source: instantiate & invoke lifecycle (spec/check/discover/read)

    User->>Source: spec()
    Source-->>User: ConnectorSpecification

    User->>Source: check(config)
    Source->>Scenarios: _get_all_scenarios(config)
    Scenarios-->>Source: combined scenarios
    Source-->>User: AirbyteConnectionStatus

    User->>Source: discover(config)
    Source->>Scenarios: _build_streams_from_scenarios(scenarios)
    Scenarios-->>Source: AirbyteStream objects
    Source-->>User: AirbyteCatalog

    User->>Source: read(config, catalog, state)
    Source->>Scenarios: get_scenario_records(scenario)
    Scenarios-->>Source: test records (or generated batches)
    Source->>Records: emit AirbyteMessage records
    Records-->>User: streaming records
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Want me to flag specific functions for deeper review (e.g., large-batch generation or read() emission), wdyt?

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and concisely describes the main change: adding a new CLI entrypoint for the smoke-test source.
Docstring Coverage ✅ Passed Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1773820262-smoke-test-source-extraction
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

Co-Authored-By: AJ Steers <aj@airbyte.io>
@aaronsteers Aaron ("AJ") Steers (aaronsteers) marked this pull request as ready for review March 18, 2026 08:04
coderabbitai[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) changed the title feat(cli): extract smoke-test source as dedicated cli submodule feat(cli): add source-smoke-test as cli submodule Mar 18, 2026
@aaronsteers Aaron ("AJ") Steers (aaronsteers) changed the title feat(cli): add source-smoke-test as cli submodule feat(cli): add source-smoke-test CLI Mar 18, 2026
@github-actions
Copy link

github-actions bot commented Mar 18, 2026

PyTest Results (Fast Tests Only, No Creds)

343 tests  ±0   343 ✅ ±0   5m 44s ⏱️ +2s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 3052f25. ± Comparison against base commit 71597ca.

♻️ This comment has been updated with latest results.

…config type validation

Co-Authored-By: AJ Steers <aj@airbyte.io>
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@github-actions
Copy link

github-actions bot commented Mar 18, 2026

PyTest Results (Full)

413 tests  ±0   395 ✅ ±0   25m 16s ⏱️ +44s
  1 suites ±0    18 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 3052f25. ± Comparison against base commit 71597ca.

♻️ This comment has been updated with latest results.

devin-ai-integration bot and others added 2 commits March 18, 2026 18:07
coderabbitai[bot]

This comment was marked as resolved.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) merged commit 803fc93 into main Mar 18, 2026
22 checks passed
@aaronsteers Aaron ("AJ") Steers (aaronsteers) deleted the devin/1773820262-smoke-test-source-extraction branch March 18, 2026 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant