Skip to content

feat(cli): add datahub init --sso for browser-based SSO login#16715

Merged
shirshanka merged 9 commits intomasterfrom
worktree-init-sso-browser
Mar 25, 2026
Merged

feat(cli): add datahub init --sso for browser-based SSO login#16715
shirshanka merged 9 commits intomasterfrom
worktree-init-sso-browser

Conversation

@shirshanka
Copy link
Copy Markdown
Contributor

Summary

  • Adds --sso flag to datahub init that opens a Chromium browser via Playwright, lets the user complete SSO (OIDC/SAML), then automatically captures the session and generates a personal access token — no manual copy-paste needed.
  • Playwright is an optional dependency behind the sso extra (pip install 'acryl-datahub[sso]'). If not installed, the command prints clear step-by-step instructions (supports pip, uv, etc.).
  • No server-side changes required.

Usage

# One-time setup
pip install 'acryl-datahub[sso]'
playwright install chromium

# SSO login
datahub init --sso --host https://your-instance.example.com/gms

Files changed

File What
sso_cli.py (new) Playwright browser flow: open → wait for actor cookie → extract cookies → generate token via GraphQL
entrypoints.py --sso click option, SSO branch in init(), mutual exclusivity validation
setup.py sso extra: playwright>=1.40.0
INIT_AGENT_CONTEXT.md Document --sso flag and prerequisites
test_init_cli.py 5 tests: incompatible flags, happy path, duration, URL derivation
test_sso_cli.py (new) 4 tests: cookie extraction, timeout, missing cookie, GraphQL error

Test plan

  • ./gradlew :metadata-ingestion:lintFix passes
  • pytest tests/unit/cli/test_init_cli.py tests/unit/cli/test_sso_cli.py — 39 tests pass
  • Manual: clean venv without playwright → clear error message with install instructions
  • Manual: datahub init --sso --host https://acryl.acryl.io/gms → browser opens, SSO completes, token saved, datahub graphql query succeeds

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

Linear: ING-2022

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Mar 22, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 22, 2026

Codecov Report

❌ Patch coverage is 93.33333% with 6 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
metadata-ingestion/src/datahub/cli/sso_cli.py 91.54% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

@datahub-connector-tests
Copy link
Copy Markdown

datahub-connector-tests bot commented Mar 22, 2026

Connector Tests Results

All connector tests passed for commit 5d87b3a

View full test logs →

To skip connector tests, add the skip-connector-tests label (org members only).

Autogenerated by the connector-tests CI pipeline.

shirshanka and others added 5 commits March 23, 2026 15:09
Users authenticating via SSO (OIDC/SAML) can now configure the CLI
without manual token copy-paste. The `--sso` flag opens a Chromium
browser via Playwright, lets the user complete SSO, then automatically
extracts the session and generates a personal access token.

Flow: browser opens → user completes SSO → CLI captures actor cookie →
generates token via GraphQL → writes ~/.datahubenv.

No server-side changes required. Playwright is an optional dependency
behind the `sso` extra (`pip install 'acryl-datahub[sso]'`). Clear
install instructions are shown if it's missing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI fixes:
- Add upper bound to playwright dependency (<2.0.0) to satisfy
  check-python-deps pinning rules
- Run mdPrettierWrite to fix markdown formatting in INIT_AGENT_CONTEXT.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds `--support` flag (used with `--sso`) that navigates to
`/support/authenticate` instead of `/authenticate`, enabling the
support team to log into customer instances for debugging.

Usage: datahub init --sso --support --host https://customer.acryl.io/gms

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After SSO login, query existing access tokens for the user and print a
count of CLI tokens with a link to manage them in the UI. This gives
users visibility into token accumulation without auto-revoking tokens
that may be in use elsewhere. Also removes the broken _revoke_old_cli_tokens
call that would crash at runtime and the unused List import.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
shirshanka and others added 3 commits March 24, 2026 10:17
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
@alwaysmeticulous
Copy link
Copy Markdown

alwaysmeticulous bot commented Mar 24, 2026

🔴 Meticulous spotted visual differences in 18 of 1853 screens tested: view and approve differences detected.

Meticulous evaluated ~8 hours of user flows against your PR.

Last updated for commit 5d87b3a. This comment will update as new commits are pushed.

- Move createAccessToken error handling from _warn_about_existing_cli_tokens
  to browser_sso_login where it belongs
- Fix tab/space mixing in browser try/finally block
- Remove redundant browser.close() in inner except (finally handles it)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 25, 2026

Bundle Report

Changes will decrease total bundle size by 284.35kB (-1.24%) ⬇️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 22.68MB -284.35kB (-1.24%) ⬇️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js -365.44kB 12.45MB -2.85%
assets/flinklogo-*.svg (New) 81.09kB 81.09kB 100.0% 🚀

@shirshanka shirshanka merged commit 611c3e8 into master Mar 25, 2026
86 of 89 checks passed
@shirshanka shirshanka deleted the worktree-init-sso-browser branch March 25, 2026 05:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ingestion PR or Issue related to the ingestion of metadata needs-review Label for PRs that need review from a maintainer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants