Skip to content

Commit d2cecd6

Browse files
docs: add OAuth implementation sprint plan
3-task breakdown covering foundation + HTTP client changes, M2M provider, and U2M provider with full test coverage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent f478b62 commit d2cecd6

File tree

2 files changed

+172
-99
lines changed

2 files changed

+172
-99
lines changed
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# Sprint Plan: OAuth U2M + M2M Implementation
2+
3+
**Sprint dates:** 2026-03-07 to 2026-03-20
4+
**Sprint goal:** Implement complete OAuth 2.0 authentication (U2M + M2M) for the Rust ADBC driver as described in the [OAuth design doc](oauth-u2m-m2m-design.md).
5+
6+
---
7+
8+
## Story
9+
10+
**Title:** Implement OAuth U2M and M2M authentication
11+
12+
**Description:**
13+
Add OAuth 2.0 support to the Rust ADBC driver, covering:
14+
- M2M (Client Credentials) flow for service principal authentication
15+
- U2M (Authorization Code + PKCE) flow for interactive browser-based login
16+
- Shared infrastructure: OIDC discovery, token lifecycle management, file-based token caching
17+
- Integration with `Database` config and `DatabricksHttpClient` (two-phase init via `OnceLock`)
18+
- ODBC-aligned numeric config values (`AuthMech`/`Auth_Flow` scheme)
19+
20+
**Acceptance Criteria:**
21+
- [ ] M2M flow works end-to-end: `Database` config -> `new_connection()` -> `get_auth_header()` returns valid Bearer token from client credentials exchange
22+
- [ ] U2M flow works end-to-end: browser launch -> callback server captures code -> PKCE token exchange -> cached token reused on subsequent connections
23+
- [ ] Token refresh state machine works: FRESH (no-op) -> STALE (background refresh) -> EXPIRED (blocking refresh)
24+
- [ ] File-based token cache persists U2M tokens across connections with correct permissions (0o600)
25+
- [ ] `DatabricksHttpClient` supports two-phase init: created without auth, auth set later via `OnceLock`
26+
- [ ] All config options from the design doc are parseable via `set_option()`
27+
- [ ] Invalid config combinations fail with clear error messages at `new_connection()`
28+
- [ ] `cargo fmt`, `cargo clippy -- -D warnings`, `cargo test` all pass
29+
30+
---
31+
32+
## Sub-Tasks
33+
34+
### Task 1: Foundation + HTTP Client Changes
35+
36+
**Scope:** Shared OAuth infrastructure and modifications to existing code to support two-phase auth initialization.
37+
38+
**Files to create:**
39+
- `src/auth/oauth/mod.rs` — module root, re-exports
40+
- `src/auth/oauth/token.rs``OAuthToken` struct with `is_expired()`, `is_stale()`, JSON serialization
41+
- `src/auth/oauth/oidc.rs``OidcEndpoints` struct, `discover()` function hitting `{host}/oidc/.well-known/oauth-authorization-server`
42+
- `src/auth/oauth/cache.rs``TokenCache` with `load()`/`save()`, SHA-256 hashed filenames, 0o600 permissions
43+
- `src/auth/oauth/token_store.rs``TokenStore` with `RwLock<Option<OAuthToken>>`, `AtomicBool` for refresh coordination, FRESH/STALE/EXPIRED state machine
44+
45+
**Files to modify:**
46+
- `Cargo.toml` — add `oauth2 = "5"`, `sha2 = "0.10"`, `open = "5"`, `dirs = "5"`
47+
- `src/client/http.rs` — change `auth_provider` from `Arc<dyn AuthProvider>` to `OnceLock<Arc<dyn AuthProvider>>`, add `set_auth_provider()`, update `execute()` to read from `OnceLock`
48+
- `src/database.rs` — add `AuthMechanism`/`AuthFlow` enums (with `TryFrom<i64>`), new config fields (`auth_mechanism`, `auth_flow`, `auth_client_id`, `auth_client_secret`, `auth_scopes`, `auth_token_endpoint`, `auth_redirect_port`), `set_option()`/`get_option_string()` for all new keys
49+
- `src/auth/mod.rs` — add `pub mod oauth;`, remove old `pub use oauth::OAuthCredentials`, delete `src/auth/oauth.rs` (replaced by directory module)
50+
51+
**Tests:**
52+
- `token.rs`: `test_token_fresh_not_expired`, `test_token_stale_threshold`, `test_token_expired_within_buffer`, `test_token_serialization_roundtrip`
53+
- `oidc.rs`: `test_discover_workspace_endpoints`, `test_discover_invalid_response`, `test_discover_http_error`
54+
- `cache.rs`: `test_cache_key_deterministic`, `test_cache_save_load_roundtrip`, `test_cache_missing_file`, `test_cache_file_permissions`, `test_cache_corrupted_file`
55+
- `token_store.rs`: `test_store_fresh_token_no_refresh`, `test_store_expired_triggers_blocking_refresh`, `test_store_concurrent_refresh_single_fetch`, `test_store_stale_returns_current_token`
56+
- `http.rs`: `test_execute_without_auth_works_before_auth_set`, `test_execute_fails_before_auth_set`, `test_execute_succeeds_after_auth_set`, `test_set_auth_provider_twice_panics_or_errors`
57+
- `database.rs`: `test_set_auth_mechanism_valid`, `test_set_auth_mechanism_invalid`, `test_set_auth_flow_valid`, `test_set_auth_flow_invalid`, `test_new_connection_missing_mechanism`, `test_new_connection_oauth_missing_flow`, `test_new_connection_pat_missing_token`
58+
59+
**Definition of done:** All foundation modules compile, unit tests pass, `DatabricksHttpClient` two-phase init works, database config parsing works for all auth options.
60+
61+
---
62+
63+
### Task 2: M2M Provider (Client Credentials)
64+
65+
**Scope:** Implement the M2M OAuth flow using `oauth2::BasicClient` for client credentials exchange.
66+
67+
**Files to create:**
68+
- `src/auth/oauth/m2m.rs``ClientCredentialsProvider` implementing `AuthProvider`
69+
70+
**Files to modify:**
71+
- `src/auth/oauth/mod.rs` — add `pub mod m2m;`, re-export `ClientCredentialsProvider`
72+
- `src/auth/mod.rs` — re-export `ClientCredentialsProvider`
73+
- `src/database.rs` — wire `AuthFlow::ClientCredentials` match arm in `new_connection()` to create `ClientCredentialsProvider`
74+
75+
**Implementation details:**
76+
- Construct `oauth2::BasicClient` from OIDC-discovered endpoints
77+
- Implement `oauth2` HTTP adapter that routes through `DatabricksHttpClient::execute_without_auth()`
78+
- Use `TokenStore` for token lifecycle (no disk cache for M2M)
79+
- `get_auth_header()``TokenStore::get_or_refresh()``client.exchange_client_credentials()` if needed
80+
81+
**Tests:**
82+
- `m2m.rs`: `test_m2m_token_exchange`, `test_m2m_auto_refresh`, `test_m2m_oidc_discovery`
83+
- Wiremock integration (`tests/`): `test_m2m_full_flow_discovery_and_token_exchange`, `test_m2m_token_refresh_on_expiry`, `test_m2m_discovery_failure_propagates`
84+
- Database validation: `test_new_connection_client_credentials_missing_secret`
85+
86+
**Definition of done:** M2M flow works end-to-end from `Database` config through `new_connection()` to `get_auth_header()`. Wiremock tests verify the full OIDC discovery -> token exchange -> refresh cycle.
87+
88+
---
89+
90+
### Task 3: U2M Provider (Authorization Code + PKCE)
91+
92+
**Scope:** Implement the U2M OAuth flow with browser-based login, PKCE, callback server, and token caching.
93+
94+
**Files to create:**
95+
- `src/auth/oauth/callback.rs``CallbackServer` with localhost HTTP listener, state validation, timeout
96+
- `src/auth/oauth/u2m.rs``AuthorizationCodeProvider` implementing `AuthProvider`
97+
98+
**Files to modify:**
99+
- `src/auth/oauth/mod.rs` — add `pub mod callback;`, `pub mod u2m;`, re-export `AuthorizationCodeProvider`
100+
- `src/auth/mod.rs` — re-export `AuthorizationCodeProvider`
101+
- `src/database.rs` — wire `AuthFlow::Browser` match arm in `new_connection()` to create `AuthorizationCodeProvider`
102+
103+
**Implementation details:**
104+
- On creation: try loading cached token via `TokenCache::load()`
105+
- If cached token has valid `refresh_token`: store in `TokenStore`, refresh on first `get_auth_header()` if stale/expired
106+
- If no cache: generate PKCE via `PkceCodeChallenge::new_random_sha256()`, build auth URL via `client.authorize_url()`, start `CallbackServer`, launch browser via `open::that()`, wait for callback, exchange code via `client.exchange_code().set_pkce_verifier()`
107+
- Save token to cache after every successful acquisition/refresh
108+
- Refresh uses `client.exchange_refresh_token()` through `DatabricksHttpClient::execute_without_auth()`
109+
- If refresh token is expired, fall back to full browser flow
110+
111+
**Tests:**
112+
- `callback.rs`: `test_callback_captures_code`, `test_callback_validates_state`, `test_callback_timeout`
113+
- `u2m.rs`: `test_u2m_refresh_token_flow`, `test_u2m_cache_hit`, `test_u2m_cache_miss_with_expired_refresh`
114+
- Wiremock integration: `test_u2m_refresh_token_full_flow`
115+
- Database validation: `test_new_connection_token_passthrough_missing_token`
116+
- E2E (ignored): `test_m2m_end_to_end`, `test_u2m_end_to_end`
117+
118+
**Definition of done:** U2M flow works end-to-end. Token cache persists across connections. Refresh path works via wiremock tests. Browser flow verified manually via `#[ignore]` E2E test.

rust/docs/designs/oauth-u2m-m2m-design.md

Lines changed: 54 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,7 @@ pub(crate) struct TokenStore {
180180
```
181181

182182
**Contract:**
183-
- `get_or_refresh(refresh_fn)`: Returns a valid token. If STALE, spawns background refresh via `std::thread::spawn` and returns current token. If EXPIRED, blocks caller until refresh completes.
183+
- `get_or_refresh(refresh_fn)`: Returns a valid token. If STALE, spawns background refresh via `tokio::spawn` and returns current token. If EXPIRED, blocks caller until refresh completes.
184184
- Thread-safe: `RwLock` for read-heavy access, `AtomicBool` to prevent concurrent refresh.
185185
- Only one refresh runs at a time; concurrent callers receive the current (stale) token.
186186

@@ -291,29 +291,29 @@ sequenceDiagram
291291
participant DB as Database
292292
participant M2M as ClientCredsProvider
293293
participant OIDC as OIDC Discovery
294-
participant TE as Token Endpoint
294+
participant TokenEP as Token Endpoint
295295
296296
App->>DB: new_connection()
297297
DB->>M2M: new(host, client_id, client_secret, scopes)
298298
M2M->>OIDC: GET {host}/oidc/.well-known/oauth-authorization-server
299299
OIDC-->>M2M: OidcEndpoints
300300
301-
Note over App,TE: First get_auth_header() call
301+
Note over App,TokenEP: First get_auth_header() call
302302
App->>M2M: get_auth_header()
303-
M2M->>TE: client.exchange_client_credentials()
304-
TE-->>M2M: access_token (no refresh_token)
303+
M2M->>TokenEP: client.exchange_client_credentials()
304+
TokenEP-->>M2M: access_token (no refresh_token)
305305
M2M-->>App: "Bearer {access_token}"
306306
307-
Note over App,TE: Subsequent calls (token FRESH)
307+
Note over App,TokenEP: Subsequent calls (token FRESH)
308308
App->>M2M: get_auth_header()
309309
M2M-->>App: "Bearer {cached_token}"
310310
311-
Note over App,TE: Token becomes STALE
311+
Note over App,TokenEP: Token becomes STALE
312312
App->>M2M: get_auth_header()
313313
M2M->>M2M: Spawn background refresh
314314
M2M-->>App: "Bearer {current_token}"
315-
M2M->>TE: client.exchange_client_credentials() (background)
316-
TE-->>M2M: new access_token
315+
M2M->>TokenEP: client.exchange_client_credentials() (background)
316+
TokenEP-->>M2M: new access_token
317317
```
318318

319319
### Token Exchange (M2M)
@@ -362,65 +362,47 @@ impl TokenCache {
362362

363363
## Configuration Options
364364

365-
Following the Databricks ODBC driver's two-level authentication scheme, configuration uses `AuthMech` (mechanism) and `Auth_Flow` (OAuth flow type) as the primary selectors. Both are **required** -- no auto-detection.
365+
Authentication is configured via a single `databricks.auth.type` string option that selects the authentication method. This replaces the ODBC-style two-level `AuthMech`/`Auth_Flow` numeric scheme with self-describing string values.
366366

367-
### Rust Enums
367+
### Rust Enum
368368

369369
```rust
370-
/// Authentication mechanism -- top-level selector.
371-
/// Config values match the ODBC driver's AuthMech numeric codes.
370+
/// Authentication type -- single selector for the authentication method.
372371
#[derive(Debug, Clone, PartialEq)]
373-
#[repr(u8)]
374-
pub enum AuthMechanism {
375-
/// Personal access token (no OAuth). Config value: 0
376-
Pat = 0,
377-
/// OAuth 2.0 -- requires AuthFlow to select the specific flow. Config value: 11
378-
OAuth = 11,
379-
}
380-
381-
/// OAuth authentication flow -- selects the specific OAuth grant type.
382-
/// Config values match the ODBC driver's Auth_Flow numeric codes.
383-
/// Only applicable when AuthMechanism is OAuth.
384-
#[derive(Debug, Clone, PartialEq)]
385-
#[repr(u8)]
386-
pub enum AuthFlow {
387-
/// Use a pre-obtained OAuth access token directly. Config value: 0
388-
TokenPassthrough = 0,
389-
/// M2M: client credentials grant for service principals. Config value: 1
390-
ClientCredentials = 1,
391-
/// U2M: browser-based authorization code + PKCE. Config value: 2
392-
Browser = 2,
372+
pub enum AuthType {
373+
/// Personal access token.
374+
AccessToken,
375+
/// M2M: client credentials grant for service principals.
376+
OAuthM2m,
377+
/// U2M: browser-based authorization code + PKCE.
378+
OAuthU2m,
393379
}
394380
```
395381

396382
### Authentication Selection
397383

398384
| Option | Type | Values | Required | Description |
399385
|--------|------|--------|----------|-------------|
400-
| `databricks.auth.mechanism` | Int | `0` (PAT), `11` (OAuth) | **Yes** | Authentication mechanism (matches ODBC `AuthMech`) |
401-
| `databricks.auth.flow` | Int | `0` (token passthrough), `1` (client credentials), `2` (browser) | **Yes** (when mechanism=`11`) | OAuth flow type (matches ODBC `Auth_Flow`) |
386+
| `databricks.auth.type` | String | `access_token`, `oauth_m2m`, `oauth_u2m` | **Yes** | Authentication method |
402387

403-
**Values aligned with ODBC driver:**
404-
405-
| `mechanism` | `flow` | ODBC `AuthMech` | ODBC `Auth_Flow` | Description |
406-
|-------------|--------|-----------------|-------------------|-------------|
407-
| `0` | -- | -- | -- | Personal access token |
408-
| `11` | `0` | 11 | 0 | Pre-obtained OAuth access token |
409-
| `11` | `1` | 11 | 1 | M2M: service principal |
410-
| `11` | `2` | 11 | 2 | U2M: browser-based auth code + PKCE |
388+
| Value | Description |
389+
|-------|-------------|
390+
| `access_token` | Personal access token |
391+
| `oauth_m2m` | M2M: client credentials for service principals |
392+
| `oauth_u2m` | U2M: browser-based authorization code + PKCE |
411393

412394
### Credential and OAuth Options
413395

414396
| Option | Type | Default | Required For | Description |
415397
|--------|------|---------|-------------|-------------|
416-
| `databricks.access_token` | String | -- | mechanism=`0`, flow=`0` | Access token (PAT or OAuth) |
417-
| `databricks.auth.client_id` | String | `"databricks-cli"` (flow=`2`) | flow=`1` (required), flow=`2` (optional) | OAuth client ID |
418-
| `databricks.auth.client_secret` | String | -- | flow=`1` | OAuth client secret |
419-
| `databricks.auth.scopes` | String | `"all-apis offline_access"` (flow=`2`), `"all-apis"` (flow=`1`) | No | Space-separated OAuth scopes |
398+
| `databricks.access_token` | String | -- | `access_token` | Personal access token |
399+
| `databricks.auth.client_id` | String | `"databricks-cli"` (`oauth_u2m`) | `oauth_m2m` (required), `oauth_u2m` (optional) | OAuth client ID |
400+
| `databricks.auth.client_secret` | String | -- | `oauth_m2m` | OAuth client secret |
401+
| `databricks.auth.scopes` | String | `"all-apis offline_access"` (`oauth_u2m`), `"all-apis"` (`oauth_m2m`) | No | Space-separated OAuth scopes |
420402
| `databricks.auth.token_endpoint` | String | Auto-discovered via OIDC | No | Override OIDC-discovered token endpoint |
421403
| `databricks.auth.redirect_port` | String | `"8020"` | No | Localhost port for browser callback server |
422404

423-
Both `mechanism` and `flow` are mandatory -- no auto-detection. This makes configuration explicit and predictable, matching the ODBC driver's approach where `AuthMech` and `Auth_Flow` are always specified.
405+
`databricks.auth.type` is mandatory -- no auto-detection. This makes configuration explicit and predictable.
424406

425407
---
426408

@@ -447,9 +429,9 @@ The `AuthProvider::get_auth_header()` trait method is synchronous, but OAuth tok
447429
// 1. Create HTTP client (no auth yet)
448430
let http_client = Arc::new(DatabricksHttpClient::new(self.http_config.clone())?);
449431

450-
// 2. Create auth provider based on mechanism + flow enums
432+
// 2. Create auth provider based on auth type
451433
// (see database.rs section below for full match logic)
452-
let auth_provider: Arc<dyn AuthProvider> = /* match on AuthMechanism/AuthFlow */;
434+
let auth_provider: Arc<dyn AuthProvider> = /* match on AuthType */;
453435

454436
// 3. Set auth on HTTP client
455437
http_client.set_auth_provider(auth_provider);
@@ -488,11 +470,10 @@ impl DatabricksHttpClient {
488470

489471
| Scenario | Error Kind | Behavior |
490472
|----------|-----------|----------|
491-
| Missing `databricks.auth.mechanism` | `invalid_argument()` | Fail at `new_connection()` |
492-
| Missing `databricks.auth.flow` when mechanism=`11` | `invalid_argument()` | Fail at `new_connection()` |
493-
| Invalid numeric value for mechanism or flow | `invalid_argument()` | Fail at `set_option()` |
494-
| Missing `client_id` or `client_secret` for flow=`1` | `invalid_argument()` | Fail at `new_connection()` |
495-
| Missing `access_token` for mechanism=`0` or flow=`0` | `invalid_argument()` | Fail at `new_connection()` |
473+
| Missing `databricks.auth.type` | `invalid_argument()` | Fail at `new_connection()` |
474+
| Invalid value for `databricks.auth.type` | `invalid_argument()` | Fail at `set_option()` |
475+
| Missing `client_id` or `client_secret` for `oauth_m2m` | `invalid_argument()` | Fail at `new_connection()` |
476+
| Missing `access_token` for `access_token` type | `invalid_argument()` | Fail at `new_connection()` |
496477
| OIDC discovery HTTP failure | `io()` | Fail at provider creation |
497478
| Token endpoint returns error | `io()` | Fail at `get_auth_header()` |
498479
| Browser callback timeout (120s) | `io()` | Fail at provider creation |
@@ -518,65 +499,39 @@ Two-phase auth initialization via `OnceLock` (see [Concurrency Model](#concurren
518499

519500
**New fields on `Database`:**
520501
```rust
521-
auth_mechanism: Option<AuthMechanism>,
522-
auth_flow: Option<AuthFlow>,
523-
auth_client_id: Option<String>,
524-
auth_client_secret: Option<String>,
525-
auth_scopes: Option<String>,
526-
auth_token_endpoint: Option<String>,
527-
auth_redirect_port: Option<u16>,
502+
auth_config: AuthConfig, // groups all auth-related options
528503
```
529504

530-
`set_option` parses numeric config values into the enums:
505+
`set_option` parses the auth type string:
531506
```rust
532-
"databricks.auth.mechanism" => {
533-
let v = Self::parse_int_option(&value)
534-
.ok_or_else(|| /* error: expected integer */)?;
535-
self.auth_mechanism = Some(AuthMechanism::try_from(v)?); // 0 -> Pat, 11 -> OAuth
536-
}
537-
"databricks.auth.flow" => {
538-
let v = Self::parse_int_option(&value)
539-
.ok_or_else(|| /* error: expected integer */)?;
540-
self.auth_flow = Some(AuthFlow::try_from(v)?); // 0 -> TokenPassthrough, 1 -> ClientCredentials, 2 -> Browser
507+
"databricks.auth.type" => {
508+
let v = value.as_ref();
509+
self.auth_config.auth_type = Some(AuthType::try_from(v)?);
510+
// "access_token" -> AccessToken, "oauth_m2m" -> OAuthM2m, "oauth_u2m" -> OAuthU2m
541511
}
542512
```
543513

544-
**Modified `new_connection()`:** Two-phase initialization with enum matching:
514+
**Modified `new_connection()`:** Two-phase initialization with auth type matching:
545515

546516
```rust
547517
// Phase 1: Create HTTP client (no auth yet)
548518
let http_client = Arc::new(DatabricksHttpClient::new(self.http_config.clone())?);
549519

550-
// Phase 2: Create auth provider based on mechanism + flow
551-
let mechanism = self.auth_mechanism.as_ref()
552-
.ok_or_else(|| /* error: databricks.auth.mechanism is required */)?;
520+
// Phase 2: Create auth provider based on auth type
521+
let auth_type = self.auth_config.validate(&self.access_token)?;
553522

554-
let auth_provider: Arc<dyn AuthProvider> = match mechanism {
555-
AuthMechanism::Pat => {
523+
let auth_provider: Arc<dyn AuthProvider> = match auth_type {
524+
AuthType::AccessToken => {
556525
let token = self.access_token.as_ref()
557-
.ok_or_else(|| /* error: access_token required for mechanism=0 */)?;
526+
.ok_or_else(|| /* error: access_token required */)?;
558527
Arc::new(PersonalAccessToken::new(token))
559528
}
560-
AuthMechanism::OAuth => {
561-
let flow = self.auth_flow.as_ref()
562-
.ok_or_else(|| /* error: databricks.auth.flow required when mechanism=11 */)?;
563-
match flow {
564-
AuthFlow::TokenPassthrough => {
565-
// No auto-refresh -- token is used as-is until it expires.
566-
// Matches ODBC behavior where expired tokens require the caller
567-
// to provide a new token via SQLSetConnectAttr.
568-
let token = self.access_token.as_ref()
569-
.ok_or_else(|| /* error: access_token required for flow=0 */)?;
570-
Arc::new(PersonalAccessToken::new(token))
571-
}
572-
AuthFlow::ClientCredentials => Arc::new(
573-
ClientCredentialsProvider::new(host, client_id, client_secret, http_client.clone())?
574-
),
575-
AuthFlow::Browser => Arc::new(
576-
AuthorizationCodeProvider::new(host, client_id, http_client.clone())?
577-
),
578-
}
579-
}
529+
AuthType::OAuthM2m => Arc::new(
530+
ClientCredentialsProvider::new(host, client_id, client_secret, http_client.clone())?
531+
),
532+
AuthType::OAuthU2m => Arc::new(
533+
AuthorizationCodeProvider::new(host, client_id, http_client.clone())?
534+
),
580535
};
581536

582537
// Phase 3: Wire auth into HTTP client

0 commit comments

Comments
 (0)