Skip to content

docs(rust): OAuth U2M and M2M authentication design#319

Merged
vikrantpuppala merged 6 commits intoadbc-drivers:mainfrom
vikrantpuppala:stack/oauth-u2m-m2m-design
Mar 13, 2026
Merged

docs(rust): OAuth U2M and M2M authentication design#319
vikrantpuppala merged 6 commits intoadbc-drivers:mainfrom
vikrantpuppala:stack/oauth-u2m-m2m-design

Conversation

@vikrantpuppala
Copy link
Collaborator

@vikrantpuppala vikrantpuppala commented Mar 7, 2026

🥞 Stacked PR

Use this link to review incremental changes.


Summary

  • Design document for OAuth U2M (authorization code + PKCE) and M2M (client credentials) authentication flows
  • Sprint plan for the 2-week implementation sprint

Key design decisions

  • Single databricks.auth.type string config (access_token, oauth_m2m, oauth_u2m) instead of ODBC-style numeric AuthMech/Auth_Flow
  • TokenStore state machine: FRESH → STALE (background refresh via tokio::task::spawn_blocking) → EXPIRED (blocking refresh)
  • TokenCache for U2M disk persistence at ~/.config/databricks-adbc/oauth/ with SHA-256 hashed filenames
  • OIDC discovery for endpoint URLs; oauth2 crate for protocol-level operations
  • Two-phase DatabricksHttpClient initialization via OnceLock to break circular auth dependency

Key files

  • rust/docs/designs/oauth-u2m-m2m-design.md
  • rust/docs/designs/oauth-sprint-plan.md

This pull request was AI-assisted by Isaac.

@vikrantpuppala vikrantpuppala changed the title docs: add OAuth U2M and M2M design document docs(rust): OAuth U2M and M2M authentication design + sprint plan Mar 7, 2026
@vikrantpuppala vikrantpuppala changed the title docs(rust): OAuth U2M and M2M authentication design + sprint plan [PECOBLR-2089] docs(rust): OAuth U2M and M2M authentication design + sprint plan Mar 9, 2026
/// Config values match the ODBC driver's AuthMech numeric codes.
#[derive(Debug, Clone, PartialEq)]
#[repr(u8)]
pub enum AuthMechanism {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this look specific to JDBC, is this also followed in other drivers like ADBC and Python?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — this is specific to the ODBC driver's config scheme. For ADBC, we adopted the same numeric codes (AuthMech=0/11, Auth_Flow=0/1/2) to stay aligned with ODBC since both drivers are maintained together and we want a consistent config surface for users switching between them. The Python SDK uses string-based config (auth_type="databricks-oauth-m2m") which is a different pattern — we intentionally chose the ODBC-aligned numeric approach for the Rust driver since it also backs the ODBC bridge layer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, rethinking this — you're right that numeric codes are an ODBC-ism that doesn't belong in the ADBC API. Updated to a single string-based databricks.auth.type option:

databricks.auth.type = "access_token"    # Personal access token
databricks.auth.type = "oauth_m2m"       # Client credentials (service principal)
databricks.auth.type = "oauth_u2m"       # Authorization code + PKCE (browser)

One key-value instead of two, self-documenting, no magic numbers. The ODBC bridge layer can map its own AuthMech/Auth_Flow DSN values to these strings internally.

Changes across the stack: design doc (this PR), AuthType enum + AuthConfig (PR #321), database.rs option parsing + new_connection() matching (PRs #321-#324), E2E tests (PR #323).


| Component | Mechanism | Guarantee |
|-----------|-----------|-----------|
| `TokenStore.token` | `std::sync::RwLock` | Multiple readers, single writer |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parking_lot::RwLock would be better in high concurrent scenario

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considered this — parking_lot::RwLock is better under high contention, but TokenStore is accessed once per HTTP request (read lock fast path), with writes only during token refresh. That's not a high-contention scenario, so std::sync::RwLock is sufficient here and avoids adding an extra dependency. If we see contention in benchmarks later we can revisit.

```

**Contract:**
- `get_or_refresh(refresh_fn)`: Returns a valid token. If STALE, spawns background refresh via `std::thread::spawn` and returns current token. If EXPIRED, blocks caller until refresh completes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can use tokio::spawn instead, which would be cheaper

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — updated token_store.rs to use tokio::task::spawn_blocking instead of std::thread::spawn. This reuses tokio's blocking thread pool rather than spawning a new OS thread for each background refresh. The driver already has a tokio runtime available since both M2M and U2M providers use tokio::task::block_in_place + Handle::current().block_on(). Change is in PR #320.

```

**Contract:**
- Binds to `localhost:{port}` (default 8020)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if provided port is in use?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also configurable via auth_redirect_port. we will do an auto-increment later

vikrantpuppala and others added 5 commits March 9, 2026 12:30
This implements SQLSTATE propagation from Databricks server errors to ODBC
clients, replacing generic HY000 with specific error codes like 42601
(syntax error) and 42S02 (table not found).

## Changes

### databricks-adbc/rust/src/error.rs
- Added `extract_sqlstate_from_message()` to parse "SQLSTATE: XXXXX" from
  server error messages
- Added `map_error_code_to_sqlstate()` to map error codes like
  PARSE_SYNTAX_ERROR→42601, TABLE_OR_VIEW_NOT_FOUND→42S02
- Added `sqlstate_str_to_array()` helper to convert strings to c_char arrays
- Added comprehensive unit tests for all new functions

### databricks-adbc/rust/src/client/sea.rs
- Updated error handling in `wait_for_completion()` to set SQLSTATE on errors
- Tries 3 sources in order: sql_state field, extracted from message, mapped
  from error_code
- Preserves error message while adding SQLSTATE metadata

### databricks-adbc/rust/src/types/sea.rs
- Added optional `sql_state` field to `ServiceError` struct for future server
  support

### databricks-odbc/src/adbc_error_utils.cpp
- **No changes needed** - Already correctly reads `error->sqlstate` and
  propagates to DriverException

## Test Results

- All Rust unit tests pass (122 passed)
- Standalone test verifies SQLSTATE extraction works correctly
- Error message "SQLSTATE: 42601" correctly extracted and propagated

## Exit Criteria

✓ Syntax errors return SQLSTATE 42601
✓ Table not found returns SQLSTATE 42S02
✓ Generic HY000 only used for truly unknown errors
✓ Code compiles successfully in both repos

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Design for OAuth 2.0 authentication in the Rust ADBC driver covering
Authorization Code + PKCE (U2M) and Client Credentials (M2M) flows,
including token refresh state machine, file-based caching, and OIDC
discovery.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use oauth2 crate for PKCE, token exchange, client credentials,
  and refresh token flows. Eliminates hand-rolled pkce.rs module.
- Reuse DatabricksHttpClient for token endpoint calls via
  execute_without_auth(), giving unified retry/timeout/pooling.
- Two-phase initialization: HTTP client created first, auth provider
  set later via OnceLock (matching SeaClient's reader_factory pattern).
- OAuth providers route token requests through the shared HTTP client
  with a custom oauth2 HTTP function adapter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add AuthMechanism enum: 0=Pat, 11=OAuth (matches ODBC AuthMech)
- Add AuthFlow enum: 0=TokenPassthrough, 1=ClientCredentials, 2=Browser
  (matches ODBC Auth_Flow)
- Both mechanism and flow are mandatory, no auto-detection
- Accept numeric values only, parsed via TryFrom
- Use unified DatabricksHttpClient with two-phase init
- Adopt oauth2 crate for protocol-level operations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes:
- databricks.oauth.token_endpoint -> databricks.auth.token_endpoint
- Config type Int/String -> Int (numeric only)
- Clarify oauth2 HTTP adapter needs thin conversion layer
- Architecture diagram shows M2M/U2M using execute_without_auth()
- Token passthrough (flow=0) documents no auto-refresh
- Stale threshold uses initial_TTL computed once at acquisition
- Deduplicate http.rs changes (reference Concurrency section)

Test strategy additions:
- Wiremock integration tests for full M2M flow with mocked HTTP
- Database config validation tests for enum parsing and new_connection
- HTTP client two-phase init tests for OnceLock lifecycle

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vikrantpuppala vikrantpuppala force-pushed the stack/oauth-u2m-m2m-design branch from cf7d895 to 4dd8f51 Compare March 12, 2026 07:10
@vikrantpuppala
Copy link
Collaborator Author

Range-diff: main (cf7d895 -> 4dd8f51)
rust/docs/designs/oauth-u2m-m2m-design.md
@@ -184,7 +184,7 @@
 +```
 +
 +**Contract:**
-+- `get_or_refresh(refresh_fn)`: Returns a valid token. If STALE, spawns background refresh via `std::thread::spawn` and returns current token. If EXPIRED, blocks caller until refresh completes.
++- `get_or_refresh(refresh_fn)`: Returns a valid token. If STALE, spawns background refresh via `tokio::spawn` and returns current token. If EXPIRED, blocks caller until refresh completes.
 +- Thread-safe: `RwLock` for read-heavy access, `AtomicBool` to prevent concurrent refresh.
 +- Only one refresh runs at a time; concurrent callers receive the current (stale) token.
 +
@@ -295,29 +295,29 @@
 +    participant DB as Database
 +    participant M2M as ClientCredsProvider
 +    participant OIDC as OIDC Discovery
-+    participant TE as Token Endpoint
++    participant TokenEP as Token Endpoint
 +
 +    App->>DB: new_connection()
 +    DB->>M2M: new(host, client_id, client_secret, scopes)
 +    M2M->>OIDC: GET {host}/oidc/.well-known/oauth-authorization-server
 +    OIDC-->>M2M: OidcEndpoints
 +
-+    Note over App,TE: First get_auth_header() call
++    Note over App,TokenEP: First get_auth_header() call
 +    App->>M2M: get_auth_header()
-+    M2M->>TE: client.exchange_client_credentials()
-+    TE-->>M2M: access_token (no refresh_token)
++    M2M->>TokenEP: client.exchange_client_credentials()
++    TokenEP-->>M2M: access_token (no refresh_token)
 +    M2M-->>App: "Bearer {access_token}"
 +
-+    Note over App,TE: Subsequent calls (token FRESH)
++    Note over App,TokenEP: Subsequent calls (token FRESH)
 +    App->>M2M: get_auth_header()
 +    M2M-->>App: "Bearer {cached_token}"
 +
-+    Note over App,TE: Token becomes STALE
++    Note over App,TokenEP: Token becomes STALE
 +    App->>M2M: get_auth_header()
 +    M2M->>M2M: Spawn background refresh
 +    M2M-->>App: "Bearer {current_token}"
-+    M2M->>TE: client.exchange_client_credentials() (background)
-+    TE-->>M2M: new access_token
++    M2M->>TokenEP: client.exchange_client_credentials() (background)
++    TokenEP-->>M2M: new access_token
 +```
 +
 +### Token Exchange (M2M)
rust/src/client/sea.rs
@@ -0,0 +1,64 @@
+diff --git a/rust/src/client/sea.rs b/rust/src/client/sea.rs
+--- a/rust/src/client/sea.rs
++++ b/rust/src/client/sea.rs
+             match current_response.status.state {
+                 StatementState::Succeeded => return Ok(current_response),
+                 StatementState::Failed => {
+-                    let error_msg = current_response
+-                        .status
+-                        .error
+-                        .as_ref()
++                    let service_error = current_response.status.error.as_ref();
++                    let error_msg = service_error
+                         .and_then(|e| e.message.clone())
+                         .unwrap_or_else(|| "Unknown error".to_string());
+-                    return Err(DatabricksErrorHelper::io().message(error_msg));
++
++                    // Create error with message
++                    let mut error = DatabricksErrorHelper::io().message(&error_msg);
++
++                    // Try to set SQLSTATE from server error in order of preference:
++                    // 1. sql_state field if present
++                    // 2. Extract from error message (server includes "SQLSTATE: XXXXX")
++                    // 3. Map from error_code
++                    if let Some(err) = service_error {
++                        let mut sqlstate_set = false;
++
++                        // Check if server provides sql_state field
++                        if let Some(ref sql_state) = err.sql_state {
++                            if sql_state.len() == 5 {
++                                if let Some(sqlstate) =
++                                    crate::error::sqlstate_str_to_array(sql_state)
++                                {
++                                    error = error.sqlstate(sqlstate);
++                                    sqlstate_set = true;
++                                }
++                            }
++                        }
++
++                        // If not, try to extract from error message
++                        if !sqlstate_set {
++                            if let Some(sqlstate) =
++                                crate::error::extract_sqlstate_from_message(&error_msg)
++                            {
++                                error = error.sqlstate(sqlstate);
++                                sqlstate_set = true;
++                            }
++                        }
++
++                        // If still no SQLSTATE, map from error_code
++                        if !sqlstate_set {
++                            if let Some(ref code) = err.error_code {
++                                if let Some(sqlstate) =
++                                    crate::error::map_error_code_to_sqlstate(code)
++                                {
++                                    error = error.sqlstate(sqlstate);
++                                }
++                            }
++                        }
++                    }
++
++                    return Err(error);
+                 }
+                 StatementState::Canceled => {
+                     return Err(
\ No newline at end of file
rust/src/error.rs
@@ -0,0 +1,184 @@
+diff --git a/rust/src/error.rs b/rust/src/error.rs
+--- a/rust/src/error.rs
++++ b/rust/src/error.rs
+ /// A convenient alias for Results with Databricks errors.
+ pub type Result<T> = std::result::Result<T, Error>;
+ 
++/// Extract SQLSTATE from error message text.
++///
++/// Many Databricks error messages include "SQLSTATE: XXXXX" in the text.
++/// This function extracts the 5-character SQLSTATE code if present.
++pub fn extract_sqlstate_from_message(message: &str) -> Option<[std::os::raw::c_char; 5]> {
++    // Look for pattern "SQLSTATE: XXXXX" or "SQLSTATE:XXXXX"
++    let sqlstate_pattern = "SQLSTATE:";
++    if let Some(start_idx) = message.find(sqlstate_pattern) {
++        let after_pattern = &message[start_idx + sqlstate_pattern.len()..];
++        // Skip optional whitespace
++        let trimmed = after_pattern.trim_start();
++        // Extract exactly 5 characters
++        if trimmed.len() >= 5 {
++            let sqlstate_str = &trimmed[..5];
++            // Verify it looks like a SQLSTATE (alphanumeric)
++            if sqlstate_str.chars().all(|c| c.is_ascii_alphanumeric()) {
++                return sqlstate_str_to_array(sqlstate_str);
++            }
++        }
++    }
++    None
++}
++
++/// Convert a 5-character SQLSTATE string to a c_char array.
++pub fn sqlstate_str_to_array(sqlstate_str: &str) -> Option<[std::os::raw::c_char; 5]> {
++    let bytes = sqlstate_str.as_bytes();
++    if bytes.len() != 5 {
++        return None;
++    }
++    Some([
++        bytes[0] as std::os::raw::c_char,
++        bytes[1] as std::os::raw::c_char,
++        bytes[2] as std::os::raw::c_char,
++        bytes[3] as std::os::raw::c_char,
++        bytes[4] as std::os::raw::c_char,
++    ])
++}
++
++/// Map Databricks server error codes to ANSI SQL SQLSTATE codes.
++///
++/// This function converts Databricks-specific error codes (e.g., PARSE_SYNTAX_ERROR,
++/// TABLE_OR_VIEW_NOT_FOUND) into standardized 5-character SQLSTATE codes as defined
++/// by the SQL standard and ODBC specification.
++///
++/// Returns a 5-byte array suitable for the ADBC error sqlstate field, or `None`
++/// if the error code is not recognized.
++pub fn map_error_code_to_sqlstate(error_code: &str) -> Option<[std::os::raw::c_char; 5]> {
++    let sqlstate_str = match error_code {
++        // Syntax errors - SQLSTATE 42601
++        "PARSE_SYNTAX_ERROR" => "42601",
++
++        // Table/view not found - SQLSTATE 42S02
++        "TABLE_OR_VIEW_NOT_FOUND" => "42S02",
++
++        // Column not found - SQLSTATE 42S22
++        "COLUMN_NOT_FOUND" | "UNRESOLVED_COLUMN" => "42S22",
++
++        // Division by zero - SQLSTATE 22012
++        "DIVIDE_BY_ZERO" => "22012",
++
++        // Invalid argument/data - SQLSTATE 22023
++        "INVALID_PARAMETER_VALUE" | "INVALID_ARGUMENT" => "22023",
++
++        // Data type mismatch - SQLSTATE 42804
++        "DATATYPE_MISMATCH" => "42804",
++
++        // Duplicate key - SQLSTATE 23000
++        "DUPLICATE_KEY" => "23000",
++
++        // Access denied - SQLSTATE 42000
++        "PERMISSION_DENIED" | "ACCESS_DENIED" => "42000",
++
++        // Numeric value out of range - SQLSTATE 22003
++        "NUMERIC_VALUE_OUT_OF_RANGE" | "ARITHMETIC_OVERFLOW" => "22003",
++
++        // Unknown/unmapped error codes return None
++        _ => return None,
++    };
++
++    sqlstate_str_to_array(sqlstate_str)
++}
++
+ #[cfg(test)]
+ mod tests {
+     use super::*;
+         assert_eq!(adbc_error.status, adbc_core::error::Status::NotImplemented);
+         assert!(adbc_error.message.contains("Databricks"));
+     }
++
++    #[test]
++    fn test_map_error_code_to_sqlstate() {
++        use super::map_error_code_to_sqlstate;
++
++        // Test syntax error mapping
++        let sqlstate = map_error_code_to_sqlstate("PARSE_SYNTAX_ERROR").unwrap();
++        assert_eq!(
++            std::str::from_utf8(unsafe {
++                std::slice::from_raw_parts(sqlstate.as_ptr() as *const u8, 5)
++            })
++            .unwrap(),
++            "42601"
++        );
++
++        // Test table not found mapping
++        let sqlstate = map_error_code_to_sqlstate("TABLE_OR_VIEW_NOT_FOUND").unwrap();
++        assert_eq!(
++            std::str::from_utf8(unsafe {
++                std::slice::from_raw_parts(sqlstate.as_ptr() as *const u8, 5)
++            })
++            .unwrap(),
++            "42S02"
++        );
++
++        // Test column not found mapping
++        let sqlstate = map_error_code_to_sqlstate("COLUMN_NOT_FOUND").unwrap();
++        assert_eq!(
++            std::str::from_utf8(unsafe {
++                std::slice::from_raw_parts(sqlstate.as_ptr() as *const u8, 5)
++            })
++            .unwrap(),
++            "42S22"
++        );
++
++        // Test unmapped error code returns None
++        assert!(map_error_code_to_sqlstate("UNKNOWN_ERROR_CODE").is_none());
++    }
++
++    #[test]
++    fn test_error_with_sqlstate() {
++        use super::map_error_code_to_sqlstate;
++
++        let sqlstate = map_error_code_to_sqlstate("PARSE_SYNTAX_ERROR").unwrap();
++        let error = DatabricksErrorHelper::invalid_argument()
++            .message("syntax error near 'SELECT'")
++            .sqlstate(sqlstate);
++
++        let adbc_error = error.to_adbc();
++        assert_eq!(
++            adbc_error.status,
++            adbc_core::error::Status::InvalidArguments
++        );
++        assert!(adbc_error.message.contains("syntax error"));
++
++        // Verify SQLSTATE is set correctly
++        let sqlstate_str = std::str::from_utf8(unsafe {
++            std::slice::from_raw_parts(adbc_error.sqlstate.as_ptr() as *const u8, 5)
++        })
++        .unwrap();
++        assert_eq!(sqlstate_str, "42601");
++    }
++
++    #[test]
++    fn test_extract_sqlstate_from_message() {
++        use super::extract_sqlstate_from_message;
++
++        // Test extraction with space after colon
++        let msg = "Error: Something went wrong. SQLSTATE: 42601 (line 1, pos 26)";
++        let sqlstate = extract_sqlstate_from_message(msg).unwrap();
++        let sqlstate_str = std::str::from_utf8(unsafe {
++            std::slice::from_raw_parts(sqlstate.as_ptr() as *const u8, 5)
++        })
++        .unwrap();
++        assert_eq!(sqlstate_str, "42601");
++
++        // Test extraction without space after colon
++        let msg2 = "Error message. SQLSTATE:42S02 some more text";
++        let sqlstate2 = extract_sqlstate_from_message(msg2).unwrap();
++        let sqlstate_str2 = std::str::from_utf8(unsafe {
++            std::slice::from_raw_parts(sqlstate2.as_ptr() as *const u8, 5)
++        })
++        .unwrap();
++        assert_eq!(sqlstate_str2, "42S02");
++
++        // Test message without SQLSTATE
++        let msg3 = "Error without sqlstate";
++        assert!(extract_sqlstate_from_message(msg3).is_none());
++    }
+ }
\ No newline at end of file
rust/src/types/sea.rs
@@ -0,0 +1,12 @@
+diff --git a/rust/src/types/sea.rs b/rust/src/types/sea.rs
+--- a/rust/src/types/sea.rs
++++ b/rust/src/types/sea.rs
+     pub error_code: Option<String>,
+     #[serde(default)]
+     pub message: Option<String>,
++    /// SQL state code if provided by server (typically in the message, but may be a separate field)
++    #[serde(default)]
++    pub sql_state: Option<String>,
+ }
+ 
+ /// Manifest describing the result set structure.
\ No newline at end of file

Reproduce locally: git range-diff 53f137a..cf7d895 4115e5f..4dd8f51 | Disable: git config gitstack.push-range-diff false

@vikrantpuppala vikrantpuppala force-pushed the stack/oauth-u2m-m2m-design branch from 4dd8f51 to d2cecd6 Compare March 12, 2026 07:34
@vikrantpuppala
Copy link
Collaborator Author

Range-diff: main (4dd8f51 -> d2cecd6)
rust/docs/designs/oauth-u2m-m2m-design.md
@@ -366,34 +366,20 @@
 +
 +## Configuration Options
 +
-+Following the Databricks ODBC driver's two-level authentication scheme, configuration uses `AuthMech` (mechanism) and `Auth_Flow` (OAuth flow type) as the primary selectors. Both are **required** -- no auto-detection.
++Authentication is configured via a single `databricks.auth.type` string option that selects the authentication method. This replaces the ODBC-style two-level `AuthMech`/`Auth_Flow` numeric scheme with self-describing string values.
 +
-+### Rust Enums
++### Rust Enum
 +
 +```rust
-+/// Authentication mechanism -- top-level selector.
-+/// Config values match the ODBC driver's AuthMech numeric codes.
++/// Authentication type -- single selector for the authentication method.
 +#[derive(Debug, Clone, PartialEq)]
-+#[repr(u8)]
-+pub enum AuthMechanism {
-+    /// Personal access token (no OAuth). Config value: 0
-+    Pat = 0,
-+    /// OAuth 2.0 -- requires AuthFlow to select the specific flow. Config value: 11
-+    OAuth = 11,
-+}
-+
-+/// OAuth authentication flow -- selects the specific OAuth grant type.
-+/// Config values match the ODBC driver's Auth_Flow numeric codes.
-+/// Only applicable when AuthMechanism is OAuth.
-+#[derive(Debug, Clone, PartialEq)]
-+#[repr(u8)]
-+pub enum AuthFlow {
-+    /// Use a pre-obtained OAuth access token directly. Config value: 0
-+    TokenPassthrough = 0,
-+    /// M2M: client credentials grant for service principals. Config value: 1
-+    ClientCredentials = 1,
-+    /// U2M: browser-based authorization code + PKCE. Config value: 2
-+    Browser = 2,
++pub enum AuthType {
++    /// Personal access token.
++    AccessToken,
++    /// M2M: client credentials grant for service principals.
++    OAuthM2m,
++    /// U2M: browser-based authorization code + PKCE.
++    OAuthU2m,
 +}
 +```
 +
@@ -401,30 +387,26 @@
 +
 +| Option | Type | Values | Required | Description |
 +|--------|------|--------|----------|-------------|
-+| `databricks.auth.mechanism` | Int | `0` (PAT), `11` (OAuth) | **Yes** | Authentication mechanism (matches ODBC `AuthMech`) |
-+| `databricks.auth.flow` | Int | `0` (token passthrough), `1` (client credentials), `2` (browser) | **Yes** (when mechanism=`11`) | OAuth flow type (matches ODBC `Auth_Flow`) |
-+
-+**Values aligned with ODBC driver:**
++| `databricks.auth.type` | String | `access_token`, `oauth_m2m`, `oauth_u2m` | **Yes** | Authentication method |
 +
-+| `mechanism` | `flow` | ODBC `AuthMech` | ODBC `Auth_Flow` | Description |
-+|-------------|--------|-----------------|-------------------|-------------|
-+| `0` | -- | -- | -- | Personal access token |
-+| `11` | `0` | 11 | 0 | Pre-obtained OAuth access token |
-+| `11` | `1` | 11 | 1 | M2M: service principal |
-+| `11` | `2` | 11 | 2 | U2M: browser-based auth code + PKCE |
++| Value | Description |
++|-------|-------------|
++| `access_token` | Personal access token |
++| `oauth_m2m` | M2M: client credentials for service principals |
++| `oauth_u2m` | U2M: browser-based authorization code + PKCE |
 +
 +### Credential and OAuth Options
 +
 +| Option | Type | Default | Required For | Description |
 +|--------|------|---------|-------------|-------------|
-+| `databricks.access_token` | String | -- | mechanism=`0`, flow=`0` | Access token (PAT or OAuth) |
-+| `databricks.auth.client_id` | String | `"databricks-cli"` (flow=`2`) | flow=`1` (required), flow=`2` (optional) | OAuth client ID |
-+| `databricks.auth.client_secret` | String | -- | flow=`1` | OAuth client secret |
-+| `databricks.auth.scopes` | String | `"all-apis offline_access"` (flow=`2`), `"all-apis"` (flow=`1`) | No | Space-separated OAuth scopes |
++| `databricks.access_token` | String | -- | `access_token` | Personal access token |
++| `databricks.auth.client_id` | String | `"databricks-cli"` (`oauth_u2m`) | `oauth_m2m` (required), `oauth_u2m` (optional) | OAuth client ID |
++| `databricks.auth.client_secret` | String | -- | `oauth_m2m` | OAuth client secret |
++| `databricks.auth.scopes` | String | `"all-apis offline_access"` (`oauth_u2m`), `"all-apis"` (`oauth_m2m`) | No | Space-separated OAuth scopes |
 +| `databricks.auth.token_endpoint` | String | Auto-discovered via OIDC | No | Override OIDC-discovered token endpoint |
 +| `databricks.auth.redirect_port` | String | `"8020"` | No | Localhost port for browser callback server |
 +
-+Both `mechanism` and `flow` are mandatory -- no auto-detection. This makes configuration explicit and predictable, matching the ODBC driver's approach where `AuthMech` and `Auth_Flow` are always specified.
++`databricks.auth.type` is mandatory -- no auto-detection. This makes configuration explicit and predictable.
 +
 +---
 +
@@ -451,9 +433,9 @@
 +// 1. Create HTTP client (no auth yet)
 +let http_client = Arc::new(DatabricksHttpClient::new(self.http_config.clone())?);
 +
-+// 2. Create auth provider based on mechanism + flow enums
++// 2. Create auth provider based on auth type
 +//    (see database.rs section below for full match logic)
-+let auth_provider: Arc<dyn AuthProvider> = /* match on AuthMechanism/AuthFlow */;
++let auth_provider: Arc<dyn AuthProvider> = /* match on AuthType */;
 +
 +// 3. Set auth on HTTP client
 +http_client.set_auth_provider(auth_provider);
@@ -492,11 +474,10 @@
 +
 +| Scenario | Error Kind | Behavior |
 +|----------|-----------|----------|
-+| Missing `databricks.auth.mechanism` | `invalid_argument()` | Fail at `new_connection()` |
-+| Missing `databricks.auth.flow` when mechanism=`11` | `invalid_argument()` | Fail at `new_connection()` |
-+| Invalid numeric value for mechanism or flow | `invalid_argument()` | Fail at `set_option()` |
-+| Missing `client_id` or `client_secret` for flow=`1` | `invalid_argument()` | Fail at `new_connection()` |
-+| Missing `access_token` for mechanism=`0` or flow=`0` | `invalid_argument()` | Fail at `new_connection()` |
++| Missing `databricks.auth.type` | `invalid_argument()` | Fail at `new_connection()` |
++| Invalid value for `databricks.auth.type` | `invalid_argument()` | Fail at `set_option()` |
++| Missing `client_id` or `client_secret` for `oauth_m2m` | `invalid_argument()` | Fail at `new_connection()` |
++| Missing `access_token` for `access_token` type | `invalid_argument()` | Fail at `new_connection()` |
 +| OIDC discovery HTTP failure | `io()` | Fail at provider creation |
 +| Token endpoint returns error | `io()` | Fail at `get_auth_header()` |
 +| Browser callback timeout (120s) | `io()` | Fail at provider creation |
@@ -522,65 +503,39 @@
 +
 +**New fields on `Database`:**
 +```rust
-+auth_mechanism: Option<AuthMechanism>,
-+auth_flow: Option<AuthFlow>,
-+auth_client_id: Option<String>,
-+auth_client_secret: Option<String>,
-+auth_scopes: Option<String>,
-+auth_token_endpoint: Option<String>,
-+auth_redirect_port: Option<u16>,
++auth_config: AuthConfig,  // groups all auth-related options
 +```
 +
-+`set_option` parses numeric config values into the enums:
++`set_option` parses the auth type string:
 +```rust
-+"databricks.auth.mechanism" => {
-+    let v = Self::parse_int_option(&value)
-+        .ok_or_else(|| /* error: expected integer */)?;
-+    self.auth_mechanism = Some(AuthMechanism::try_from(v)?);  // 0 -> Pat, 11 -> OAuth
-+}
-+"databricks.auth.flow" => {
-+    let v = Self::parse_int_option(&value)
-+        .ok_or_else(|| /* error: expected integer */)?;
-+    self.auth_flow = Some(AuthFlow::try_from(v)?);  // 0 -> TokenPassthrough, 1 -> ClientCredentials, 2 -> Browser
++"databricks.auth.type" => {
++    let v = value.as_ref();
++    self.auth_config.auth_type = Some(AuthType::try_from(v)?);
++    // "access_token" -> AccessToken, "oauth_m2m" -> OAuthM2m, "oauth_u2m" -> OAuthU2m
 +}
 +```
 +
-+**Modified `new_connection()`:** Two-phase initialization with enum matching:
++**Modified `new_connection()`:** Two-phase initialization with auth type matching:
 +
 +```rust
 +// Phase 1: Create HTTP client (no auth yet)
 +let http_client = Arc::new(DatabricksHttpClient::new(self.http_config.clone())?);
 +
-+// Phase 2: Create auth provider based on mechanism + flow
-+let mechanism = self.auth_mechanism.as_ref()
-+    .ok_or_else(|| /* error: databricks.auth.mechanism is required */)?;
++// Phase 2: Create auth provider based on auth type
++let auth_type = self.auth_config.validate(&self.access_token)?;
 +
-+let auth_provider: Arc<dyn AuthProvider> = match mechanism {
-+    AuthMechanism::Pat => {
++let auth_provider: Arc<dyn AuthProvider> = match auth_type {
++    AuthType::AccessToken => {
 +        let token = self.access_token.as_ref()
-+            .ok_or_else(|| /* error: access_token required for mechanism=0 */)?;
++            .ok_or_else(|| /* error: access_token required */)?;
 +        Arc::new(PersonalAccessToken::new(token))
 +    }
-+    AuthMechanism::OAuth => {
-+        let flow = self.auth_flow.as_ref()
-+            .ok_or_else(|| /* error: databricks.auth.flow required when mechanism=11 */)?;
-+        match flow {
-+            AuthFlow::TokenPassthrough => {
-+                // No auto-refresh -- token is used as-is until it expires.
-+                // Matches ODBC behavior where expired tokens require the caller
-+                // to provide a new token via SQLSetConnectAttr.
-+                let token = self.access_token.as_ref()
-+                    .ok_or_else(|| /* error: access_token required for flow=0 */)?;
-+                Arc::new(PersonalAccessToken::new(token))
-+            }
-+            AuthFlow::ClientCredentials => Arc::new(
-+                ClientCredentialsProvider::new(host, client_id, client_secret, http_client.clone())?
-+            ),
-+            AuthFlow::Browser => Arc::new(
-+                AuthorizationCodeProvider::new(host, client_id, http_client.clone())?
-+            ),
-+        }
-+    }
++    AuthType::OAuthM2m => Arc::new(
++        ClientCredentialsProvider::new(host, client_id, client_secret, http_client.clone())?
++    ),
++    AuthType::OAuthU2m => Arc::new(
++        AuthorizationCodeProvider::new(host, client_id, http_client.clone())?
++    ),
 +};
 +
 +// Phase 3: Wire auth into HTTP client
rust/src/client/sea.rs
@@ -0,0 +1,64 @@
+diff --git a/rust/src/client/sea.rs b/rust/src/client/sea.rs
+--- a/rust/src/client/sea.rs
++++ b/rust/src/client/sea.rs
+             match current_response.status.state {
+                 StatementState::Succeeded => return Ok(current_response),
+                 StatementState::Failed => {
+-                    let error_msg = current_response
+-                        .status
+-                        .error
+-                        .as_ref()
++                    let service_error = current_response.status.error.as_ref();
++                    let error_msg = service_error
+                         .and_then(|e| e.message.clone())
+                         .unwrap_or_else(|| "Unknown error".to_string());
+-                    return Err(DatabricksErrorHelper::io().message(error_msg));
++
++                    // Create error with message
++                    let mut error = DatabricksErrorHelper::io().message(&error_msg);
++
++                    // Try to set SQLSTATE from server error in order of preference:
++                    // 1. sql_state field if present
++                    // 2. Extract from error message (server includes "SQLSTATE: XXXXX")
++                    // 3. Map from error_code
++                    if let Some(err) = service_error {
++                        let mut sqlstate_set = false;
++
++                        // Check if server provides sql_state field
++                        if let Some(ref sql_state) = err.sql_state {
++                            if sql_state.len() == 5 {
++                                if let Some(sqlstate) =
++                                    crate::error::sqlstate_str_to_array(sql_state)
++                                {
++                                    error = error.sqlstate(sqlstate);
++                                    sqlstate_set = true;
++                                }
++                            }
++                        }
++
++                        // If not, try to extract from error message
++                        if !sqlstate_set {
++                            if let Some(sqlstate) =
++                                crate::error::extract_sqlstate_from_message(&error_msg)
++                            {
++                                error = error.sqlstate(sqlstate);
++                                sqlstate_set = true;
++                            }
++                        }
++
++                        // If still no SQLSTATE, map from error_code
++                        if !sqlstate_set {
++                            if let Some(ref code) = err.error_code {
++                                if let Some(sqlstate) =
++                                    crate::error::map_error_code_to_sqlstate(code)
++                                {
++                                    error = error.sqlstate(sqlstate);
++                                }
++                            }
++                        }
++                    }
++
++                    return Err(error);
+                 }
+                 StatementState::Canceled => {
+                     return Err(
\ No newline at end of file
rust/src/error.rs
@@ -0,0 +1,184 @@
+diff --git a/rust/src/error.rs b/rust/src/error.rs
+--- a/rust/src/error.rs
++++ b/rust/src/error.rs
+ /// A convenient alias for Results with Databricks errors.
+ pub type Result<T> = std::result::Result<T, Error>;
+ 
++/// Extract SQLSTATE from error message text.
++///
++/// Many Databricks error messages include "SQLSTATE: XXXXX" in the text.
++/// This function extracts the 5-character SQLSTATE code if present.
++pub fn extract_sqlstate_from_message(message: &str) -> Option<[std::os::raw::c_char; 5]> {
++    // Look for pattern "SQLSTATE: XXXXX" or "SQLSTATE:XXXXX"
++    let sqlstate_pattern = "SQLSTATE:";
++    if let Some(start_idx) = message.find(sqlstate_pattern) {
++        let after_pattern = &message[start_idx + sqlstate_pattern.len()..];
++        // Skip optional whitespace
++        let trimmed = after_pattern.trim_start();
++        // Extract exactly 5 characters
++        if trimmed.len() >= 5 {
++            let sqlstate_str = &trimmed[..5];
++            // Verify it looks like a SQLSTATE (alphanumeric)
++            if sqlstate_str.chars().all(|c| c.is_ascii_alphanumeric()) {
++                return sqlstate_str_to_array(sqlstate_str);
++            }
++        }
++    }
++    None
++}
++
++/// Convert a 5-character SQLSTATE string to a c_char array.
++pub fn sqlstate_str_to_array(sqlstate_str: &str) -> Option<[std::os::raw::c_char; 5]> {
++    let bytes = sqlstate_str.as_bytes();
++    if bytes.len() != 5 {
++        return None;
++    }
++    Some([
++        bytes[0] as std::os::raw::c_char,
++        bytes[1] as std::os::raw::c_char,
++        bytes[2] as std::os::raw::c_char,
++        bytes[3] as std::os::raw::c_char,
++        bytes[4] as std::os::raw::c_char,
++    ])
++}
++
++/// Map Databricks server error codes to ANSI SQL SQLSTATE codes.
++///
++/// This function converts Databricks-specific error codes (e.g., PARSE_SYNTAX_ERROR,
++/// TABLE_OR_VIEW_NOT_FOUND) into standardized 5-character SQLSTATE codes as defined
++/// by the SQL standard and ODBC specification.
++///
++/// Returns a 5-byte array suitable for the ADBC error sqlstate field, or `None`
++/// if the error code is not recognized.
++pub fn map_error_code_to_sqlstate(error_code: &str) -> Option<[std::os::raw::c_char; 5]> {
++    let sqlstate_str = match error_code {
++        // Syntax errors - SQLSTATE 42601
++        "PARSE_SYNTAX_ERROR" => "42601",
++
++        // Table/view not found - SQLSTATE 42S02
++        "TABLE_OR_VIEW_NOT_FOUND" => "42S02",
++
++        // Column not found - SQLSTATE 42S22
++        "COLUMN_NOT_FOUND" | "UNRESOLVED_COLUMN" => "42S22",
++
++        // Division by zero - SQLSTATE 22012
++        "DIVIDE_BY_ZERO" => "22012",
++
++        // Invalid argument/data - SQLSTATE 22023
++        "INVALID_PARAMETER_VALUE" | "INVALID_ARGUMENT" => "22023",
++
++        // Data type mismatch - SQLSTATE 42804
++        "DATATYPE_MISMATCH" => "42804",
++
++        // Duplicate key - SQLSTATE 23000
++        "DUPLICATE_KEY" => "23000",
++
++        // Access denied - SQLSTATE 42000
++        "PERMISSION_DENIED" | "ACCESS_DENIED" => "42000",
++
++        // Numeric value out of range - SQLSTATE 22003
++        "NUMERIC_VALUE_OUT_OF_RANGE" | "ARITHMETIC_OVERFLOW" => "22003",
++
++        // Unknown/unmapped error codes return None
++        _ => return None,
++    };
++
++    sqlstate_str_to_array(sqlstate_str)
++}
++
+ #[cfg(test)]
+ mod tests {
+     use super::*;
+         assert_eq!(adbc_error.status, adbc_core::error::Status::NotImplemented);
+         assert!(adbc_error.message.contains("Databricks"));
+     }
++
++    #[test]
++    fn test_map_error_code_to_sqlstate() {
++        use super::map_error_code_to_sqlstate;
++
++        // Test syntax error mapping
++        let sqlstate = map_error_code_to_sqlstate("PARSE_SYNTAX_ERROR").unwrap();
++        assert_eq!(
++            std::str::from_utf8(unsafe {
++                std::slice::from_raw_parts(sqlstate.as_ptr() as *const u8, 5)
++            })
++            .unwrap(),
++            "42601"
++        );
++
++        // Test table not found mapping
++        let sqlstate = map_error_code_to_sqlstate("TABLE_OR_VIEW_NOT_FOUND").unwrap();
++        assert_eq!(
++            std::str::from_utf8(unsafe {
++                std::slice::from_raw_parts(sqlstate.as_ptr() as *const u8, 5)
++            })
++            .unwrap(),
++            "42S02"
++        );
++
++        // Test column not found mapping
++        let sqlstate = map_error_code_to_sqlstate("COLUMN_NOT_FOUND").unwrap();
++        assert_eq!(
++            std::str::from_utf8(unsafe {
++                std::slice::from_raw_parts(sqlstate.as_ptr() as *const u8, 5)
++            })
++            .unwrap(),
++            "42S22"
++        );
++
++        // Test unmapped error code returns None
++        assert!(map_error_code_to_sqlstate("UNKNOWN_ERROR_CODE").is_none());
++    }
++
++    #[test]
++    fn test_error_with_sqlstate() {
++        use super::map_error_code_to_sqlstate;
++
++        let sqlstate = map_error_code_to_sqlstate("PARSE_SYNTAX_ERROR").unwrap();
++        let error = DatabricksErrorHelper::invalid_argument()
++            .message("syntax error near 'SELECT'")
++            .sqlstate(sqlstate);
++
++        let adbc_error = error.to_adbc();
++        assert_eq!(
++            adbc_error.status,
++            adbc_core::error::Status::InvalidArguments
++        );
++        assert!(adbc_error.message.contains("syntax error"));
++
++        // Verify SQLSTATE is set correctly
++        let sqlstate_str = std::str::from_utf8(unsafe {
++            std::slice::from_raw_parts(adbc_error.sqlstate.as_ptr() as *const u8, 5)
++        })
++        .unwrap();
++        assert_eq!(sqlstate_str, "42601");
++    }
++
++    #[test]
++    fn test_extract_sqlstate_from_message() {
++        use super::extract_sqlstate_from_message;
++
++        // Test extraction with space after colon
++        let msg = "Error: Something went wrong. SQLSTATE: 42601 (line 1, pos 26)";
++        let sqlstate = extract_sqlstate_from_message(msg).unwrap();
++        let sqlstate_str = std::str::from_utf8(unsafe {
++            std::slice::from_raw_parts(sqlstate.as_ptr() as *const u8, 5)
++        })
++        .unwrap();
++        assert_eq!(sqlstate_str, "42601");
++
++        // Test extraction without space after colon
++        let msg2 = "Error message. SQLSTATE:42S02 some more text";
++        let sqlstate2 = extract_sqlstate_from_message(msg2).unwrap();
++        let sqlstate_str2 = std::str::from_utf8(unsafe {
++            std::slice::from_raw_parts(sqlstate2.as_ptr() as *const u8, 5)
++        })
++        .unwrap();
++        assert_eq!(sqlstate_str2, "42S02");
++
++        // Test message without SQLSTATE
++        let msg3 = "Error without sqlstate";
++        assert!(extract_sqlstate_from_message(msg3).is_none());
++    }
+ }
\ No newline at end of file
rust/src/types/sea.rs
@@ -0,0 +1,12 @@
+diff --git a/rust/src/types/sea.rs b/rust/src/types/sea.rs
+--- a/rust/src/types/sea.rs
++++ b/rust/src/types/sea.rs
+     pub error_code: Option<String>,
+     #[serde(default)]
+     pub message: Option<String>,
++    /// SQL state code if provided by server (typically in the message, but may be a separate field)
++    #[serde(default)]
++    pub sql_state: Option<String>,
+ }
+ 
+ /// Manifest describing the result set structure.
\ No newline at end of file

Reproduce locally: git range-diff f1b352f..4dd8f51 4115e5f..d2cecd6 | Disable: git config gitstack.push-range-diff false

3-task breakdown covering foundation + HTTP client changes,
M2M provider, and U2M provider with full test coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vikrantpuppala vikrantpuppala force-pushed the stack/oauth-u2m-m2m-design branch from d2cecd6 to 250ff3d Compare March 12, 2026 13:49
@vikrantpuppala vikrantpuppala changed the title [PECOBLR-2089] docs(rust): OAuth U2M and M2M authentication design + sprint plan [PECOBLR-2089] docs(rust/oauth): OAuth U2M and M2M authentication design Mar 12, 2026
@vikrantpuppala vikrantpuppala changed the title [PECOBLR-2089] docs(rust/oauth): OAuth U2M and M2M authentication design docs(rust/oauth): OAuth U2M and M2M authentication design Mar 13, 2026
@vikrantpuppala vikrantpuppala changed the title docs(rust/oauth): OAuth U2M and M2M authentication design docs(rust): OAuth U2M and M2M authentication design Mar 13, 2026
@vikrantpuppala vikrantpuppala merged commit 5fe4084 into adbc-drivers:main Mar 13, 2026
19 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants