Conversation
Add a new `odbc-ffi` feature that exposes extern "C" functions for ODBC catalog operations (SQLTables, SQLColumns, SQLGetTypeInfo, SQLPrimaryKeys, etc.) returning flat Arrow result sets via the Arrow C Data Interface. New files: - metadata/schemas.rs: JDBC/ODBC result set schema definitions - metadata/type_info.rs: Static type info catalog for SQLGetTypeInfo - metadata/service.rs: MetadataService trait + ConnectionMetadataService - ffi/mod.rs, error.rs, handle.rs, odbc.rs: C FFI surface - spec/odbc-metadata-ffi-design.md: Design specification Also adds PK/FK parsers to parse.rs, public accessors on Connection, and INTERVAL type mapping aligned with the Databricks JDBC driver. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Soundness fixes: - Remove unsound 'static lifetime from handle_to_service(), use explicit lifetime parameter 'a instead - Remove unsound 'static lifetime from c_str_to_option/c_str_to_str - Remove dead odbc_connection_create() that always failed - Gate client()/runtime_handle() accessors behind odbc-ffi feature Correctness fixes: - Fix FK update_rule/delete_rule: use SQL_NO_ACTION (3) instead of SQL_RESTRICT (1) — Databricks FKs are informational - Fix TYPE_INFO_SCHEMA: use Int16 instead of Boolean for CASE_SENSITIVE, UNSIGNED_ATTRIBUTE, FIXED_PREC_SCALE, AUTO_UNIQUE_VALUE per ODBC spec - Document SQL_ALL_TYPES == 0 ambiguity in odbc_get_type_info Code quality: - Extract build_type_info_batch() as shared function, removing duplicated test helper - Remove let _ = n; anti-pattern in get_tables/get_primary_keys/ get_foreign_keys - Add design notes documenting INTERVAL/ARRAY/MAP/STRUCT type code decisions and TIMESTAMP/TIMESTAMP_NTZ dual-entry behavior - Add test for get_type_info(93) returning both TIMESTAMP entries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The static type info catalog (SQLGetTypeInfo data) is ODBC-specific and belongs in the ODBC wrapper layer, not in the Arrow-native ADBC driver. The caller is expected to handle type conversions. Removed: - src/metadata/type_info.rs (static TYPE_INFO_ENTRIES catalog) - TYPE_INFO_SCHEMA from schemas.rs - get_type_info from MetadataService trait and implementation - odbc_get_type_info FFI function - build_type_info_batch and append_opt_bool helpers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The FFI layer is not ODBC-specific — any library or application can call
these functions to get flat Arrow RecordBatches of catalog metadata.
Renames:
- Feature: odbc-ffi → metadata-ffi
- File: ffi/odbc.rs → ffi/catalog.rs
- Types: OdbcConnectionHandle → FfiConnectionHandle,
OdbcFfiStatus → FfiStatus, OdbcFfiError → FfiError
- Functions: odbc_* → metadata_*
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove intermediate Arrow → Rust structs → Arrow conversion in the FFI path. The metadata service now returns raw RecordBatch data directly from Databricks, letting the caller handle column renaming and type mapping. - Rewrite service.rs: remove MetadataService trait, stub methods, and all parsing/rebuilding logic; add collect_batches() using concat_batches - Delete schemas.rs (only used by old service.rs for schema definitions) - Trim ffi/catalog.rs from 16 to 9 FFI functions (remove table_types, statistics, special_columns, procedures, procedure_columns, table_privileges, column_privileges) - Clean up parse.rs: remove PrimaryKeyInfo, ForeignKeyInfo structs and parse_primary_keys, parse_foreign_keys (only used by old service.rs) - ADBC get_objects path (connection.rs → builder.rs) unchanged Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tation Rewrite the spec to reflect the raw Arrow pass-through architecture: - No MetadataService trait, just ConnectionMetadataService - No schemas.rs or type_info.rs (deleted) - 9 FFI functions (down from 16), 6 catalog methods - Feature renamed from odbc-ffi to metadata-ffi - Caller handles column renaming, type mapping, reshaping Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…error Add panic safety (catch_unwind) to handle management functions and clear_last_error at FFI entry points so callers never see stale errors. Add unit tests for error buffer behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tches Replace the collect-all-then-re-wrap pattern with direct streaming: - service.rs returns Box<dyn ResultReader> instead of RecordBatch - Remove collect_batches() and multi-catalog orchestration in get_columns - catalog.rs uses ResultReaderAdapter to stream through FFI - list_columns now takes Option<&str> catalog, uses SHOW COLUMNS IN ALL CATALOGS - build_show_columns handles optional catalog (no longer returns Result) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d FFI handle - ResultReaderAdapter: schema, iteration, empty, error propagation, schema error - export_reader: empty reader, multiple batches, schema error - handle_panic: &str, String, and unknown panic types - handle.rs: null connection sets proper error message Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rust/src/ffi/catalog.rs
Outdated
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
||
| //! `extern "C"` catalog metadata functions. |
There was a problem hiding this comment.
this file is not only for catalog?
rust/src/ffi/catalog.rs
Outdated
There was a problem hiding this comment.
file name is misleading
There was a problem hiding this comment.
changed to metadata.rs
rust/src/ffi/catalog.rs
Outdated
|
|
||
| /// Export a RecordBatch as an FFI_ArrowArrayStream. | ||
| /// | ||
| /// The caller is responsible for releasing the stream. |
There was a problem hiding this comment.
is this safe? why do we need this function?
There was a problem hiding this comment.
yes, needed for required params (PK/FK)
rust/src/ffi/catalog.rs
Outdated
| let reader: Box<dyn RecordBatchReader + Send> = Box::new(RecordBatchIterator::new( | ||
| vec![Ok(batch)].into_iter(), | ||
| schema, | ||
| )); |
There was a problem hiding this comment.
we are collecting all batches first so this is not really streaming, can we use ResultReaderAdapter that we already have?
rust/src/ffi/catalog.rs
Outdated
| /// - `conn` must be a valid handle from `metadata_connection_from_ref()` | ||
| /// - `out` must point to a valid, writable `FFI_ArrowArrayStream` | ||
| #[no_mangle] | ||
| pub unsafe extern "C" fn metadata_get_catalogs( |
There was a problem hiding this comment.
none of the extern "C" functions in catalog.rs use std::panic::catch_unwind.
rust/src/lib.rs
Outdated
| // Metadata FFI — additional extern "C" functions for catalog metadata | ||
| // when built with `cargo build --features metadata-ffi` | ||
| #[cfg(feature = "metadata-ffi")] | ||
| pub mod ffi; |
There was a problem hiding this comment.
should this be pub(crate) mod ffi
rust/src/metadata/service.rs
Outdated
| use std::sync::Arc; | ||
|
|
||
| /// Collect all batches from an `ExecuteResult` into a single `RecordBatch`. | ||
| fn collect_batches(result: ExecuteResult) -> Result<RecordBatch> { |
There was a problem hiding this comment.
why? let's stream directly, see existing result set adapter pattern
There was a problem hiding this comment.
Done, returns Box directly
rust/src/metadata/service.rs
Outdated
| let mut all_batches = Vec::new(); | ||
| let mut schema = None; | ||
|
|
||
| for cat in &catalogs_to_query { |
There was a problem hiding this comment.
this is sequential, any reason why? eventually we should move to all catalogs.
rust/src/metadata/service.rs
Outdated
| /// | ||
| /// Executes metadata SQL queries and returns raw Arrow `RecordBatch` results | ||
| /// directly from the server, without intermediate parsing or schema reshaping. | ||
| pub struct ConnectionMetadataService { |
There was a problem hiding this comment.
should we gate this with #[cfg(feature = "metadata-ffi")]
- Rename ffi/catalog.rs → ffi/metadata.rs (file handles all metadata, not just catalogs) - Change `pub mod ffi` to `pub(crate) mod ffi` (no need to expose FFI internals) - Gate `metadata/service.rs` with `#[cfg(feature = "metadata-ffi")]` Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…for None - Remove wildcard/ALL CATALOGS handling from build_show_columns - Catalog is now required; service returns empty stream when None - SeaClient validates catalog is present before building SQL Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Overall: Good progress from v1 — streaming via My main concern is the layering. I'd recommend removing Detailed comments on individual files below. This comment was generated with GitHub MCP. |
rust/src/metadata/service.rs
Outdated
| @@ -0,0 +1,149 @@ | |||
| // Copyright (c) 2025 ADBC Drivers Contributors | |||
There was a problem hiding this comment.
This entire file can be removed. Every method follows the same pattern:
pub fn get_X(&self, ...) -> Result<Box<dyn ResultReader + Send>> {
let result = self.runtime.block_on(self.client.list_X(&self.session_id, ...))?;
Ok(result.reader)
}The FFI functions in metadata.rs can call Connection directly — it already holds client, session_id, and runtime. The only non-trivial logic is get_columns's empty-catalog guard (3 lines), which can live in the FFI function itself.
rust/src/ffi/handle.rs
Outdated
| @@ -0,0 +1,154 @@ | |||
| // Copyright (c) 2025 ADBC Drivers Contributors | |||
There was a problem hiding this comment.
With the service removed, this file can also go. Instead of:
metadata_connection_from_ref(conn_ptr) → Box<Service> → opaque handle
metadata_get_tables(handle, ...) → handle_to_service → svc.get_tables()
metadata_connection_free(handle)
The FFI functions can just take *const Connection:
pub unsafe extern "C" fn metadata_get_tables(
conn: *const Connection,
...,
out: *mut FFI_ArrowArrayStream,
) -> FfiStatus {
let conn = &*conn;
let result = conn.runtime_handle().block_on(
conn.client().list_tables(conn.session_id(), ...)
)?;
export_reader(result.reader, out)
}This eliminates the create/free lifecycle entirely. The ODBC side already has the Connection* via adbc_connection_.private_data.
There was a problem hiding this comment.
Additional service layer removed
| .message("catalog is required for SHOW COLUMNS (ALL CATALOGS not yet supported)") | ||
| })?; | ||
|
|
||
| /// Build SHOW COLUMNS command. |
There was a problem hiding this comment.
build_show_columns changed from returning Result<String> to panicking via expect(). This is a regression — library code shouldn't panic on invalid input. While catch_unwind in the FFI layer would catch it, unwinding through Rust code has edge cases and shouldn't be relied on for normal error handling.
Two options:
- Revert to
Result<String>(preferred) - Take
catalog: &str(notOption) since it's always required — then the type system enforces it and there's nothing to panic on
There was a problem hiding this comment.
changed to taking catalog: &str
| /// Catalog is required — callers must pass `Some(catalog)`. | ||
| async fn list_columns( | ||
| &self, | ||
| session_id: &str, |
There was a problem hiding this comment.
list_columns changed from catalog: &str to catalog: Option<&str>, but SeaClient immediately errors on None. If catalog is always required at the client level, the trait should enforce it with &str.
The None → empty result case belongs at the call site (the FFI function), not pushed into the trait.
| @@ -0,0 +1,579 @@ | |||
| // Copyright (c) 2025 ADBC Drivers Contributors | |||
There was a problem hiding this comment.
The ODBC C++ side will need declarations for FfiStatus, FfiError, and all extern "C" functions. Can you add a databricks_metadata_ffi.h header? Or set up cbindgen to generate one. Without it the ODBC side has to manually mirror these declarations, which is error-prone as the surface evolves.
There was a problem hiding this comment.
Good point, added
| pub unsafe extern "C" fn metadata_get_foreign_keys( | ||
| conn: FfiConnectionHandle, | ||
| catalog: *const c_char, | ||
| schema: *const c_char, |
There was a problem hiding this comment.
MockReader and SchemaErrorReader are duplicated between here and rust/src/reader/mod.rs tests. Consider extracting to a shared #[cfg(test)] module.
| "ARRAY" => 2003, // JDBC ARRAY | ||
| "MAP" => 2000, // JDBC JAVA_OBJECT | ||
| "STRUCT" => 2002, // JDBC STRUCT | ||
| "INTERVAL" => 12, // JDBC VARCHAR (matches JDBC driver) |
There was a problem hiding this comment.
The INTERVAL → VARCHAR mapping is a good fix but unrelated to the metadata FFI feature. Should this be a separate commit to keep the diff focused?
There was a problem hiding this comment.
I think, this is minor change, can keep it
…, add C header - Remove `service.rs` and `handle.rs`: FFI functions now take `*const Connection` directly instead of going through ConnectionMetadataService + opaque handle. Eliminates ~300 lines of indirection that didn't carry its weight. - Change `list_columns` trait: `catalog: Option<&str>` → `catalog: &str` since the client always requires it. Empty-catalog guard moves to FFI layer. - Fix `build_show_columns`: takes `catalog: &str` parameter instead of panicking via `expect()` on missing builder state. Type system enforces the requirement. - Add `databricks_metadata_ffi.h`: C header with declarations for all FFI types and functions, so the ODBC C++ side doesn't manually mirror Rust declarations. - Deduplicate test mocks: extract MockReader, ErrorReader, SchemaErrorReader to shared `reader/test_utils.rs` module, used by both reader and FFI tests. - Update design spec to reflect the simplified architecture. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
metadata-ffifeature flag exposingextern "C"functions for catalog metadata operations (GetCatalogs, GetSchemas, GetTables, GetColumns, GetPrimaryKeys, GetForeignKeys)RecordBatchReadervia the Arrow C Data Interface (FFI_ArrowArrayStream), usingResultReaderAdapterto bridgeResultReader→RecordBatchReaderwithout bufferingConnectionMetadataService(thin pass-through toDatabricksClient), C FFI handle/error management withcatch_unwindpanic safety, and thread-local error bufferlist_columnsaccepts optional catalog — usesSHOW COLUMNS IN ALL CATALOGSserver-side when catalog is None/wildcard (no client-side multi-catalog orchestration)Files
New
ffi/mod.rs,ffi/catalog.rs,ffi/error.rs,ffi/handle.rs— C FFI surface (9 extern "C" functions)metadata/service.rs—ConnectionMetadataServicereturning streamingBox<dyn ResultReader>spec/odbc-metadata-ffi-design.md— Design specificationModified
client/mod.rs—list_columnscatalog:&str→Option<&str>client/sea.rs— Handle optional catalog inlist_columnsmetadata/sql.rs—build_show_columnssupports ALL CATALOGS, returnsString(notResult)connection.rs— Updatedlist_columnscall sitesreader/mod.rs— AddedResultReaderAdapterunit testsData Flow
No intermediate
collect_batchesorconcat_batches— batches stream lazily from network to caller.Test plan
cargo test)cargo clippy -- -D warningscleancargo fmtclean🤖 Generated with Claude Code