Skip to content

docs: add Thrift protocol test suite design#74

Merged
eric-wang-1990 merged 25 commits intomainfrom
docs/thrift-protocol-test-suite-design
Jan 9, 2026
Merged

docs: add Thrift protocol test suite design#74
eric-wang-1990 merged 25 commits intomainfrom
docs/thrift-protocol-test-suite-design

Conversation

@eric-wang-1990
Copy link
Collaborator

Add comprehensive design document for ADBC driver Thrift protocol test suite.

Key features:

  • Language-agnostic test specifications (~300 test cases across 16 categories)
  • Standalone Go proxy server for failure injection testing
  • Multi-language support strategy (C#, Java, C++, Go)
  • Extractable design for future common repository

Initial implementation targets C# ADBC driver with plans to extend to:

  • Java (JDBC driver)
  • C++ (ODBC driver)
  • Go (ADBC driver)

The design enables comprehensive testing of:

  • Session lifecycle and management
  • Statement execution (sync/async)
  • Metadata operations
  • Arrow format and compression
  • CloudFetch results and failure scenarios
  • Parameterized queries
  • Error handling and recovery
  • Concurrency and edge cases

Directory structure:

  • docs/designs/thrift-protocol-tests/ - Design and specifications
  • test-infrastructure/proxy-server/ - Standalone proxy (to be implemented)

Related to runtime Thrift tests and ADBC E2E tests, but focuses on comprehensive protocol compliance and failure scenario testing.

What's Changed

Please fill in a description of the changes here.

This contains breaking changes.

Closes #NNN.

Add comprehensive design document for ADBC driver Thrift protocol test suite.

Key features:
- Language-agnostic test specifications (~300 test cases across 16 categories)
- Standalone Go proxy server for failure injection testing
- Multi-language support strategy (C#, Java, C++, Go)
- Extractable design for future common repository

Initial implementation targets C# ADBC driver with plans to extend to:
- Java (JDBC driver)
- C++ (ODBC driver)
- Go (ADBC driver)

The design enables comprehensive testing of:
- Session lifecycle and management
- Statement execution (sync/async)
- Metadata operations
- Arrow format and compression
- CloudFetch results and failure scenarios
- Parameterized queries
- Error handling and recovery
- Concurrency and edge cases

Directory structure:
- docs/designs/thrift-protocol-tests/ - Design and specifications
- test-infrastructure/proxy-server/ - Standalone proxy (to be implemented)

Related to runtime Thrift tests and ADBC E2E tests, but focuses on
comprehensive protocol compliance and failure scenario testing.
Copy link
Contributor

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## Architecture

```
┌─────────────────────────────────────────┐
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use mermaid graph

- **C# ADBC**: This repository
- **Java JDBC**: Separate repository
- **C++ ODBC**: Separate repository
- **Go ADBC**: Part of Apache Arrow ADBC
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily Go ADBC, there is a Go driver.


## Getting Started

### For Reviewers
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes part should not be in the readme but should be in PR description.

- [ ] C++ test implementation (future)
- [ ] Go test implementation (future)

## Timeline
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this part, put in PR description

- **Weeks 9-10**: C# robustness tests (concurrency, edge cases)
- **Future**: Adapt for Java, C++, Go drivers

## Questions?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

### High-Level Architecture

```
┌─────────────────────────────────────────────────────────┐
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to mermaid graph


### Overview

| Part | Category | Test Count | Priority | Initial Lang |
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the Initial Lang column as it is not important


### Rollout Plan

**Phase 1: C# Implementation (Weeks 1-10)**
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 weeks too long, put it into a 2 week sprint, same for other places

- Convert ASCII diagram to Mermaid graph for better rendering
- Change 'Go ADBC' to 'Go driver' for accuracy
- Remove 'For Reviewers' section (moved to PR description)
- Remove 'Timeline' section (moved to PR description)
- Add section explaining relationship to existing adbc-drivers/validation suite

Addresses review comments from @lidavidm and @eric-wang-1990
@eric-wang-1990
Copy link
Collaborator Author

Review Feedback Addressed

Thank you for the review! I've addressed all the comments:

@lidavidm's feedback on validation repository

✅ Added a "Relationship to Existing Validation" section explaining how this Thrift protocol test suite complements the existing adbc-drivers/validation suite. They serve different purposes:

  • Validation suite: ADBC API feature validation and SQL query correctness
  • This suite: Databricks Thrift protocol compliance and failure scenario testing

My own review comments

Mermaid graph (line 43): Converted ASCII diagram to Mermaid format
Go driver (line 85): Changed "Go ADBC" to "Go driver" for accuracy
For Reviewers section (line 130): Removed - this belongs in PR description
Timeline section (line 175): Removed - this belongs in PR description

All changes pushed in commit 2d0b4c2.

- Convert ASCII diagram to Mermaid graph in design.md
- Remove 'Initial Lang' column from test categories table

Addresses review comments from @eric-wang-1990
@eric-wang-1990
Copy link
Collaborator Author

Additional Review Feedback Addressed

Addressed additional comments on design.md:

Mermaid graph in design.md (line 130): Converted ASCII diagram to Mermaid format for consistency
Removed 'Initial Lang' column: Removed from test categories table as it's not important for the design

Changes pushed in commit f8ab173.

### Example: Cross-Language Test Implementation

**Specification** (shared):
```markdown
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this can be generated by AI easily. but wondering if we can define some format, that can avoid AI code gen for each test cases?

- Injects failures based on configuration
- Logs all traffic for debugging

**Configuration Example**:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is yaml the best way injection failures? or we actually create a injection api on the proxy serivce, which might be more flexible?

just like how we set mock

Enhance proxy server configuration with 24 real-world failure scenarios based on Databricks customer issues discovered through Glean search.

Key additions:
- CloudFetch failures (6 scenarios): expired links, Azure 403, SSL errors, timeouts
- Connection reset errors (4 scenarios): TLS handshake, mid-fetch resets, SSL errors
- Session management issues (5 scenarios): premature timeouts, invalid handles, DBR/CP mismatch
- Protocol violations (3 scenarios): MaxRows exceeded, retry limits, communication failures
- Rate limiting and network failures (4 scenarios)

All scenarios include:
- JIRA ticket references for traceability
- Priority levels (P0/P1/P2) based on customer impact
- Detailed descriptions and trigger configurations
- Production-tuned probability settings

Top priority scenarios (P0):
- PECOBLR-1131: CloudFetch expired link refetching
- BL-13580: Connection reset during large result fetch
- ES-610899: Invalid SessionHandle errors
- BL-13239: CloudFetch timeout without clear errors

This significantly enhances the test suite's ability to catch real production issues before they impact customers.
@eric-wang-1990
Copy link
Collaborator Author

Enhanced Proxy Server Design with Production Failure Scenarios

Added comprehensive failure scenario configuration based on real Databricks customer issues discovered through internal Glean search.

What's Added

24 production-validated failure scenarios with JIRA references across 6 categories:

Category Count Key Issues
🔴 CloudFetch Failures 6 PECOBLR-1131, ES-1624602, BL-13239, ES-1539484
🔴 Connection Resets 4 BL-13580, BL-14202, ES-1657027, ES-1498241
🔴 Session Management 5 ES-610899, ES-1661289, XTA-11040, ES-1608485
🟡 Protocol Issues 3 PECO-2524, BL-14014, ES-1559149
🟢 Rate Limiting 2 Generic timeout scenarios
🟢 Network Failures 2 Generic network issues

Top Priority (P0) Scenarios

  1. cloudfetch_expired_link (PECOBLR-1131)

    • Most common CloudFetch issue in OSS JDBC driver
  2. connection_reset_during_fetch (BL-13580)

    • Data loss in large JDBC result sets
  3. invalid_session_handle (ES-610899)

    • Query failures when session invalidated
  4. cloudfetch_timeout (BL-13239)

    • Timeout without clear error during CloudFetch

Why This Matters

  • Real-world validated: Every scenario is based on actual customer tickets
  • Traceable: JIRA IDs link test failures back to original issues
  • Prioritized: P0/P1/P2 priorities guide implementation order
  • Actionable: Detailed trigger configurations ready for proxy implementation

This ensures the test suite catches issues that have actually impacted production customers, not just theoretical edge cases.

Changes pushed in commit dffa378.

### Non-Goals

- ❌ **Server-side testing**: Use existing runtime tests for ThriftServer behavior
- ❌ **Performance benchmarking**: Focus on correctness, not performance optimization
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance test can be added later.

- ❌ Failure scenarios (expired links, network timeouts)
- ❌ Concurrent operation handling

**Go ADBC Driver Tests (reference):**
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this, we do not have a go adbc driver for databricks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It needs to be ported over here though.)

…injection

Addresses two design questions from review:

1. Test Specification Format (line 168):
   - Added structured YAML format option alongside Markdown
   - YAML is machine-readable, enables code generation tools
   - Markdown remains for human-readable documentation
   - Avoids dependency on AI code generation

2. Failure Injection Approach (line 294):
   - Added API-based programmatic approach alongside YAML
   - YAML for declarative/static scenarios (90% of cases)
   - HTTP API for dynamic/stateful scenarios (10% of cases)
   - Includes complete workflow examples:
     * Step-by-step YAML-based workflow
     * C# and Go code examples with API
     * Multiple configuration strategies
     * Proxy API endpoints

Key benefits:
- Flexibility: Choose right tool for each scenario
- Simplicity: YAML for common cases, API for complex ones
- Familiarity: Similar to mocking frameworks
- Testability: Easy to switch configs per test suite
@eric-wang-1990
Copy link
Collaborator Author

Addressed @jadewang-db Review Feedback

Thanks for the excellent design questions! I've enhanced the design to address both concerns:

1. Test Specification Format (Line 168)

Question: Can we avoid AI code gen dependency?

Solution: Added structured YAML format option alongside Markdown:

# Machine-readable test specification
test_id: TEST-SESSION-001
name: Basic OpenSession
steps:
  - action: open_session
    params:
      credentials: "{{test_credentials}}"
  - action: assert_not_null
    target: session_handle

Benefits:

  • ✅ Code generation tools can parse YAML directly
  • ✅ Validation tools can check test completeness
  • ✅ Markdown still available for documentation
  • ✅ No AI dependency for test implementation

2. Failure Injection Approach (Line 294)

Question: Is YAML best? Or should we use an injection API?

Solution: Support both approaches (similar to mocking frameworks):

YAML for static scenarios (90% of cases):

failure_scenarios:
  - name: "cloudfetch_expired_link"
    trigger: "after_requests"
    count: 1
    action: "expire_cloud_link"

HTTP API for dynamic scenarios (10% of cases):

var proxy = new ProxyControlClient("http://localhost:8081");
await proxy.ConfigureFailure(new FailureScenario {
    Name = "dynamic_failure",
    Trigger = new AfterRequestsTrigger { Count = 1 },
    Action = new ExpireCloudLinkAction()
});

Proxy API Endpoints:

  • POST /api/v1/failures - Configure failure
  • GET /api/v1/failures - List active failures
  • DELETE /api/v1/failures/{name} - Clear specific failure

Workflow Example:

  1. Start proxy: go run main.go --config cloudfetch-failures.yaml
  2. Run tests: dotnet test --filter Category=CloudFetch
  3. Proxy injects failures based on YAML config
  4. Or use API for dynamic scenarios during test execution

Benefits:

  • ✅ Simple YAML for most cases (version controlled, easy to review)
  • ✅ Flexible API for complex/dynamic scenarios
  • ✅ Similar to mocking frameworks (familiar pattern)
  • ✅ Best of both worlds

Changes pushed in commit b7f4c81.

Address concern that YAML-based approach would require restarting proxy for each test.

**Key Change**: Hybrid approach where:
- YAML defines all failure scenarios (version controlled)
- HTTP API enables/disables scenarios dynamically
- Proxy starts ONCE, tests control it via API
- No restart needed between tests

**Benefits**:
- ✅ No proxy restart overhead
- ✅ Fast test execution
- ✅ Clean test isolation (enable/disable per test)
- ✅ YAML scenarios are reviewed and version controlled
- ✅ API provides flexibility without restart complexity

**Updated API Endpoints**:
- POST /api/v1/scenarios/{name}/enable - Enable YAML-defined scenario
- POST /api/v1/scenarios/{name}/disable - Disable scenario
- GET /api/v1/scenarios - List all defined scenarios
- GET /api/v1/scenarios/active - List active scenarios

**Example Workflow**:
1. Start proxy once: go run main.go --config all-scenarios.yaml
2. Test 1: Enable scenario "cloudfetch_expired_link" via API
3. Test 2: Enable scenario "session_timeout" via API (same proxy!)
4. No restart needed

This addresses the complexity concern while maintaining the benefits of both approaches.
@eric-wang-1990
Copy link
Collaborator Author

Improved Design: Hybrid Approach (No Restart Needed!)

Great catch! Restarting the proxy for each test would indeed be impractical. I've updated the design to use a hybrid approach:

The Problem You Identified

❌ Pure YAML approach: Restart proxy for each test configuration
❌ Too slow and complicated

The Solution: Hybrid Approach

YAML defines scenarios (version controlled, reviewable):

# all-scenarios.yaml - loaded once at startup
failure_scenarios:
  - name: "cloudfetch_expired_link"
    trigger: "after_requests"
    action: "expire_cloud_link"
  - name: "session_timeout_premature"
    trigger: "after_duration"
    action: "invalidate_session"
  # ... all 24 scenarios defined

API controls activation (no restart needed):

var proxy = new ProxyControlClient("http://localhost:8081");

[Fact]
public async Task Test1()
{
    // Enable scenario from YAML
    await proxy.EnableScenario("cloudfetch_expired_link");
    // Run test...
    await proxy.DisableScenario("cloudfetch_expired_link");
}

[Fact]
public async Task Test2()
{
    // Different scenario, same proxy server!
    await proxy.EnableScenario("session_timeout_premature");
    // Run test...
    await proxy.DisableScenario("session_timeout_premature");
}

Key Advantages

Start proxy ONCE: go run main.go --config all-scenarios.yaml
No restart between tests
Fast execution: No startup overhead
Clean isolation: Each test enables/disables its scenarios
YAML benefits: Version controlled, reviewed, documented
API flexibility: Dynamic control without complexity

Updated API Endpoints

  • POST /api/v1/scenarios/{name}/enable - Enable YAML-defined scenario
  • POST /api/v1/scenarios/{name}/disable - Disable scenario
  • GET /api/v1/scenarios - List all defined scenarios
  • GET /api/v1/scenarios/active - List currently active scenarios

This gives us the best of both worlds: declarative YAML definitions with imperative API control.

Changes pushed in commit d7730dc.

eric-wang-1990 and others added 4 commits December 16, 2025 15:14
- Remove "Questions?" section from README (move to PR description)
- Update performance benchmarking as future work (not hard non-goal)
- Remove [Initial]/[Future] labels from Mermaid diagram
- Update implementation plan to 2-week sprint format (from 10 weeks)
- Update all timeline references for consistency

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Key design decisions based on review feedback:
- Remove trigger logic (WHEN) - YAML only defines WHAT to inject
- API controls activation - scenarios apply to next matching request
- Remove probability/random triggers - tests should be deterministic
- Focus on action definitions with operation matching

Schema includes:
- 7 action types (return_error, delay, close_connection, etc.)
- Optional operation matching (OpenSession, FetchResults, etc.)
- Complete examples for CloudFetch, connection, session failures
- Clear validation rules and field requirements

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Removed from examples:
- priority (P0/P1/P2) - not needed for functionality
- retryable - adds complexity, not essential
- at_byte - advanced parameter, keep it simple

Kept essential fields:
- name, description, action (required)
- operation (useful for targeting)
- jira (useful for traceability to production issues)
- Action-specific parameters (error_code, duration, etc.)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Removed from all proxy configuration examples:
- trigger, probability, count (trigger logic)
- priority, min_duration, max_duration (metadata)
- retryable, at_row, at_byte, phase, context, factor, retry_after, at_percent (complex parameters)

Updated to simplified format:
- name, jira, description (core fields)
- operation (optional, for targeting specific Thrift operations)
- action (required, what to inject)
- Action-specific parameters (error_code, duration, error_message, etc.)

Also updated:
- Removed priority columns from failure scenario summary table
- Updated implementation notes to reflect deterministic approach
- Changed "Top Priority" to "Recommended" scenarios (no P0/P1 labels)

Count: 21 simplified scenarios (was 24 with duplicates removed)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
eric-wang-1990 added a commit that referenced this pull request Dec 18, 2025
Implements comprehensive tests for Thrift protocol session and statement
operations using the new call verification API.

Session Lifecycle Tests (5 tests):
- BasicSession_OpensAndCloses - Verifies OpenSession → CloseSession
- Session_ExecutesQuery_WithProperSequence - Full query sequence
- Session_WithMultipleStatements_TracksAllOperations - Multiple statements per session
- Session_CloseOperationCalled_AfterEachStatement - Resource cleanup verification
- Session lifecycle management

Statement Execution Tests (6 tests):
- SimpleQuery_ExecutesWithExpectedSequence - Basic ExecuteStatement flow
- LongRunningQuery_PollsOperationStatus - GetOperationStatus polling
- StatementWithFetchResults_CallsExpectedMethods - FetchResults verification
- MultipleStatements_EachHasOwnOperation - Concurrent statement handling
- Statement_OperationLifecycle_ProperSequence - Verify call ordering

Key Features:
- Uses VerifyThriftCallsAsync for sequence validation
- Tests method counts, existence, and ordering
- Validates proper resource cleanup (CloseOperation)
- Covers both fast (directResult) and slow (polling) queries

Based on PR #74 design doc test categories.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
eric-wang-1990 added a commit that referenced this pull request Dec 19, 2025
Implements comprehensive tests for Thrift protocol session and statement
operations using the new call verification API.

Session Lifecycle Tests (5 tests):
- BasicSession_OpensAndCloses - Verifies OpenSession → CloseSession
- Session_ExecutesQuery_WithProperSequence - Full query sequence
- Session_WithMultipleStatements_TracksAllOperations - Multiple statements per session
- Session_CloseOperationCalled_AfterEachStatement - Resource cleanup verification
- Session lifecycle management

Statement Execution Tests (6 tests):
- SimpleQuery_ExecutesWithExpectedSequence - Basic ExecuteStatement flow
- LongRunningQuery_PollsOperationStatus - GetOperationStatus polling
- StatementWithFetchResults_CallsExpectedMethods - FetchResults verification
- MultipleStatements_EachHasOwnOperation - Concurrent statement handling
- Statement_OperationLifecycle_ProperSequence - Verify call ordering

Key Features:
- Uses VerifyThriftCallsAsync for sequence validation
- Tests method counts, existence, and ordering
- Validates proper resource cleanup (CloseOperation)
- Covers both fast (directResult) and slow (polling) queries

Based on PR #74 design doc test categories.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
eric-wang-1990 added a commit that referenced this pull request Dec 19, 2025
Implements comprehensive tests for Thrift protocol session and statement
operations using the new call verification API.

Session Lifecycle Tests (5 tests):
- BasicSession_OpensAndCloses - Verifies OpenSession → CloseSession
- Session_ExecutesQuery_WithProperSequence - Full query sequence
- Session_WithMultipleStatements_TracksAllOperations - Multiple statements per session
- Session_CloseOperationCalled_AfterEachStatement - Resource cleanup verification
- Session lifecycle management

Statement Execution Tests (6 tests):
- SimpleQuery_ExecutesWithExpectedSequence - Basic ExecuteStatement flow
- LongRunningQuery_PollsOperationStatus - GetOperationStatus polling
- StatementWithFetchResults_CallsExpectedMethods - FetchResults verification
- MultipleStatements_EachHasOwnOperation - Concurrent statement handling
- Statement_OperationLifecycle_ProperSequence - Verify call ordering

Key Features:
- Uses VerifyThriftCallsAsync for sequence validation
- Tests method counts, existence, and ordering
- Validates proper resource cleanup (CloseOperation)
- Covers both fast (directResult) and slow (polling) queries

Based on PR #74 design doc test categories.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
eric-wang-1990 added a commit that referenced this pull request Dec 19, 2025
Implements comprehensive tests for Thrift protocol session and statement
operations using the new call verification API.

Session Lifecycle Tests (5 tests):
- BasicSession_OpensAndCloses - Verifies OpenSession → CloseSession
- Session_ExecutesQuery_WithProperSequence - Full query sequence
- Session_WithMultipleStatements_TracksAllOperations - Multiple statements per session
- Session_CloseOperationCalled_AfterEachStatement - Resource cleanup verification
- Session lifecycle management

Statement Execution Tests (6 tests):
- SimpleQuery_ExecutesWithExpectedSequence - Basic ExecuteStatement flow
- LongRunningQuery_PollsOperationStatus - GetOperationStatus polling
- StatementWithFetchResults_CallsExpectedMethods - FetchResults verification
- MultipleStatements_EachHasOwnOperation - Concurrent statement handling
- Statement_OperationLifecycle_ProperSequence - Verify call ordering

Key Features:
- Uses VerifyThriftCallsAsync for sequence validation
- Tests method counts, existence, and ordering
- Validates proper resource cleanup (CloseOperation)
- Covers both fast (directResult) and slow (polling) queries

Based on PR #74 design doc test categories.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
eric-wang-1990 added a commit that referenced this pull request Dec 19, 2025
Implements comprehensive tests for Thrift protocol session and statement
operations using the new call verification API.

Session Lifecycle Tests (5 tests):
- BasicSession_OpensAndCloses - Verifies OpenSession → CloseSession
- Session_ExecutesQuery_WithProperSequence - Full query sequence
- Session_WithMultipleStatements_TracksAllOperations - Multiple statements per session
- Session_CloseOperationCalled_AfterEachStatement - Resource cleanup verification
- Session lifecycle management

Statement Execution Tests (6 tests):
- SimpleQuery_ExecutesWithExpectedSequence - Basic ExecuteStatement flow
- LongRunningQuery_PollsOperationStatus - GetOperationStatus polling
- StatementWithFetchResults_CallsExpectedMethods - FetchResults verification
- MultipleStatements_EachHasOwnOperation - Concurrent statement handling
- Statement_OperationLifecycle_ProperSequence - Verify call ordering

Key Features:
- Uses VerifyThriftCallsAsync for sequence validation
- Tests method counts, existence, and ordering
- Validates proper resource cleanup (CloseOperation)
- Covers both fast (directResult) and slow (polling) queries

Based on PR #74 design doc test categories.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…erns

- Change proxy from Go to Python/mitmproxy with HTTPS interception
- Add Thrift call verification section with real-world test examples
- Document complete TLS certificate trust configuration required for driver
- Update API endpoints to match actual Flask implementation
- Change from sprint-based to timeless phased approach
- Add rationale for mitmproxy choice over alternatives
- Update architecture diagram and component responsibilities
…entation

- Consolidate docs/designs/thrift-protocol-tests/README.md into test-infrastructure/README.md
- Update test-infrastructure/README.md with:
  - Comprehensive overview and architecture
  - Actual mitmproxy implementation details (not Go)
  - Complete directory structure and component descriptions
  - Getting started guide and usage examples
  - Remove implementation status tracking (timeless doc)
- Rewrite test-infrastructure/proxy-server/README.md with:
  - Actual mitmproxy usage instructions
  - Flask API endpoint documentation
  - CloudFetch failure scenarios table with JIRA references
  - Complete configuration examples for C# tests
  - Architecture and debugging sections
- Remove redundant docs/designs/thrift-protocol-tests/README.md
- Delete docs/designs/thrift-protocol-tests/design.md (1198 lines)
  - Architecture and rationale already covered in test-infrastructure/README.md
  - Most useful content already consolidated
- Delete docs/designs/thrift-protocol-tests/proxy-config-schema.md (443 lines)
  - Obsolete: describes YAML config but we use hardcoded Python scenarios
  - API endpoints already documented in proxy-server/README.md
- Update both READMEs to remove references to deleted docs
- Add mitmproxy docs reference

Result: 2 focused READMEs instead of 4 documents
- test-infrastructure/README.md: Overview, architecture, getting started
- test-infrastructure/proxy-server/README.md: Proxy usage, API reference, debugging
Add back important sections from deleted design.md:
- Implementation Plan: 5 phases (Foundation → Critical Tests → Comprehensive → Advanced → Cross-Driver)
- Key Design Decisions: Rationale for language-agnostic specs, mitmproxy choice, C# first approach
- Alternatives Considered: Why we chose mitmproxy over Go/C#/Java/custom solutions

These sections provide valuable context for future development and decision-making.
**Phase 1: Foundation Infrastructure**
- Design document and architecture definition
- Directory structure (`test-infrastructure/` with `proxy-server/` and `tests/`)
- Proxy server implementation (mitmproxy-based) with:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where we can get the detail list of test cases?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored to add those in the /specs folder

Restore additional important sections from deleted design.md:

1. Goals and Non-Goals:
   - Goals: Comprehensive coverage, driver behavior focus, language-agnostic specs,
     failure scenario testing, extractable design
   - Non-Goals: Server-side testing, performance benchmarking, load testing,
     protocol design changes

2. Alternatives Considered:
   - Alternative 1: Mock server (rejected - want real server testing)
   - Alternative 2: Separate test suites per driver (rejected - leads to inconsistency)
   - Alternative 3: Single language implementation (rejected - interop complexity)

These provide context for design decisions and future contributors.
…ions

Add machine-readable YAML format for test specifications to enable
cross-language implementation:

New files:
- test-infrastructure/specs/cloudfetch.yaml
  - 3 implemented tests: CLOUDFETCH-001, CLOUDFETCH-002, CLOUDFETCH-003
  - 7 planned tests: CLOUDFETCH-004 through CLOUDFETCH-010
  - Maps directly to proxy scenarios in mitmproxy_addon.py
  - Includes steps, assertions, driver configs, and JIRA references

- test-infrastructure/specs/README.md
  - Documentation for YAML structure and usage
  - Examples for C# and Java implementations
  - Assertion types and measurement types reference
  - Validation instructions

Updates:
- test-infrastructure/README.md
  - Add Test Specifications component section
  - Update directory structure to show specs/
  - Add quick link to specs documentation
  - Add YAML example in Components section

Benefits:
- Single source of truth for test behavior across C#, Java, C++, Go
- Machine-readable format can validate implementations
- Direct mapping to proxy scenarios
- Easy to see all tests and their status at a glance
- Can potentially generate test stubs for new languages
Add back the ~300 test cases across 16 categories table that was
accidentally removed during consolidation. This shows the full scope
of planned testing including:
- 3 Critical categories (Session, Statement, CloudFetch, Error Handling)
- 8 High priority categories
- 4 Medium priority categories
- 1 Low priority category (Performance)

Note CloudFetch has 3 tests implemented with 17 more planned.
Add Background section that provides context for the test infrastructure:

1. Thrift Protocol Overview:
   - Protocol hierarchy: Hive → Spark → Databricks extensions
   - 20 key RPC operations across Session, Execution, Results, Metadata
   - Databricks extensions: CloudFetch, Direct Results, Arrow Streaming, Parameterized Queries

2. Current Test Coverage Analysis:
   - What's tested: E2E, unit tests, CloudFetch happy path
   - Coverage gaps: Metadata operations, parameterized queries, protocol negotiation,
     failure scenarios, concurrency, direct results

This context helps developers understand what we're testing and why.
Add comprehensive table of all 16 planned test suites (~300 tests):

Implemented:
- cloudfetch.yaml (3 tests implemented, 7 planned)

Planned YAML files to create:
- session-lifecycle.yaml (15 tests, Critical)
- statement-execution.yaml (25 tests, Critical)
- metadata-operations.yaml (40 tests, High)
- arrow-format.yaml (20 tests, High)
- direct-results.yaml (15 tests, High)
- parameterized-queries.yaml (20 tests, High)
- result-fetching.yaml (15 tests, High)
- error-handling.yaml (30 tests, Critical)
- timeout-cleanup.yaml (12 tests, Medium)
- concurrency.yaml (15 tests, Medium)
- protocol-versions.yaml (12 tests, Medium)
- security.yaml (15 tests, High)
- performance.yaml (10 tests, Low)
- edge-cases.yaml (36 tests, Medium)

This provides a clear roadmap for future YAML spec creation.

See [specs/README.md](./specs/README.md) for full documentation.

**Planned Test Suites (~300 test cases across 16 categories):**
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seem can be removed since we already have detailed tests in the specs/README.md


| Type | Description | Save As |
|------|-------------|---------|
| `thrift_method` | Count calls to Thrift method | Variable name |
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thrift_method should have a name as well like GetCatalogs, etc.


## Assertion Types

| Type | Description | Parameters |
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be reworked, we should not pre-defined the types but just use equations/formulas.
E.g. for testing a GetColumns metadata call we want to verify:

  1. no error throw
  2. The result schema matching our golden results

eric-wang-1990 and others added 6 commits January 8, 2026 17:08
- Rework assertions to use equations/formulas instead of predefined types
- Add method name examples (FetchResults, GetCatalogs, etc.) to measurements
- Remove duplicate test suites table from main README (now in specs/README.md)
- Update cloudfetch.yaml to use new assertion format (no_error, formulas)
- Add GetColumns validation example per feedback
Resolved conflict in test-infrastructure/proxy-server/README.md by accepting version from main
Add missing Apache License headers to:
- test-infrastructure/README.md
- test-infrastructure/specs/README.md
- test-infrastructure/specs/cloudfetch.yaml

This fixes the Apache RAT check failures in CI.
@eric-wang-1990 eric-wang-1990 merged commit 4c0598b into main Jan 9, 2026
2 checks passed
@eric-wang-1990 eric-wang-1990 deleted the docs/thrift-protocol-test-suite-design branch January 9, 2026 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants