Skip to content

Conversation

@oetiker
Copy link

@oetiker oetiker commented Sep 19, 2025

DBD::Pg Async Query Ownership and Data Preservation Fixes

Caveat! I developed this code and the tests tests using Claude code in order to fix to #105 and mojolicious/mojo#2276

Overview

This document describes the comprehensive fixes for DBD::Pg's asynchronous query handling, addressing GitHub issue #105 and related problems with PG_OLDQUERY_WAIT data loss.

Problems Addressed

1. GitHub Issue #105: Statement Destruction Cancels Unrelated Async Queries

Problem: Destroying any statement handle would cancel the active async query of another statement, even if the destroyed statement never executed an async query.

Root Cause: The dbd_st_destroy() and dbd_st_finish() functions unconditionally cleared the global imp_dbh->async_sth pointer and called handle_old_async() without checking if the statement being destroyed actually owned the async query.

Impact: Applications using multiple statement handles with async queries would experience random query cancellations when unrelated statements were destroyed or went out of scope.

2. PG_OLDQUERY_WAIT Data Loss

Problem: Using PG_OLDQUERY_WAIT to wait for a previous async query would successfully wait but would discard the query results instead of preserving them.

Root Cause: The handle_old_async() function would consume and discard results when waiting, with no mechanism to preserve them for the owning statement.

Impact: Data loss - results from async queries would be irretrievably lost when another query used PG_OLDQUERY_WAIT.

3. Missing Ownership Verification

Problem: Any statement handle could call pg_result() and retrieve another statement's async results, violating ownership semantics.

Root Cause: The pg_db_result() function did not verify that the calling statement was the one that initiated the async query.

Impact: Race conditions, incorrect data retrieval, and unpredictable behavior in multi-statement applications.

Technical Background

PostgreSQL's async query limitations:

  • Only ONE async query can be active per database connection at any time
  • DBD::Pg provides a statement-level API despite the connection-level limitation
  • The module tracks ownership via imp_dbh->async_sth pointing to the statement that owns the current async query

The Fixes

Fix 1: Ownership Checking in Statement Lifecycle Functions

Location: dbdimp.c - dbd_st_finish() (lines 3977-3980)

/* Only clear async_sth if THIS statement is the current async statement */
if (imp_dbh->async_sth == imp_sth) {
    imp_dbh->async_sth = NULL;
    imp_dbh->async_status = 0;
}

Location: dbdimp.c - dbd_st_destroy() (lines 4122-4127, 4194-4197)

/* Only call handle_old_async if THIS statement owns the async query */
if (imp_dbh->async_status && imp_dbh->async_sth == imp_sth) {
    handle_old_async(aTHX_ sth, imp_dbh, PG_OLDQUERY_WAIT);
}

/* Only clear async_sth if THIS statement owns it */
if (imp_dbh->async_sth == imp_sth) {
    imp_dbh->async_sth = NULL;
}

Fix 2: Auto-Retrieve Mechanism for PG_OLDQUERY_WAIT

Location: dbdimp.c - handle_old_async() (line ~5715)

Instead of discarding results, the function now:

  1. Retrieves the pending async query results
  2. Stores them in the owning statement's structure
  3. Sets async_status = 100 to indicate auto-retrieved results
  4. Preserves row count and result data for later access

Fix 3: Ownership Verification in pg_db_result()

Location: dbdimp.c - pg_db_result() (lines 5292-5397)

Added comprehensive ownership checking:

/* Determine if called from statement or database handle */
D_imp_xxh(h);
imp_sth_t *imp_sth = NULL;
if (DBIc_TYPE(imp_xxh) == DBIt_ST) {
    imp_sth = (imp_sth_t*)imp_xxh;
}

/* Check ownership BEFORE consuming the result */
if (imp_sth && imp_dbh->async_sth && imp_sth != imp_dbh->async_sth) {
    pg_error(aTHX_ h, PGRES_FATAL_ERROR,
             "pg_result() called on wrong statement handle - "
             "this statement does not own the async query\n");
    return -2;
}

Fix 4: Auto-Retrieved Result Handling

Location: dbdimp.c - pg_db_result()

Special handling for auto-retrieved results:

/* Check if this statement had auto-retrieved results from PG_OLDQUERY_WAIT */
if (imp_sth && imp_sth->async_status == STH_ASYNC_AUTORETRIEVED) {
    /* Results were auto-retrieved - return the stored row count */
    imp_sth->async_status = STH_NO_ASYNC;  /* Clear the auto-retrieved flag */
    return imp_sth->rows;
}
else if (imp_sth && imp_sth->async_status == STH_ASYNC_AUTOERROR) {
    /* This statement has auto-retrieved error result */
    /* Report the error from the stored result */
    ...
}

New Async Status Values

The implementation uses named constants for async_status values on statement handles:

enum {
    STH_ASYNC_AUTOERROR = -2,    /* PG_OLDQUERY_WAIT auto-retrieved an error result */
    STH_ASYNC_CANCELLED = -1,
    STH_NO_ASYNC,                /* 0 - No async query */
    STH_ASYNC,                   /* 1 - Active async query */
    STH_ASYNC_PREPARE,           /* 2 - Async prepare in progress */
    STH_ASYNC_AUTORETRIEVED = 100 /* PG_OLDQUERY_WAIT auto-retrieved results */
};

Test Coverage

The fix includes comprehensive test coverage in t/08async_regression.t with 7 test scenarios:

  1. Basic ownership - Prevents wrong statements from stealing results
  2. $dbh->pg_result ownership - Database handle can retrieve, finished statements cannot
  3. Statement destruction - Destroying unrelated statements doesn't affect async queries
  4. Cross-statement interference - Multiple statements don't interfere
  5. Interleaved operations - OLDQUERY_WAIT preserves data correctly
  6. Error attribution - Errors are properly attributed to correct statements
  7. Data preservation - OLDQUERY_WAIT auto-retrieves and preserves data

Behavior Changes

Before the Fix

  • Destroying any statement could cancel another's async query
  • PG_OLDQUERY_WAIT would discard previous query results
  • Any statement could steal another's async results
  • Unpredictable behavior with multiple async statements

After the Fix

  • Only the owning statement can affect its async query
  • PG_OLDQUERY_WAIT auto-retrieves and preserves results
  • Strict ownership verification prevents result theft
  • Predictable, safe multi-statement async operations

Migration Notes

The fixes are backward compatible with one exception:

  • Code that relied on the broken behavior of stealing async results from wrong statements will now correctly fail with an error

@oetiker oetiker changed the title DBD::Pg Async Query Ownership and Data Preservation Fixes (Fix for #105 and hopefully https://github.com/mojolicious/mojo/issues/2276) DBD::Pg Async Query Ownership and Data Preservation Fixes (Fix for #105 and https://github.com/mojolicious/mojo/issues/2276) Sep 19, 2025
@oetiker oetiker changed the title DBD::Pg Async Query Ownership and Data Preservation Fixes (Fix for #105 and https://github.com/mojolicious/mojo/issues/2276) DBD::Pg Async Query Ownership and Data Preservation Fixes Sep 19, 2025
@oetiker
Copy link
Author

oetiker commented Oct 20, 2025

ping ?

@turnstep
Copy link
Contributor

Please pull the recent async-related changes and retest.

@esabol
Copy link
Contributor

esabol commented Dec 12, 2025

Also, this PR has merge conflicts. Please rebase/resolve, @oetiker.

@oetiker
Copy link
Author

oetiker commented Dec 13, 2025

Also, this PR has merge conflicts. Please rebase/resolve, @oetiker.

I will have to re-evaluate after the changes ...

oetiker and others added 2 commits December 15, 2025 09:49
- Fix statement ownership checks in dbd_st_finish and dbd_st_destroy
- Add auto-retrieval of results with PG_OLDQUERY_WAIT instead of discarding
- Add ownership verification in pg_db_result to prevent result stealing
- Handle $dbh->do() async queries (no statement handle) correctly
- Add NULL check for async_sth in handle_old_async cancel path
- Add named constants STH_ASYNC_AUTORETRIEVED and STH_ASYNC_AUTOERROR

Fixes GitHub Issue bucardo#105

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@oetiker oetiker force-pushed the oetiker-async-regressions branch from e40cc86 to fe5c547 Compare December 15, 2025 09:14
oetiker and others added 2 commits December 15, 2025 10:44
Add missing 'use lib' statement to load from blib/ instead of
system-installed DBD::Pg.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
The async connect tests create new connections but were hardcoding
an empty password. Now uses $ENV{DBI_PASS} when set, falling back
to empty string for environments with trust authentication.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@oetiker
Copy link
Author

oetiker commented Dec 15, 2025

The PR is rebased and the entire testsuite is passing including the new testfile.
Note I added a fix for a test which was trying to connect without password even when a password was supplied.

@esabol
Copy link
Contributor

esabol commented Dec 15, 2025

@turnstep : Could you approve the CI workflow to run for this PR?

@esabol
Copy link
Contributor

esabol commented Dec 15, 2025

The PR is rebased and the entire testsuite is passing including the new testfile. Note I added a fix for a test which was trying to connect without password even when a password was supplied.

Thank you, @oetiker ! Looks promising.

@esabol esabol mentioned this pull request Dec 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants