Skip to content

Conversation

suchapalaver
Copy link
Collaborator

@suchapalaver suchapalaver commented Jul 21, 2025

Note on fix for test_retry_unaggregated_fees: Correcting RAV Mock Behavior

Problem Summary

The test test_retry_unaggregated_fees was failing with "Expected notify to be triggered" after recent testcontainers migration work. Investigation revealed the root cause was an incorrectly implemented mock behavior that didn't align with documented expectations.

Root Cause Analysis

Original Mock Behavior (Incorrect)

The original MockSenderAllocation was sending successful RAV responses that did not clear unaggregated fees:

sender_account.cast(SenderAccountMessage::UpdateReceiptFees(
    AllocationId::Legacy(AllocationIdCore::from(ALLOCATION_ID_0)),
    ReceiptFees::RavRequestResponse(
        UnaggregatedReceipts {
            value: *self.next_unaggregated_fees_value.borrow(), // ❌ NOT clearing fees!
            last_id: 0,
            counter: 0,
        },
        Ok(Some(signed_rav.into())),
    ),
))?;

This behavior was inconsistent with documented expectations. A comment in the codebase explicitly states:

// trigger rav request
// set the unnagregated fees to zero and the rav to the amount

Retry Logic Analysis

The sender account retry mechanism works as follows:

  1. Enters retry state when: sender is denied AND deny condition is reached
  2. Continues retrying when: sender remains denied AND deny condition still reached
  3. Stops retrying when: sender is denied BUT deny condition no longer reached

The deny condition is: total_potential_fees = pending_ravs + unaggregated_fees > balance

Why the Test Was Passing Before

The test was artificially passing because:

  1. Mock sent successful RAV response but didn't clear unaggregated fees
  2. Sender remained blocked since unaggregated_fees still exceeded the balance limit
  3. Retry mechanism continued, triggering a second retry attempt
  4. Test expected 2 retries and got them due to the incorrectly persistent deny condition

Solution Implemented

Added Comprehensive Documentation

Added detailed documentation throughout the code to explain:

  • MockSenderAllocation behavior: Documents that the mock follows TAP protocol by clearing unaggregated fees
  • Retry logic: Explains when retries continue vs. stop based on deny conditions
  • Test expectations: Clarifies why only one retry should occur when RAV succeeds
  • deny_condition_reached(): Documents the core logic determining when senders are blocked

Fixed Mock Behavior

Updated the mock to correctly implement the documented behavior:

sender_account.cast(SenderAccountMessage::UpdateReceiptFees(
    AllocationId::Legacy(AllocationIdCore::from(ALLOCATION_ID_0)),
    ReceiptFees::RavRequestResponse(
        UnaggregatedReceipts {
            value: 0, // ✅ Correctly clear unaggregated fees to zero
            last_id: 0,
            counter: 0,
        },
        Ok(Some(RavInformation {
            allocation_id: ALLOCATION_ID_0,
            value_aggregate: current_value, // ✅ Create RAV for the full amount
        })),
    ),
))?;

Updated Test Expectations

Modified the test to expect the correct behavior:

  • First retry triggers and succeeds, clearing unaggregated fees and resolving deny condition
  • No second retry occurs because the underlying issue is resolved
// wait to try again so it's outside the buffer
tokio::time::sleep(RETRY_DURATION).await;
assert_triggered!(triggered_rav_request);

// wait to check that no additional retry happens since the first RAV
// successfully cleared the unaggregated fees and resolved the deny condition
tokio::time::sleep(RETRY_DURATION).await;
assert_not_triggered!(triggered_rav_request); // ✅ Expect no second retry

Impact and Justification

Why This Fix Is Correct

  1. Aligns with documented behavior: Implements the explicit comment "set the unnagregated fees to zero"
  2. Matches real-world TAP behavior: Successful RAV requests should clear unaggregated fees
  3. Follows proper retry logic: Retries should stop when underlying condition is resolved
  4. Improves test accuracy: Tests now verify correct behavior rather than incorrect mock artifacts

Behavior Change

  • Before: Mock artificially kept sender blocked, causing unnecessary retries
  • After: Mock correctly unblocks sender when RAV request succeeds, stopping retries appropriately

This change makes the test more accurate in representing real system behavior where successful RAV processing resolves the underlying deny condition and stops unnecessary retries.

@suchapalaver suchapalaver force-pushed the suchapalaver/test/replace-sqlx-test-with-testcontainers branch from 1e0a2e0 to 4a57d33 Compare July 21, 2025 20:51
@suchapalaver suchapalaver force-pushed the suchapalaver/test/replace-sqlx-test-with-testcontainers branch 3 times, most recently from c3e0181 to 5d01552 Compare July 21, 2025 21:17
@coveralls
Copy link

coveralls commented Jul 21, 2025

Pull Request Test Coverage Report for Build 16432285415

Details

  • 709 of 718 (98.75%) changed or added relevant lines in 13 files are covered.
  • 55 unchanged lines in 6 files lost coverage.
  • Overall coverage increased (+5.2%) to 74.629%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/tap-agent/src/tap/context/receipt.rs 383 386 99.22%
crates/test-assets/src/lib.rs 105 111 94.59%
Files with Coverage Reduction New Missed Lines %
crates/watcher/src/lib.rs 1 95.83%
crates/service/src/tap/checks/value_check.rs 2 95.02%
crates/tap-agent/src/agent/sender_allocation.rs 3 79.6%
crates/service/src/tap/checks/deny_list_check.rs 6 86.49%
crates/tap-agent/src/agent/sender_account.rs 6 85.75%
crates/test-assets/src/lib.rs 37 88.75%
Totals Coverage Status
Change from base Build 16427537367: 5.2%
Covered Lines: 11816
Relevant Lines: 15833

💛 - Coveralls

@suchapalaver suchapalaver force-pushed the suchapalaver/test/replace-sqlx-test-with-testcontainers branch 2 times, most recently from 4a95373 to d924db1 Compare July 21, 2025 21:45
@suchapalaver suchapalaver marked this pull request as ready for review July 22, 2025 01:31
@suchapalaver suchapalaver force-pushed the suchapalaver/test/replace-sqlx-test-with-testcontainers branch from 5466624 to 8f26845 Compare July 22, 2025 01:34
Copy link
Collaborator

@neithanmo neithanmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great contribution to the testing infrastructure
I also loved the documentation, that is really helpful!

/// This creates an isolated PostgreSQL container and database for testing.
/// The container will be kept alive as long as the returned TestDatabase
/// instance is not dropped.
pub async fn setup_shared_test_db() -> TestDatabase {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is magic!
I will add something like this to graph-agent for sure!

@suchapalaver suchapalaver merged commit 12bf3e2 into main Jul 22, 2025
12 checks passed
@suchapalaver suchapalaver deleted the suchapalaver/test/replace-sqlx-test-with-testcontainers branch July 22, 2025 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants