Skip to content

Conversation

@LeoPatOZ
Copy link
Collaborator

@LeoPatOZ LeoPatOZ commented Sep 10, 2025

  • Would be nice to have some 'setup_test' helper functions

@LeoPatOZ LeoPatOZ changed the title Basic Integration Tests Basic Integration Tests For Live Processing Sep 10, 2025
@LeoPatOZ LeoPatOZ requested a review from 0xNeshi September 11, 2025 07:42
Copy link
Collaborator

@0xNeshi 0xNeshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

}

#[async_trait]
impl EventCallback for EventCounter {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually love the callback approach you took with EventCallback - instead of storing the callback itself, you store a callback struct type that implements this callback trait 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes thought it was simpler glad you like it 😄

}

#[tokio::test]
async fn test_live_scanning_with_slow_processor() -> anyhow::Result<()> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just found out that there's a race condition in this test, it fails intermittently...

Copy link
Collaborator

@0xNeshi 0xNeshi Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The race condition seems to appear when aborting the scanner thread. Because it is slow to process the pending events (events are emitted every ~50ms, while the scanner "runs" every ~100ms), occasionally it doesn't manage to handle the last event before being aborted, causing the assertion below it to fail (in my case it failed because the actual number of processed events was 2, not 3). It could easily happen that the actual number of processed events is just 1.

The solution could be one of the following:

  1. Add an additional sleep(200) to allow the scanner thread to "catch up". <- 150ms (the difference in delay) + 50ms buffer just in case; still could theoretically have a race condition in the right circumstances, but much less likely.
  2. Take into account that the last event might not be processed, so update the assertion to state something like assert!((1..=3).contains(processed.load(Ordering::SeqCst)); <- this makes the test an "approximation" of valid behavior, and a poor approximation at that.
  3. Do you have any alternative ideas?

Wdyt?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now I will add a sleep(200).
Thanks for looking into this - perhaps once we have channels we can expose a method to see if the channels are empty? Im not sure if this is relevant or needed but my reasoning is for any reason a process want to stop the scanner, they must ensure that any callbacks have finished executing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll try to make the channels testable.

I think we should ensure the scanner does it's job, but I don't think we can make assumptions about what happens when someone intentionally kills the whole process

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more of like a pulse check

let bool = scanner.has_pending_callbacks();

or like

scanner.safe_abort();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the test assertions look like for each?

}

#[tokio::test]
async fn test_live_scanning_basic() -> anyhow::Result<()> {
Copy link
Collaborator

@0xNeshi 0xNeshi Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

red alert, just had this test randomly fail, seems it has a race condition too...

My guess is that the scanner_handle.abort() call is again to blame.
There's probably a race condition in test_live_scanning_multiple_events too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there must be a more reliable way to write these tests to avoid this problem, but I'd leave that for when we're writing tests for channels

Base automatically changed from unit-tests to main September 12, 2025 07:05
Copy link
Collaborator

@0xNeshi 0xNeshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving in advance

@0xNeshi 0xNeshi merged commit ccc6dd0 into main Sep 12, 2025
6 checks passed
@0xNeshi 0xNeshi deleted the integration-tests branch September 12, 2025 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants