Conversation
Currently if we have an error during message processing, we will try again and again indefinitely. Try to keep things in batch while isolating errors.
|
@copilot can you write tests for this PR ? We should test a stream with 5 messages where on message is breaking at insertion time in the database. |
|
@alexgarel I've opened a new pull request, #263, to work on those changes. Once the pull request is ready, I'll request review from you. |
0a6c7d8 to
0ca7af4
Compare
|
the #263 brings some fixes, so do not merge it yet. |
…insertion (#263) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Alex Garel <alex@openfoodfacts.org>
There was a problem hiding this comment.
Pull request overview
This PR improves redis_listener error handling so that message-processing failures don’t cause the listener to retry the same failing batch indefinitely, by splitting batches to isolate problematic messages and scheduling per-message retries.
Changes:
- Refactors retry/backoff logic (introduces
redis_error_count, parameterizedget_retry_interval). - Adds batch-splitting and per-message retry scheduling via
items_to_retry. - Extends test coverage to verify batch splitting and updates existing retry-related tests.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
query/events.py |
Adds per-message retry queue + batch splitting logic in redis_listener, refactors retry/backoff handling. |
query/events_test.py |
Adds a new test for batch splitting on insertion failure and updates retry/backoff tests for the new counters/signatures. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
query/events_test.py
Outdated
| len(call.args[1][0][1]) for call in messages_received.call_args_list | ||
| ] | ||
| assert 5 in stream_message_counts | ||
| # we have one call that isolate ou problematic message |
There was a problem hiding this comment.
Typo in comment: "ou" → "our".
| # we have one call that isolate ou problematic message | |
| # we have one call that isolate our problematic message |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
Hi @alexgarel . I get the idea here but I think we might need to move the logic inside the import_with_filter function somehow so that it also copes with scheduled updates (that run overnight). I'm not sure how we would do that though so maybe what you have proposed is better than nothing... |
Currently if we have an error during message processing, we will try again and again indefinitely.
Try to keep things in batch while isolating errors.
see also https://openfoodfacts.slack.com/archives/C02LDQDDD/p1772184346701689