Skip to content

Conversation

@eyw520
Copy link
Member

@eyw520 eyw520 commented Oct 31, 2025

No description provided.

@vercel
Copy link
Contributor

vercel bot commented Oct 31, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Updated (UTC)
dev.ferndocs.com Ready Ready Preview Nov 3, 2025 3:25pm
fern-dashboard Ready Ready Preview Nov 3, 2025 3:25pm
fern-dashboard-dev Ready Ready Preview Nov 3, 2025 3:25pm
ferndocs.com Ready Ready Preview Nov 3, 2025 3:25pm
preview.ferndocs.com Ready Ready Preview Nov 3, 2025 3:25pm
prod-assets.ferndocs.com Ready Ready Preview Nov 3, 2025 3:25pm
prod.ferndocs.com Ready Ready Preview Nov 3, 2025 3:25pm
1 Skipped Deployment
Project Deployment Preview Updated (UTC)
fern-platform Ignored Ignored Nov 3, 2025 3:25pm

Comment on lines 69 to 78
def has_interrupt_message(self) -> tuple[bool, str | None]:
messages = self.receive_messages(max_messages=10)

for msg in messages:
body = msg["body"]
if body.get("type") == "INTERRUPT":
LOGGER.info(f"Found INTERRUPT message in queue")
return True, msg["receipt_handle"]

return False, None
Copy link
Contributor

@vercel vercel bot Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The has_interrupt_message() method fetches messages but doesn't acknowledge non-INTERRUPT messages, causing queue message accumulation and duplicates on retry.

View Details
📝 Patch Details
diff --git a/servers/fai-lambda/fai-scribe/src/utils/sqs_client.py b/servers/fai-lambda/fai-scribe/src/utils/sqs_client.py
index 7df5e881d..f5703da45 100644
--- a/servers/fai-lambda/fai-scribe/src/utils/sqs_client.py
+++ b/servers/fai-lambda/fai-scribe/src/utils/sqs_client.py
@@ -68,14 +68,21 @@ class SQSClient:
 
     def has_interrupt_message(self) -> tuple[bool, str | None]:
         messages = self.receive_messages(max_messages=10)
+        interrupt_receipt = None
 
         for msg in messages:
             body = msg["body"]
+            receipt_handle = msg["receipt_handle"]
+            
             if body.get("type") == "INTERRUPT":
                 LOGGER.info(f"Found INTERRUPT message in queue")
-                return True, msg["receipt_handle"]
+                interrupt_receipt = receipt_handle
+            else:
+                # Delete non-INTERRUPT messages to prevent queue accumulation
+                self.delete_message(receipt_handle)
+                LOGGER.debug(f"Deleted processed {body.get('type', 'UNKNOWN')} message")
 
-        return False, None
+        return interrupt_receipt is not None, interrupt_receipt
 
     def get_resume_messages(self) -> list[dict[str, Any]]:
         messages = self.receive_messages(max_messages=10)
@@ -83,8 +90,14 @@ class SQSClient:
 
         for msg in messages:
             body = msg["body"]
+            receipt_handle = msg["receipt_handle"]
+            
             if body.get("type") == "RESUME":
-                resume_messages.append({"body": body, "receipt_handle": msg["receipt_handle"]})
+                resume_messages.append({"body": body, "receipt_handle": receipt_handle})
+            else:
+                # Delete non-RESUME messages to prevent queue accumulation
+                self.delete_message(receipt_handle)
+                LOGGER.debug(f"Deleted processed {body.get('type', 'UNKNOWN')} message")
 
         if resume_messages:
             LOGGER.info(f"Found {len(resume_messages)} RESUME messages in queue")

Analysis

SQS message accumulation in has_interrupt_message() and get_resume_messages()

What fails: SQSClient.has_interrupt_message() and SQSClient.get_resume_messages() fetch messages but only delete specific message types (INTERRUPT/RESUME), leaving other messages in the queue to accumulate

How to reproduce:

# Simulate mixed message types in SQS queue
# Call has_interrupt_message() repeatedly
client = SQSClient(queue_url)
client.has_interrupt_message()  # Fetches all messages, only deletes INTERRUPT
client.has_interrupt_message()  # Re-fetches same non-INTERRUPT messages

Result: Non-INTERRUPT messages (RESUME, OTHER types) remain in queue after each poll and get re-processed on every subsequent call, causing message accumulation and duplicate processing

Expected: All processed messages should be deleted from queue to prevent redelivery after SQS visibility timeout expires

Root cause: Methods call receive_messages() but only delete messages of specific types, violating SQS best practices for message cleanup

@eyw520 eyw520 merged commit 608a4d4 into app Nov 3, 2025
23 checks passed
@eyw520 eyw520 deleted the eden/scribe-implement-sqs-polling branch November 3, 2025 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants