Skip to content

Conversation

@the-asmit
Copy link

@the-asmit the-asmit commented Jan 7, 2026

📝 Description

Currently, network and backend errors during repository indexing are returned as generic errors, preventing agents from distinguishing retryable failures from permanent ones.

This PR improves error signalling during repository indexing by making retryability explicit, while keeping the existing indexing flow unchanged.

🔧 Changes Made

  • Distinguished retryable and non-retryable failures during repository indexing
  • Added explicit error_code values for common failure scenarios
  • Included a retryable flag in error responses to enable agent-side decision-making
  • Preserved existing indexing logic and database updates
  • Fixed a pre-existing bug where the response body could be consumed twice when parsing JSON

📷 Screenshots or Visual Changes (if applicable)

N/A — backend-only changes.

✅ Checklist

  • I have read the contributing guidelines.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have added necessary documentation (if applicable).
  • Any dependent changes have been merged and published in downstream modules.

Summary by CodeRabbit

  • Bug Fixes
    • More consistent, structured error reporting during repository indexing with explicit error codes and retryable flags.
    • Better handling of rate limits and backend failures (recoverable cases marked retryable).
    • Resilient JSON parsing with safe fallback to avoid crashes on malformed payloads.
    • Clearer distinction between recoverable and non-recoverable indexing failures and improved, truncated error messages stored for diagnostics.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 7, 2026

📝 Walkthrough

Walkthrough

Repo indexing now returns consistent, structured failure responses: HTTP 429, 5xx, other non-200 responses, ValueError, aiohttp.ClientError, and unexpected exceptions are mapped to explicit error_code and retryable flags; error text is truncated and stored in the repo record. (49 words)

Changes

Cohort / File(s) Summary
Error handling updates
backend/app/services/codegraph/repo_service.py
Added resilient JSON parsing on 200; added handling for 429 (RATE_LIMITED, retryable), 5xx (BACKEND_ERROR, retryable), other non-200 (INVALID_REQUEST, non-retryable); mapped ValueError → INVALID_REPOSITORY (non-retryable); aiohttp.ClientError → BACKEND_UNAVAILABLE (retryable); generic exceptions → INDEXING_FAILED (non-retryable). Error text is truncated (500 chars) and saved to DB.

Sequence Diagram(s)

(omitted — changes are focused on error mapping/handling within a single service function and do not introduce a multi-component sequential flow that requires visualization.)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇
I hopped through bytes and named each fall,
Gave every error a code and call,
Truncated tales tucked in a nest,
Retry flags set for those that rest,
A quiet burrow for logs, blessed. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Differentiate retryable and non-retryable indexing errors' accurately summarizes the main change: adding explicit error_code and retryable flag to distinguish between retryable and non-retryable failures during repository indexing.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/app/services/codegraph/repo_service.py (1)

153-169: Consider handling 429 (rate limiting) as retryable.

The current logic treats all non-5xx failures as non-retryable, but HTTP 429 (Too Many Requests) indicates rate limiting and is inherently retryable after a backoff period. Classifying it as INVALID_REQUEST with retryable: false could cause permanent failures for transient conditions.

🐛 Suggested fix
+                    elif response.status == 429:
+                        error_text = await response.text()
+                        error_msg = error_text[:500]
+
+                        await self.supabase.table("indexed_repositories").update({
+                            "indexing_status": "failed",
+                            "last_error": error_msg,
+                            "updated_at": datetime.now().isoformat()
+                        }).eq("repository_full_name", repo_info['full_name']).eq(
+                            "is_deleted", False
+                        ).execute()
+
+                        return {
+                            "status": "error",
+                            "error_code": "RATE_LIMITED",
+                            "retryable": True,
+                            "message": "Rate limit exceeded. Please try again later."
+                        }
                     else:
                         error_text = await response.text()
🧹 Nitpick comments (2)
backend/app/services/codegraph/repo_service.py (2)

140-145: Missing updated_at field in database update.

The update at line 83 includes updated_at when changing status to "pending", but this update (and the one at lines 157-162) omits it when marking as "failed". This inconsistency causes stale timestamps.

♻️ Suggested fix
                         await self.supabase.table("indexed_repositories").update({
                             "indexing_status": "failed",
-                            "last_error": error_msg
+                            "last_error": error_msg,
+                            "updated_at": datetime.now().isoformat()
                         }).eq("repository_full_name", repo_info['full_name']).eq(
                             "is_deleted", False
                         ).execute()

Apply the same fix to lines 157-162.


177-184: Correct retryable classification; minor logging cleanup suggested.

Network errors are correctly marked as retryable. Per static analysis (Ruff TRY401): logging.exception() already captures the exception, so including {e} in the message is redundant.

♻️ Optional cleanup
         except aiohttp.ClientError as e:
-            logger.exception(f"Network error indexing {repo_input}: {e}")
+            logger.exception(f"Network error indexing {repo_input}")
             return {
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa66519 and ff7320f.

📒 Files selected for processing (1)
  • backend/app/services/codegraph/repo_service.py
🧰 Additional context used
🪛 Ruff (0.14.10)
backend/app/services/codegraph/repo_service.py

178-178: Redundant exception object included in logging.exception call

(TRY401)

🔇 Additional comments (2)
backend/app/services/codegraph/repo_service.py (2)

170-176: LGTM!

Correct classification—invalid repository format errors are user input issues and appropriately marked as non-retryable.


185-192: LGTM!

Defaulting unknown exceptions to non-retryable is a safe, conservative choice that prevents infinite retry loops for persistent failures.

@the-asmit
Copy link
Author

Yeah, you're right. 429 should indeed be treated as a retryable rate-limiting scenario.
I've added explicit handling for HTTP 429 with a retryable error signal.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/app/services/codegraph/repo_service.py (1)

117-117: Critical bug: Response body consumed twice.

The ternary expression first calls await response.text() (which consumes the response body), then conditionally calls await response.json() (which requires the body to still be available). In aiohttp, a response body can only be read once, so this would fail when trying to parse JSON.

🐛 Proposed fix
-                    data = await response.json() if await response.text() else {}
+                    text = await response.text()
+                    data = await response.json() if text else {}

Or better yet, since you're expecting JSON:

-                    data = await response.json() if await response.text() else {}
+                    try:
+                        data = await response.json()
+                    except (aiohttp.ContentTypeError, ValueError):
+                        data = {}
🧹 Nitpick comments (2)
backend/app/services/codegraph/repo_service.py (2)

194-201: LGTM with optional logging improvement.

The ClientError handling correctly marks network failures as retryable and provides appropriate user feedback.

♻️ Optional: Remove redundant exception from log message

The logger.exception() method automatically includes exception details, so explicitly including {e} in the message is redundant (as flagged by static analysis).

-            logger.exception(f"Network error indexing {repo_input}: {e}")
+            logger.exception(f"Network error indexing {repo_input}")

136-169: Optional: Consider reducing duplication between error handlers.

The 429 and 5xx error handling blocks follow an identical pattern (extract error text, update DB, return structured response). While the current implementation is clear and explicit, you could optionally reduce duplication with a helper method.

♻️ Optional refactoring approach
async def _handle_indexing_error(
    self, 
    response, 
    repo_full_name: str, 
    error_code: str, 
    retryable: bool, 
    message: str
) -> Dict[str, Any]:
    """Helper to handle indexing errors consistently."""
    error_text = await response.text()
    error_msg = error_text[:500]
    
    await self.supabase.table("indexed_repositories").update({
        "indexing_status": "failed",
        "last_error": error_msg
    }).eq("repository_full_name", repo_full_name).eq(
        "is_deleted", False
    ).execute()
    
    return {
        "status": "error",
        "error_code": error_code,
        "retryable": retryable,
        "message": message
    }

Then use it like:

elif response.status == 429:
    return await self._handle_indexing_error(
        response, repo_info['full_name'],
        "RATE_LIMITED", True,
        "Rate limit exceeded. Please try again later."
    )
elif 500 <= response.status < 600:
    return await self._handle_indexing_error(
        response, repo_info['full_name'],
        "BACKEND_ERROR", True,
        "Indexing service encountered an internal error. Please try again later."
    )
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ff7320f and 9893bd8.

📒 Files selected for processing (1)
  • backend/app/services/codegraph/repo_service.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/app/services/codegraph/repo_service.py (2)
frontend/src/lib/supabaseClient.ts (1)
  • supabase (12-12)
backend/app/agents/devrel/github/github_toolkit.py (1)
  • execute (101-141)
🪛 Ruff (0.14.10)
backend/app/services/codegraph/repo_service.py

195-195: Redundant exception object included in logging.exception call

(TRY401)

🔇 Additional comments (5)
backend/app/services/codegraph/repo_service.py (5)

136-152: LGTM! Proper handling for rate limits.

The 429 handling correctly marks the error as retryable, truncates the error message for storage, and provides a clear user-facing message. The database status update ensures failed attempts are tracked.


153-169: LGTM! Appropriate handling for server errors.

The 5xx handling correctly marks server errors as retryable and provides appropriate feedback. The implementation is consistent with the rate limit handling.


170-186: LGTM! Improved handling for invalid requests.

The error response now correctly extracts and stores the error text, and provides a structured response with retryable=False, which is appropriate for client errors (4xx status codes).


187-193: LGTM! Clear handling for invalid repository formats.

The ValueError handling correctly marks parsing failures as non-retryable and includes the specific error message to help users correct their input.


202-209: LGTM! Safe fallback for unexpected errors.

The general exception handler provides a safe default by marking unexpected failures as non-retryable, which is appropriate since these errors would typically require investigation rather than automatic retry.

@the-asmit
Copy link
Author

Good catch — fixed the response body consumption by safely parsing JSON without reading the body twice.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @backend/app/services/codegraph/repo_service.py:
- Around line 191-196: The ValueError handler that returns the error dict must
also mark the repository DB record out of "pending" so retries aren't blocked;
before returning in the except ValueError block, call the same repository status
update routine used earlier to set status to a terminal state (e.g., "failed" or
"error") using repo_info when available or falling back to repo_input (or
parsing owner/repo from repo_input) so the DB is updated even when parsing
fails, then return the error dict.
🧹 Nitpick comments (3)
backend/app/services/codegraph/repo_service.py (3)

156-172: LGTM: Correct handling of server errors.

Server errors are appropriately marked as retryable. The database update and error response structure are consistent with other error paths.

Minor: Remove trailing whitespace

Line 159 appears to have trailing whitespace.

-                        error_msg = error_text[:500]
-                    
+                        error_msg = error_text[:500]
+

198-198: Optional: Remove redundant exception object from logging.exception.

The logging.exception method automatically includes exception details, making {e} redundant in the log message.

♻️ Proposed refactor
-            logger.exception(f"Network error indexing {repo_input}: {e}")
+            logger.exception(f"Network error indexing {repo_input}")

Based on static analysis hints.


139-189: Optional: Extract repeated error text extraction.

The pattern of extracting and truncating error text is repeated three times. Consider extracting it to reduce duplication.

♻️ Proposed refactor

Add a helper method near the top of the class:

async def _extract_error_text(self, response: aiohttp.ClientResponse, max_length: int = 500) -> str:
    """Extract and truncate error text from response."""
    error_text = await response.text()
    return error_text[:max_length]

Then use it in the error handling blocks:

                     elif response.status == 429:
-                        error_text = await response.text()
-                        error_msg = error_text[:500]
+                        error_msg = await self._extract_error_text(response)

                     elif 500 <= response.status < 600:
-                        error_text = await response.text()
-                        error_msg = error_text[:500]
+                        error_msg = await self._extract_error_text(response)
                     
                     else:
-                        error_text = await response.text()
-                        error_msg = error_text[:500]
+                        error_msg = await self._extract_error_text(response)
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9893bd8 and 0ef76da.

📒 Files selected for processing (1)
  • backend/app/services/codegraph/repo_service.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/app/services/codegraph/repo_service.py (2)
frontend/src/lib/supabaseClient.ts (1)
  • supabase (12-12)
backend/app/agents/devrel/github/github_toolkit.py (1)
  • execute (101-141)
🪛 Ruff (0.14.10)
backend/app/services/codegraph/repo_service.py

198-198: Redundant exception object included in logging.exception call

(TRY401)

🔇 Additional comments (3)
backend/app/services/codegraph/repo_service.py (3)

117-120: LGTM: Resilient JSON parsing.

The fallback to an empty dict when JSON parsing fails is appropriate and allows the indexing to complete successfully even if the backend returns malformed JSON.


139-155: LGTM: Appropriate rate limiting handling.

The 429 response is correctly marked as retryable, and the database is properly updated with the failed status and error message.


173-189: LGTM: Appropriate handling of client errors.

Non-retryable client errors (4xx) are correctly distinguished from retryable server errors. The database update and error response are consistent.

Comment on lines +191 to +196
return {
"status": "error",
"error_code": "INVALID_REPOSITORY",
"retryable": False,
"message": str(e)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Database status not updated on ValueError.

If a database record was created or updated to "pending" status (lines 80-94) before the ValueError occurs, it will remain in "pending" state permanently. This breaks retry functionality because lines 71-75 will block subsequent attempts with "indexing in progress."

🐛 Proposed fix: Update database before returning
         except ValueError as e:
+            # Update database if record exists
+            try:
+                await self.supabase.table("indexed_repositories").update({
+                    "indexing_status": "failed",
+                    "last_error": str(e)[:500]
+                }).eq("repository_full_name", repo_info.get('full_name', repo_input)).eq(
+                    "is_deleted", False
+                ).execute()
+            except Exception:
+                logger.exception("Failed to update database after ValueError")
+            
             return {
                 "status": "error",
                 "error_code": "INVALID_REPOSITORY",
                 "retryable": False,
                 "message": str(e)
             }

Note: repo_info might not be available if parsing failed, so use repo_input as fallback or extract owner/repo before the try-except if possible.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @backend/app/services/codegraph/repo_service.py around lines 191 - 196, The
ValueError handler that returns the error dict must also mark the repository DB
record out of "pending" so retries aren't blocked; before returning in the
except ValueError block, call the same repository status update routine used
earlier to set status to a terminal state (e.g., "failed" or "error") using
repo_info when available or falling back to repo_input (or parsing owner/repo
from repo_input) so the DB is updated even when parsing fails, then return the
error dict.

Comment on lines +199 to +204
return {
"status": "error",
"error_code": "BACKEND_UNAVAILABLE",
"retryable": True,
"message": "Indexing service is currently unavailable. Please try again later."
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Database status not updated on ClientError.

Similar to the ValueError case, if a database record was created or updated to "pending" status (lines 80-94) before the ClientError occurs, it will remain in "pending" state permanently, blocking retry attempts.

🐛 Proposed fix: Update database before returning
         except aiohttp.ClientError as e:
             logger.exception(f"Network error indexing {repo_input}: {e}")
+            
+            # Update database if record exists
+            try:
+                error_msg = f"Network error: {str(e)}"[:500]
+                await self.supabase.table("indexed_repositories").update({
+                    "indexing_status": "failed",
+                    "last_error": error_msg
+                }).eq("repository_full_name", repo_info.get('full_name', repo_input)).eq(
+                    "is_deleted", False
+                ).execute()
+            except Exception:
+                logger.exception("Failed to update database after ClientError")
+            
             return {
                 "status": "error",
                 "error_code": "BACKEND_UNAVAILABLE",
                 "retryable": True,
                 "message": "Indexing service is currently unavailable. Please try again later."
             }

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +207 to +212
return {
"status": "error",
"error_code": "INDEXING_FAILED",
"retryable": False,
"message": "Indexing failed due to an unexpected error."
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Database status not updated on general Exception.

If a database record was created or updated to "pending" status (lines 80-94) before an unexpected exception occurs, it will remain in "pending" state permanently, blocking retry attempts.

🐛 Proposed fix: Update database before returning
         except Exception:
             logger.exception(f"Failed to index {repo_input}")
+            
+            # Update database if record exists
+            try:
+                await self.supabase.table("indexed_repositories").update({
+                    "indexing_status": "failed",
+                    "last_error": "Indexing failed due to an unexpected error."
+                }).eq("repository_full_name", repo_info.get('full_name', repo_input)).eq(
+                    "is_deleted", False
+                ).execute()
+            except Exception:
+                logger.exception("Failed to update database after unexpected exception")
+            
             return {
                 "status": "error",
                 "error_code": "INDEXING_FAILED",
                 "retryable": False,
                 "message": "Indexing failed due to an unexpected error."
             }

Committable suggestion skipped: line range outside the PR's diff.

@the-asmit
Copy link
Author

Good catch — this is a valid issue.

The pending-state inconsistency on exceptions pre-existed and affects broader state-management logic in index_repo. Since this PR is focused on clarifying retryable vs non-retryable error signaling, I’d prefer to address this in a follow-up PR (or tracked issue) to keep the scope contained and reviewable.

Happy to work on that separately.

@the-asmit
Copy link
Author

the-asmit commented Jan 7, 2026

Hey @Aditya30ag @smokeyScraper
This PR focuses on clarifying retryable vs permanent failure modes during repository indexing (including explicit handling for HTTP 429).

During review, a pre-existing critical bug was also surfaced, which I fixed: the response body was being consumed twice (response.text() followed by response.json()), which can break JSON parsing in aiohttp. That fix is now included.

CodeRabbit also flagged a broader issue where the DB state may remain pending on certain exceptions. This behavior pre-existed and affects core state-management logic, so I’ve intentionally not expanded the scope here. Happy to address that in a follow-up PR.

You can review the changes.

@Aditya30ag
Copy link
Contributor

@the-asmit can you please solve the Coderabbit ai suggestions

@the-asmit
Copy link
Author

Hey @Aditya30ag, thanks for the suggestion — I understand the concern.
This PR currently addresses:

  • Clear differentiation between retryable and permanent indexing failures (including explicit handling for HTTP 429), and
  • Fixes a pre-existing bug where the response body could be consumed twice when parsing JSON in aiohttp.

The remaining CodeRabbit suggestion regarding updating DB state on all exception paths affects broader indexing state-management logic (e.g. handling of pendingfailed transitions) and was present prior to this PR. Addressing it would expand the scope significantly and likely deserves a focused change on its own.

I’m happy to take that up in a follow-up PR.
You can merge this PR for now. And in the meantime, I'll be addressing the issue that Codebrabbit suggested separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants