Fix classification fallback bug and enhance all prompts (Issue #93) #191

LinKon12 · 2025-12-15T22:17:23Z

Closes #93

📝 Description

This PR fixes a critical bug where the bot responded to every message when classification API calls failed (rate limits, errors). The fallback behavior returned needs_devrel: True, causing spam. Now it safely ignores messages when classification fails.

Additionally, all prompts were enhanced per issue #93's secondary objective - improved clarity, added examples, and stricter activation rules for the classification prompt.

🔧 Changes Made

Fixed critical bug in classification_router.py: Changed fallback from needs_devrel: True to False - bot now ignores messages safely when classification fails
Enhanced DEVREL_TRIAGE_PROMPT: Added strict activation rules (bot tag OR direct questions), 18 examples (vs 3), explicit ignore cases
Improved EXTRACT_SEARCH_QUERY_PROMPT: Added 8 examples and comprehensive extraction guidelines (was only 3 lines)
Enhanced REACT_SUPERVISOR_PROMPT: Added iteration limits (max 5), decision logic, and 5 reasoning examples
Improved RESPONSE_PROMPT: Added 6 response type templates and quality checklist
Enhanced CONVERSATION_SUMMARY_PROMPT: Added structured format with 4 sections and merge strategy

📷 Screenshots or Visual Changes (if applicable)

N/A - Backend logic changes only

🤝 Collaboration

Solo work

✅ Checklist

I have read the contributing guidelines.
I have added tests that prove my fix is effective or that my feature works.
I have added necessary documentation (if applicable).
Any dependent changes have been merged and published in downstream modules.

Summary by CodeRabbit

Improvements
- More structured, rule-driven assistant prompts with explicit decision steps, iteration limits, and clearer action choices.
- Expanded response formatting and content guidelines for clearer, actionable replies and quality checks.
- Improved search-query extraction and conversation summarization for more consistent, concise results.
- Stricter classification/triage defaults and fallback behavior to reduce false positives and better prioritize items.
- Improved error logging to include tracebacks for easier diagnostics.
Chores
- Pinned Discord library dependency to a specific version.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…-Org#93)

coderabbitai · 2025-12-15T22:17:34Z

Warning

Rate limit exceeded

@LinKon12 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 22 minutes and 53 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 77e64ab and b40cd7a.

📒 Files selected for processing (1)

backend/app/agents/devrel/prompts/response_prompt.py (2 hunks)

Walkthrough

This PR tightens DevRel activation rules (prefer ignoring non-relevant messages), updates classification defaults to not trigger DevRel, expands multiple DevRel prompts into structured, rule-based templates with iteration and formatting controls, pins discord-py to ==2.4.0, and changes Discord cog error logging to use exception tracebacks.

Changes

Cohort / File(s)	Summary
DevRel prompt suite `backend/app/agents/devrel/prompts/react_prompt.py`, `backend/app/agents/devrel/prompts/response_prompt.py`, `backend/app/agents/devrel/prompts/search_prompt.py`, `backend/app/agents/devrel/prompts/summarization_prompt.py`	Replaced concise prompts with expanded, structured templates: REACT adds Think→Act→Observe loop, iteration limits, TOOL RESULTS block, formalized actions and DECISION LOGIC; RESPONSE adds multi-section context placeholders, CONTENT/DISCORD formatting rules, response-type templates, and a quality checklist; SEARCH tightens extraction rules with examples and edge-case handling; SUMMARIZATION changes to a merge-aware, sectioned summary template with explicit fields and examples.
Classification logic `backend/app/classification/prompt.py`, `backend/app/classification/classification_router.py`	Rewrote triage prompt to strict activation rules (respond only when mentioned or for direct project questions), added CRITICAL IGNORE list and new decision criteria; router now guards non-JSON LLM outputs, logs warnings, and defaults to `needs_devrel: False`, `priority: "low"` on fallback.
Discord startup & dependency `backend/main.py`, `pyproject.toml`	Changed Discord cog error logging to use `logger.exception` (traceback) and pinned `discord-py` dependency from a range to `==2.4.0`.
Minor formatting `backend/app/database/falkor/code-graph-backend/api/index.py`	Removed an extra blank line before Flask app instantiation; no behavioral change.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant Classifier
  participant Agent
  participant Tools
  participant Responder

  User->>Classifier: Send message (chat / mention / question)
  Classifier->>Classifier: Apply STRICT ACTIVATION RULES
  alt Should respond (mention or project-specific)
    Classifier-->>Agent: needs_devrel:true, priority
    Agent->>Agent: Run REACT supervisor loop (Think → Act → Observe, up to 5 iterations)
    Agent->>Tools: Invoke selected tool (web_search / faq_handler / github_toolkit / onboarding)
    Tools-->>Agent: Tool Results (TOOL RESULTS FROM PREVIOUS ACTIONS)
    Agent->>Responder: Final task result / reasoning
    Responder-->>User: Formatted response (per RESPONSE_PROMPT rules)
  else Should ignore
    Classifier-->>Responder: needs_devrel:false, priority: low
    Responder-->>User: No action (or ignored)
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas needing extra attention:
- Cross-file prompt consistency: placeholders and variable names referenced by callers (REACT ↔ RESPONSE ↔ SEARCH ↔ SUMMARIZATION).
- Classification router changes: JSON parsing guard, fallback defaults, and logged reasoning — ensure calling code handles new defaults.
- Tool/action definitions in REACT prompt (e.g., github_toolkit) — confirm tooling implementations and examples match runtime tool signatures.
- Dependency pinning to ==2.4.0 — verify CI and runtime compatibility.

Possibly related PRs

[feat]: implement github contributor recommendation tool #110 — Adds/expands github_toolkit and contributor-recommendation formatting; likely overlaps with REACT/RESPONSE prompt changes.
updated docs and added setup video guide #152 — Modifies discord-py dependency pinning; directly related to the pyproject.toml change.
[refactor]: migrate to cog-based command architecture #76 — Changes to CONVERSATION_SUMMARY_PROMPT and summarization logic; related to summarization_prompt.py edits.

Suggested reviewers

smokeyScraper
chandansgowda

Poem

🐰 I hopped through rules and prompts today,
I tuned each loop and showed the way.
Now I answer only when I'm called,
I search, I summarize, but mostly — stalled.
A tidy hop, a careful stay. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Out of Scope Changes check	⚠️ Warning	Changes include unrelated modifications: logging enhancement in main.py, dependency pinning in pyproject.toml, and a whitespace-only change in index.py that are outside the scope of Issue #93.	Revert the logging change in main.py, dependency version pinning in pyproject.toml, and whitespace formatting in index.py to keep the PR focused on classification and prompt enhancements per Issue #93.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main change: fixing a classification fallback bug and enhancing prompts to address Issue #93.
Linked Issues check	✅ Passed	The PR successfully implements the primary objective (strict classification activation rules via DEVREL_TRIAGE_PROMPT) and secondary objective (enhancement of multiple prompts) from Issue #93.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (3)

pyproject.toml (1)
29-29: Exact version pinning may be overly restrictive.

Pinning discord-py to ==2.4.0 prevents automatic pickup of patch releases (e.g., 2.4.1) that may contain security fixes. Consider using a compatible release constraint like ~=2.4.0 (equivalent to >=2.4.0,<2.5.0) unless there's a specific reason for exact pinning.
-    "discord-py (==2.4.0)",
+    "discord-py (>=2.4.0,<2.5.0)",
backend/app/classification/classification_router.py (2)
51-53: Consider using logging.exception for better debugging.

Per static analysis, logging.exception automatically includes the traceback, which aids debugging classification failures.
         except Exception as e:
-            logger.error(f"Triage error: {str(e)}")
+            logger.exception("Triage error: %s", e)
             return self._fallback_triage(message)
36-37: Move import json to top of file.

The json module import inside the function adds overhead on each call. Standard practice is to import at module level.

Add at the top of the file with other imports:
import json
Then remove line 36.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8eeacad and 2233f29.

⛔ Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (9)

backend/app/agents/devrel/prompts/react_prompt.py (2 hunks)
backend/app/agents/devrel/prompts/response_prompt.py (2 hunks)
backend/app/agents/devrel/prompts/search_prompt.py (1 hunks)
backend/app/agents/devrel/prompts/summarization_prompt.py (2 hunks)
backend/app/classification/classification_router.py (1 hunks)
backend/app/classification/prompt.py (2 hunks)
backend/app/database/falkor/code-graph-backend/api/index.py (0 hunks)
backend/main.py (2 hunks)
pyproject.toml (1 hunks)

💤 Files with no reviewable changes (1)

backend/app/database/falkor/code-graph-backend/api/index.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-06-08T13:08:48.469Z

Learnt from: smokeyScraper
Repo: AOSSIE-Org/Devr.AI PR: 72
File: backend/app/agents/shared/classification_router.py:0-0
Timestamp: 2025-06-08T13:08:48.469Z
Learning: The user plans to migrate the JSON parsing in backend/app/agents/shared/classification_router.py from manual JSON extraction to using Pydantic parser for better validation and type safety.

Applied to files:

backend/app/classification/classification_router.py

🪛 Ruff (0.14.8)

backend/app/classification/classification_router.py

51-51: Do not catch blind exception: Exception

(BLE001)

52-52: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

52-52: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🔇 Additional comments (9)

backend/app/classification/classification_router.py (2)

39-45: Core bug fix looks good.

Changing the default from True to False directly addresses the critical issue in #93. When JSON parsing succeeds but fields are missing, the bot will now safely ignore rather than spam responses.

55-64: Fallback behavior fix is correct.

The change from needs_devrel: True to needs_devrel: False in the fallback path is the right approach—better to miss a message than to spam users when classification fails.

backend/app/agents/devrel/prompts/search_prompt.py (2)

1-48: Well-structured prompt with comprehensive guidance.

The expanded prompt provides clear extraction rules, diverse examples covering common scenarios (setup, errors, contribution, APIs), and explicit edge-case handling. The 2-6 word constraint with max 10 words is reasonable for search queries.

41-45: Consider the "general inquiry" fallback behavior.

Returning "general inquiry" as a search query for greeting-only messages may not yield useful search results. Ensure downstream code handles this sentinel value appropriately, or consider returning an empty string to signal no actionable query.

backend/app/classification/prompt.py (2)

7-39: Excellent stricter activation rules.

The two-pronged activation criteria (explicit bot tag OR direct project question) with the "err on the side of NOT responding" decision logic directly addresses the false-positive reduction goal. The KEY example distinguishing general questions from project-specific ones (Lines 28-29) is particularly helpful for the LLM.

48-95: Comprehensive examples covering both positive and negative cases.

The 18 examples (9 SHOULD RESPOND, 9 SHOULD IGNORE) provide clear guidance across various scenarios including greetings, statements vs questions, community-answered questions, and reactions. This should significantly improve classification accuracy.

backend/app/agents/devrel/prompts/summarization_prompt.py (1)

1-57: Well-structured summarization prompt with clear guidance.

The prompt is well-organized with distinct sections, explicit merging instructions, and helpful good/bad examples. The 250-word limit and prioritization of recent context are sensible choices for maintaining useful conversation summaries.

backend/app/agents/devrel/prompts/react_prompt.py (1)

1-104: Comprehensive ReAct prompt with clear decision logic and guardrails.

The prompt effectively implements the ReAct reasoning pattern with well-defined actions, priority-based decision logic, and iteration limits. The examples cover diverse scenarios and the "When NOT to use tools" section helps prevent common agent loops.

backend/app/agents/devrel/prompts/response_prompt.py (1)

24-144: Well-structured response prompt with comprehensive guidelines.

The content guidelines, response type templates, and quality checklist provide excellent structure for generating consistent, user-friendly responses. The DevRel tone guidance and handling of incomplete information are particularly valuable additions.

backend/app/agents/devrel/prompts/response_prompt.py

backend/main.py

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2233f29 and 3f8a42f.

📒 Files selected for processing (1)

backend/main.py (2 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-06-08T13:31:11.572Z

Learnt from: smokeyScraper
Repo: AOSSIE-Org/Devr.AI PR: 72
File: backend/app/agents/devrel/nodes/handle_web_search_node.py:31-42
Timestamp: 2025-06-08T13:31:11.572Z
Learning: In backend/app/agents/devrel/tools/search_tool.py, the TavilySearchTool.search() method has partial error handling for missing API key, AttributeError, ConnectionError, and TimeoutError, but lacks a comprehensive Exception catch-all block, so calling functions may still need additional error handling for other potential exceptions.

Applied to files:

backend/main.py

🪛 Ruff (0.14.8)

backend/main.py

53-53: Redundant exception object included in logging.exception call

(TRY401)

backend/main.py

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3f8a42f and e47abca.

📒 Files selected for processing (1)

backend/app/agents/devrel/prompts/response_prompt.py (2 hunks)

🔇 Additional comments (1)

backend/app/agents/devrel/prompts/response_prompt.py (1)

1-145: LGTM! Comprehensive prompt expansion with clear guidelines.

The expanded prompt template significantly improves response quality guidance with:

Structured content guidelines covering synthesis, actionability, tone, and length

Clear Discord-specific formatting rules

Six specialized response type templates

Practical quality checklist

The multi-section approach provides the AI model with explicit instructions for generating consistent, high-quality responses while maintaining appropriate tone and formatting for the Discord platform.

backend/app/agents/devrel/prompts/response_prompt.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

backend/main.py (1)
52-53: Good fix addressing past feedback; consider removing unused exception variable.

The switch to logger.exception() correctly addresses previous review comments and will now log the full stack trace for debugging. The exception handling also properly catches all extension-related failures via commands.ExtensionError.

However, the exception variable e on line 52 is no longer used since logger.exception() automatically includes exception details. You can simplify by removing the variable capture:

Apply this diff to remove the unused variable:
-            except (ImportError, commands.ExtensionError) as e:
+            except (ImportError, commands.ExtensionError):
                 logger.exception("Failed to load Discord cog extension")

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e47abca and 77e64ab.

📒 Files selected for processing (1)

backend/main.py (2 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-06-08T13:31:11.572Z

Learnt from: smokeyScraper
Repo: AOSSIE-Org/Devr.AI PR: 72
File: backend/app/agents/devrel/nodes/handle_web_search_node.py:31-42
Timestamp: 2025-06-08T13:31:11.572Z
Learning: In backend/app/agents/devrel/tools/search_tool.py, the TavilySearchTool.search() method has partial error handling for missing API key, AttributeError, ConnectionError, and TimeoutError, but lacks a comprehensive Exception catch-all block, so calling functions may still need additional error handling for other potential exceptions.

Applied to files:

backend/main.py

🪛 Ruff (0.14.8)

backend/main.py

52-52: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

Fix classification fallback bug and enhance all prompts (Issue AOSSIE…

2233f29

…-Org#93)

coderabbitai bot reviewed Dec 15, 2025

View reviewed changes

backend/app/agents/devrel/prompts/response_prompt.py Show resolved Hide resolved

backend/app/agents/devrel/prompts/response_prompt.py Show resolved Hide resolved

backend/main.py Outdated Show resolved Hide resolved

LinKon12 added 2 commits December 17, 2025 02:17

Fix: Catch all extension errors and use logger.exception

3f8a42f

Fix: Remove formatting inconsistencies in response prompt

e47abca

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

backend/main.py Outdated Show resolved Hide resolved

backend/main.py Outdated Show resolved Hide resolved

Fix: Remove redundant exception argument from logger.exception

77e64ab

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

backend/app/agents/devrel/prompts/response_prompt.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

Fix: Add backticks to command example to match formatting guidelines

b40cd7a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix classification fallback bug and enhance all prompts (Issue #93) #191

Fix classification fallback bug and enhance all prompts (Issue #93) #191

Uh oh!

LinKon12 commented Dec 15, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 15, 2025 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix classification fallback bug and enhance all prompts (Issue #93) #191

Are you sure you want to change the base?

Fix classification fallback bug and enhance all prompts (Issue #93) #191

Uh oh!

Conversation

LinKon12 commented Dec 15, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Description

🔧 Changes Made

📷 Screenshots or Visual Changes (if applicable)

🤝 Collaboration

✅ Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

LinKon12 commented Dec 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 15, 2025 •

edited

Loading