-
Notifications
You must be signed in to change notification settings - Fork 296
Description
Summary
E2E tests for the signal-decision engine revealed that keyword matching is highly unreliable. Only the "ASAP" keyword works consistently, while other keywords ("urgent", "think", "careful", "immediate") fail sporadically.
Severity
Critical (Functionality)
Symptom
Keyword matching shows inconsistent behavior:
- ✅ "ASAP" keyword: Works consistently
- ❌ Other keywords: Fail sporadically, even when present in queries
- ❌ Same keywords succeed in some queries but fail in others
Test Results
Tests affected:
keyword-routing: 17% accuracy (1/6 pass)rule-condition-logic: 33% accuracy (2/6 pass)
Example Failures
| Query | Keywords | Expected | Actual | Status |
|---|---|---|---|---|
| "We need this done ASAP" | asap | thinking_decision | thinking_decision | ✅ PASS |
| "This is URGENT" | urgent | thinking_decision | (empty) | ❌ FAIL |
| "Please think carefully" | think, careful | thinking_decision | thinking_decision | ✅ PASS |
| "Think about this urgent problem" | think, urgent | thinking_decision | (empty) | ❌ FAIL |
Pattern: Same keywords succeed in some queries but fail in others (inconsistent behavior).
Steps to Reproduce
Run the E2E test workflows:
Option 1: Run AIBrix/AI Gateway tests
# Trigger integration-test-k8s.yml workflow
# This runs tests for AIBrix and AI Gateway profilesOption 2: Run Dynamic Config tests
# Trigger integration-test-dynamic-config.yml workflow
# This runs tests for Dynamic Config profileOr run locally:
# AIBrix profile
make e2e-test E2E_PROFILE=aibrix
# AI Gateway profile
make e2e-test E2E_PROFILE=ai-gateway
# Dynamic Config profile
make e2e-test E2E_PROFILE=dynamic-configAll profiles will show the same keyword matching issues.
Root Cause
This appears to be an embedding-based signal issue. The keyword matching signal needs to be working correctly and covered in tests.
Acceptance Criteria
- All configured keywords (urgent, immediate, asap, think, careful) match consistently
- Case-insensitive matching works correctly (URGENT = urgent)
- Keywords are detected regardless of position in query
- Multiple keywords in same query are all detected
-
keyword-routingtest reaches 100% accuracy (6/6 pass) -
rule-condition-logictest reaches 100% accuracy (6/6 pass) - Add unit tests for keyword matching with various query patterns
- Verify embedding-based keyword matching signal is working correctly
Impact
This is a critical functionality issue as it breaks the core routing logic that depends on keyword detection. Queries that should trigger specific decision paths are not being routed correctly.
Related Issues
Part of the signal-decision engine backend issues affecting all deployment profiles (AIBrix, AI Gateway, Dynamic Config).