Skip to content

[Test] Keyword Matching Unreliable - Only 'ASAP' Works Consistently #713

@Xunzhuo

Description

@Xunzhuo

Summary

E2E tests for the signal-decision engine revealed that keyword matching is highly unreliable. Only the "ASAP" keyword works consistently, while other keywords ("urgent", "think", "careful", "immediate") fail sporadically.

Severity

Critical (Functionality)

Symptom

Keyword matching shows inconsistent behavior:

  • ✅ "ASAP" keyword: Works consistently
  • ❌ Other keywords: Fail sporadically, even when present in queries
  • ❌ Same keywords succeed in some queries but fail in others

Test Results

Tests affected:

  • keyword-routing: 17% accuracy (1/6 pass)
  • rule-condition-logic: 33% accuracy (2/6 pass)

Example Failures

Query Keywords Expected Actual Status
"We need this done ASAP" asap thinking_decision thinking_decision ✅ PASS
"This is URGENT" urgent thinking_decision (empty) ❌ FAIL
"Please think carefully" think, careful thinking_decision thinking_decision ✅ PASS
"Think about this urgent problem" think, urgent thinking_decision (empty) ❌ FAIL

Pattern: Same keywords succeed in some queries but fail in others (inconsistent behavior).

Steps to Reproduce

Run the E2E test workflows:

Option 1: Run AIBrix/AI Gateway tests

# Trigger integration-test-k8s.yml workflow
# This runs tests for AIBrix and AI Gateway profiles

Option 2: Run Dynamic Config tests

# Trigger integration-test-dynamic-config.yml workflow
# This runs tests for Dynamic Config profile

Or run locally:

# AIBrix profile
make e2e-test E2E_PROFILE=aibrix

# AI Gateway profile
make e2e-test E2E_PROFILE=ai-gateway

# Dynamic Config profile
make e2e-test E2E_PROFILE=dynamic-config

All profiles will show the same keyword matching issues.

Root Cause

This appears to be an embedding-based signal issue. The keyword matching signal needs to be working correctly and covered in tests.

Acceptance Criteria

  • All configured keywords (urgent, immediate, asap, think, careful) match consistently
  • Case-insensitive matching works correctly (URGENT = urgent)
  • Keywords are detected regardless of position in query
  • Multiple keywords in same query are all detected
  • keyword-routing test reaches 100% accuracy (6/6 pass)
  • rule-condition-logic test reaches 100% accuracy (6/6 pass)
  • Add unit tests for keyword matching with various query patterns
  • Verify embedding-based keyword matching signal is working correctly

Impact

This is a critical functionality issue as it breaks the core routing logic that depends on keyword detection. Queries that should trigger specific decision paths are not being routed correctly.

Related Issues

Part of the signal-decision engine backend issues affecting all deployment profiles (AIBrix, AI Gateway, Dynamic Config).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions