Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 22, 2025

Overview

This PR implements category-level jailbreak detection configuration, allowing administrators to enable/disable jailbreak protection and customize detection thresholds on a per-category basis. Previously, jailbreak detection was a global setting that applied uniformly to all requests. With this change, different categories can have different security policies based on their specific risk profiles and use cases.

Problem Statement

The global prompt_guard.enabled and prompt_guard.threshold settings applied jailbreak detection uniformly across all categories. However, different categories have different security requirements:

  • Public-facing categories (e.g., customer support, business advice) need strict jailbreak detection with high thresholds
  • Technical categories (e.g., code generation) may need relaxed thresholds to avoid false positives on code/technical jargon
  • Internal tools may need it disabled entirely for trusted environments

Without category-level control, administrators had to choose between:

  1. Enabling globally with one threshold (potentially blocking legitimate requests or missing threats)
  2. Disabling globally (leaving endpoints vulnerable)

Solution

New Configuration Fields

Added jailbreak_enabled and jailbreak_threshold as optional fields in category configuration:

prompt_guard:
  enabled: true       # Global default
  threshold: 0.7      # Global default threshold

categories:
  - name: customer_support
    jailbreak_enabled: true      # Explicitly enable for public-facing
    jailbreak_threshold: 0.9     # High threshold for strict detection
    model_scores:
      - model: qwen3
        score: 0.8

  - name: code_generation
    jailbreak_enabled: true      # Keep enabled
    jailbreak_threshold: 0.5     # Lower threshold to reduce false positives
    model_scores:
      - model: qwen3
        score: 0.9

  - name: general
    # No jailbreak_enabled or jailbreak_threshold specified
    # Inherits global settings (enabled: true, threshold: 0.7)
    model_scores:
      - model: qwen3
        score: 0.5

Behavior

  • When jailbreak_enabled is not specified: Category inherits from global prompt_guard.enabled
  • When jailbreak_enabled: true/false: Jailbreak detection is explicitly enabled/disabled for this category
  • When jailbreak_threshold is not specified: Category inherits from global prompt_guard.threshold
  • When jailbreak_threshold: 0.X: Uses category-specific threshold (0.0-1.0)
  • Category-specific settings always override global settings when explicitly configured

Threshold Tuning Guidelines

  • High threshold (0.8-0.95): Stricter detection, fewer false positives, for high-security categories
  • Medium threshold (0.6-0.8): Balanced detection, good for most use cases
  • Low threshold (0.4-0.6): More sensitive, catches more attacks, for technical categories with higher false positive tolerance

Implementation Details

  1. Configuration Structure (pkg/config/config.go):

    • Added JailbreakEnabled *bool field to Category struct
    • Added JailbreakThreshold *float32 field to Category struct
    • Implemented IsJailbreakEnabledForCategory(categoryName string) bool method
    • Implemented GetJailbreakThresholdForCategory(categoryName string) float32 method
    • Uses pointer types to distinguish between "not set" (nil) and explicitly set values
  2. Request Processing (pkg/extproc/request_handler.go):

    • Moved category classification before security checks to enable category-aware detection
    • Updated performSecurityChecks() to accept category name and use category-specific settings
    • Retrieves both enabled status and threshold based on category
    • Maintains backward compatibility when category cannot be determined
  3. Classifier (pkg/utils/classification/classifier.go):

    • Added CheckForJailbreakWithThreshold() method that accepts custom threshold
    • Added AnalyzeContentForJailbreakWithThreshold() for batch analysis with custom threshold
    • Original methods delegate to new threshold-aware methods for backward compatibility
  4. Testing (pkg/config/config_test.go):

    • Added 7 test cases for jailbreak_enabled configuration
    • Added 5 test cases for jailbreak_threshold configuration
    • Tests cover inheritance behavior, explicit overrides, and edge cases
    • All 132 tests pass

Documentation

  • Created config/examples/jailbreak_category_example.yaml with comprehensive examples and threshold tuning guidance
  • Updated website/docs/tutorials/content-safety/jailbreak-protection.md with category-level configuration
  • Updated website/docs/overview/categories/configuration.md with parameter documentation and examples
  • Updated website/docs/installation/configuration.md with inline comments
  • Updated config/config.yaml with inline comments explaining the feature
  • Updated README.md to mention fine-grained security control

Security Considerations

  • Default behavior unchanged: Existing configurations continue to work without modification
  • No silent failures: CodeQL analysis shows 0 security alerts
  • Best practice: Global jailbreak should be enabled by default, with category-specific overrides only where risk is managed differently
  • Explicit control: Administrators must explicitly configure per-category settings
  • Threshold monitoring: Documentation includes guidance on monitoring false positive/negative rates

Testing

  • ✅ All 132 config tests pass (12 new tests added)
  • ✅ New tests cover all inheritance and override scenarios for both enabled status and threshold
  • ✅ CodeQL security analysis: 0 alerts
  • ✅ Backward compatibility verified
  • ✅ YAML syntax validated
  • ✅ Go packages compile successfully

Migration Guide

No migration needed - this is a backward-compatible addition. Existing configurations will continue to work exactly as before. To use category-level jailbreak detection:

  1. Set your global defaults: prompt_guard.enabled: true/false and prompt_guard.threshold: 0.0-1.0
  2. Override per category as needed: jailbreak_enabled: true/false and/or jailbreak_threshold: 0.0-1.0
  3. Omit category-level settings to inherit the global defaults

Related Issues

Fixes #507

Original prompt

This section details on the original issue you should resolve

<issue_title>Allow jailbreak to be set at the category level</issue_title>
<issue_description>Is your feature request related to a problem? Please describe.

Currently, jailbreak prompt guard is based off a global setting for doing jailbreak detection.

Given the possible consequences and specifics for different categories, this should really be set at the category example.

Please allow setting jailbreak on/off at the category level.</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #507


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

@netlify
Copy link

netlify bot commented Oct 22, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit ea50e6c
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68f88c6e7dee4400087dd50b
😎 Deploy Preview https://deploy-preview-508--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copilot AI changed the title [WIP] Add category level setting for jailbreak detection Allow jailbreak detection to be configured at the category level Oct 22, 2025
Copilot AI requested a review from Xunzhuo October 22, 2025 06:29
Copilot finished work on behalf of Xunzhuo October 22, 2025 06:29
@github-actions
Copy link

github-actions bot commented Oct 22, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 config

Owners: @rootfs
Files changed:

  • config/examples/jailbreak_category_example.yaml
  • config/config.yaml

📁 Root Directory

Owners: @rootfs, @Xunzhuo
Files changed:

  • README.md

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/pkg/config/config.go
  • src/semantic-router/pkg/config/config_test.go
  • src/semantic-router/pkg/extproc/request_handler.go
  • src/semantic-router/pkg/utils/classification/classifier.go
  • src/training/training_lora/classifier_model_fine_tuning_lora/go.mod
  • src/training/training_lora/pii_model_fine_tuning_lora/go.mod
  • src/training/training_lora/prompt_guard_fine_tuning_lora/go.mod

📁 website

Owners: @Xunzhuo, @rootfs, @yuluo-yx
Files changed:

  • website/docs/installation/configuration.md
  • website/docs/overview/categories/configuration.md
  • website/docs/tutorials/content-safety/jailbreak-protection.md

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Copy link
Member

@Xunzhuo Xunzhuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot good to go

@Xunzhuo
Copy link
Member

Xunzhuo commented Oct 22, 2025

@copilot please update relevant docs in site

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>
Copy link
Contributor Author

Copilot AI commented Oct 22, 2025

@copilot please update relevant docs in site

Updated documentation in commit f75a8c1:

  • Added category-level jailbreak configuration section to jailbreak-protection.md
  • Updated category configuration docs with jailbreak_enabled parameter explanation
  • Added security-focused configuration example showing public-facing vs internal tool settings
  • Updated global configuration docs with category override notes
  • Updated README to mention fine-grained security control

Copilot AI requested a review from Xunzhuo October 22, 2025 07:29
Copilot finished work on behalf of Xunzhuo October 22, 2025 07:29
@Xunzhuo
Copy link
Member

Xunzhuo commented Oct 22, 2025

@copilot threshold should be configurable in category level (override)

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>
Copy link
Contributor Author

Copilot AI commented Oct 22, 2025

@copilot threshold should be configurable in category level (override)

Implemented category-level threshold configuration in commit ea50e6c:

  • Added jailbreak_threshold field to Category struct (0.0-1.0)
  • Categories can now override global threshold per category
  • Added helper method GetJailbreakThresholdForCategory
  • Updated jailbreak detection to use category-specific thresholds
  • Added 5 comprehensive tests for threshold configuration
  • Updated documentation with threshold tuning guidelines:
    • High-security (0.8-0.95): Customer support, public-facing
    • Standard (0.6-0.8): General queries
    • Technical (0.4-0.6): Code generation, dev tools
  • Example configs show threshold tuning for different risk profiles

Copilot AI changed the title Allow jailbreak detection to be configured at the category level Allow jailbreak detection and threshold to be configured at the category level Oct 22, 2025
Copilot finished work on behalf of Xunzhuo October 22, 2025 07:50
@Xunzhuo
Copy link
Member

Xunzhuo commented Oct 22, 2025

@copilot it is ready for review and merge

Copilot finished work on behalf of Xunzhuo October 22, 2025 08:05
@Xunzhuo Xunzhuo marked this pull request as ready for review October 22, 2025 08:07
@Xunzhuo Xunzhuo merged commit bc4cc98 into main Oct 22, 2025
22 of 24 checks passed
rootfs pushed a commit to rootfs/semantic-router.bak that referenced this pull request Oct 23, 2025
…ory level (vllm-project#508)

* Initial plan

* Add category-level jailbreak detection configuration

Co-authored-by: Xunzhuo <[email protected]>

* Add documentation for category-level jailbreak settings

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation for category-level jailbreak detection

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>

* Add category-level jailbreak threshold configuration

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
rootfs added a commit that referenced this pull request Oct 23, 2025
* Update test description from Math to General (#483)

Signed-off-by: carlory <[email protected]>

* feat: add HuggingChat support (#477)

* add chat ui to dashboard and docker compose & refactor dashboard/backend/

Signed-off-by: JaredforReal <[email protected]>

* try fix network error

Signed-off-by: JaredforReal <[email protected]>

* more

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: bitliu <[email protected]>

* project: 2025 Q4 roadmap (#487)

* project: q4 roadmap

* project: q4 roadmap

* project: q4 roadmap

* more

* more

* more

* more

* feat: add shelleck precommit hook (#488)

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

---------

Signed-off-by: yuluo-yx <[email protected]>

* project: add q4 roadmap news (#495)

* fix missing shellcheck in pre-commit image (#497)

Signed-off-by: carlory <[email protected]>

* infra: update tools (#501)

Signed-off-by: yuluo-yx <[email protected]>

* feat(demo): enhance OpenShift demo scripts with improved UX (#478)

- Reduce model selection test to 4 categories (2×Model-A, 2×Model-B)
- Add new "Classification Examples" option calling curl-examples.sh
- Update reasoning examples to avoid cache hits from previous tests
- Remove benign examples from PII and Jailbreak tests (show only attacks)
- Enhance live-semantic-router-logs.sh with better color visibility:
  - Fix duplicate "WITH SCORE" text in classification output
  - Fix CACHE HIT background color extending over timestamp
  - Distinguish reasoning enabled vs disabled messages
  - Remove redundant "(standard routing)" text
  - Add background colors for Model-A/Model-B routing display

These improvements make the live demo clearer and more impactful for
presentations and demonstrations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: fix precommit Argument list too long error (#502)

Signed-off-by: yuluo-yx <[email protected]>

* feat: enforce milvus dial timeout if set (#503)

Signed-off-by: cryo <[email protected]>

* Add IETF draft publication: Multi-Provider Extensions for Agentic AI Inference APIs (#506)

* Initial plan

* Add new IETF draft publication for Multi-Provider Extensions for Agentic AI Inference APIs

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>

* Allow semantic cache similarity threshold to be set at the category level (#493)

* Initial plan

* Add category-level cache settings: enabled and similarity_threshold

Co-authored-by: rootfs <[email protected]>

* Add comprehensive tests for category-level cache settings

Co-authored-by: rootfs <[email protected]>

* Update config files and documentation for category-level cache settings

- Updated 7 config YAML files (development, production, testing, e2e, and 3 recipes) with commented examples of category-level cache settings
- Added comprehensive documentation section explaining category-level cache configuration
- Updated semantic cache overview and in-memory cache docs with category-level examples
- Added best practices for threshold selection and privacy considerations

Co-authored-by: rootfs <[email protected]>

* Remove duplicate code in FindSimilar functions

Refactored FindSimilar() to delegate to FindSimilarWithThreshold() with default threshold instead of duplicating the entire implementation. This eliminates 226 lines of duplicate code across inmemory_cache.go and milvus_cache.go.

Co-authored-by: rootfs <[email protected]>

* Update src/semantic-router/pkg/extproc/request_handler.go

Co-authored-by: Copilot <[email protected]>

* Revert changes from unsigned commit ae39fe2

Restored the classificationText empty check that was removed in the previous commit.

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Allow jailbreak detection and threshold to be configured at the category level (#508)

* Initial plan

* Add category-level jailbreak detection configuration

Co-authored-by: Xunzhuo <[email protected]>

* Add documentation for category-level jailbreak settings

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation for category-level jailbreak detection

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>

* Add category-level jailbreak threshold configuration

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Allow PII detection threshold to be set at the category level (#510)

* Initial plan

* Add category-level PII threshold support

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation with API integration notes

Co-authored-by: Xunzhuo <[email protected]>

* Fix markdown linting issues

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Fix: The caller information points to the wrapper function instead of the actual call location (#518)

Signed-off-by: carlory <[email protected]>

* feat: Implement hybrid cache that use in-memory index and milvus based doc store (#504)

* feat: add HNSW index to inmemory semantic cache and implement hybrid cache that use in-memory index and milvus based doc store

Signed-off-by: Huamin Chen <[email protected]>

* chore: run go mod tidy to clean up module dependencies

Signed-off-by: Huamin Chen <[email protected]>

* conditionally build candle cuda support

Signed-off-by: Huamin Chen <[email protected]>

* rebuild index upon restart

Signed-off-by: Huamin Chen <[email protected]>

* precommit fix

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* disable cuda build on ci

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: carlory <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: yuluo-yx <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Signed-off-by: cryo <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Jared <[email protected]>
Co-authored-by: bitliu <[email protected]>
Co-authored-by: shown <[email protected]>
Co-authored-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: cryo <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>
rootfs added a commit that referenced this pull request Oct 23, 2025
* Update test description from Math to General (#483)

Signed-off-by: carlory <[email protected]>

* feat: add HuggingChat support (#477)

* add chat ui to dashboard and docker compose & refactor dashboard/backend/

Signed-off-by: JaredforReal <[email protected]>

* try fix network error

Signed-off-by: JaredforReal <[email protected]>

* more

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: bitliu <[email protected]>

* project: 2025 Q4 roadmap (#487)

* project: q4 roadmap

* project: q4 roadmap

* project: q4 roadmap

* more

* more

* more

* more

* feat: add shelleck precommit hook (#488)

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

---------

Signed-off-by: yuluo-yx <[email protected]>

* project: add q4 roadmap news (#495)

* fix missing shellcheck in pre-commit image (#497)

Signed-off-by: carlory <[email protected]>

* infra: update tools (#501)

Signed-off-by: yuluo-yx <[email protected]>

* feat(demo): enhance OpenShift demo scripts with improved UX (#478)

- Reduce model selection test to 4 categories (2×Model-A, 2×Model-B)
- Add new "Classification Examples" option calling curl-examples.sh
- Update reasoning examples to avoid cache hits from previous tests
- Remove benign examples from PII and Jailbreak tests (show only attacks)
- Enhance live-semantic-router-logs.sh with better color visibility:
  - Fix duplicate "WITH SCORE" text in classification output
  - Fix CACHE HIT background color extending over timestamp
  - Distinguish reasoning enabled vs disabled messages
  - Remove redundant "(standard routing)" text
  - Add background colors for Model-A/Model-B routing display

These improvements make the live demo clearer and more impactful for
presentations and demonstrations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: fix precommit Argument list too long error (#502)

Signed-off-by: yuluo-yx <[email protected]>

* feat: enforce milvus dial timeout if set (#503)

Signed-off-by: cryo <[email protected]>

* Add IETF draft publication: Multi-Provider Extensions for Agentic AI Inference APIs (#506)

* Initial plan

* Add new IETF draft publication for Multi-Provider Extensions for Agentic AI Inference APIs

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>

* Allow semantic cache similarity threshold to be set at the category level (#493)

* Initial plan

* Add category-level cache settings: enabled and similarity_threshold

Co-authored-by: rootfs <[email protected]>

* Add comprehensive tests for category-level cache settings

Co-authored-by: rootfs <[email protected]>

* Update config files and documentation for category-level cache settings

- Updated 7 config YAML files (development, production, testing, e2e, and 3 recipes) with commented examples of category-level cache settings
- Added comprehensive documentation section explaining category-level cache configuration
- Updated semantic cache overview and in-memory cache docs with category-level examples
- Added best practices for threshold selection and privacy considerations

Co-authored-by: rootfs <[email protected]>

* Remove duplicate code in FindSimilar functions

Refactored FindSimilar() to delegate to FindSimilarWithThreshold() with default threshold instead of duplicating the entire implementation. This eliminates 226 lines of duplicate code across inmemory_cache.go and milvus_cache.go.

Co-authored-by: rootfs <[email protected]>

* Update src/semantic-router/pkg/extproc/request_handler.go

Co-authored-by: Copilot <[email protected]>

* Revert changes from unsigned commit ae39fe2

Restored the classificationText empty check that was removed in the previous commit.

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Allow jailbreak detection and threshold to be configured at the category level (#508)

* Initial plan

* Add category-level jailbreak detection configuration

Co-authored-by: Xunzhuo <[email protected]>

* Add documentation for category-level jailbreak settings

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation for category-level jailbreak detection

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>

* Add category-level jailbreak threshold configuration

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Allow PII detection threshold to be set at the category level (#510)

* Initial plan

* Add category-level PII threshold support

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation with API integration notes

Co-authored-by: Xunzhuo <[email protected]>

* Fix markdown linting issues

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Fix: The caller information points to the wrapper function instead of the actual call location (#518)

Signed-off-by: carlory <[email protected]>

* feat: Implement hybrid cache that use in-memory index and milvus based doc store (#504)

* feat: add HNSW index to inmemory semantic cache and implement hybrid cache that use in-memory index and milvus based doc store

Signed-off-by: Huamin Chen <[email protected]>

* chore: run go mod tidy to clean up module dependencies

Signed-off-by: Huamin Chen <[email protected]>

* conditionally build candle cuda support

Signed-off-by: Huamin Chen <[email protected]>

* rebuild index upon restart

Signed-off-by: Huamin Chen <[email protected]>

* precommit fix

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* disable cuda build on ci

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: carlory <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: yuluo-yx <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Signed-off-by: cryo <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Jared <[email protected]>
Co-authored-by: bitliu <[email protected]>
Co-authored-by: shown <[email protected]>
Co-authored-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: cryo <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>
rootfs added a commit that referenced this pull request Oct 23, 2025
* Update test description from Math to General (#483)

Signed-off-by: carlory <[email protected]>

* feat: add HuggingChat support (#477)

* add chat ui to dashboard and docker compose & refactor dashboard/backend/

Signed-off-by: JaredforReal <[email protected]>

* try fix network error

Signed-off-by: JaredforReal <[email protected]>

* more

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: bitliu <[email protected]>

* project: 2025 Q4 roadmap (#487)

* project: q4 roadmap

* project: q4 roadmap

* project: q4 roadmap

* more

* more

* more

* more

* feat: add shelleck precommit hook (#488)

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

---------

Signed-off-by: yuluo-yx <[email protected]>

* project: add q4 roadmap news (#495)

* fix missing shellcheck in pre-commit image (#497)

Signed-off-by: carlory <[email protected]>

* infra: update tools (#501)

Signed-off-by: yuluo-yx <[email protected]>

* feat(demo): enhance OpenShift demo scripts with improved UX (#478)

- Reduce model selection test to 4 categories (2×Model-A, 2×Model-B)
- Add new "Classification Examples" option calling curl-examples.sh
- Update reasoning examples to avoid cache hits from previous tests
- Remove benign examples from PII and Jailbreak tests (show only attacks)
- Enhance live-semantic-router-logs.sh with better color visibility:
  - Fix duplicate "WITH SCORE" text in classification output
  - Fix CACHE HIT background color extending over timestamp
  - Distinguish reasoning enabled vs disabled messages
  - Remove redundant "(standard routing)" text
  - Add background colors for Model-A/Model-B routing display

These improvements make the live demo clearer and more impactful for
presentations and demonstrations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: fix precommit Argument list too long error (#502)

Signed-off-by: yuluo-yx <[email protected]>

* feat: enforce milvus dial timeout if set (#503)

Signed-off-by: cryo <[email protected]>

* Add IETF draft publication: Multi-Provider Extensions for Agentic AI Inference APIs (#506)

* Initial plan

* Add new IETF draft publication for Multi-Provider Extensions for Agentic AI Inference APIs

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>

* Allow semantic cache similarity threshold to be set at the category level (#493)

* Initial plan

* Add category-level cache settings: enabled and similarity_threshold

Co-authored-by: rootfs <[email protected]>

* Add comprehensive tests for category-level cache settings

Co-authored-by: rootfs <[email protected]>

* Update config files and documentation for category-level cache settings

- Updated 7 config YAML files (development, production, testing, e2e, and 3 recipes) with commented examples of category-level cache settings
- Added comprehensive documentation section explaining category-level cache configuration
- Updated semantic cache overview and in-memory cache docs with category-level examples
- Added best practices for threshold selection and privacy considerations

Co-authored-by: rootfs <[email protected]>

* Remove duplicate code in FindSimilar functions

Refactored FindSimilar() to delegate to FindSimilarWithThreshold() with default threshold instead of duplicating the entire implementation. This eliminates 226 lines of duplicate code across inmemory_cache.go and milvus_cache.go.

Co-authored-by: rootfs <[email protected]>

* Update src/semantic-router/pkg/extproc/request_handler.go

Co-authored-by: Copilot <[email protected]>

* Revert changes from unsigned commit ae39fe2

Restored the classificationText empty check that was removed in the previous commit.

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Allow jailbreak detection and threshold to be configured at the category level (#508)

* Initial plan

* Add category-level jailbreak detection configuration

Co-authored-by: Xunzhuo <[email protected]>

* Add documentation for category-level jailbreak settings

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation for category-level jailbreak detection

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>

* Add category-level jailbreak threshold configuration

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Allow PII detection threshold to be set at the category level (#510)

* Initial plan

* Add category-level PII threshold support

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation with API integration notes

Co-authored-by: Xunzhuo <[email protected]>

* Fix markdown linting issues

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Fix: The caller information points to the wrapper function instead of the actual call location (#518)

Signed-off-by: carlory <[email protected]>

* feat: Implement hybrid cache that use in-memory index and milvus based doc store (#504)

* feat: add HNSW index to inmemory semantic cache and implement hybrid cache that use in-memory index and milvus based doc store

Signed-off-by: Huamin Chen <[email protected]>

* chore: run go mod tidy to clean up module dependencies

Signed-off-by: Huamin Chen <[email protected]>

* conditionally build candle cuda support

Signed-off-by: Huamin Chen <[email protected]>

* rebuild index upon restart

Signed-off-by: Huamin Chen <[email protected]>

* precommit fix

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* disable cuda build on ci

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

* merge main to feat branch

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: carlory <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: yuluo-yx <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Signed-off-by: cryo <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Jared <[email protected]>
Co-authored-by: bitliu <[email protected]>
Co-authored-by: shown <[email protected]>
Co-authored-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: cryo <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>
rootfs added a commit that referenced this pull request Oct 23, 2025
* Update test description from Math to General (#483)

Signed-off-by: carlory <[email protected]>

* feat: add HuggingChat support (#477)

* add chat ui to dashboard and docker compose & refactor dashboard/backend/

Signed-off-by: JaredforReal <[email protected]>

* try fix network error

Signed-off-by: JaredforReal <[email protected]>

* more

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: bitliu <[email protected]>

* project: 2025 Q4 roadmap (#487)

* project: q4 roadmap

* project: q4 roadmap

* project: q4 roadmap

* more

* more

* more

* more

* feat: add shelleck precommit hook (#488)

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

---------

Signed-off-by: yuluo-yx <[email protected]>

* project: add q4 roadmap news (#495)

* fix missing shellcheck in pre-commit image (#497)

Signed-off-by: carlory <[email protected]>

* infra: update tools (#501)

Signed-off-by: yuluo-yx <[email protected]>

* feat(demo): enhance OpenShift demo scripts with improved UX (#478)

- Reduce model selection test to 4 categories (2×Model-A, 2×Model-B)
- Add new "Classification Examples" option calling curl-examples.sh
- Update reasoning examples to avoid cache hits from previous tests
- Remove benign examples from PII and Jailbreak tests (show only attacks)
- Enhance live-semantic-router-logs.sh with better color visibility:
  - Fix duplicate "WITH SCORE" text in classification output
  - Fix CACHE HIT background color extending over timestamp
  - Distinguish reasoning enabled vs disabled messages
  - Remove redundant "(standard routing)" text
  - Add background colors for Model-A/Model-B routing display

These improvements make the live demo clearer and more impactful for
presentations and demonstrations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: fix precommit Argument list too long error (#502)

Signed-off-by: yuluo-yx <[email protected]>

* feat: enforce milvus dial timeout if set (#503)

Signed-off-by: cryo <[email protected]>

* Add IETF draft publication: Multi-Provider Extensions for Agentic AI Inference APIs (#506)

* Initial plan

* Add new IETF draft publication for Multi-Provider Extensions for Agentic AI Inference APIs

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>

* Allow semantic cache similarity threshold to be set at the category level (#493)

* Initial plan

* Add category-level cache settings: enabled and similarity_threshold

Co-authored-by: rootfs <[email protected]>

* Add comprehensive tests for category-level cache settings

Co-authored-by: rootfs <[email protected]>

* Update config files and documentation for category-level cache settings

- Updated 7 config YAML files (development, production, testing, e2e, and 3 recipes) with commented examples of category-level cache settings
- Added comprehensive documentation section explaining category-level cache configuration
- Updated semantic cache overview and in-memory cache docs with category-level examples
- Added best practices for threshold selection and privacy considerations

Co-authored-by: rootfs <[email protected]>

* Remove duplicate code in FindSimilar functions

Refactored FindSimilar() to delegate to FindSimilarWithThreshold() with default threshold instead of duplicating the entire implementation. This eliminates 226 lines of duplicate code across inmemory_cache.go and milvus_cache.go.

Co-authored-by: rootfs <[email protected]>

* Update src/semantic-router/pkg/extproc/request_handler.go

Co-authored-by: Copilot <[email protected]>

* Revert changes from unsigned commit ae39fe2

Restored the classificationText empty check that was removed in the previous commit.

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Allow jailbreak detection and threshold to be configured at the category level (#508)

* Initial plan

* Add category-level jailbreak detection configuration

Co-authored-by: Xunzhuo <[email protected]>

* Add documentation for category-level jailbreak settings

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation for category-level jailbreak detection

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>

* Add category-level jailbreak threshold configuration

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Allow PII detection threshold to be set at the category level (#510)

* Initial plan

* Add category-level PII threshold support

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation with API integration notes

Co-authored-by: Xunzhuo <[email protected]>

* Fix markdown linting issues

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Fix: The caller information points to the wrapper function instead of the actual call location (#518)

Signed-off-by: carlory <[email protected]>

* feat: Implement hybrid cache that use in-memory index and milvus based doc store (#504)

* feat: add HNSW index to inmemory semantic cache and implement hybrid cache that use in-memory index and milvus based doc store

Signed-off-by: Huamin Chen <[email protected]>

* chore: run go mod tidy to clean up module dependencies

Signed-off-by: Huamin Chen <[email protected]>

* conditionally build candle cuda support

Signed-off-by: Huamin Chen <[email protected]>

* rebuild index upon restart

Signed-off-by: Huamin Chen <[email protected]>

* precommit fix

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* disable cuda build on ci

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: carlory <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: yuluo-yx <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Signed-off-by: cryo <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Jared <[email protected]>
Co-authored-by: bitliu <[email protected]>
Co-authored-by: shown <[email protected]>
Co-authored-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: cryo <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>
rootfs added a commit that referenced this pull request Oct 23, 2025
* Update test description from Math to General (#483)

Signed-off-by: carlory <[email protected]>

* feat: add HuggingChat support (#477)

* add chat ui to dashboard and docker compose & refactor dashboard/backend/

Signed-off-by: JaredforReal <[email protected]>

* try fix network error

Signed-off-by: JaredforReal <[email protected]>

* more

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: bitliu <[email protected]>

* project: 2025 Q4 roadmap (#487)

* project: q4 roadmap

* project: q4 roadmap

* project: q4 roadmap

* more

* more

* more

* more

* feat: add shelleck precommit hook (#488)

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

---------

Signed-off-by: yuluo-yx <[email protected]>

* project: add q4 roadmap news (#495)

* fix missing shellcheck in pre-commit image (#497)

Signed-off-by: carlory <[email protected]>

* infra: update tools (#501)

Signed-off-by: yuluo-yx <[email protected]>

* feat(demo): enhance OpenShift demo scripts with improved UX (#478)

- Reduce model selection test to 4 categories (2×Model-A, 2×Model-B)
- Add new "Classification Examples" option calling curl-examples.sh
- Update reasoning examples to avoid cache hits from previous tests
- Remove benign examples from PII and Jailbreak tests (show only attacks)
- Enhance live-semantic-router-logs.sh with better color visibility:
  - Fix duplicate "WITH SCORE" text in classification output
  - Fix CACHE HIT background color extending over timestamp
  - Distinguish reasoning enabled vs disabled messages
  - Remove redundant "(standard routing)" text
  - Add background colors for Model-A/Model-B routing display

These improvements make the live demo clearer and more impactful for
presentations and demonstrations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: fix precommit Argument list too long error (#502)

Signed-off-by: yuluo-yx <[email protected]>

* feat: enforce milvus dial timeout if set (#503)

Signed-off-by: cryo <[email protected]>

* Add IETF draft publication: Multi-Provider Extensions for Agentic AI Inference APIs (#506)

* Initial plan

* Add new IETF draft publication for Multi-Provider Extensions for Agentic AI Inference APIs

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>

* Allow semantic cache similarity threshold to be set at the category level (#493)

* Initial plan

* Add category-level cache settings: enabled and similarity_threshold

Co-authored-by: rootfs <[email protected]>

* Add comprehensive tests for category-level cache settings

Co-authored-by: rootfs <[email protected]>

* Update config files and documentation for category-level cache settings

- Updated 7 config YAML files (development, production, testing, e2e, and 3 recipes) with commented examples of category-level cache settings
- Added comprehensive documentation section explaining category-level cache configuration
- Updated semantic cache overview and in-memory cache docs with category-level examples
- Added best practices for threshold selection and privacy considerations

Co-authored-by: rootfs <[email protected]>

* Remove duplicate code in FindSimilar functions

Refactored FindSimilar() to delegate to FindSimilarWithThreshold() with default threshold instead of duplicating the entire implementation. This eliminates 226 lines of duplicate code across inmemory_cache.go and milvus_cache.go.

Co-authored-by: rootfs <[email protected]>

* Update src/semantic-router/pkg/extproc/request_handler.go

Co-authored-by: Copilot <[email protected]>

* Revert changes from unsigned commit ae39fe2

Restored the classificationText empty check that was removed in the previous commit.

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Allow jailbreak detection and threshold to be configured at the category level (#508)

* Initial plan

* Add category-level jailbreak detection configuration

Co-authored-by: Xunzhuo <[email protected]>

* Add documentation for category-level jailbreak settings

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation for category-level jailbreak detection

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>

* Add category-level jailbreak threshold configuration

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Allow PII detection threshold to be set at the category level (#510)

* Initial plan

* Add category-level PII threshold support

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation with API integration notes

Co-authored-by: Xunzhuo <[email protected]>

* Fix markdown linting issues

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Fix: The caller information points to the wrapper function instead of the actual call location (#518)

Signed-off-by: carlory <[email protected]>

* feat: Implement hybrid cache that use in-memory index and milvus based doc store (#504)

* feat: add HNSW index to inmemory semantic cache and implement hybrid cache that use in-memory index and milvus based doc store

Signed-off-by: Huamin Chen <[email protected]>

* chore: run go mod tidy to clean up module dependencies

Signed-off-by: Huamin Chen <[email protected]>

* conditionally build candle cuda support

Signed-off-by: Huamin Chen <[email protected]>

* rebuild index upon restart

Signed-off-by: Huamin Chen <[email protected]>

* precommit fix

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* disable cuda build on ci

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: carlory <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: yuluo-yx <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Signed-off-by: cryo <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Jared <[email protected]>
Co-authored-by: bitliu <[email protected]>
Co-authored-by: shown <[email protected]>
Co-authored-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: cryo <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>
rootfs added a commit that referenced this pull request Oct 23, 2025
* Update test description from Math to General (#483)

Signed-off-by: carlory <[email protected]>

* feat: add HuggingChat support (#477)

* add chat ui to dashboard and docker compose & refactor dashboard/backend/

Signed-off-by: JaredforReal <[email protected]>

* try fix network error

Signed-off-by: JaredforReal <[email protected]>

* more

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: bitliu <[email protected]>

* project: 2025 Q4 roadmap (#487)

* project: q4 roadmap

* project: q4 roadmap

* project: q4 roadmap

* more

* more

* more

* more

* feat: add shelleck precommit hook (#488)

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

---------

Signed-off-by: yuluo-yx <[email protected]>

* project: add q4 roadmap news (#495)

* fix missing shellcheck in pre-commit image (#497)

Signed-off-by: carlory <[email protected]>

* infra: update tools (#501)

Signed-off-by: yuluo-yx <[email protected]>

* feat(demo): enhance OpenShift demo scripts with improved UX (#478)

- Reduce model selection test to 4 categories (2×Model-A, 2×Model-B)
- Add new "Classification Examples" option calling curl-examples.sh
- Update reasoning examples to avoid cache hits from previous tests
- Remove benign examples from PII and Jailbreak tests (show only attacks)
- Enhance live-semantic-router-logs.sh with better color visibility:
  - Fix duplicate "WITH SCORE" text in classification output
  - Fix CACHE HIT background color extending over timestamp
  - Distinguish reasoning enabled vs disabled messages
  - Remove redundant "(standard routing)" text
  - Add background colors for Model-A/Model-B routing display

These improvements make the live demo clearer and more impactful for
presentations and demonstrations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: fix precommit Argument list too long error (#502)

Signed-off-by: yuluo-yx <[email protected]>

* feat: enforce milvus dial timeout if set (#503)

Signed-off-by: cryo <[email protected]>

* Add IETF draft publication: Multi-Provider Extensions for Agentic AI Inference APIs (#506)

* Initial plan

* Add new IETF draft publication for Multi-Provider Extensions for Agentic AI Inference APIs

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>

* Allow semantic cache similarity threshold to be set at the category level (#493)

* Initial plan

* Add category-level cache settings: enabled and similarity_threshold

Co-authored-by: rootfs <[email protected]>

* Add comprehensive tests for category-level cache settings

Co-authored-by: rootfs <[email protected]>

* Update config files and documentation for category-level cache settings

- Updated 7 config YAML files (development, production, testing, e2e, and 3 recipes) with commented examples of category-level cache settings
- Added comprehensive documentation section explaining category-level cache configuration
- Updated semantic cache overview and in-memory cache docs with category-level examples
- Added best practices for threshold selection and privacy considerations

Co-authored-by: rootfs <[email protected]>

* Remove duplicate code in FindSimilar functions

Refactored FindSimilar() to delegate to FindSimilarWithThreshold() with default threshold instead of duplicating the entire implementation. This eliminates 226 lines of duplicate code across inmemory_cache.go and milvus_cache.go.

Co-authored-by: rootfs <[email protected]>

* Update src/semantic-router/pkg/extproc/request_handler.go

Co-authored-by: Copilot <[email protected]>

* Revert changes from unsigned commit ae39fe2

Restored the classificationText empty check that was removed in the previous commit.

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Allow jailbreak detection and threshold to be configured at the category level (#508)

* Initial plan

* Add category-level jailbreak detection configuration

Co-authored-by: Xunzhuo <[email protected]>

* Add documentation for category-level jailbreak settings

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation for category-level jailbreak detection

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>

* Add category-level jailbreak threshold configuration

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Allow PII detection threshold to be set at the category level (#510)

* Initial plan

* Add category-level PII threshold support

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation with API integration notes

Co-authored-by: Xunzhuo <[email protected]>

* Fix markdown linting issues

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Fix: The caller information points to the wrapper function instead of the actual call location (#518)

Signed-off-by: carlory <[email protected]>

* feat: Implement hybrid cache that use in-memory index and milvus based doc store (#504)

* feat: add HNSW index to inmemory semantic cache and implement hybrid cache that use in-memory index and milvus based doc store

Signed-off-by: Huamin Chen <[email protected]>

* chore: run go mod tidy to clean up module dependencies

Signed-off-by: Huamin Chen <[email protected]>

* conditionally build candle cuda support

Signed-off-by: Huamin Chen <[email protected]>

* rebuild index upon restart

Signed-off-by: Huamin Chen <[email protected]>

* precommit fix

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* disable cuda build on ci

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

* merge main to feat branch

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: carlory <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: yuluo-yx <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Signed-off-by: cryo <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Jared <[email protected]>
Co-authored-by: bitliu <[email protected]>
Co-authored-by: shown <[email protected]>
Co-authored-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: cryo <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>
rootfs added a commit that referenced this pull request Oct 24, 2025
* refactor: Implement modular candle-binding architecture (#254)


- Restructure codebase into modular layers (core/, ffi/, model_architectures/, classifiers/)
- Add unified error handling and configuration loading systems
- Implement dual-path architecture for traditional and LoRA models
- Add comprehensive FFI layer with memory safety

Maintains backward compatibility while enabling future model integrations.

refactor: Implement modular candle-binding architecture

- Restructure codebase into modular layers (core/, ffi/, model_architectures/, classifiers/)
- Add unified error handling and configuration loading systems
- Implement dual-path architecture for traditional and LoRA models
- Add comprehensive FFI layer with memory safety

Maintains backward compatibility while enabling future model integrations.

Signed-off-by: OneZero-Y <[email protected]>

* feat:unit tests for candle refactoring (#296)

feat:unit tests for candle refactoring

feat:unit tests for candle refactoring

Signed-off-by: OneZero-Y <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>

* feat:support for two long-context embedding models (Qwen3-Embedding-0.6B and EmbeddingGemma-300M) (#453)

feat:support for two long-context embedding models (Qwen3-Embedding-0.6B and EmbeddingGemma-300M)

Signed-off-by: OneZero-Y <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>

* fix:Implement Comprehensive Rayon Parallelization for LoRA Classifiers (#464)

Signed-off-by: OneZero-Y <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>

* fix:Improve rust unit test and optimize concurrent tests with rayon (#471)

- Add 6 new unit test files
- Replace std::thread::spawn with rayon::par_iter

Signed-off-by: OneZero-Y <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>

* fix: resolve syntax errors after rebase

Signed-off-by: Huamin Chen <[email protected]>

* add additional update

Signed-off-by: Huamin Chen <[email protected]>

* Change label count params to c_int (#494)

Signed-off-by: carlory <[email protected]>

* update embedding setting in config (#489)

Signed-off-by: Huamin Chen <[email protected]>

* make CUDA and Flash Attention 2 optional features (#511)

Signed-off-by: OneZero-Y <[email protected]>

* fix: Fix duplicate UNIFIED_CLASSIFIER definition and optimize lock contention (#516)

- Remove duplicate UNIFIED_CLASSIFIER global state
- Optimize PARALLEL_LORA_ENGINE lock contention by using Arc clone

Signed-off-by: OneZero-Y <[email protected]>

* Merge main to candle refactoring (#523)

* Update test description from Math to General (#483)

Signed-off-by: carlory <[email protected]>

* feat: add HuggingChat support (#477)

* add chat ui to dashboard and docker compose & refactor dashboard/backend/

Signed-off-by: JaredforReal <[email protected]>

* try fix network error

Signed-off-by: JaredforReal <[email protected]>

* more

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: bitliu <[email protected]>

* project: 2025 Q4 roadmap (#487)

* project: q4 roadmap

* project: q4 roadmap

* project: q4 roadmap

* more

* more

* more

* more

* feat: add shelleck precommit hook (#488)

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

---------

Signed-off-by: yuluo-yx <[email protected]>

* project: add q4 roadmap news (#495)

* fix missing shellcheck in pre-commit image (#497)

Signed-off-by: carlory <[email protected]>

* infra: update tools (#501)

Signed-off-by: yuluo-yx <[email protected]>

* feat(demo): enhance OpenShift demo scripts with improved UX (#478)

- Reduce model selection test to 4 categories (2×Model-A, 2×Model-B)
- Add new "Classification Examples" option calling curl-examples.sh
- Update reasoning examples to avoid cache hits from previous tests
- Remove benign examples from PII and Jailbreak tests (show only attacks)
- Enhance live-semantic-router-logs.sh with better color visibility:
  - Fix duplicate "WITH SCORE" text in classification output
  - Fix CACHE HIT background color extending over timestamp
  - Distinguish reasoning enabled vs disabled messages
  - Remove redundant "(standard routing)" text
  - Add background colors for Model-A/Model-B routing display

These improvements make the live demo clearer and more impactful for
presentations and demonstrations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: fix precommit Argument list too long error (#502)

Signed-off-by: yuluo-yx <[email protected]>

* feat: enforce milvus dial timeout if set (#503)

Signed-off-by: cryo <[email protected]>

* Add IETF draft publication: Multi-Provider Extensions for Agentic AI Inference APIs (#506)

* Initial plan

* Add new IETF draft publication for Multi-Provider Extensions for Agentic AI Inference APIs

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>

* Allow semantic cache similarity threshold to be set at the category level (#493)

* Initial plan

* Add category-level cache settings: enabled and similarity_threshold

Co-authored-by: rootfs <[email protected]>

* Add comprehensive tests for category-level cache settings

Co-authored-by: rootfs <[email protected]>

* Update config files and documentation for category-level cache settings

- Updated 7 config YAML files (development, production, testing, e2e, and 3 recipes) with commented examples of category-level cache settings
- Added comprehensive documentation section explaining category-level cache configuration
- Updated semantic cache overview and in-memory cache docs with category-level examples
- Added best practices for threshold selection and privacy considerations

Co-authored-by: rootfs <[email protected]>

* Remove duplicate code in FindSimilar functions

Refactored FindSimilar() to delegate to FindSimilarWithThreshold() with default threshold instead of duplicating the entire implementation. This eliminates 226 lines of duplicate code across inmemory_cache.go and milvus_cache.go.

Co-authored-by: rootfs <[email protected]>

* Update src/semantic-router/pkg/extproc/request_handler.go

Co-authored-by: Copilot <[email protected]>

* Revert changes from unsigned commit ae39fe2

Restored the classificationText empty check that was removed in the previous commit.

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Allow jailbreak detection and threshold to be configured at the category level (#508)

* Initial plan

* Add category-level jailbreak detection configuration

Co-authored-by: Xunzhuo <[email protected]>

* Add documentation for category-level jailbreak settings

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation for category-level jailbreak detection

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>

* Add category-level jailbreak threshold configuration

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Allow PII detection threshold to be set at the category level (#510)

* Initial plan

* Add category-level PII threshold support

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation with API integration notes

Co-authored-by: Xunzhuo <[email protected]>

* Fix markdown linting issues

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Fix: The caller information points to the wrapper function instead of the actual call location (#518)

Signed-off-by: carlory <[email protected]>

* feat: Implement hybrid cache that use in-memory index and milvus based doc store (#504)

* feat: add HNSW index to inmemory semantic cache and implement hybrid cache that use in-memory index and milvus based doc store

Signed-off-by: Huamin Chen <[email protected]>

* chore: run go mod tidy to clean up module dependencies

Signed-off-by: Huamin Chen <[email protected]>

* conditionally build candle cuda support

Signed-off-by: Huamin Chen <[email protected]>

* rebuild index upon restart

Signed-off-by: Huamin Chen <[email protected]>

* precommit fix

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* disable cuda build on ci

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: carlory <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: yuluo-yx <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Signed-off-by: cryo <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Jared <[email protected]>
Co-authored-by: bitliu <[email protected]>
Co-authored-by: shown <[email protected]>
Co-authored-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: cryo <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Candle refactoring to main (#524)

* Update test description from Math to General (#483)

Signed-off-by: carlory <[email protected]>

* feat: add HuggingChat support (#477)

* add chat ui to dashboard and docker compose & refactor dashboard/backend/

Signed-off-by: JaredforReal <[email protected]>

* try fix network error

Signed-off-by: JaredforReal <[email protected]>

* more

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: bitliu <[email protected]>

* project: 2025 Q4 roadmap (#487)

* project: q4 roadmap

* project: q4 roadmap

* project: q4 roadmap

* more

* more

* more

* more

* feat: add shelleck precommit hook (#488)

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

---------

Signed-off-by: yuluo-yx <[email protected]>

* project: add q4 roadmap news (#495)

* fix missing shellcheck in pre-commit image (#497)

Signed-off-by: carlory <[email protected]>

* infra: update tools (#501)

Signed-off-by: yuluo-yx <[email protected]>

* feat(demo): enhance OpenShift demo scripts with improved UX (#478)

- Reduce model selection test to 4 categories (2×Model-A, 2×Model-B)
- Add new "Classification Examples" option calling curl-examples.sh
- Update reasoning examples to avoid cache hits from previous tests
- Remove benign examples from PII and Jailbreak tests (show only attacks)
- Enhance live-semantic-router-logs.sh with better color visibility:
  - Fix duplicate "WITH SCORE" text in classification output
  - Fix CACHE HIT background color extending over timestamp
  - Distinguish reasoning enabled vs disabled messages
  - Remove redundant "(standard routing)" text
  - Add background colors for Model-A/Model-B routing display

These improvements make the live demo clearer and more impactful for
presentations and demonstrations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: fix precommit Argument list too long error (#502)

Signed-off-by: yuluo-yx <[email protected]>

* feat: enforce milvus dial timeout if set (#503)

Signed-off-by: cryo <[email protected]>

* Add IETF draft publication: Multi-Provider Extensions for Agentic AI Inference APIs (#506)

* Initial plan

* Add new IETF draft publication for Multi-Provider Extensions for Agentic AI Inference APIs

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>

* Allow semantic cache similarity threshold to be set at the category level (#493)

* Initial plan

* Add category-level cache settings: enabled and similarity_threshold

Co-authored-by: rootfs <[email protected]>

* Add comprehensive tests for category-level cache settings

Co-authored-by: rootfs <[email protected]>

* Update config files and documentation for category-level cache settings

- Updated 7 config YAML files (development, production, testing, e2e, and 3 recipes) with commented examples of category-level cache settings
- Added comprehensive documentation section explaining category-level cache configuration
- Updated semantic cache overview and in-memory cache docs with category-level examples
- Added best practices for threshold selection and privacy considerations

Co-authored-by: rootfs <[email protected]>

* Remove duplicate code in FindSimilar functions

Refactored FindSimilar() to delegate to FindSimilarWithThreshold() with default threshold instead of duplicating the entire implementation. This eliminates 226 lines of duplicate code across inmemory_cache.go and milvus_cache.go.

Co-authored-by: rootfs <[email protected]>

* Update src/semantic-router/pkg/extproc/request_handler.go

Co-authored-by: Copilot <[email protected]>

* Revert changes from unsigned commit ae39fe2

Restored the classificationText empty check that was removed in the previous commit.

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Allow jailbreak detection and threshold to be configured at the category level (#508)

* Initial plan

* Add category-level jailbreak detection configuration

Co-authored-by: Xunzhuo <[email protected]>

* Add documentation for category-level jailbreak settings

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation for category-level jailbreak detection

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>

* Add category-level jailbreak threshold configuration

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Allow PII detection threshold to be set at the category level (#510)

* Initial plan

* Add category-level PII threshold support

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation with API integration notes

Co-authored-by: Xunzhuo <[email protected]>

* Fix markdown linting issues

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Fix: The caller information points to the wrapper function instead of the actual call location (#518)

Signed-off-by: carlory <[email protected]>

* feat: Implement hybrid cache that use in-memory index and milvus based doc store (#504)

* feat: add HNSW index to inmemory semantic cache and implement hybrid cache that use in-memory index and milvus based doc store

Signed-off-by: Huamin Chen <[email protected]>

* chore: run go mod tidy to clean up module dependencies

Signed-off-by: Huamin Chen <[email protected]>

* conditionally build candle cuda support

Signed-off-by: Huamin Chen <[email protected]>

* rebuild index upon restart

Signed-off-by: Huamin Chen <[email protected]>

* precommit fix

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* disable cuda build on ci

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: carlory <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: yuluo-yx <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Signed-off-by: cryo <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Jared <[email protected]>
Co-authored-by: bitliu <[email protected]>
Co-authored-by: shown <[email protected]>
Co-authored-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: cryo <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Merge candle refactoring 3 (#525)

* Update test description from Math to General (#483)

Signed-off-by: carlory <[email protected]>

* feat: add HuggingChat support (#477)

* add chat ui to dashboard and docker compose & refactor dashboard/backend/

Signed-off-by: JaredforReal <[email protected]>

* try fix network error

Signed-off-by: JaredforReal <[email protected]>

* more

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: bitliu <[email protected]>

* project: 2025 Q4 roadmap (#487)

* project: q4 roadmap

* project: q4 roadmap

* project: q4 roadmap

* more

* more

* more

* more

* feat: add shelleck precommit hook (#488)

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

* feat: add shelleck precommit hook

Signed-off-by: yuluo-yx <[email protected]>

---------

Signed-off-by: yuluo-yx <[email protected]>

* project: add q4 roadmap news (#495)

* fix missing shellcheck in pre-commit image (#497)

Signed-off-by: carlory <[email protected]>

* infra: update tools (#501)

Signed-off-by: yuluo-yx <[email protected]>

* feat(demo): enhance OpenShift demo scripts with improved UX (#478)

- Reduce model selection test to 4 categories (2×Model-A, 2×Model-B)
- Add new "Classification Examples" option calling curl-examples.sh
- Update reasoning examples to avoid cache hits from previous tests
- Remove benign examples from PII and Jailbreak tests (show only attacks)
- Enhance live-semantic-router-logs.sh with better color visibility:
  - Fix duplicate "WITH SCORE" text in classification output
  - Fix CACHE HIT background color extending over timestamp
  - Distinguish reasoning enabled vs disabled messages
  - Remove redundant "(standard routing)" text
  - Add background colors for Model-A/Model-B routing display

These improvements make the live demo clearer and more impactful for
presentations and demonstrations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: fix precommit Argument list too long error (#502)

Signed-off-by: yuluo-yx <[email protected]>

* feat: enforce milvus dial timeout if set (#503)

Signed-off-by: cryo <[email protected]>

* Add IETF draft publication: Multi-Provider Extensions for Agentic AI Inference APIs (#506)

* Initial plan

* Add new IETF draft publication for Multi-Provider Extensions for Agentic AI Inference APIs

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>

* Allow semantic cache similarity threshold to be set at the category level (#493)

* Initial plan

* Add category-level cache settings: enabled and similarity_threshold

Co-authored-by: rootfs <[email protected]>

* Add comprehensive tests for category-level cache settings

Co-authored-by: rootfs <[email protected]>

* Update config files and documentation for category-level cache settings

- Updated 7 config YAML files (development, production, testing, e2e, and 3 recipes) with commented examples of category-level cache settings
- Added comprehensive documentation section explaining category-level cache configuration
- Updated semantic cache overview and in-memory cache docs with category-level examples
- Added best practices for threshold selection and privacy considerations

Co-authored-by: rootfs <[email protected]>

* Remove duplicate code in FindSimilar functions

Refactored FindSimilar() to delegate to FindSimilarWithThreshold() with default threshold instead of duplicating the entire implementation. This eliminates 226 lines of duplicate code across inmemory_cache.go and milvus_cache.go.

Co-authored-by: rootfs <[email protected]>

* Update src/semantic-router/pkg/extproc/request_handler.go

Co-authored-by: Copilot <[email protected]>

* Revert changes from unsigned commit ae39fe2

Restored the classificationText empty check that was removed in the previous commit.

Co-authored-by: rootfs <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Allow jailbreak detection and threshold to be configured at the category level (#508)

* Initial plan

* Add category-level jailbreak detection configuration

Co-authored-by: Xunzhuo <[email protected]>

* Add documentation for category-level jailbreak settings

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation for category-level jailbreak detection

- Add category-level jailbreak configuration to jailbreak-protection.md
- Update category configuration docs with jailbreak_enabled parameter
- Add security-focused configuration example
- Update global configuration docs with category override notes
- Update README to mention fine-grained security control

Co-authored-by: Xunzhuo <[email protected]>

* Add category-level jailbreak threshold configuration

- Add JailbreakThreshold field to Category struct
- Add GetJailbreakThresholdForCategory helper method
- Create CheckForJailbreakWithThreshold and AnalyzeContentForJailbreakWithThreshold methods
- Update performSecurityChecks to use category-specific threshold
- Add 5 comprehensive tests for threshold configuration
- Update example configs with threshold tuning examples
- Update documentation with threshold configuration and tuning guidelines
- Add threshold tuning guide with recommendations for different category types

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Allow PII detection threshold to be set at the category level (#510)

* Initial plan

* Add category-level PII threshold support

Co-authored-by: Xunzhuo <[email protected]>

* Update documentation with API integration notes

Co-authored-by: Xunzhuo <[email protected]>

* Fix markdown linting issues

Co-authored-by: Xunzhuo <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* Fix: The caller information points to the wrapper function instead of the actual call location (#518)

Signed-off-by: carlory <[email protected]>

* feat: Implement hybrid cache that use in-memory index and milvus based doc store (#504)

* feat: add HNSW index to inmemory semantic cache and implement hybrid cache that use in-memory index and milvus based doc store

Signed-off-by: Huamin Chen <[email protected]>

* chore: run go mod tidy to clean up module dependencies

Signed-off-by: Huamin Chen <[email protected]>

* conditionally build candle cuda support

Signed-off-by: Huamin Chen <[email protected]>

* rebuild index upon restart

Signed-off-by: Huamin Chen <[email protected]>

* precommit fix

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* fix precommit

Signed-off-by: Huamin Chen <[email protected]>

* disable cuda build on ci

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

* review feedback

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

* merge main to feat branch

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: carlory <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: yuluo-yx <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Signed-off-by: cryo <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Jared <[email protected]>
Co-authored-by: bitliu <[email protected]>
Co-authored-by: shown <[email protected]>
Co-authored-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: cryo <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>

* chore: fix unit test (#527)

* chore: fix unit test

Signed-off-by: Huamin Chen <[email protected]>

* fix go vet

Signed-off-by: Huamin Chen <[email protected]>

* fix ci

Signed-off-by: Huamin Chen <[email protected]>

* fix ci

Signed-off-by: Huamin Chen <[email protected]>

* split test-binding to two stages on ci

Signed-off-by: Huamin Chen <[email protected]>

* ignore test failure due to embeddinggemma restriction

Signed-off-by: Huamin Chen <[email protected]>

* reorder ci test sequences to avoid missing models

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

* refactor: Replace lazy_static with OnceLock for zero-cost concurrent reads based on review  (#528)

* refactor: Replace lazy_static with OnceLock for zero-cost concurrent reads based on review #266 (comment)

Signed-off-by: Huamin Chen <[email protected]>

* update tests

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

* chore: fix lint error (#530)

Signed-off-by: Huamin Chen <[email protected]>

* Fix lint error2 (#531)

* chore: fix lint error

Signed-off-by: Huamin Chen <[email protected]>

* chore: fix lint error

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: Huamin Chen <[email protected]>

---------

Signed-off-by: OneZero-Y <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Signed-off-by: carlory <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: yuluo-yx <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Signed-off-by: cryo <[email protected]>
Co-authored-by: OneZero-Y <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Jared <[email protected]>
Co-authored-by: bitliu <[email protected]>
Co-authored-by: shown <[email protected]>
Co-authored-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: cryo <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: rootfs <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Xunzhuo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow jailbreak to be set at the category level

4 participants