Commit 292170b
feat(classifier): enable LoRA auto-detection for intent classification (#726)
* feat: enable LoRA auto-detection for intent classification (#724)
This commit implements automatic detection of LoRA (Low-Rank Adaptation)
models based on the presence of lora_config.json in the model directory.
Changes:
- Add LoRA auto-detection logic in Rust candle-binding layer
- Implement fallback to BERT base model when LoRA config is not found
- Add comprehensive test coverage for auto-detection mechanism
- Update default Helm values to use LoRA intent classification model
- Update ABrix deployment values to use LoRA models
The auto-detection mechanism checks for lora_config.json during model
initialization and automatically switches between LoRA and base BERT
models without requiring explicit configuration changes.
Signed-off-by: Yossi Ovadia <[email protected]>
* fix: enable LoRA intent classification and optimize PII threshold
This commit fixes two critical issues affecting classification accuracy:
1. Fixed IsCategoryEnabled() to check correct config field path:
- Changed from c.Config.CategoryMappingPath (non-existent)
- To c.Config.CategoryModel.CategoryMappingPath (correct)
- This bug prevented LoRA classification from running in e2e tests
2. Optimized PII detection threshold from 0.7 to 0.9:
- Reduces false positives from aggressive LoRA PII model (PR #709)
- Improves domain classification accuracy from 40.71% to 52.50%
- Beats ModernBERT baseline of ~50%
Updated e2e test configurations to use LoRA models with optimized
thresholds across ai-gateway and dynamic-config profiles.
Signed-off-by: Yossi Ovadia <[email protected]>
* fix(ci): bump model cache version to pick up lora_config.json
Increment cache version from v15 to v16 to ensure CI downloads the
updated LoRA models that include lora_config.json files needed for
auto-detection.
Signed-off-by: Yossi Ovadia <[email protected]>
* chore: switch default config to use LoRA models with optimized thresholds
Update default configuration to use LoRA-based classification:
- Intent classification: lora_intent_classifier_bert-base-uncased_model
- PII detection: lora_pii_detector_bert-base-uncased_model with threshold 0.9
This aligns the default config with e2e test configurations for
consistency across all environments.
Signed-off-by: Yossi Ovadia <[email protected]>
* fix(e2e): add decision routes for all 14 LoRA categories in test profiles
The production-stack and llm-d E2E test profiles were failing with
0-1% domain classification accuracy because they only configured
decision routes for 1-2 categories while using LoRA intent classifiers
that classify into 14 categories.
When the classifier correctly identified categories like "biology",
"health", or "math", no matching decision existed, causing
"decision evaluation failed: no decision matched" errors.
Changes:
- production-stack: Added decision routes for all 14 categories
(business, philosophy, biology, health, computer science,
engineering, psychology, math, chemistry, physics, history,
law, economics, other)
- llm-d: Added decision routes for all 14 categories with
intelligent grouping (sciences, social sciences, humanities)
Results:
- production-stack domain classification: 1% → 53% accuracy (50x improvement)
- All 12 production-stack E2E tests now pass
This fix ensures LoRA auto-detection works properly by providing
decision routes for all categories that the classifier can identify.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
---------
Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Co-authored-by: Claude <[email protected]>1 parent 9061431 commit 292170b
File tree
15 files changed
+641
-95
lines changed- .github/workflows
- candle-binding/src
- classifiers/lora
- core
- ffi
- model_architectures/lora
- config
- deploy
- helm/semantic-router
- kubernetes/aibrix/semantic-router-values
- e2e/profiles
- ai-gateway
- dynamic-config
- llm-d
- production-stack
- src/semantic-router/pkg/classification
15 files changed
+641
-95
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
88 | | - | |
| 88 | + | |
89 | 89 | | |
90 | | - | |
| 90 | + | |
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
116 | 150 | | |
117 | 151 | | |
118 | 152 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
387 | 387 | | |
388 | 388 | | |
389 | 389 | | |
390 | | - | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
391 | 403 | | |
392 | 404 | | |
393 | 405 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
734 | 734 | | |
735 | 735 | | |
736 | 736 | | |
737 | | - | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
738 | 763 | | |
739 | 764 | | |
740 | 765 | | |
| |||
758 | 783 | | |
759 | 784 | | |
760 | 785 | | |
761 | | - | |
| 786 | + | |
762 | 787 | | |
763 | 788 | | |
764 | 789 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
40 | 44 | | |
41 | 45 | | |
42 | 46 | | |
| |||
604 | 608 | | |
605 | 609 | | |
606 | 610 | | |
607 | | - | |
608 | 611 | | |
609 | 612 | | |
610 | 613 | | |
611 | 614 | | |
612 | 615 | | |
613 | 616 | | |
614 | 617 | | |
615 | | - | |
616 | | - | |
617 | | - | |
618 | | - | |
619 | | - | |
620 | | - | |
621 | | - | |
622 | | - | |
| 618 | + | |
| 619 | + | |
623 | 620 | | |
624 | | - | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
625 | 641 | | |
626 | | - | |
627 | | - | |
628 | | - | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
629 | 658 | | |
630 | 659 | | |
631 | 660 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
499 | 499 | | |
500 | 500 | | |
501 | 501 | | |
502 | | - | |
| 502 | + | |
503 | 503 | | |
504 | 504 | | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
505 | 514 | | |
506 | 515 | | |
507 | 516 | | |
| |||
690 | 699 | | |
691 | 700 | | |
692 | 701 | | |
693 | | - | |
| 702 | + | |
694 | 703 | | |
695 | 704 | | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
696 | 714 | | |
697 | 715 | | |
698 | 716 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
62 | | - | |
| 61 | + | |
63 | 62 | | |
64 | 63 | | |
65 | | - | |
| 64 | + | |
66 | 65 | | |
67 | | - | |
68 | | - | |
69 | | - | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
70 | 69 | | |
71 | 70 | | |
72 | 71 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
165 | 165 | | |
166 | 166 | | |
167 | 167 | | |
| 168 | + | |
| 169 | + | |
168 | 170 | | |
169 | 171 | | |
170 | 172 | | |
| |||
278 | 280 | | |
279 | 281 | | |
280 | 282 | | |
281 | | - | |
282 | | - | |
| 283 | + | |
| 284 | + | |
283 | 285 | | |
284 | 286 | | |
285 | | - | |
| 287 | + | |
286 | 288 | | |
287 | 289 | | |
288 | 290 | | |
| |||
Lines changed: 3 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
433 | 433 | | |
434 | 434 | | |
435 | 435 | | |
436 | | - | |
437 | | - | |
| 436 | + | |
| 437 | + | |
438 | 438 | | |
439 | 439 | | |
440 | | - | |
| 440 | + | |
441 | 441 | | |
442 | 442 | | |
443 | 443 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
509 | 509 | | |
510 | 510 | | |
511 | 511 | | |
512 | | - | |
513 | | - | |
| 512 | + | |
| 513 | + | |
514 | 514 | | |
515 | 515 | | |
516 | | - | |
| 516 | + | |
517 | 517 | | |
518 | 518 | | |
519 | 519 | | |
520 | 520 | | |
521 | 521 | | |
522 | | - | |
| 522 | + | |
523 | 523 | | |
524 | 524 | | |
525 | 525 | | |
| |||
0 commit comments