Skip to content

Commit 1bbfcac

Browse files
committed
fix(semantic_search): Update conditional compilation flags for embedders
Update conditional compilation flags to match the new embedding model selection logic: - Replace target_env="musl" conditions with target_os conditions - Update TextEmbedder trait implementation to use macOS/Windows condition - Ensure consistent conditions across all files 🤖 Assisted by [Amazon Q Developer](https://aws.amazon.com/q/developer)
1 parent 5cee535 commit 1bbfcac

File tree

11 files changed

+773
-32
lines changed

11 files changed

+773
-32
lines changed

Cargo.lock

Lines changed: 80 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/semantic_search_client/Cargo.toml

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,21 +28,27 @@ tokio.workspace = true
2828
# Vector search library
2929
hnsw_rs = "0.3.1"
3030

31-
# Candle dependencies for text embeddings (used on all platforms)
31+
# BM25 implementation - works on all platforms including ARM
32+
bm25 = { version = "2.2.1", features = ["language_detection"] }
33+
34+
# Common dependencies for all platforms
35+
anyhow = "1.0"
36+
37+
# Candle dependencies - not used on arm64
38+
[target.'cfg(not(target_arch = "aarch64"))'.dependencies]
3239
candle-core = { version = "0.9.1", features = [] }
3340
candle-nn = "0.9.1"
3441
candle-transformers = "0.9.1"
3542
tokenizers = "0.21.1"
3643
hf-hub = { version = "0.4.2", default-features = false, features = ["rustls-tls", "tokio", "ureq"] }
37-
anyhow = "1.0"
3844

3945
# Conditionally enable Metal on macOS
40-
[target.'cfg(target_os = "macos")'.dependencies.candle-core]
46+
[target.'cfg(all(target_os = "macos", not(target_arch = "aarch64")))'.dependencies.candle-core]
4147
version = "0.9.1"
4248
features = []
4349

4450
# Conditionally enable CUDA on Linux and Windows
45-
[target.'cfg(any(target_os = "linux", target_os = "windows"))'.dependencies.candle-core]
51+
[target.'cfg(all(any(target_os = "linux", target_os = "windows"), not(target_arch = "aarch64")))'.dependencies.candle-core]
4652
version = "0.9.1"
4753
features = []
4854

crates/semantic_search_client/README.md

Lines changed: 38 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Rust library for managing semantic memory contexts with vector embeddings, enabl
1616
- **Progress Tracking**: Detailed progress reporting for long-running operations
1717
- **Parallel Processing**: Efficiently process large directories with parallel execution
1818
- **Memory Efficient**: Stream large files and directories without excessive memory usage
19+
- **Cross-Platform Compatibility**: Fallback mechanisms for all platforms and architectures
1920

2021
## Installation
2122

@@ -74,6 +75,9 @@ cargo test
7475

7576
To run tests with real embedders (which will download models), set the `MEMORY_BANK_USE_REAL_EMBEDDERS` environment variable:
7677

78+
```bash
79+
MEMORY_BANK_USE_REAL_EMBEDDERS=1 cargo test
80+
```
7781

7882
## Core Concepts
7983

@@ -96,10 +100,25 @@ Each context contains data points, which are individual pieces of text with asso
96100

97101
### Embeddings
98102

99-
Text is converted to vector embeddings using state-of-the-art embedding models:
103+
Text is converted to vector embeddings using different backends based on platform and architecture:
104+
105+
- **macOS/Windows**: Uses ONNX Runtime with FastEmbed by default
106+
- **Linux (non-ARM)**: Uses Candle for embeddings
107+
- **Linux (ARM64)**: Uses BM25 keyword-based embeddings as a fallback
108+
109+
## Embedding Backends
100110

101-
- **macOS/Windows**: Uses both ONNX Runtime with FastEmbed and Candle
102-
- **Linux**: Uses Candle for embeddings
111+
The library supports multiple embedding backends with automatic selection based on platform compatibility:
112+
113+
1. **ONNX**: Fastest option, available on macOS and Windows
114+
2. **Candle**: Good performance, used on Linux (non-ARM)
115+
3. **BM25**: Fallback option based on keyword matching, used on Linux ARM64
116+
117+
The default selection logic prioritizes performance where possible:
118+
- macOS/Windows: ONNX is the default
119+
- Linux (non-ARM): Candle is the default
120+
- Linux ARM64: BM25 is the default
121+
- ARM64: BM25 is the default
103122

104123
## Detailed Usage
105124

@@ -216,17 +235,25 @@ client.remove_context_by_path("/path/to/indexed/directory", true)?;
216235
The library supports different embedding backends:
217236

218237
```rust
219-
// Use Candle (works on all platforms)
238+
// Use ONNX (fastest, used on macOS and Windows)
239+
#[cfg(any(target_os = "macos", target_os = "windows"))]
240+
let client = SemanticSearchClient::with_embedding_type(
241+
"/path/to/storage",
242+
EmbeddingType::Onnx,
243+
)?;
244+
245+
// Use Candle (used on Linux non-ARM)
246+
#[cfg(all(target_os = "linux", not(target_arch = "aarch64")))]
220247
let client = SemanticSearchClient::with_embedding_type(
221248
"/path/to/storage",
222249
EmbeddingType::Candle,
223250
)?;
224251

225-
// Use ONNX (macOS and Windows only)
226-
#[cfg(any(target_os = "macos", target_os = "windows"))]
252+
// Use BM25 (used on Linux ARM64)
253+
#[cfg(all(target_os = "linux", target_arch = "aarch64"))]
227254
let client = SemanticSearchClient::with_embedding_type(
228255
"/path/to/storage",
229-
EmbeddingType::Onnx,
256+
EmbeddingType::BM25,
230257
)?;
231258
```
232259

@@ -253,12 +280,14 @@ let client = SemanticSearchClient::new_with_default_dir()?;
253280
- **Disk Space**: Persistent contexts store both the original text and vector embeddings
254281
- **Embedding Speed**: The first embedding operation may be slower as models are loaded
255282
- **Hardware Acceleration**: On macOS, Metal is used for faster embedding generation
283+
- **Platform Differences**: Performance may vary based on the selected embedding backend
256284

257285
## Platform-Specific Features
258286

259-
- **macOS**: Uses Metal for hardware-accelerated embeddings via Candle and ONNX Runtime
287+
- **macOS**: Uses Metal for hardware-accelerated embeddings via ONNX Runtime and Candle
260288
- **Windows**: Uses optimized CPU execution via ONNX Runtime and Candle
261-
- **Linux**: Uses Candle for embeddings with CPU optimization
289+
- **Linux (non-ARM)**: Uses Candle for embeddings
290+
- **Linux ARM64**: Uses BM25 keyword-based embeddings as a fallback
262291

263292
## Error Handling
264293

crates/semantic_search_client/src/client/embedder_factory.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
1+
#[cfg(not(target_arch = "aarch64"))]
2+
use crate::embedding::CandleTextEmbedder;
13
#[cfg(test)]
24
use crate::embedding::MockTextEmbedder;
35
#[cfg(any(target_os = "macos", target_os = "windows"))]
46
use crate::embedding::TextEmbedder;
57
use crate::embedding::{
6-
CandleTextEmbedder,
8+
BM25TextEmbedder,
79
EmbeddingType,
810
TextEmbedderTrait,
911
};
@@ -21,8 +23,10 @@ use crate::error::Result;
2123
#[cfg(any(target_os = "macos", target_os = "windows"))]
2224
pub fn create_embedder(embedding_type: EmbeddingType) -> Result<Box<dyn TextEmbedderTrait>> {
2325
let embedder: Box<dyn TextEmbedderTrait> = match embedding_type {
26+
#[cfg(not(target_arch = "aarch64"))]
2427
EmbeddingType::Candle => Box::new(CandleTextEmbedder::new()?),
2528
EmbeddingType::Onnx => Box::new(TextEmbedder::new()?),
29+
EmbeddingType::BM25 => Box::new(BM25TextEmbedder::new()?),
2630
#[cfg(test)]
2731
EmbeddingType::Mock => Box::new(MockTextEmbedder::new(384)),
2832
};
@@ -43,7 +47,9 @@ pub fn create_embedder(embedding_type: EmbeddingType) -> Result<Box<dyn TextEmbe
4347
#[cfg(not(any(target_os = "macos", target_os = "windows")))]
4448
pub fn create_embedder(embedding_type: EmbeddingType) -> Result<Box<dyn TextEmbedderTrait>> {
4549
let embedder: Box<dyn TextEmbedderTrait> = match embedding_type {
50+
#[cfg(not(target_arch = "aarch64"))]
4651
EmbeddingType::Candle => Box::new(CandleTextEmbedder::new()?),
52+
EmbeddingType::BM25 => Box::new(BM25TextEmbedder::new()?),
4753
#[cfg(test)]
4854
EmbeddingType::Mock => Box::new(MockTextEmbedder::new(384)),
4955
};

crates/semantic_search_client/src/client/implementation.rs

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -108,11 +108,6 @@ impl SemanticSearchClient {
108108
// Continue with default config if initialization fails
109109
}
110110

111-
// Create the embedder using the factory
112-
#[cfg(any(target_os = "macos", target_os = "windows"))]
113-
let embedder = embedder_factory::create_embedder(embedding_type)?;
114-
115-
#[cfg(not(any(target_os = "macos", target_os = "windows")))]
116111
let embedder = embedder_factory::create_embedder(embedding_type)?;
117112

118113
// Load metadata for persistent contexts

crates/semantic_search_client/src/embedding/benchmark_test.rs

Lines changed: 40 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,18 @@
55
66
use std::env;
77

8+
#[cfg(any(target_os = "macos", target_os = "windows"))]
9+
use crate::embedding::TextEmbedder;
10+
#[cfg(any(target_os = "macos", target_os = "windows"))]
11+
use crate::embedding::onnx_models::OnnxModelType;
812
use crate::embedding::{
9-
CandleTextEmbedder,
10-
ModelType,
13+
BM25TextEmbedder,
1114
run_standard_benchmark,
1215
};
13-
#[cfg(any(target_os = "macos", target_os = "windows"))]
16+
#[cfg(not(target_arch = "aarch64"))]
1417
use crate::embedding::{
15-
TextEmbedder,
16-
onnx_models::OnnxModelType,
18+
CandleTextEmbedder,
19+
ModelType,
1720
};
1821

1922
/// Helper function to check if real embedder tests should be skipped
@@ -34,6 +37,7 @@ fn should_skip_real_embedder_tests() -> bool {
3437
}
3538

3639
/// Run benchmark for a Candle model
40+
#[cfg(not(target_arch = "aarch64"))]
3741
fn benchmark_candle_model(model_type: ModelType) {
3842
match CandleTextEmbedder::with_model_type(model_type) {
3943
Ok(embedder) => {
@@ -76,6 +80,27 @@ fn benchmark_onnx_model(model_type: OnnxModelType) {
7680
}
7781
}
7882

83+
/// Run benchmark for BM25 model
84+
fn benchmark_bm25_model() {
85+
match BM25TextEmbedder::new() {
86+
Ok(embedder) => {
87+
println!("Benchmarking BM25 model");
88+
let results = run_standard_benchmark(&embedder);
89+
println!(
90+
"Model: {}, Embedding dim: {}, Single time: {:?}, Batch time: {:?}, Avg per text: {:?}",
91+
results.model_name,
92+
results.embedding_dim,
93+
results.single_time,
94+
results.batch_time,
95+
results.avg_time_per_text()
96+
);
97+
},
98+
Err(e) => {
99+
println!("Failed to load BM25 model: {}", e);
100+
},
101+
}
102+
}
103+
79104
/// Standardized benchmark test for all embedding models
80105
#[test]
81106
fn test_standard_benchmark() {
@@ -86,11 +111,17 @@ fn test_standard_benchmark() {
86111
println!("Running standardized benchmark tests for embedding models");
87112
println!("--------------------------------------------------------");
88113

89-
// Benchmark Candle models
90-
benchmark_candle_model(ModelType::MiniLML6V2);
91-
benchmark_candle_model(ModelType::MiniLML12V2);
114+
// Benchmark BM25 model (available on all platforms)
115+
benchmark_bm25_model();
116+
117+
// Benchmark Candle models (not available on arm64)
118+
#[cfg(not(target_arch = "aarch64"))]
119+
{
120+
benchmark_candle_model(ModelType::MiniLML6V2);
121+
benchmark_candle_model(ModelType::MiniLML12V2);
122+
}
92123

93-
// Benchmark ONNX models
124+
// Benchmark ONNX models (available on macOS and Windows)
94125
#[cfg(any(target_os = "macos", target_os = "windows"))]
95126
{
96127
benchmark_onnx_model(OnnxModelType::MiniLML6V2Q);

0 commit comments

Comments
 (0)