Skip to content

Conversation

OneZero-Y
Copy link
Contributor

What type of PR is this?

What's Changed

1. Simplified parallel_engine.rs Thread Handling

File: candle-binding/src/classifiers/lora/parallel_engine.rs

Before (Complex manual threading):

let intent_results = Arc::new(Mutex::new(Vec::new()));
let pii_results = Arc::new(Mutex::new(Vec::new()));
let security_results = Arc::new(Mutex::new(Vec::new()));

let handles = vec![
    self.spawn_intent_task(texts_owned.clone(), Arc::clone(&intent_results)),
    self.spawn_pii_task(texts_owned.clone(), Arc::clone(&pii_results)),
    self.spawn_security_task(texts_owned.clone(), Arc::clone(&security_results)),
];

for handle in handles {
    handle.join().unwrap();
}

After (Clean rayon parallelism):

use rayon::prelude::*;

let ((intent_results, pii_results), security_results) = rayon::join(
    || rayon::join(
        || self.intent_classifier.batch_classify(texts),
        || self.pii_classifier.batch_detect(texts),
    ),
    || self.security_classifier.batch_detect(texts),
);

2. Parallelized Batch Processing in LoRA Classifiers

Files:

  • candle-binding/src/classifiers/lora/pii_lora.rs
  • candle-binding/src/classifiers/lora/intent_lora.rs
  • candle-binding/src/classifiers/lora/security_lora.rs

Change (1 line per file):

// Before
texts.iter().map(|text| self.detect(text)).collect()

// After  
texts.par_iter().map(|text| self.detect(text)).collect()

3. Multi-Task Batch Classification Parallelization

File: candle-binding/src/model_architectures/lora/bert_lora.rs

Function: classify_batch_multi_task()

Before:

pub fn classify_batch_multi_task(&self, texts: &[&str]) -> Result<Vec<LoRAMultiTaskResult>> {
    // For now, process sequentially. In future, implement true batch processing
    texts.iter().map(|text| self.classify_multi_task(text)).collect()
}

After:

pub fn classify_batch_multi_task(&self, texts: &[&str]) -> Result<Vec<LoRAMultiTaskResult>> {
    texts.par_iter().map(|text| self.classify_multi_task(text)).collect()
}

4. Traditional Model Batch Forward Pass Parallelization

File: candle-binding/src/model_architectures/traditional/base_model.rs

Function: forward_batch()

Before:

pub fn forward_batch(&self, input_batch: &[Tensor], attention_batch: &[Tensor]) -> Result<Vec<Tensor>> {
    let mut results = Vec::with_capacity(input_batch.len());
    for (input_ids, attention_mask) in input_batch.iter().zip(attention_batch.iter()) {
        let output = self.forward(input_ids, attention_mask)?;
        results.push(output);
    }
    Ok(results)
}

After:

pub fn forward_batch(&self, input_batch: &[Tensor], attention_batch: &[Tensor]) -> Result<Vec<Tensor>> {
    input_batch
        .par_iter()
        .zip(attention_batch.par_iter())
        .map(|(input_ids, attention_mask)| self.forward(input_ids, attention_mask))
        .collect()
}

Testing

New Test

  • Added: candle-binding/src/classifiers/lora/parallel_engine_test.rs

Which issue(s) this PR fixes:

part of #266

Release Notes: Yes/No

Copy link

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 candle-binding

Owners: @rootfs
Files changed:

  • candle-binding/src/classifiers/lora/parallel_engine_test.rs
  • candle-binding/Cargo.toml
  • candle-binding/src/classifiers/lora/intent_lora.rs
  • candle-binding/src/classifiers/lora/mod.rs
  • candle-binding/src/classifiers/lora/parallel_engine.rs
  • candle-binding/src/classifiers/lora/pii_lora.rs
  • candle-binding/src/classifiers/lora/security_lora.rs
  • candle-binding/src/model_architectures/embedding/qwen3_embedding_test.rs
  • candle-binding/src/model_architectures/lora/bert_lora.rs
  • candle-binding/src/model_architectures/traditional/base_model.rs

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@rootfs
Copy link
Collaborator

rootfs commented Oct 17, 2025

@OneZero-Y thanks! I'll test this branch soonish.

@rootfs rootfs merged commit e34204c into vllm-project:feat-candle-refactoring Oct 17, 2025
3 of 4 checks passed
@OneZero-Y OneZero-Y deleted the feat/testing-for-candle-refactoring branch October 18, 2025 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants