Uni-Xervo

Unified Rust runtime for embedding, reranking, and generation across local and remote model providers.

uni-xervo gives you one runtime and one API surface for mixed model stacks, so application code stays stable while you swap providers, models, and execution modes.

Overview

Uni-Xervo is built around three core ideas:

Model aliases: your app requests models by stable names like embed/default or generate/llm.
Provider abstraction: local and remote providers implement the same task traits.
Runtime deduplication: equivalent model specs share one loaded instance.

Core tasks:

embed for vector embeddings
rerank for relevance scoring
generate for LLM text generation

Why Uni-Xervo?

Keep product code provider-agnostic.
Mix local and remote models in one runtime.
Enforce config correctness with schema-backed option validation.
Control startup behavior with lazy, eager, or background warmup.
Add retries/timeouts per model alias instead of hard-coding behavior.

Provider Support

Provider ID	Tasks	Cargo Feature
`local/candle`	`embed`	`provider-candle`
`local/fastembed`	`embed`	`provider-fastembed`
`local/mistralrs`	`embed`, `generate`	`provider-mistralrs`
`remote/openai`	`embed`, `generate`	`provider-openai`
`remote/gemini`	`embed`, `generate`	`provider-gemini`
`remote/vertexai`	`embed`, `generate`	`provider-vertexai`
`remote/mistral`	`embed`, `generate`	`provider-mistral`
`remote/anthropic`	`generate`	`provider-anthropic`
`remote/voyageai`	`embed`, `rerank`	`provider-voyageai`
`remote/cohere`	`embed`, `rerank`, `generate`	`provider-cohere`
`remote/azure-openai`	`embed`, `generate`	`provider-azure-openai`

Installation

Use only the features you need.

[dependencies]
uni-xervo = { version = "0.1.0", default-features = false, features = ["provider-candle"] }
tokio = { version = "1", features = ["full"] }

Default feature set:

provider-candle

If you want local embeddings + OpenAI generation:

[dependencies]
uni-xervo = { version = "0.1.0", default-features = false, features = ["provider-candle", "provider-openai"] }
tokio = { version = "1", features = ["full"] }

GPU acceleration flag:

gpu-cuda for CUDA-enabled builds.

Quick Start (Rust)

use uni_xervo::api::{ModelAliasSpec, ModelTask};
use uni_xervo::provider::candle::LocalCandleProvider;
use uni_xervo::runtime::ModelRuntime;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let spec = ModelAliasSpec {
        alias: "embed/local".to_string(),
        task: ModelTask::Embed,
        provider_id: "local/candle".to_string(),
        model_id: "sentence-transformers/all-MiniLM-L6-v2".to_string(),
        revision: None,
        warmup: Default::default(),
        required: true,
        timeout: None,
        load_timeout: None,
        retry: None,
        options: serde_json::Value::Null,
    };

    let runtime = ModelRuntime::builder()
        .register_provider(LocalCandleProvider::new())
        .catalog(vec![spec])
        .build()
        .await?;

    let embedder = runtime.embedding("embed/local").await?;
    let vectors = embedder.embed(vec!["hello world"]).await?;
    println!("vector dims = {}", vectors[0].len());

    Ok(())
}

JSON Config Example (`generate/llm`)

Model catalogs are JSON arrays of ModelAliasSpec.

model-catalog.json:

[
  {
    "alias": "embed/default",
    "task": "embed",
    "provider_id": "local/candle",
    "model_id": "sentence-transformers/all-MiniLM-L6-v2",
    "warmup": "lazy",
    "required": true,
    "options": null
  },
  {
    "alias": "generate/llm",
    "task": "generate",
    "provider_id": "remote/openai",
    "model_id": "gpt-4o-mini",
    "warmup": "lazy",
    "timeout": 30,
    "retry": {
      "max_attempts": 3,
      "initial_backoff_ms": 200
    },
    "options": {
      "api_key_env": "OPENAI_API_KEY"
    }
  }
]

Load JSON Config and Run Generation

use uni_xervo::provider::{LocalCandleProvider, RemoteOpenAIProvider};
use uni_xervo::runtime::ModelRuntime;
use uni_xervo::traits::GenerationOptions;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let runtime = ModelRuntime::builder()
        .register_provider(LocalCandleProvider::new())
        .register_provider(RemoteOpenAIProvider::new())
        .catalog_from_file("model-catalog.json")?
        .build()
        .await?;

    let llm = runtime.generator("generate/llm").await?;
    let result = llm
        .generate(
            &[
                "You are a concise assistant.".to_string(),
                "Understood.".to_string(),
                "Explain what embeddings are in one paragraph.".to_string(),
            ],
            GenerationOptions {
                max_tokens: Some(200),
                temperature: Some(0.3),
                top_p: Some(0.9),
            },
        )
        .await?;

    println!("{}", result.text);
    Ok(())
}

Configuration and Validation

Catalog schema: schemas/model-catalog.schema.json
Provider option schemas: schemas/provider-options/*.schema.json
Unknown keys or wrong value types fail fast during runtime build/register.

Default remote credential env vars:

Provider ID	Default credential env var	Extra required options
`remote/openai`	`OPENAI_API_KEY`	None
`remote/gemini`	`GEMINI_API_KEY`	None
`remote/vertexai`	`VERTEX_AI_TOKEN`	`project_id` option or `VERTEX_AI_PROJECT`
`remote/mistral`	`MISTRAL_API_KEY`	None
`remote/anthropic`	`ANTHROPIC_API_KEY`	None
`remote/voyageai`	`VOYAGE_API_KEY`	None
`remote/cohere`	`CO_API_KEY`	None
`remote/azure-openai`	`AZURE_OPENAI_API_KEY`	`resource_name` option

CLI Prefetch Utility

The repository includes a prefetch CLI target (src/bin/prefetch.rs) to pre-download local model artifacts:

cargo run --bin prefetch -- model-catalog.json --dry-run
cargo run --bin prefetch -- model-catalog.json

Remote providers are skipped by design because they do not cache local weights.

Development

# Build
./scripts/build.sh

# Format + check + test
./scripts/test.sh

# Ignored integration tests (real providers)
./scripts/test-integration.sh

Integration tests for real providers are gated by EXPENSIVE_TESTS=1 and relevant API credentials.

Docs

Contributing guide: CONTRIBUTING.md
Development guide: DEVELOPMENT.md
Community guidelines: COMMUNITY.md
Code of conduct: CODE_OF_CONDUCT.md
Support guide: SUPPORT.md
Security policy: SECURITY.md
User guide: docs/USER_GUIDE.md
Testing guide: TESTING.md
Website docs: website/

License

Apache-2.0 (LICENSE).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uni-Xervo

Overview

Why Uni-Xervo?

Provider Support

Installation

Quick Start (Rust)

JSON Config Example (`generate/llm`)

Load JSON Config and Run Generation

Configuration and Validation

CLI Prefetch Utility

Development

Docs

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.config		.config
.github/workflows		.github/workflows
benches		benches
docs		docs
examples		examples
schemas		schemas
scripts		scripts
src		src
tests		tests
website		website
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COMMUNITY.md		COMMUNITY.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
README.md		README.md
RELEASE.md		RELEASE.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
TESTING.md		TESTING.md
build.rs		build.rs

License

rustic-ai/uni-xervo

Folders and files

Latest commit

History

Repository files navigation

Uni-Xervo

Overview

Why Uni-Xervo?

Provider Support

Installation

Quick Start (Rust)

JSON Config Example (generate/llm)

Load JSON Config and Run Generation

Configuration and Validation

CLI Prefetch Utility

Development

Docs

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

JSON Config Example (`generate/llm`)

Packages