Summary
SemanticSimilarity currently defaults to OpenAI's text-embedding-3-small. While extensible via BaseEmbeddingModel, there are no pre-built alternatives. Add support for common embedding providers out of the box.
Motivation
Not all users have OpenAI API keys. Ragas supports multiple providers. Local embedding models (Sentence-Transformers) enable offline testing — important for CI/CD without API key management and for cost-sensitive users.
Implementation Guide
Steps
-
Create pre-built embedding model implementations in libs/giskard-checks/src/giskard/checks/utils/embeddings.py
-
Implement providers:
SentenceTransformerEmbedding — uses sentence-transformers library (local, free)
- Document how to use existing
BaseEmbeddingModel for custom providers
-
Add optional dependencies:
[project.optional-dependencies]
local-embeddings = ["sentence-transformers>=2.0,<4"]
-
Update set_default_embedding_model() documentation
Example usage
from giskard.checks import SemanticSimilarity, set_default_embedding_model
from giskard.checks.utils.embeddings import SentenceTransformerEmbedding
# Use local embeddings (no API key needed)
set_default_embedding_model(SentenceTransformerEmbedding("all-MiniLM-L6-v2"))
check = SemanticSimilarity(reference="Hello world", threshold=0.8)
Acceptance Criteria
Summary
SemanticSimilaritycurrently defaults to OpenAI'stext-embedding-3-small. While extensible viaBaseEmbeddingModel, there are no pre-built alternatives. Add support for common embedding providers out of the box.Motivation
Not all users have OpenAI API keys. Ragas supports multiple providers. Local embedding models (Sentence-Transformers) enable offline testing — important for CI/CD without API key management and for cost-sensitive users.
Implementation Guide
Steps
Create pre-built embedding model implementations in
libs/giskard-checks/src/giskard/checks/utils/embeddings.pyImplement providers:
SentenceTransformerEmbedding— usessentence-transformerslibrary (local, free)BaseEmbeddingModelfor custom providersAdd optional dependencies:
Update
set_default_embedding_model()documentationExample usage
Acceptance Criteria