Add default Rhesis embedding model and standardize terminology by EmanueleDeRossi1 · Pull Request #1355 · rhesis-ai/rhesis

EmanueleDeRossi1 · 2026-02-16T10:44:35Z

Summary

This PR standardizes the terminology across the codebase from inconsistent "LLM/llm" naming to clear "language model" and "embedding model" terminology, and adds support for a default Rhesis-hosted embedding model.

Key Changes

1. Terminology Standardization

Renamed get_model() → get_language_model() in SDK for clarity
Changed model_type enum value from llm → language
Updated all variable names from DEFAULT_GENERATION_MODEL → DEFAULT_LANGUAGE_MODEL_PROVIDER
Consistent use of "language model" and "embedding model" throughout codebase
Updated 73 files across backend, frontend, SDK, and tests

2. Default Rhesis Embedding Model

Added RhesisEmbedder class in SDK with full implementation
New endpoint: /generate/embedding (same behavior as /generate/content)
Registered "rhesis" provider in embedding model factory
Changed default embedding provider from OpenAI to Rhesis
Backend now creates Rhesis Default Embedding model during organization initialization
Stores both language_model_id and embedding_model_id in user settings
Uses RHESIS_API_KEY and RHESIS_BASE_URL environment variables

3. Database Migration

Added migration to update existing model_type values from "llm" to "language"

❗ Breaking Changes

Environment variable names change:

DEFAULT_GENERATION_MODEL → DEFAULT_LANGUAGE_MODEL_PROVIDER
DEFAULT_MODEL_NAME -> DEFAULT_LANGUAGE_MODEL_NAME

New variables to add:

DEFAULT_EMBEDDING_MODEL_PROVIDER
DEFAULT_EMBEDDING_MODEL_NAME

peqy

Critical issues to address before merge

Backend model_type values are inconsistent ("language" vs "language_model") across DB enum/migration, API schemas, and connection testing. This will cause 422s and/or make model connection tests fail.
SDK factory removed DEFAULT_PROVIDER but tests (and likely external users) still import it.
SDK embedding factory default provider changed to rhesis, but tests still assert OpenAI is used on the default path.

Found 6 issues (6 critical).

apps/backend/src/rhesis/backend/app/schemas/model.py

apps/backend/src/rhesis/backend/app/routers/model.py

apps/backend/src/rhesis/backend/app/services/model_connection.py

sdk/src/rhesis/sdk/models/factory.py

tests/sdk/models/test_model_factory.py

tests/sdk/models/test_embedder_factory.py

peqy · 2026-02-16T11:47:20Z

@EmanueleDeRossi1 yep, confirmed — the backend API schema/routers now consistently use model_type="language" (instead of language_model), which should unblock the earlier 422/“invalid model type” issues.

Remaining blockers from my earlier notes still look outstanding in this draft:

sdk.models.factory still doesn’t define DEFAULT_PROVIDER (tests/imports still reference it)
get_embedding_model() now defaults to rhesis, but the default-path unit tests still patch/assert OpenAIEmbedder (and will fail unless updated/mocked)

peqy · 2026-02-16T11:51:33Z

@EmanueleDeRossi1 confirmed — the backend is now consistently using model_type="language" (schema + router fallback + connection service), so that earlier mismatch should be resolved.

The other two blockers I flagged still remain (missing DEFAULT_PROVIDER backward-compat alias in sdk.models.factory, and the embedding-factory tests still expecting OpenAI on the default path).

peqy · 2026-02-16T11:53:24Z

@EmanueleDeRossi1 confirmed — with 561d187 the model_type handling in ModelConnectionService is now consistently "language", so the earlier “invalid model type” path should be resolved.

The two remaining draft blockers from my earlier notes still apply: sdk.models.factory missing the DEFAULT_PROVIDER backward-compat alias, and the embedding factory tests still expecting OpenAI on the default path even though get_embedding_model() now defaults to rhesis.

peqy · 2026-02-16T12:02:27Z

@EmanueleDeRossi1 confirmed — after 46fed56 I don’t see any remaining DEFAULT_PROVIDER imports/usages in the repo, so this is resolved on our side.

One related follow-up: get_embedding_model() still documents the default as OpenAI and tests/sdk/models/test_embedder_factory.py::test_get_embedding_model_minimal_defaults still patches OpenAIEmbedder, but the default provider is now rhesis (DEFAULT_EMBEDDING_MODEL_PROVIDER). Likely needs aligning to avoid CI failures.

- Rename get_model() → get_language_model() and get_embedder() → get_embedding_model() - Rename ModelConfig → LanguageModelConfig and EmbedderConfig → EmbeddingModelConfig - Keep deprecated aliases for backward compatibility

… and backend

- Renamed get_model() to get_language_model() across SDK - Renamed DEFAULT_MODEL_NAME to DEFAULT_LANGUAGE_MODEL_NAME in all providers - Renamed PROVIDER_REGISTRY to LANGUAGE_MODEL_PROVIDER_REGISTRY

Rename DEFAULT_GENERATION_MODEL → DEFAULT_LANGUAGE_MODEL_PROVIDER and DEFAULT_MODEL_NAME → DEFAULT_LANGUAGE_MODEL_NAME across all services

- rename model_type to purpose in _get_user_model and related functions to avoid confusion between model_type terminology (which refers to whether model is either language/embedding model)

Add Rhesis as the default embedding model provider, following the same pattern as the language model: Backend changes: - Update constants to use consistent naming (DEFAULT_EMBEDDING_MODEL_PROVIDER) - Create default Rhesis embedding model during organization initialization - Store both language_model_id and embedding_model_id in user settings - Update generate/embedding endpoint to use new constants SDK changes: - Implement complete RhesisEmbedder class with generate() and generate_batch() - Add factory function for Rhesis embedding model - Register "rhesis" provider in EMBEDDING_MODEL_REGISTRY - Update DEFAULT_EMBEDDING_MODEL_PROVIDER from "openai" to "rhesis" This enables users to use Rhesis-hosted embeddings by default while still allowing custom embedding model configuration.

… connection

- use correct import (DEFAULT_LANGUAGE_MODEL_PROVIDER) in tests - remove unused aliases (DEFAULT_MODELS, DEFAULT_PROVIDER)

…ilename

…ing_model

…_NAME in infrastructure, docs and github workflow files

…xisting organizations

EmanueleDeRossi1 changed the title ~~refactor: standardize model terminology and add default Rhesis embedding model~~ Add default Rhesis embedding model and standardize terminology Feb 16, 2026

peqy bot suggested changes Feb 16, 2026

View reviewed changes

EmanueleDeRossi1 marked this pull request as draft February 16, 2026 11:30

EmanueleDeRossi1 added this to the Release 16 (26.02.2026) milestone Feb 16, 2026

EmanueleDeRossi1 added 16 commits February 16, 2026 13:39

feat: consistently use 'language model' instead of 'llm model'

ebec42f

feat: use consistent naming for language and embedding model in tests…

f6420b7

… and backend

feat(sdk): get_model to get_language_model for clarity

23e3f1e

- Renamed get_model() to get_language_model() across SDK - Renamed DEFAULT_MODEL_NAME to DEFAULT_LANGUAGE_MODEL_NAME in all providers - Renamed PROVIDER_REGISTRY to LANGUAGE_MODEL_PROVIDER_REGISTRY

feat(test): update tests for renaming in SDK

cfc4462

refactor: rename model config vars for clarity

3723754

Rename DEFAULT_GENERATION_MODEL → DEFAULT_LANGUAGE_MODEL_PROVIDER and DEFAULT_MODEL_NAME → DEFAULT_LANGUAGE_MODEL_NAME across all services

rename model_type to purpose

62af900

- rename model_type to purpose in _get_user_model and related functions to avoid confusion between model_type terminology (which refers to whether model is either language/embedding model)

refactor(frontend): change model_type from llm -> language

1ccaa23

refactor(sdk): change llm to language in model_type param

7fef3cf

change from 'language_model' to 'model' in schemas, routers and model…

3f0ae62

… connection

fix: import in tests and remove unused aliases

5d96920

- use correct import (DEFAULT_LANGUAGE_MODEL_PROVIDER) in tests - remove unused aliases (DEFAULT_MODELS, DEFAULT_PROVIDER)

fix(test): use rhesis default embedding model

f883d1d

fix: import

dbfc973

fix(test): 'get_model'-> 'get_language_model'

c633dbc

fix(alembic): resolve head conflict and add hash-like id to alembic f…

d410840

…ilename

EmanueleDeRossi1 force-pushed the feature/user-default-embedding-model branch from e87e8d9 to d410840 Compare February 16, 2026 12:47

EmanueleDeRossi1 added 3 commits February 16, 2026 14:11

fix: mock using get_language_model

a9e219b

style: reformat imports

565cf00

docs: update documentation with new get_language_model and get_embedd…

a59a6f0

…ing_model

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 13:24 — with GitHub Actions Inactive

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 13:27 — with GitHub Actions Inactive

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 13:28 — with GitHub Actions Inactive

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 13:32 — with GitHub Actions Inactive

feat(frontend): add rhesis default embedding model

8424e88

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 14:11 — with GitHub Actions Inactive

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 14:14 — with GitHub Actions Inactive

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 14:15 — with GitHub Actions Inactive

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 14:18 — with GitHub Actions Inactive

EmanueleDeRossi1 added 3 commits February 16, 2026 15:40

fix: add DEFAULT_EMBEDDING_MODEL_PROVIDER and DEFAULT_EMBEDDING_MODEL…

696bcef

…_NAME in infrastructure, docs and github workflow files

feat: add new migration to add defaul Rhesis embedding model to all e…

d0bbe41

…xisting organizations

fix(test): add missing description parameter

b21228b

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 15:50 — with GitHub Actions Inactive

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 15:52 — with GitHub Actions Inactive

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 15:53 — with GitHub Actions Inactive

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 15:54 — with GitHub Actions Inactive

EmanueleDeRossi1 temporarily deployed to dev February 16, 2026 15:57 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add default Rhesis embedding model and standardize terminology#1355

Add default Rhesis embedding model and standardize terminology#1355
EmanueleDeRossi1 wants to merge 23 commits intomainfrom
feature/user-default-embedding-model

EmanueleDeRossi1 commented Feb 16, 2026 •

edited

Loading

Uh oh!

peqy bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

peqy bot commented Feb 16, 2026

Uh oh!

peqy bot commented Feb 16, 2026

Uh oh!

peqy bot commented Feb 16, 2026

Uh oh!

peqy bot commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EmanueleDeRossi1 commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

❗ Breaking Changes

Uh oh!

peqy bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

peqy bot commented Feb 16, 2026

Uh oh!

peqy bot commented Feb 16, 2026

Uh oh!

peqy bot commented Feb 16, 2026

Uh oh!

peqy bot commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

EmanueleDeRossi1 commented Feb 16, 2026 •

edited

Loading