-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Enable dimension passing to llitellm #1942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Please make sure all the checkboxes are checked:
|
WalkthroughAdds optional Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py (1)
89-107: Consider documenting the dimensions parameter behavior.The docstring comprehensively covers error handling and context window behavior, but doesn't mention that the embedding call optionally includes the dimensions parameter when configured. Consider adding a brief note about this behavior for completeness.
📝 Suggested docstring enhancement
async def embed_text(self, text: List[str]) -> List[List[float]]: """ Embed a list of text strings into vector representations. If the input exceeds the model's context window, the method will recursively split the input and combine the results. It handles both mock and live embedding scenarios, logging errors for any encountered exceptions, and raising specific exceptions for context window issues and embedding failures. + + When dimensions is configured, it is passed to the embedding API (for models that + support it) to control the output vector dimensionality. Parameters: -----------As per coding guidelines: undocumented function definitions are assumed incomplete.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Use 4-space indentation in Python code
Use snake_case for Python module and function names
Use PascalCase for Python class names
Use ruff format before committing Python code
Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml
Prefer explicit, structured error handling in Python code
Files:
cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py
⚙️ CodeRabbit configuration file
**/*.py: When reviewing Python code for this project:
- Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.
- As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.
- As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance.
- As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.
- As a general rule, undocumented function definitions and class definitions in the project's Python code are assumed incomplete. Please consider suggesting a short summary of the code for any of these incomplete definitions as docstrings when reviewing.
Files:
cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py
cognee/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use shared logging utilities from cognee.shared.logging_utils in Python code
Files:
cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py
cognee/{modules,infrastructure,tasks}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Co-locate feature-specific helpers under their respective package (modules/, infrastructure/, or tasks/)
Files:
cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
- GitHub Check: End-to-End Tests / Test multi tenancy with different situations in Cognee
- GitHub Check: End-to-End Tests / Test dataset database deletion in Cognee
- GitHub Check: End-to-End Tests / Conversation sessions test (FS)
- GitHub Check: End-to-End Tests / Test Cognify - Edge Centered Payload
- GitHub Check: End-to-End Tests / Test Feedback Enrichment
- GitHub Check: End-to-End Tests / Conversation sessions test (Redis)
- GitHub Check: End-to-End Tests / Concurrent Subprocess access test
- GitHub Check: End-to-End Tests / Test dataset database handlers in Cognee
- GitHub Check: End-to-End Tests / Test Entity Extraction
- GitHub Check: End-to-End Tests / Test graph edge ingestion
- GitHub Check: End-to-End Tests / Test permissions with different situations in Cognee
- GitHub Check: End-to-End Tests / Test adding of label for data in Cognee
- GitHub Check: End-to-End Tests / Test using different async databases in parallel in Cognee
- GitHub Check: End-to-End Tests / Server Start Test
- GitHub Check: End-to-End Tests / S3 Bucket Test
- GitHub Check: End-to-End Tests / Deduplication Test
- GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
- GitHub Check: Basic Tests / Run Integration Tests
- GitHub Check: Basic Tests / Run Simple Examples
- GitHub Check: Basic Tests / Run Unit Tests
- GitHub Check: CLI Tests / CLI Functionality Tests
- GitHub Check: CLI Tests / CLI Integration Tests
| embedding_kwargs = { | ||
| "model": self.model, | ||
| "input": text, | ||
| "api_key": self.api_key, | ||
| "api_base": self.endpoint, | ||
| "api_version": self.api_version, | ||
| } | ||
| # Pass through target embedding dimensions when supported | ||
| if self.dimensions is not None: | ||
| embedding_kwargs["dimensions"] = self.dimensions | ||
|
|
||
| response = await litellm.aembedding(**embedding_kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the file content to understand the context
cat -n cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py | head -150Repository: topoteretes/cognee
Length of output: 6543
🏁 Script executed:
# Check if litellm is available and get version info
pip show litellm 2>/dev/null | grep -E "Version|Name"Repository: topoteretes/cognee
Length of output: 148
🏁 Script executed:
# Check the aembedding function signature if available
python3 << 'EOF'
try:
import inspect
import litellm
sig = inspect.signature(litellm.aembedding)
print("aembedding signature:")
print(sig)
print("\nParameters:")
for param_name, param in sig.parameters.items():
print(f" {param_name}: {param.annotation if param.annotation != inspect.Parameter.empty else 'no annotation'}")
except Exception as e:
print(f"Error: {e}")
EOFRepository: topoteretes/cognee
Length of output: 148
🌐 Web query:
litellm aembedding dimensions parameter API documentation
💡 Result:
The LiteLLM embedding API (embedding / aembedding) accepts an optional integer parameter dimensions which requests the output embedding dimensionality. It’s documented as an optional field (used with OpenAI/Azure text-embedding-3+ models and some provider-specific models) and appears in the aembedding examples. See LiteLLM docs for details and supported-model notes. [1][2]
Sources
[1] LiteLLM aembedding / embedding docs (parameters list, includes dimensions).
[2] LiteLLM “supported embedding” / provider notes (dimensions usage and examples).
🌐 Web query:
OpenAI embedding models dimensions parameter support text-embedding-3
💡 Result:
Yes — the text-embedding-3 models support a dimensions parameter. Summary:
- text-embedding-3-large: default 3072, can be shortened to 1024 or 256 (supports 3072, 1024, 256). [1][2]
- text-embedding-3-small: default 1536, can be returned at 512 or 1536 (supports 1536, 512). [1][3]
References:
- OpenAI model page (text-embedding-3-large / embeddings). [1]
- OpenAI announcement describing the dimensions parameter and shortening examples. [2]
- Weaviate docs listing available dimensions per model. [3]
Add docstring documentation for the dimensions parameter.
The implementation correctly constructs embedding kwargs and conditionally passes the dimensions parameter to litellm.aembedding. LiteLLM's API supports the dimensions parameter (confirmed for version 1.76.0+), and the default model (openai/text-embedding-3-large) accepts it with valid values of 3072, 1024, or 256.
However, the class docstring (lines 35-43) does not document the dimensions parameter or its behavior. Update the docstring to explain:
- That dimensions controls output embedding dimensionality
- That it's passed through to the underlying embedding API when supported by the model
- The valid range or values for the default model
🤖 Prompt for AI Agents
In cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py
around lines 35 to 43 and implementation at 114-125, the class docstring is
missing documentation for the dimensions parameter; update the class docstring
to state that dimensions is an optional int controlling output embedding
dimensionality, that when set it is passed through to the underlying LiteLLM
embedding API (litellm.aembedding) only if the model supports it, and document
the valid values for the default model (openai/text-embedding-3-large accepts
3072, 1024, or 256). Also note the default behavior when dimensions is None (use
model default) and mention LiteLLM version requirement (1.76.0+) for dimensions
support.
Resolve conflict in LiteLLMEmbeddingEngine.py by combining: - Dimensions parameter support from COG-2988 - Timeout wrapper from dev for preventing indefinite hangs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py`:
- Around line 96-99: embed_text currently treats any non-list as a scalar and
wraps it into [text], causing tuples/other iterables to become a single prompt
and break JSON serialization; change the normalization so only str is treated as
a scalar (if isinstance(text, str): text = [text]) and for other iterable inputs
convert to a list via text = list(text), and update the embed_text type
hint/docstring to indicate it accepts Sequence[str] (or Iterable[str]) rather
than only list.
| # Handle case when a single string is passed instead of a list | ||
| if not isinstance(text, list): | ||
| text = [text] | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix regression for non-list iterables in embed_text.
Line 96–99 wraps any non-list into a single element, so tuples/iterables now become a single prompt instead of multiple prompts; this is a behavior change and can send a tuple into JSON serialization. Treat only str as a scalar and normalize other iterables via list(...), and update the type hint/docstring accordingly.
🔧 Proposed fix
-from typing import List, Optional
+from typing import List, Optional, Sequence, Union
@@
- async def embed_text(self, text: List[str]) -> List[List[float]]:
+ async def embed_text(self, text: Union[str, Sequence[str]]) -> List[List[float]]:
@@
- - text (List[str]): A list of text prompts for which to generate embeddings.
+ - text (Union[str, Sequence[str]]): A single prompt or a list/sequence of prompts.
@@
- if not isinstance(text, list):
- text = [text]
+ if isinstance(text, str):
+ text = [text]
+ else:
+ text = list(text)Docs: https://docs.python.org/3/library/typing.html#typing.Sequence
🤖 Prompt for AI Agents
In `@cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py`
around lines 96 - 99, embed_text currently treats any non-list as a scalar and
wraps it into [text], causing tuples/other iterables to become a single prompt
and break JSON serialization; change the normalization so only str is treated as
a scalar (if isinstance(text, str): text = [text]) and for other iterable inputs
convert to a list via text = list(text), and update the embed_text type
hint/docstring to indicate it accepts Sequence[str] (or Iterable[str]) rather
than only list.
Description
Acceptance Criteria
Type of Change
Screenshots/Videos (if applicable)
Pre-submission Checklist
DCO Affirmation
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
Summary by CodeRabbit
New Features
Compatibility & Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.