-
Notifications
You must be signed in to change notification settings - Fork 392
feat: specify text embedding dim #6097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Greptile OverviewGreptile SummaryThis PR implements support for specifying custom embedding dimensions for both LM Studio and Transformers text embedding providers, addressing issue #5555. Key Changes:
Issues Found:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant Provider
participant Descriptor
participant Embedder
participant API
User->>Provider: get_text_embedder(model, dimensions)
Provider->>Descriptor: Create descriptor with dimensions
Descriptor->>Descriptor: __post_init__() - validate dimensions
alt LM Studio
Descriptor->>API: Check if model in _models
alt Model supports overriding
Descriptor->>Descriptor: Set supports_overriding_dimensions=True
else Model doesn't support
Descriptor-->>User: ValueError
end
else Transformers
Descriptor->>API: AutoConfig.from_pretrained()
API-->>Descriptor: hidden_size
alt dimensions > hidden_size
Descriptor-->>User: ValueError
end
end
User->>Descriptor: get_dimensions()
alt dimensions specified
Descriptor-->>User: Return specified dimensions
else dimensions not specified
alt LM Studio
Descriptor->>API: embeddings.create("dimension probe")
API-->>Descriptor: embedding vector
Descriptor-->>User: Return detected dimensions
else Transformers
Descriptor->>API: AutoConfig.from_pretrained()
API-->>Descriptor: hidden_size
Descriptor-->>User: Return model dimensions
end
end
User->>Descriptor: instantiate()
Descriptor->>Embedder: Create embedder
alt LM Studio
Embedder->>API: Pass dimensions if supports_overriding_dimensions else omit
else Transformers
Embedder->>API: Pass truncate_dim=dimensions
end
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 files reviewed, 2 comments
| return | ||
| if self.dimensions <= 0: | ||
| raise ValueError("Embedding dimensions must be a positive integer.") | ||
| dimensions = AutoConfig.from_pretrained(self.model, trust_remote_code=True).hidden_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model download happens in __post_init__ during validation
Calling AutoConfig.from_pretrained() in __post_init__ will download the model config every time a descriptor is created, even when validation isn't needed (e.g., when dimensions is None). This causes unnecessary network calls and delays during normal usage.
Move the model config fetch to get_dimensions() method where it's already called, or cache it after first use.
Prompt To Fix With AI
This is a comment left during a code review.
Path: daft/ai/transformers/protocols/text_embedder.py
Line: 36:36
Comment:
Model download happens in `__post_init__` during validation
Calling `AutoConfig.from_pretrained()` in `__post_init__` will download the model config every time a descriptor is created, even when validation isn't needed (e.g., when `dimensions` is `None`). This causes unnecessary network calls and delays during normal usage.
Move the model config fetch to `get_dimensions()` method where it's already called, or cache it after first use.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AutoConfig.from_pretrained() only runs when dimensions is not None. The early return in __post_init__ prevents any model config lookup when no validation is needed.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6097 +/- ##
===========================================
- Coverage 72.91% 43.41% -29.50%
===========================================
Files 973 909 -64
Lines 126196 112757 -13439
===========================================
- Hits 92016 48956 -43060
- Misses 34180 63801 +29621
🚀 New features to boost your workflow:
|
|
@greptile-apps re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 files reviewed, 1 comment
Changes Made
Related Issues
Closes #5555.