-
Notifications
You must be signed in to change notification settings - Fork 167
new: use cuda if available #537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds a Device enum (CPU, CUDA, AUTO) and updates many public constructors and ONNX-related method signatures to accept Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Areas to pay extra attention:
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
fastembed/sparse/minicoil.py (1)
95-99: Excellent docstring updates!The docstring has been properly updated to reflect the new parameter type and clearly explains when
device_idsshould be used with the differentcudavalues. This level of documentation consistency should be applied to other files as well.Consider updating docstrings in the other files (
custom_text_cross_encoder.py,sparse_text_embedding.py,custom_text_embedding.py) to match this level of detail for thecudaparameter documentation.fastembed/sparse/splade_pp.py (1)
88-89: Fix incomplete docstring.The docstring for the
cudaparameter is incomplete - it says "Defaults to Device." but should specify "Defaults to Device.AUTO."- cuda (Union[bool, Device], optional): Whether to use cuda for inference. Mutually exclusive with `providers` - Defaults to Device. + cuda (Union[bool, Device], optional): Whether to use cuda for inference. Mutually exclusive with `providers` + Defaults to Device.AUTO.fastembed/late_interaction_multimodal/onnx_multimodal_model.py (1)
65-65: ```shell
#!/bin/bashLocate Device enum/class definition across codebase
rg -n "class Device" -C5
Show any Enum base classes to confirm how Device is defined
rg -n "class .*Enum" -C3 --type py
Show all occurrences of Device.AUTO to see its context
rg -n "Device.AUTO" -C3
Inspect cuda parameter usage in onnx_multimodal_model.py
rg -n "def init" -C5 fastembed/late_interaction_multimodal/onnx_multimodal_model.py
rg -n "cuda" -C3 fastembed/late_interaction_multimodal/onnx_multimodal_model.py</blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used: CodeRabbit UI** **Review profile: CHILL** **Plan: Pro** <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between acec31277b78b24200ca900ba58c69d729abb663 and 1c438fccf8dd870a2a5bebef16410f48b9278af8. </details> <details> <summary>📒 Files selected for processing (23)</summary> * `fastembed/common/onnx_model.py` (2 hunks) * `fastembed/common/types.py` (2 hunks) * `fastembed/image/image_embedding.py` (2 hunks) * `fastembed/image/onnx_embedding.py` (3 hunks) * `fastembed/image/onnx_image_model.py` (3 hunks) * `fastembed/late_interaction/colbert.py` (3 hunks) * `fastembed/late_interaction/late_interaction_text_embedding.py` (2 hunks) * `fastembed/late_interaction_multimodal/colpali.py` (3 hunks) * `fastembed/late_interaction_multimodal/late_interaction_multimodal_embedding.py` (2 hunks) * `fastembed/late_interaction_multimodal/onnx_multimodal_model.py` (4 hunks) * `fastembed/parallel_processor.py` (2 hunks) * `fastembed/rerank/cross_encoder/custom_text_cross_encoder.py` (2 hunks) * `fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py` (3 hunks) * `fastembed/rerank/cross_encoder/onnx_text_model.py` (4 hunks) * `fastembed/rerank/cross_encoder/text_cross_encoder.py` (2 hunks) * `fastembed/sparse/bm42.py` (3 hunks) * `fastembed/sparse/minicoil.py` (3 hunks) * `fastembed/sparse/sparse_text_embedding.py` (2 hunks) * `fastembed/sparse/splade_pp.py` (3 hunks) * `fastembed/text/custom_text_embedding.py` (3 hunks) * `fastembed/text/onnx_embedding.py` (3 hunks) * `fastembed/text/onnx_text_model.py` (3 hunks) * `fastembed/text/text_embedding.py` (2 hunks) </details> <details> <summary>⏰ Context from checks skipped due to timeout of 90000ms (15)</summary> * GitHub Check: Python 3.9.x on windows-latest test * GitHub Check: Python 3.11.x on macos-latest test * GitHub Check: Python 3.11.x on windows-latest test * GitHub Check: Python 3.10.x on windows-latest test * GitHub Check: Python 3.13.x on windows-latest test * GitHub Check: Python 3.11.x on ubuntu-latest test * GitHub Check: Python 3.13.x on macos-latest test * GitHub Check: Python 3.13.x on ubuntu-latest test * GitHub Check: Python 3.12.x on windows-latest test * GitHub Check: Python 3.10.x on macos-latest test * GitHub Check: Python 3.12.x on macos-latest test * GitHub Check: Python 3.9.x on ubuntu-latest test * GitHub Check: Python 3.10.x on ubuntu-latest test * GitHub Check: Python 3.12.x on ubuntu-latest test * GitHub Check: Python 3.9.x on macos-latest test </details> <details> <summary>🔇 Additional comments (56)</summary><blockquote> <details> <summary>fastembed/common/types.py (1)</summary> `29-32`: **LGTM! Clean Device enum implementation.** The enum correctly inherits from both `str` and `Enum`, which allows values to be used as strings while maintaining type safety. The three device options (CPU, CUDA, AUTO) provide clear and explicit device specification. </details> <details> <summary>fastembed/rerank/cross_encoder/custom_text_cross_encoder.py (2)</summary> `5-5`: **LGTM! Correct import of Device enum.** --- `18-18`: ```shell #!/bin/bash set -e # Locate the Device class definition rg -n "class Device" . # Find all references to Device.AUTO rg -n "Device.AUTO" . # Inspect usage of the `cuda` parameter in the rerank module rg -n "cuda" fastembed/rerankfastembed/sparse/sparse_text_embedding.py (2)
5-5: LGTM! Correct import of Device enum.
60-60: Consistent parameter update with other files.The cuda parameter type change maintains the same pattern as other files in the codebase. The Union type preserves backward compatibility while enabling the new Device enum functionality.
fastembed/text/custom_text_embedding.py (2)
1-1: LGTM! Correct imports for the new parameter type.Both
UnionandDeviceare properly imported to support the newcudaparameter type.Also applies to: 14-14
35-35: Consistent parameter type update.The cuda parameter follows the same Union type pattern established across the codebase, maintaining backward compatibility.
fastembed/sparse/minicoil.py (2)
13-13: LGTM! Correct import of Device enum.
75-75: Consistent parameter type update.The cuda parameter follows the established Union type pattern across the codebase.
fastembed/sparse/bm42.py (2)
65-65: LGTM! Improved device specification flexibility.The change from a boolean
cudaparameter toUnion[bool, Device]withDevice.AUTOdefault provides more explicit device control while maintaining backward compatibility. The updated docstrings correctly reflect the new parameter types and usage.Also applies to: 83-87
12-12: Verify the Device enum definition and usage.The import looks correct, but let's verify the Device enum exists with the expected values and is properly used throughout the codebase.
#!/bin/bash # Verify Device enum exists and has expected values echo "Checking Device enum definition..." ast-grep --pattern 'class Device($$$)' echo -e "\nChecking Device enum values..." rg -A 10 "class Device" --type py echo -e "\nVerifying Device.AUTO, Device.CPU, Device.CUDA usage..." rg "Device\.(AUTO|CPU|CUDA)" --type py -A 2 -B 2fastembed/text/onnx_embedding.py (2)
3-3: LGTM! Clean import of Device enum.The import follows the established pattern and properly includes the Device enum alongside other type definitions.
205-205: LGTM! Consistent implementation of device enum.The parameter type change and documentation updates are well-implemented. The docstring clearly explains when
device_idsshould be used with the new Device enum values.Also applies to: 221-225
fastembed/image/onnx_image_model.py (2)
11-11: LGTM! Proper Device enum import.The import is correctly placed and follows the established pattern for including the Device enum.
56-56: LGTM! Consistent method signature updates.Both
_load_onnx_modeland_embed_imagesmethods have been consistently updated to use the new Device enum. The type changes maintain backward compatibility while providing enhanced device specification capabilities.Also applies to: 98-98
fastembed/sparse/splade_pp.py (2)
6-6: LGTM! Correct Device enum import.The import follows the established pattern and properly includes the Device enum.
72-72: LGTM! Proper parameter type update.The
cudaparameter type change anddevice_idsdocumentation are correctly implemented and consistent with the pattern across other files.Also applies to: 91-92
fastembed/late_interaction/late_interaction_text_embedding.py (2)
5-5: LGTM! Consistent Device enum import.The import properly includes the Device enum alongside other type definitions following the established pattern.
57-57: ```shell
#!/bin/bashShow the full signature of Colbert.init to verify cuda and device_ids support
echo ">> Full Colbert.init signature and parameters:"
rg -A15 "def init" -n fastembed/late_interaction/colbert.py</details> <details> <summary>fastembed/late_interaction_multimodal/late_interaction_multimodal_embedding.py (2)</summary> `5-5`: **LGTM: Import statement updated correctly.** The `Device` enum is properly imported alongside the existing `NumpyArray` type. --- `60-60`: **Verify Device enum definition and backward compatibility.** The change from `bool` to `Union[bool, Device]` with default `Device.AUTO` looks correct and maintains backward compatibility. However, please verify that the `Device` enum is properly defined and that `Device.AUTO` is an appropriate default value. ```shell #!/bin/bash # Verify Device enum definition and usage echo "Searching for Device enum definition..." ast-grep --pattern 'class Device' echo "Checking Device enum values..." rg -A 10 "class Device" --type py echo "Verifying Device.AUTO usage..." rg "Device\.AUTO" --type pyfastembed/parallel_processor.py (3)
11-11: LGTM: Union import added correctly.The
Uniontype is properly imported to support the new parameter type annotation.
13-13: LGTM: Device import added correctly.The
Deviceenum is properly imported from the common types module.
99-99: LGTM: Parameter type updated consistently.The
cudaparameter type change frombooltoUnion[bool, Device]with defaultDevice.AUTOis consistent with the broader refactoring pattern. The parameter is correctly passed to worker processes.fastembed/image/image_embedding.py (2)
4-4: LGTM: Device import added alongside existing types.The
Deviceenum is properly imported alongsideNumpyArrayfrom the common types module.
54-54: LGTM: Parameter type updated consistently.The
cudaparameter type change maintains backward compatibility while enabling the newDeviceenum usage. The parameter is correctly passed to the underlying embedding model.fastembed/rerank/cross_encoder/text_cross_encoder.py (3)
1-1: LGTM: Union import added for type annotation.The
Uniontype is properly imported to support the new parameter type annotation.
5-5: LGTM: Device import added correctly.The
Deviceenum is properly imported from the common types module.
60-60: LGTM: Parameter type updated consistently.The
cudaparameter type change frombooltoUnion[bool, Device]with defaultDevice.AUTOfollows the established pattern and is correctly passed to the underlying cross encoder model.fastembed/text/text_embedding.py (2)
5-5: LGTM: Device import added alongside existing types.The
Deviceenum is properly imported alongside other common types likeNumpyArrayandOnnxProvider.
85-85: LGTM: Parameter type updated consistently.The
cudaparameter type change maintains backward compatibility while enabling the newDeviceenum usage. The parameter is correctly passed to the underlying embedding model.fastembed/image/onnx_embedding.py (3)
4-4: LGTM: Device enum import added correctly.The import statement properly adds the
Deviceenum from the common types module.
69-69: LGTM: Parameter type and default updated appropriately.The change from
bool = FalsetoUnion[bool, Device] = Device.AUTOmaintains backward compatibility while providing more explicit device control. TheDevice.AUTOdefault is more user-friendly than requiring explicitFalse.
85-89: LGTM: Documentation updated accurately.The docstring updates correctly reflect the new parameter types and usage conditions, clearly stating when
device_idsshould be used with the variouscudavalues.fastembed/late_interaction/colbert.py (3)
7-7: LGTM: Consistent Device enum import.Import follows the same pattern as other updated files.
125-125: LGTM: Parameter signature updated consistently.The cuda parameter change maintains the same pattern as other files in this refactor.
141-145: LGTM: Documentation consistently updated.Docstrings match the pattern established in other files, correctly documenting the new parameter usage.
fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py (3)
7-7: LGTM: Device import added consistently.Follows the established pattern for this refactor.
84-84: LGTM: Parameter type updated consistently.Maintains the same signature pattern as other refactored classes.
100-104: LGTM: Documentation updated consistently.Docstring changes match the pattern and accurately describe the new parameter usage.
fastembed/common/onnx_model.py (3)
12-12: LGTM: Device enum import added to core module.Properly imports the Device enum for use in the core ONNX model logic.
61-61: LGTM: Core parameter signature updated.The updated signature in the core
_load_onnx_modelmethod establishes the foundation for the Device enum usage across the codebase.
66-81: LGTM: Enhanced device selection logic.The updated logic provides several improvements:
- Queries available providers once at the start for efficiency
- Clearly distinguishes explicit CUDA requests (
TrueorDevice.CUDA) from auto-detection- Provides helpful warnings for mutually exclusive parameters
- Handles
Device.AUTOmode intelligently by checking CUDA availability- Maintains backward compatibility with boolean values
The logic correctly prioritizes explicit user choices while providing smart defaults.
fastembed/late_interaction_multimodal/colpali.py (3)
8-8: LGTM: Device import added consistently.Follows the established import pattern for this refactor.
52-52: LGTM: Parameter signature updated consistently.Maintains the same pattern as all other refactored classes in the codebase.
68-72: LGTM: Documentation updated consistently.Docstring changes follow the established pattern and accurately describe the parameter usage conditions.
fastembed/text/onnx_text_model.py (3)
10-10: LGTM: Import addition is consistent.The
Deviceimport is correctly added to support the new union type parameter.
109-109: ```shell
#!/bin/bashShow the full init signature of ParallelWorkerPool in parallel_processor.py
sed -n '90,120p' fastembed/parallel_processor.py
--- `55-55`: ```shell #!/bin/bash # Display the full _load_onnx_model signature in the parent class rg -n "def _load_onnx_model" -C20 fastembed/common/onnx_model.pyfastembed/rerank/cross_encoder/onnx_text_model.py (3)
4-4: LGTM: Import additions are correct.Both
UnionandDeviceimports are properly added to support the new parameter type.Also applies to: 15-15
34-34: LGTM: Consistent parameter type update.The parameter type change is consistent with the broader refactoring and provides better device specification flexibility.
95-95: LGTM: Consistent parameter type update.The parameter type change follows the established pattern and maintains consistency across the codebase.
fastembed/late_interaction_multimodal/onnx_multimodal_model.py (4)
14-14: LGTM: Import addition is consistent.The
Deviceimport correctly supports the new union type parameter.
65-65: LGTM: Consistent parameter type update.The parameter type change maintains consistency with the broader refactoring effort.
121-121: LGTM: Consistent parameter type update.The parameter type change follows the established pattern across the codebase.
188-188: LGTM: Consistent parameter type update.The parameter type change maintains consistency with other methods in the refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
fastembed/common/onnx_model.py(2 hunks)fastembed/common/types.py(2 hunks)fastembed/image/image_embedding.py(2 hunks)fastembed/image/onnx_embedding.py(3 hunks)fastembed/image/onnx_image_model.py(3 hunks)fastembed/late_interaction/colbert.py(3 hunks)fastembed/late_interaction/late_interaction_text_embedding.py(2 hunks)fastembed/late_interaction_multimodal/colpali.py(3 hunks)fastembed/late_interaction_multimodal/late_interaction_multimodal_embedding.py(2 hunks)fastembed/late_interaction_multimodal/onnx_multimodal_model.py(4 hunks)fastembed/parallel_processor.py(2 hunks)fastembed/rerank/cross_encoder/custom_text_cross_encoder.py(2 hunks)fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py(3 hunks)fastembed/rerank/cross_encoder/onnx_text_model.py(4 hunks)fastembed/rerank/cross_encoder/text_cross_encoder.py(2 hunks)fastembed/sparse/bm42.py(3 hunks)fastembed/sparse/minicoil.py(3 hunks)fastembed/sparse/sparse_text_embedding.py(2 hunks)fastembed/sparse/splade_pp.py(3 hunks)fastembed/text/custom_text_embedding.py(3 hunks)fastembed/text/onnx_embedding.py(3 hunks)fastembed/text/onnx_text_model.py(3 hunks)fastembed/text/text_embedding.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
- fastembed/text/onnx_text_model.py
- fastembed/image/onnx_image_model.py
- fastembed/late_interaction_multimodal/colpali.py
- fastembed/rerank/cross_encoder/text_cross_encoder.py
- fastembed/parallel_processor.py
- fastembed/text/onnx_embedding.py
- fastembed/common/types.py
- fastembed/sparse/sparse_text_embedding.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-12T10:48:30.978Z
Learnt from: joein
Repo: qdrant/fastembed PR: 574
File: fastembed/sparse/sparse_embedding_base.py:2-2
Timestamp: 2025-11-12T10:48:30.978Z
Learning: In fastembed codebase, when using numpy NDArray types in dataclass fields, keep Union syntax instead of PEP 604 pipe operator (|) because dataclasses evaluate annotations at runtime and NDArray types don't support the __or__ operator. Add a comment explaining the constraint.
Applied to files:
fastembed/common/onnx_model.pyfastembed/image/image_embedding.pyfastembed/rerank/cross_encoder/onnx_text_cross_encoder.pyfastembed/rerank/cross_encoder/custom_text_cross_encoder.pyfastembed/text/text_embedding.pyfastembed/text/custom_text_embedding.py
🧬 Code graph analysis (15)
fastembed/sparse/minicoil.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/image/onnx_embedding.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/common/onnx_model.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/image/image_embedding.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py (2)
fastembed/common/onnx_model.py (1)
OnnxOutputContext(20-23)fastembed/common/types.py (1)
Device(29-32)
fastembed/sparse/bm42.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/sparse/splade_pp.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/rerank/cross_encoder/custom_text_cross_encoder.py (2)
fastembed/common/model_description.py (1)
BaseModelDescription(24-31)fastembed/common/types.py (1)
Device(29-32)
fastembed/text/text_embedding.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/late_interaction_multimodal/late_interaction_multimodal_embedding.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/text/custom_text_embedding.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/rerank/cross_encoder/onnx_text_model.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/late_interaction_multimodal/onnx_multimodal_model.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/late_interaction/colbert.py (1)
fastembed/common/types.py (1)
Device(29-32)
fastembed/late_interaction/late_interaction_text_embedding.py (1)
fastembed/common/types.py (1)
Device(29-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Python 3.13.x on windows-latest test
- GitHub Check: Python 3.13.x on macos-latest test
- GitHub Check: Python 3.9.x on windows-latest test
- GitHub Check: Python 3.9.x on macos-latest test
🔇 Additional comments (16)
fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py (1)
1-1: Device enum integration forcudalooks consistentImporting
Union/Device, updatingcudatoUnion[bool, Device] = Device.AUTO, and aligning the docstrings with the newcuda/device_idssemantics are all coherent and match howcudais passed into_load_onnx_modeland_rerank_pairs. No issues from this change on this path.Also applies to: 7-7, 84-85, 100-104
fastembed/text/custom_text_embedding.py (1)
1-1: CUDADevicesupport wired correctly intoCustomTextEmbeddingThe updated typing/imports and the new
cuda: Union[bool, Device] = Device.AUTOparameter are correctly forwarded toOnnxTextEmbedding, so this class stays aligned with the new device-selection API.Also applies to: 14-14, 35-35
fastembed/late_interaction/late_interaction_text_embedding.py (1)
5-5: Late-interaction wrapper correctly adoptsDevice-typedcudaImporting
Deviceand changingcudatoUnion[bool, Device] = Device.AUTOin the constructor, then passing it through to the selected embedding model, keeps this facade consistent with the rest of the API without altering other behavior.Also applies to: 57-57
fastembed/sparse/minicoil.py (1)
13-13: MiniCOIL CUDA configuration matches new Device semanticsThe switch to
cuda: Union[bool, Device] = Device.AUTOplus the updated docstrings (including valid combinations fordevice_ids) are consistent with howcuda/device_idsare later passed into_load_onnx_modeland_embed_documents. No further changes needed here.Also applies to: 82-82, 102-106
fastembed/sparse/bm42.py (1)
12-12: Bm42 now correctly exposes Device-awarecudaconfigurationThe new
cuda: Union[bool, Device] = Device.AUTOsignature and corresponding docstring updates (including allowed values for use withdevice_ids) are coherent with howcudais later consumed. Looks good.Also applies to: 73-73, 91-95
fastembed/late_interaction/colbert.py (1)
8-8: Colbert constructor correctly aligned with Device-basedcudaAPIThe import of
Device, thecuda: Union[bool, Device] = Device.AUTOsignature, and the refreshed docstrings forcuda/device_idsare all in sync with the usage inload_onnx_modelandembed/query_embed. No issues spotted.Also applies to: 146-146, 162-166
fastembed/text/text_embedding.py (1)
5-5: TextEmbedding now defaults to automatic device selectionImporting
Deviceand updatingcudatoUnion[bool, Device] = Device.AUTO, then passing it through to all registered embedding implementations, cleanly centralizes the “use CUDA if available” behavior at this facade. Callers who need CPU-only can still passcuda=FalseorDevice.CPU.Also applies to: 85-85
fastembed/common/onnx_model.py (1)
63-90: LGTM on the device selection logic.The provider selection logic correctly handles all combinations:
- Explicit CUDA request (
cuda=Trueorcuda=Device.CUDA) → CUDAExecutionProvider- Auto-detection (
cuda=Device.AUTO) with available CUDA → CUDAExecutionProvider- Explicit CPU (
cuda=Device.CPUorcuda=False) or AUTO without CUDA → CPUExecutionProviderThe backward compatibility with boolean values is preserved.
fastembed/late_interaction_multimodal/late_interaction_multimodal_embedding.py (1)
54-79: LGTM!The
cudaparameter type update is consistent with the broader API changes, and the value is correctly forwarded to the underlying model implementation.fastembed/rerank/cross_encoder/custom_text_cross_encoder.py (1)
12-36: LGTM!The parameter type change aligns with the codebase-wide update, and all arguments are correctly forwarded to the parent class.
fastembed/image/onnx_embedding.py (1)
63-106: LGTM!The parameter type change, docstring updates, and forwarding logic are all correctly implemented and consistent with the PR-wide changes.
fastembed/image/image_embedding.py (1)
48-73: LGTM!The
cudaparameter type update follows the consistent pattern across the codebase, with correct forwarding to the underlying embedding model.fastembed/rerank/cross_encoder/onnx_text_model.py (2)
4-4: LGTM! Import additions support the new device typing.The addition of
UnionandDeviceimports is correct and necessary for the updated type annotations.Also applies to: 15-15
28-46: No verification issues found — parent class and ParallelWorkerPool already support Union[bool, Device].The
cuda: Union[bool, Device] = Device.AUTOparameter signature is consistent across the codebase. The parent classOnnxModel._load_onnx_model()andParallelWorkerPool.__init__()already use the same type signature, and all type handling is correct (proper enum comparisons at lines 71 and 84 in OnnxModel). No breaking boolean checks exist.fastembed/late_interaction_multimodal/onnx_multimodal_model.py (2)
14-14: LGTM! Device import added correctly.The
Deviceimport is properly added to support the updated type annotations.Unionwas already imported on line 5.
59-77: LGTM! Method signatures updated consistently.All three methods (
_load_onnx_model,_embed_documents,_embed_images) correctly update thecudaparameter toUnion[bool, Device] = Device.AUTO. The parameters are properly passed through to parent class andParallelWorkerPool.Note: Type compatibility verification requested in the previous file review applies here as well.
Also applies to: 115-170, 186-241
tbung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs rebase and all Union removed, I didn't mark them all.
fastembed/common/onnx_model.py
Outdated
| from dataclasses import dataclass | ||
| from pathlib import Path | ||
| from typing import Any, Generic, Iterable, Optional, Sequence, Type, TypeVar | ||
| from typing import Any, Generic, Iterable, Optional, Sequence, Type, TypeVar, Union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs rebase and removal of Union and Optional
fastembed/common/onnx_model.py
Outdated
| threads: Optional[int], | ||
| providers: Optional[Sequence[OnnxProvider]] = None, | ||
| cuda: bool = False, | ||
| cuda: Union[bool, Device] = Device.AUTO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union
fastembed/image/image_embedding.py
Outdated
| threads: Optional[int] = None, | ||
| providers: Optional[Sequence[OnnxProvider]] = None, | ||
| cuda: bool = False, | ||
| cuda: Union[bool, Device] = Device.AUTO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union
fastembed/image/onnx_embedding.py
Outdated
| threads: Optional[int] = None, | ||
| providers: Optional[Sequence[OnnxProvider]] = None, | ||
| cuda: bool = False, | ||
| cuda: Union[bool, Device] = Device.AUTO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union
fastembed/image/onnx_image_model.py
Outdated
| threads: Optional[int], | ||
| providers: Optional[Sequence[OnnxProvider]] = None, | ||
| cuda: bool = False, | ||
| cuda: Union[bool, Device] = Device.AUTO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union
fastembed/image/onnx_image_model.py
Outdated
| parallel: Optional[int] = None, | ||
| providers: Optional[Sequence[OnnxProvider]] = None, | ||
| cuda: bool = False, | ||
| cuda: Union[bool, Device] = Device.AUTO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union
| threads: Optional[int] = None, | ||
| providers: Optional[Sequence[OnnxProvider]] = None, | ||
| cuda: bool = False, | ||
| cuda: Union[bool, Device] = Device.AUTO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union
| threads: Optional[int] = None, | ||
| providers: Optional[Sequence[OnnxProvider]] = None, | ||
| cuda: bool = False, | ||
| cuda: Union[bool, Device] = Device.AUTO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union
| threads: Optional[int] = None, | ||
| providers: Optional[Sequence[OnnxProvider]] = None, | ||
| cuda: bool = False, | ||
| cuda: Union[bool, Device] = Device.AUTO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union
| threads: Optional[int] = None, | ||
| providers: Optional[Sequence[OnnxProvider]] = None, | ||
| cuda: bool = False, | ||
| cuda: Union[bool, Device] = Device.AUTO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
fastembed/sparse/splade_pp.py (1)
93-94: Fix docstring: "Defaults to Device." should be "Defaults to Device.AUTO."The docstring states "Defaults to Device." but the code at line 77 shows the default is
Device.AUTO. This inconsistency was already flagged in a previous review.Apply this diff to fix the docstring:
- cuda (Union[bool, Device], optional): Whether to use cuda for inference. Mutually exclusive with `providers` - Defaults to Device. + cuda (Union[bool, Device], optional): Whether to use cuda for inference. Mutually exclusive with `providers`. + Defaults to Device.AUTO.fastembed/text/onnx_embedding.py (1)
3-241: OnnxTextEmbedding correctly centralizes Device-aware cuda (minor docstyle nit)The switch to
cuda: bool | Device = Device.AUTO, storingself.cuda, and passing it through in bothembedandload_onnx_modelis consistent and matches the intended “auto CUDA if available” behavior across the stack. TheOnnxProvider/Deviceimport consolidation also looks fine.Minor nit: the docstring still refers to
cuda (Union[bool, Device], ...)while the code usesbool | Device. Consider aligning the docstring type notation with the code style for clarity.
🧹 Nitpick comments (4)
fastembed/text/onnx_text_model.py (2)
49-68: Consider updating method docstrings.If
_load_onnx_modelhas a docstring documenting thecudaparameter, it should be updated to reflect the new type (Union[bool, Device]) and default value (Device.AUTO). Similar to the comprehensive docstring updates infastembed/image/onnx_embedding.py(lines 86-90).
103-117: Consider updating method docstrings.If
_embed_documentshas a docstring documenting thecudaparameter, it should be updated to reflect the new type (Union[bool, Device]) and default value (Device.AUTO).fastembed/image/onnx_image_model.py (2)
50-69: Consider updating method docstrings.If
_load_onnx_modelhas a docstring documenting thecudaparameter, it should be updated to reflect the new type and default value, similar to the updates infastembed/image/onnx_embedding.py.
92-106: Consider updating method docstrings.If
_embed_imageshas a docstring documenting thecudaparameter, it should be updated to reflect the new type (Union[bool, Device]) and default value (Device.AUTO).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
fastembed/common/onnx_model.py(2 hunks)fastembed/common/types.py(1 hunks)fastembed/image/image_embedding.py(2 hunks)fastembed/image/onnx_embedding.py(3 hunks)fastembed/image/onnx_image_model.py(3 hunks)fastembed/late_interaction/colbert.py(3 hunks)fastembed/late_interaction/late_interaction_text_embedding.py(2 hunks)fastembed/late_interaction_multimodal/colpali.py(3 hunks)fastembed/late_interaction_multimodal/late_interaction_multimodal_embedding.py(2 hunks)fastembed/late_interaction_multimodal/onnx_multimodal_model.py(4 hunks)fastembed/parallel_processor.py(2 hunks)fastembed/rerank/cross_encoder/custom_text_cross_encoder.py(2 hunks)fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py(3 hunks)fastembed/rerank/cross_encoder/onnx_text_model.py(3 hunks)fastembed/rerank/cross_encoder/text_cross_encoder.py(2 hunks)fastembed/sparse/bm42.py(3 hunks)fastembed/sparse/minicoil.py(3 hunks)fastembed/sparse/sparse_text_embedding.py(2 hunks)fastembed/sparse/splade_pp.py(3 hunks)fastembed/text/custom_text_embedding.py(2 hunks)fastembed/text/onnx_embedding.py(3 hunks)fastembed/text/onnx_text_model.py(3 hunks)fastembed/text/text_embedding.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
- fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py
- fastembed/late_interaction_multimodal/late_interaction_multimodal_embedding.py
- fastembed/image/image_embedding.py
- fastembed/common/onnx_model.py
- fastembed/rerank/cross_encoder/onnx_text_model.py
- fastembed/common/types.py
- fastembed/rerank/cross_encoder/text_cross_encoder.py
- fastembed/text/text_embedding.py
- fastembed/late_interaction/late_interaction_text_embedding.py
- fastembed/late_interaction_multimodal/onnx_multimodal_model.py
- fastembed/rerank/cross_encoder/custom_text_cross_encoder.py
🧰 Additional context used
🧬 Code graph analysis (12)
fastembed/sparse/bm42.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/parallel_processor.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/image/onnx_image_model.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/late_interaction/colbert.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/sparse/minicoil.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/text/onnx_embedding.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/text/onnx_text_model.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/sparse/sparse_text_embedding.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/sparse/splade_pp.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/image/onnx_embedding.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/late_interaction_multimodal/colpali.py (1)
fastembed/common/types.py (1)
Device(10-13)
fastembed/text/custom_text_embedding.py (1)
fastembed/common/types.py (1)
Device(10-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Python 3.13.x on ubuntu-latest test
- GitHub Check: Python 3.12.x on ubuntu-latest test
- GitHub Check: Python 3.13.x on windows-latest test
- GitHub Check: Python 3.11.x on ubuntu-latest test
- GitHub Check: Python 3.13.x on macos-latest test
🔇 Additional comments (16)
fastembed/sparse/splade_pp.py (2)
6-6: LGTM!The Device import is correctly added to support the new type annotation.
77-77: LGTM!The parameter signature correctly updates the cuda parameter to accept both bool and Device types with Device.AUTO as the default, enabling automatic GPU detection.
fastembed/text/custom_text_embedding.py (1)
13-35: CustomTextEmbedding cuda signature aligns with new Device APIImporting
Deviceand wideningcudatobool | Device = Device.AUTO, then forwarding it unchanged toOnnxTextEmbedding, is consistent and preserves existing behavior for bool callers while enabling the enum. No issues spotted.fastembed/sparse/sparse_text_embedding.py (1)
5-61: SparseTextEmbedding correctly forwards Device-aware cuda to underlying modelsThe
Deviceimport andcuda: bool | Device = Device.AUTOparameter, forwarded ascuda=cudawhen instantiating registry models, look correct and keep the wrapper thin and consistent with the new device API.fastembed/sparse/minicoil.py (1)
13-121: MiniCOIL Device-aware cuda integration looks consistentThe
cuda: bool | Device = Device.AUTOparameter, updated docstring, and propagation ofself.cudaintoload_onnx_modeland_embed_documentsall look coherent and in line with the rest of the refactor. No issues found.fastembed/sparse/bm42.py (1)
12-113: Bm42 cuda parameter refactor is correctly wiredThe change to
cuda: bool | Device = Device.AUTO, along with storingself.cudaand passing it through toload_onnx_modeland_embed_documents, is consistent with the new Device enum usage. The updated docstring matches the signature. Looks good.fastembed/late_interaction/colbert.py (1)
8-184: Colbert’s Device-aware cuda parameter is consistent with other ONNX text modelsThe import of
Device, the widenedcuda: bool | Device = Device.AUTOsignature, and its propagation intoload_onnx_modeland_embed_documentsall look correct and match the pattern used in other text embeddings. No additional issues detected.fastembed/late_interaction_multimodal/colpali.py (1)
8-89: ColPali Device-aware cuda wiring for text and image embeddings looks goodUsing
cuda: bool | Device = Device.AUTO, storing it onself.cuda, and forwarding it consistently inload_onnx_model,embed_text, andembed_imagekeeps the multimodal behavior intact while enabling the new Device enum. No issues noticed.fastembed/parallel_processor.py (1)
13-111: ParallelWorkerPool's cuda default is part of a consistent internal API design – no public API impact
ParallelWorkerPoolis an internal class not exported in the public API, andWorkeris an abstract base for internal use only. The single internalWorkersubclass (Bm25Worker) receives thecudaparameter via**kwargsand forwards it correctly, making it compatible with theDevice.AUTOdefault regardless of parameter type. This change aligns with the library's systematic adoption ofDevice.AUTOacross all public embedding APIs, and poses no breaking changes to external code.Likely an incorrect or invalid review comment.
fastembed/text/onnx_text_model.py (2)
10-10: LGTM! Device import added correctly.The import statement properly includes the new Device enum type needed for the cuda parameter updates.
55-55: No issue found. The project's minimum Python version is 3.10.0 (python = ">=3.10.0"in pyproject.toml), which fully supports PEP 604 union syntax. The type annotationbool | Deviceis compatible with the project's Python version requirement.Likely an incorrect or invalid review comment.
fastembed/image/onnx_image_model.py (2)
11-11: LGTM! Device import added correctly.The import statement properly includes the new Device enum type. The past review comments suggesting
Unionare addressed by using the modern PEP 604 syntax (bool | Device).
56-56: LGTM! Type signatures updated consistently.The cuda parameter type has been updated from
booltobool | Devicewith a default ofDevice.AUTOin both_load_onnx_modeland_embed_images. The parameters are properly propagated through the call chains.Also applies to: 100-100
fastembed/image/onnx_embedding.py (3)
4-4: LGTM! Device import added correctly.The import statement properly includes the new Device enum type needed for the cuda parameter updates.
70-70: LGTM! Type signature updated consistently.The cuda parameter type has been properly updated to
bool | Devicewith a default ofDevice.AUTO, consistent with the changes across other modules.
86-90: Excellent docstring updates!The docstring updates are comprehensive and clearly document:
- The new
Union[bool, Device]type- Valid values including
Device.AUTOandDevice.CUDA- Mutual exclusivity with the
providersparameter- Integration with
device_idsparameterThis is a good model for docstring updates in the other files.
* new: use cuda if available * fix: fix warning msg * fix: add missing import
No description provided.