Skip to content

Conversation

@joein
Copy link
Member

@joein joein commented Nov 24, 2025

  • Implement
  • Add tests

@coderabbitai
Copy link

coderabbitai bot commented Nov 24, 2025

📝 Walkthrough

Walkthrough

Introduces controlled exposure and application of ONNX Runtime SessionOptions. OnnxModel gains EXPOSED_SESSION_OPTIONS, _select_exposed_session_options, and add_extra_session_options; _load_onnx_model accepts extra_session_options and applies validated options to ort.SessionOptions before creating an InferenceSession. Multiple model classes (text, image, sparse, rerank, late-interaction, multimodal) extract _extra_session_options from kwargs and thread them through load_onnx_model, embedding calls, and parallel worker params. Tests added to assert enable_cpu_mem_arena behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Files to focus on:
    • fastembed/common/onnx_model.py: EXPOSED_SESSION_OPTIONS, _select_exposed_session_options, add_extra_session_options, and integration into _load_onnx_model.
    • Model classes that initialize and propagate self._extra_session_options (fastembed/text/, fastembed/image/, fastembed/sparse/, fastembed/late_interaction, fastembed/rerank/, fastembed/late_interaction_multimodal/): confirm consistent propagation and correct forwarding into parallel worker params.
    • Tests (tests/*_onnx_embeddings.py, tests/test_late_interaction_embeddings.py, tests/test_sparse_embeddings.py, tests/test_text_cross_encoder.py): validate they assert the intended session option and that test setup matches production defaults.
    • Parameter merging and in-place mutation concerns: ensure add_extra_session_options validates keys, types, and safely mutates SessionOptions only for exposed options.
    • None vs empty-dict handling and behavior when unsupported keys are passed.

Suggested reviewers

  • generall

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.18% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'new: expose some onnx session options' directly describes the main change in the PR, which is exposing ONNX session options across multiple embedding and model classes.
Description check ✅ Passed The description indicates implementation is complete and tests have been added, which aligns with the changeset that includes both code changes and test additions.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cpu-mem-arena

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 298912a and 2ef4d17.

📒 Files selected for processing (5)
  • tests/test_image_onnx_embeddings.py (1 hunks)
  • tests/test_late_interaction_embeddings.py (1 hunks)
  • tests/test_sparse_embeddings.py (2 hunks)
  • tests/test_text_cross_encoder.py (1 hunks)
  • tests/test_text_onnx_embeddings.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
tests/test_sparse_embeddings.py (3)
tests/test_image_onnx_embeddings.py (2)
  • test_session_options (169-175)
  • model_cache (35-56)
tests/test_text_onnx_embeddings.py (2)
  • test_session_options (199-205)
  • model_cache (80-101)
tests/test_attention_embeddings.py (1)
  • model_cache (16-38)
tests/test_text_cross_encoder.py (5)
tests/test_image_onnx_embeddings.py (2)
  • test_session_options (169-175)
  • model_cache (35-56)
tests/test_late_interaction_embeddings.py (2)
  • test_session_options (314-320)
  • model_cache (159-180)
tests/test_sparse_embeddings.py (2)
  • test_session_options (294-300)
  • model_cache (90-111)
tests/test_text_onnx_embeddings.py (2)
  • test_session_options (199-205)
  • model_cache (80-101)
fastembed/rerank/cross_encoder/text_cross_encoder.py (1)
  • TextCrossEncoder (15-163)
tests/test_image_onnx_embeddings.py (1)
fastembed/image/image_embedding.py (1)
  • ImageEmbedding (11-135)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (6)
tests/test_text_cross_encoder.py (1)

127-134: LGTM! Session options test aligns with existing patterns.

The test correctly validates that enable_cpu_mem_arena defaults to True and can be explicitly disabled via constructor argument. The implementation mirrors other session option tests across the test suite.

tests/test_late_interaction_embeddings.py (1)

313-320: LGTM! Session options test correctly validates late-interaction embedding behavior.

The test properly verifies session option propagation for LateInteractionTextEmbedding, consistent with the broader test suite.

tests/test_sparse_embeddings.py (2)

80-85: LGTM! Caching the BM42 model improves CI efficiency.

Adding "Qdrant/bm42-all-minilm-l6-v2-attentions" to the cache list prevents redundant downloads during test runs.


286-300: LGTM! Session options test covers multiple sparse embedding models.

The test correctly validates enable_cpu_mem_arena behavior across three sparse embedding models, ensuring consistent session option propagation.

tests/test_image_onnx_embeddings.py (1)

168-175: LGTM! Session options test validates image embedding behavior.

The test correctly verifies that enable_cpu_mem_arena can be controlled via the ImageEmbedding constructor and propagates to the underlying ONNX session.

tests/test_text_onnx_embeddings.py (1)

198-205: LGTM! Session options test completes coverage for text embeddings.

The test properly validates that enable_cpu_mem_arena defaults to True and can be explicitly disabled through the TextEmbedding constructor.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
fastembed/image/onnx_image_model.py (1)

127-135: Fix critical None-unpacking bug in parallel embedding.

Line 133 unpacks extra_session_options with **extra_session_options, which raises TypeError: 'NoneType' object is not a mapping when extra_session_options is None (the default).

Apply this diff to fix the unpacking:

             start_method = "forkserver" if "forkserver" in get_all_start_methods() else "spawn"
             params = {
                 "model_name": model_name,
                 "cache_dir": cache_dir,
                 "providers": providers,
                 "local_files_only": local_files_only,
                 "specific_model_path": specific_model_path,
-                **extra_session_options,
+                **(extra_session_options or {}),
                 **kwargs,
             }
fastembed/image/onnx_embedding.py (1)

174-186: Missing extra_session_options propagation to _embed_images.

The _extra_session_options are not passed to _embed_images, meaning parallel workers won't receive the session options. This breaks the feature for parallel embedding scenarios.

Apply this diff to propagate the session options:

         yield from self._embed_images(
             model_name=self.model_name,
             cache_dir=str(self.cache_dir),
             images=images,
             batch_size=batch_size,
             parallel=parallel,
             providers=self.providers,
             cuda=self.cuda,
             device_ids=self.device_ids,
             local_files_only=self._local_files_only,
             specific_model_path=self._specific_model_path,
+            extra_session_options=self._extra_session_options,
             **kwargs,
         )
fastembed/text/onnx_text_model.py (1)

136-148: Critical: Fix unpacking of Optional dict.

Unpacking extra_session_options directly will fail type checking and cause a runtime error if it's None. The parameter is Optional[dict[str, Any]], so it must be guarded before unpacking.

Apply this diff to fix the issue:

             params = {
                 "model_name": model_name,
                 "cache_dir": cache_dir,
                 "providers": providers,
                 "local_files_only": local_files_only,
                 "specific_model_path": specific_model_path,
-                **extra_session_options,
+                **(extra_session_options or {}),
                 **kwargs,
             }
fastembed/rerank/cross_encoder/onnx_text_model.py (1)

124-132: Critical: Fix unpacking of Optional dict.

Unpacking extra_session_options directly will fail type checking and cause a runtime error if it's None. The parameter is Optional[dict[str, Any]], so it must be guarded before unpacking.

Apply this diff to fix the issue:

             params = {
                 "model_name": model_name,
                 "cache_dir": cache_dir,
                 "providers": providers,
                 "local_files_only": local_files_only,
                 "specific_model_path": specific_model_path,
-                **extra_session_options,
+                **(extra_session_options or {}),
                 **kwargs,
             }
fastembed/late_interaction_multimodal/onnx_multimodal_model.py (2)

150-158: Critical: Fix unpacking of Optional dict.

Unpacking extra_session_options directly will fail type checking and cause a runtime error if it's None. The parameter is Optional[dict[str, Any]], so it must be guarded before unpacking.

Apply this diff to fix the issue:

             params = {
                 "model_name": model_name,
                 "cache_dir": cache_dir,
                 "providers": providers,
                 "local_files_only": local_files_only,
                 "specific_model_path": specific_model_path,
-                **extra_session_options,
+                **(extra_session_options or {}),
                 **kwargs,
             }

219-227: Critical: Fix unpacking of Optional dict.

Unpacking extra_session_options directly will fail type checking and cause a runtime error if it's None. The parameter is Optional[dict[str, Any]], so it must be guarded before unpacking.

Apply this diff to fix the issue:

             params = {
                 "model_name": model_name,
                 "cache_dir": cache_dir,
                 "providers": providers,
                 "local_files_only": local_files_only,
                 "specific_model_path": specific_model_path,
-                **extra_session_options,
+                **(extra_session_options or {}),
                 **kwargs,
             }
🧹 Nitpick comments (1)
fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py (1)

88-89: Consider removing the redundant explicit parameter.

The extra_session_options parameter is explicitly declared but never used directly—it's only accessed via kwargs in _select_exposed_session_options(kwargs) on line 115. This creates redundancy and triggers the static analysis warning.

Apply this diff to remove the redundant parameter:

         device_id: Optional[int] = None,
         specific_model_path: Optional[str] = None,
-        extra_session_options: Optional[dict[str, Any]] = None,
         **kwargs: Any,
     ):

Alternatively, if you prefer to keep explicit parameters for documentation, use it directly instead of extracting from kwargs:

         self.providers = providers
         self.lazy_load = lazy_load
-        self._extra_session_options = self._select_exposed_session_options(kwargs)
+        if extra_session_options is None:
+            extra_session_options = {}
+        self._extra_session_options = extra_session_options
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 533b54c and 0e17bea.

📒 Files selected for processing (13)
  • fastembed/common/onnx_model.py (4 hunks)
  • fastembed/image/onnx_embedding.py (2 hunks)
  • fastembed/image/onnx_image_model.py (4 hunks)
  • fastembed/late_interaction/colbert.py (3 hunks)
  • fastembed/late_interaction_multimodal/colpali.py (4 hunks)
  • fastembed/late_interaction_multimodal/onnx_multimodal_model.py (6 hunks)
  • fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py (4 hunks)
  • fastembed/rerank/cross_encoder/onnx_text_model.py (4 hunks)
  • fastembed/sparse/bm42.py (3 hunks)
  • fastembed/sparse/minicoil.py (1 hunks)
  • fastembed/sparse/splade_pp.py (3 hunks)
  • fastembed/text/onnx_embedding.py (3 hunks)
  • fastembed/text/onnx_text_model.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
fastembed/text/onnx_embedding.py (1)
fastembed/common/onnx_model.py (1)
  • _select_exposed_session_options (122-131)
fastembed/sparse/minicoil.py (1)
fastembed/common/onnx_model.py (1)
  • _select_exposed_session_options (122-131)
fastembed/late_interaction_multimodal/colpali.py (1)
fastembed/common/onnx_model.py (1)
  • _select_exposed_session_options (122-131)
fastembed/late_interaction/colbert.py (1)
fastembed/common/onnx_model.py (1)
  • _select_exposed_session_options (122-131)
fastembed/image/onnx_embedding.py (1)
fastembed/common/onnx_model.py (1)
  • _select_exposed_session_options (122-131)
fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py (1)
fastembed/common/onnx_model.py (1)
  • _select_exposed_session_options (122-131)
fastembed/sparse/bm42.py (1)
fastembed/common/onnx_model.py (1)
  • _select_exposed_session_options (122-131)
🪛 GitHub Actions: Tests
fastembed/image/onnx_image_model.py

[error] 127-127: TypeError: 'NoneType' object is not a mapping while constructing ONNX session options (extra_session_options is None).

fastembed/common/onnx_model.py

[error] 146-146: TypeError: 'NoneType' object is not iterable in add_extra_session_options (extra_session_options is None).

🪛 GitHub Actions: type-checkers
fastembed/text/onnx_text_model.py

[error] 146-146: Unpacked dict entry 5 has incompatible type "Optional[dict[str, Any]]"; expected "SupportsKeysAndGetItem[str, Any]" [dict-item]

fastembed/image/onnx_image_model.py

[error] 133-133: Unpacked dict entry 5 has incompatible type "Optional[dict[str, Any]]"; expected "SupportsKeysAndGetItem[str, Any]" [dict-item]

fastembed/common/onnx_model.py

[error] 105-105: Argument 2 to "add_extra_session_options" of "OnnxModel" has incompatible type "Optional[dict[str, Any]]"; expected "dict[str, Any]" [arg-type]

fastembed/rerank/cross_encoder/onnx_text_model.py

[error] 130-130: Unpacked dict entry 5 has incompatible type "Optional[dict[str, Any]]"; expected "SupportsKeysAndGetItem[str, Any]" [dict-item]

fastembed/late_interaction_multimodal/onnx_multimodal_model.py

[error] 156-156: Unpacked dict entry 5 has incompatible type "Optional[dict[str, Any]]"; expected "SupportsKeysAndGetItem[str, Any]" [dict-item]


[error] 225-225: Unpacked dict entry 5 has incompatible type "Optional[dict[str, Any]]"; expected "SupportsKeysAndGetItem[str, Any]" [dict-item]

🪛 Ruff (0.14.5)
fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py

88-88: Unused method argument: extra_session_options

(ARG002)

🔇 Additional comments (22)
fastembed/common/onnx_model.py (3)

27-28: LGTM! Session options exposure mechanism introduced.

The EXPOSED_SESSION_OPTIONS tuple provides a whitelist of configurable ONNX Runtime session options, starting with enable_cpu_mem_arena. This is a sound approach to controlled exposure of session options.


121-131: LGTM! Helper method correctly filters exposed options.

The _select_exposed_session_options classmethod correctly filters kwargs to extract only the exposed session options.


65-65: Type signature is correct here.

The extra_session_options: Optional[dict[str, Any]] = None parameter correctly accepts None. The issue is in add_extra_session_options which needs to handle None (see related comment on lines 133-151).

fastembed/sparse/splade_pp.py (2)

102-102: LGTM! Extra session options correctly initialized.

The _extra_session_options attribute is properly initialized via _select_exposed_session_options, which extracts exposed session options from kwargs.


137-137: LGTM! Extra session options correctly propagated.

The _extra_session_options are correctly threaded through both the ONNX model loading path (line 137) and the embedding path (line 173), ensuring session options are applied in both direct and parallel worker scenarios.

Also applies to: 173-173

fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py (1)

115-115: LGTM! Extra session options correctly propagated.

The _extra_session_options are correctly initialized and threaded through both the ONNX model loading (line 155) and reranking (line 198) paths.

Also applies to: 155-155, 198-198

fastembed/image/onnx_image_model.py (1)

58-59: LGTM! Extra session options correctly forwarded to parent.

The extra_session_options parameter is properly declared and forwarded to the parent class's _load_onnx_model method.

Also applies to: 67-67

fastembed/sparse/bm42.py (2)

106-106: LGTM! Extra session options correctly initialized.

The _extra_session_options attribute is properly initialized via _select_exposed_session_options.


150-150: LGTM! Extra session options correctly propagated.

The _extra_session_options are correctly threaded through both the ONNX model loading (line 150) and embedding (line 317) paths.

Also applies to: 317-317

fastembed/image/onnx_embedding.py (1)

101-101: LGTM! Extra session options correctly initialized and passed to model loading.

The _extra_session_options are properly initialized and forwarded to _load_onnx_model.

Also applies to: 138-138

fastembed/text/onnx_embedding.py (2)

236-236: LGTM! Extra session options correctly initialized.

The _extra_session_options attribute is properly initialized via _select_exposed_session_options.


294-294: LGTM! Extra session options correctly propagated.

The _extra_session_options are correctly threaded through both the embedding path (line 294) and the ONNX model loading path (line 331).

Also applies to: 331-331

fastembed/late_interaction_multimodal/colpali.py (4)

83-83: LGTM!

The initialization of _extra_session_options correctly filters kwargs using the exposed session options pattern.


121-130: LGTM!

The session options are correctly propagated to the ONNX model loading path.


232-245: LGTM!

The session options are correctly propagated through the text embedding path.


268-281: LGTM!

The session options are correctly propagated through the image embedding path.

fastembed/late_interaction/colbert.py (3)

146-146: LGTM!

The initialization of _extra_session_options correctly filters kwargs using the exposed session options pattern, consistent with other classes in this PR.


178-187: LGTM!

The session options are correctly propagated to the ONNX model loading path.


229-242: LGTM!

The session options are correctly propagated through the document embedding path.

fastembed/text/onnx_text_model.py (1)

49-68: LGTM!

The extra_session_options parameter is correctly added and propagated to the superclass.

fastembed/rerank/cross_encoder/onnx_text_model.py (1)

28-47: LGTM!

The extra_session_options parameter is correctly added and propagated to the superclass.

fastembed/late_interaction_multimodal/onnx_multimodal_model.py (1)

59-81: LGTM!

The extra_session_options parameter is correctly added and propagated to the superclass.

Comment on lines +120 to +121
self._extra_session_options = self._select_exposed_session_options(kwargs)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Extra session options initialized but not propagated.

The _extra_session_options are initialized here but never passed to _load_onnx_model (line 151-158) or _embed_documents (lines 211-227, 235-249). This means the session options won't be applied for MiniCOIL models.

Apply this diff to propagate the session options:

     def load_onnx_model(self) -> None:
         self._load_onnx_model(
             model_dir=self._model_dir,
             model_file=self.model_description.model_file,
             threads=self.threads,
             providers=self.providers,
             cuda=self.cuda,
             device_id=self.device_id,
+            extra_session_options=self._extra_session_options,
         )

Also add to the embed method (around line 211):

         yield from self._embed_documents(
             model_name=self.model_name,
             cache_dir=str(self.cache_dir),
             documents=documents,
             batch_size=batch_size,
             parallel=parallel,
             providers=self.providers,
             cuda=self.cuda,
             device_ids=self.device_ids,
             k=self.k,
             b=self.b,
             avg_len=self.avg_len,
             is_query=False,
             local_files_only=self._local_files_only,
             specific_model_path=self._specific_model_path,
+            extra_session_options=self._extra_session_options,
             **kwargs,
         )

And similarly for query_embed (around line 235).

Committable suggestion skipped: line range outside the PR's diff.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
fastembed/common/onnx_model.py (1)

134-152: Fix the None-handling bug flagged in previous review.

As noted in the previous review, this method needs to handle None input properly. The type signature should be Optional[dict[str, Any]] with an early return guard.

Additionally, consider using ValueError instead of assert for the validation on lines 148-150, as assertions can be disabled with Python's -O flag and may not provide clear error messages in production.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e17bea and f458562.

📒 Files selected for processing (5)
  • fastembed/common/onnx_model.py (4 hunks)
  • fastembed/image/onnx_image_model.py (4 hunks)
  • fastembed/late_interaction_multimodal/onnx_multimodal_model.py (6 hunks)
  • fastembed/rerank/cross_encoder/onnx_text_model.py (4 hunks)
  • fastembed/text/onnx_text_model.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • fastembed/image/onnx_image_model.py
  • fastembed/rerank/cross_encoder/onnx_text_model.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
🔇 Additional comments (5)
fastembed/common/onnx_model.py (2)

122-132: LGTM!

The filtering logic correctly extracts only the exposed session options from the kwargs dictionary.


65-66: LGTM!

The parameter is correctly typed as Optional[dict[str, Any]] and the None-check before calling add_extra_session_options appropriately guards against the method's current limitation.

Also applies to: 105-106

fastembed/text/onnx_text_model.py (1)

49-67: LGTM!

The parameter is correctly added and forwarded to the superclass, maintaining consistency with the parent class API.

fastembed/late_interaction_multimodal/onnx_multimodal_model.py (2)

59-77: LGTM!

The parameter is correctly threaded through to the superclass.


115-128: LGTM!

The parameter signatures are correctly typed and consistently applied to both text and image embedding methods.

Also applies to: 186-199

Comment on lines +230 to +231
if extra_session_options is not None:
params.update(extra_session_options)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical bug: session options are incorrectly flattened into params.

Same issue as in _embed_documents: the dictionary structure must be preserved when passing to workers.

Apply this diff:

-            if extra_session_options is not None:
-                params.update(extra_session_options)
+            if extra_session_options is not None:
+                params["extra_session_options"] = extra_session_options
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if extra_session_options is not None:
params.update(extra_session_options)
if extra_session_options is not None:
params["extra_session_options"] = extra_session_options
🤖 Prompt for AI Agents
In fastembed/late_interaction_multimodal/onnx_multimodal_model.py around lines
230-231, the code currently flattens extra_session_options into params with
params.update(extra_session_options); instead preserve the dictionary structure
by adding the options as a nested entry (e.g., params['extra_session_options']
or the same key used in _embed_documents) instead of updating params, so the
worker receives a single dict field containing the session options rather than
merging its keys into the top-level params.

Comment on lines +149 to +150
if extra_session_options is not None:
params.update(extra_session_options)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical bug: session options are incorrectly flattened into params.

Using params.update(extra_session_options) flattens the session options dictionary into top-level keys in params. This breaks the propagation to worker processes because:

  1. extra_session_options = {"enable_cpu_mem_arena": True} becomes params["enable_cpu_mem_arena"] = True
  2. Workers receive **params, so they get kwargs["enable_cpu_mem_arena"] = True instead of kwargs["extra_session_options"]
  3. Workers cannot pass these correctly to _load_onnx_model(extra_session_options=...)

Apply this diff to preserve the dictionary structure:

-            if extra_session_options is not None:
-                params.update(extra_session_options)
+            if extra_session_options is not None:
+                params["extra_session_options"] = extra_session_options
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if extra_session_options is not None:
params.update(extra_session_options)
if extra_session_options is not None:
params["extra_session_options"] = extra_session_options
🤖 Prompt for AI Agents
In fastembed/text/onnx_text_model.py around lines 149-150, the code flattens
extra_session_options into params which breaks worker propagation; instead of
params.update(extra_session_options) preserve the dictionary by assigning it to
a single key (e.g. params["extra_session_options"] = extra_session_options) so
that workers receive extra_session_options as a dict and can call
_load_onnx_model(extra_session_options=...) correctly; remove the update call
and add the single-key assignment, keeping params None-safe if needed.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py (2)

111-115: Session options extraction placement is fine; consider moving before super().__init__

Using _select_exposed_session_options(kwargs) and storing the result in self._extra_session_options is a clean way to centralize ONNX session tuning. If _select_exposed_session_options mutates kwargs (e.g., via pop), you might want to call it before super().__init__(...) so base classes never see these special keys, and to keep the pattern consistent with other ONNX-based models, if any. Not a blocker, just a small robustness/readability tweak to consider.


186-199: Guard against duplicate extra_session_options between attribute and **kwargs

Passing extra_session_options=self._extra_session_options while also forwarding **kwargs means a caller who explicitly passes extra_session_options into rerank_pairs(...) will hit a “multiple values for keyword argument 'extra_session_options'” error.

If you want to support per-call overrides, consider resolving it here, e.g.:

-        yield from self._rerank_pairs(
+        extra_session_options = kwargs.pop(
+            "extra_session_options", self._extra_session_options
+        )
+        yield from self._rerank_pairs(
             model_name=self.model_name,
             cache_dir=str(self.cache_dir),
             pairs=pairs,
             batch_size=batch_size,
             parallel=parallel,
             providers=self.providers,
             cuda=self.cuda,
             device_ids=self.device_ids,
             local_files_only=self._local_files_only,
             specific_model_path=self._specific_model_path,
-            extra_session_options=self._extra_session_options,
+            extra_session_options=extra_session_options,
             **kwargs,
         )

If per-call overrides are not intended, it may still be worth documenting that extra_session_options is constructor-only to avoid confusion.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f458562 and 298912a.

📒 Files selected for processing (3)
  • fastembed/image/onnx_embedding.py (3 hunks)
  • fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py (3 hunks)
  • fastembed/sparse/minicoil.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • fastembed/sparse/minicoil.py
  • fastembed/image/onnx_embedding.py
🔇 Additional comments (1)
fastembed/rerank/cross_encoder/onnx_text_cross_encoder.py (1)

146-155: ONNX session options plumbing into _load_onnx_model looks correct

Forwarding extra_session_options=self._extra_session_options into _load_onnx_model cleanly connects constructor-time configuration with actual session creation; this should give consistent behavior across eager and lazy loads. No issues spotted with this change.



class OnnxModel(Generic[T]):
EXPOSED_SESSION_OPTIONS = ("enable_cpu_mem_arena",)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't expect this to be expanded at runtime it could be an enum like

from enum import StrEnum, auto

class ExposedSessionOptions(StrEnum):
    enable_cpu_mem_arena = auto()

or similar, would make type hinting nicer and still allows in checks and iteration.

Nvm, StrEnum was added with 3.11.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it can be emulated with ExposedSessionOptions(str, Enum), but feels like a bit too much, users won't see this enum anyway

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's not worth it to emulate it that way, it was a low hanging fruit with StrEnum built in but definitely not worth reimplementing it (also just (str, Enum) won't give you auto iirc so it'd be even more work). I just misremembered when StrEnum was added.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could type it as a tuple[Literal[...]] but again, not worth the effort for something that's only internal.

@joein joein merged commit ec0e312 into main Nov 25, 2025
17 checks passed
@joein joein deleted the cpu-mem-arena branch November 25, 2025 10:49
kacperlukawski pushed a commit to kacperlukawski/fastembed that referenced this pull request Dec 15, 2025
* new: expose some onnx session options

* fix: fix extra session options is None case

* fix: fix missing params

* new: add tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants