Skip to content

Commit 84d6d1a

Browse files
authored
revert(cli): Revert the default reranker to NaiveReranker. (#277)
* revert(cli): default to `NaiveReranker` * Auto generate docs
1 parent 747e8a0 commit 84d6d1a

File tree

4 files changed

+13
-23
lines changed

4 files changed

+13
-23
lines changed

doc/VectorCode-cli.txt

Lines changed: 7 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -358,18 +358,13 @@ most `n` documents. A larger value of `query_multiplier` guarantees the return
358358
of `n` documents, but with the risk of including too many less-relevant chunks
359359
that may affect the document selection. Default: `-1` (any negative value means
360360
selecting documents based on all indexed chunks); - `reranker`string, the
361-
reranking method to use. Currently supports `CrossEncoderReranker` (default,
362-
using sentence-transformers cross-encoder
363-
<https://sbert.net/docs/package_reference/cross_encoder/cross_encoder.html> )
364-
and `NaiveReranker` (sort chunks by the "distance" between the embedding
365-
vectors). Note: If you’re using a good embedding model (eg. a hosted service
366-
from OpenAI, or a LLM-based embedding model like Qwen3-Embedding-0.6B
367-
<https://huggingface.co/Qwen/Qwen3-Embedding-0.6B>), you may get better results
368-
if you use `NaiveReranker` here because a good embedding model may understand
369-
texts better than a mediocre reranking model. - `reranker_params`dictionary,
370-
similar to `embedding_params`. The options passed to the reranker class
371-
constructor. For `CrossEncoderReranker`, these are the options passed to the
372-
`CrossEncoder`
361+
reranking method to use. Currently supports `NaiveReranker` (sort chunks by the
362+
"distance" between the embedding vectors) and `CrossEncoderReranker` (using
363+
sentence-transformers cross-encoder
364+
<https://sbert.net/docs/package_reference/cross_encoder/cross_encoder.html> ).
365+
- `reranker_params`dictionary, similar to `embedding_params`. The options
366+
passed to the reranker class constructor. For `CrossEncoderReranker`, these are
367+
the options passed to the `CrossEncoder`
373368
<https://sbert.net/docs/package_reference/cross_encoder/cross_encoder.html#id1>
374369
class. For example, if you want to use a non-default model, you can use the
375370
following: `json { "reranker_params": { "model_name_or_path": "your_model_here"

docs/cli.md

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -311,16 +311,11 @@ The JSON configuration file may hold the following values:
311311
guarantees the return of `n` documents, but with the risk of including too
312312
many less-relevant chunks that may affect the document selection. Default:
313313
`-1` (any negative value means selecting documents based on all indexed chunks);
314-
- `reranker`: string, the reranking method to use. Currently supports
315-
`CrossEncoderReranker` (default, using
314+
- `reranker`: string, the reranking method to use. Currently supports `NaiveReranker`
315+
(sort chunks by the "distance" between the embedding vectors) and
316+
`CrossEncoderReranker` (using
316317
[sentence-transformers cross-encoder](https://sbert.net/docs/package_reference/cross_encoder/cross_encoder.html)
317-
) and `NaiveReranker` (sort chunks by the "distance" between the embedding
318-
vectors).
319-
Note: If you're using a good embedding model (eg. a hosted service from OpenAI, or
320-
a LLM-based embedding model like
321-
[Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B)), you
322-
may get better results if you use `NaiveReranker` here because a good embedding
323-
model may understand texts better than a mediocre reranking model.
318+
).
324319
- `reranker_params`: dictionary, similar to `embedding_params`. The options
325320
passed to the reranker class constructor. For `CrossEncoderReranker`, these
326321
are the options passed to the

src/vectorcode/cli_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ class Config:
100100
overlap_ratio: float = 0.2
101101
query_multiplier: int = -1
102102
query_exclude: list[Union[str, os.PathLike]] = field(default_factory=list)
103-
reranker: Optional[str] = "CrossEncoderReranker"
103+
reranker: Optional[str] = "NaiveReranker"
104104
reranker_params: dict[str, Any] = field(default_factory=lambda: {})
105105
check_item: Optional[str] = None
106106
use_absolute_path: bool = False

tests/test_cli_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ async def test_config_import_from_missing_keys():
113113
assert config.chunk_size == 2500
114114
assert config.overlap_ratio == 0.2
115115
assert config.query_multiplier == -1
116-
assert config.reranker == "CrossEncoderReranker"
116+
assert config.reranker == "NaiveReranker"
117117
assert config.reranker_params == {}
118118
assert config.db_settings is None
119119

0 commit comments

Comments
 (0)