Skip to content

Commit 04ff531

Browse files
unit testing for deepsearcher (#237)
* finished testing agent and loader * finished testing embeddings * added llm testings * added utils testing * finalizing unit testing for all integrations * added testing trigger commands in docs and CONTRIBUTING * finalizing unit testing for deepsearcher * handled env vars in tests * using milvus lite to avoid connection failing issues
1 parent 2c2be93 commit 04ff531

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

71 files changed

+9167
-394
lines changed

CONTRIBUTING.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,53 @@ Syncing the environment manually is especially useful for ensuring your editor h
102102
For more detailed information about dependency locking and syncing, refer to the [offical Locking and syncing documentation](https://docs.astral.sh/uv/concepts/projects/sync/).
103103

104104

105+
## Running Tests
106+
107+
Before submitting your pull request, make sure to run the test suite to ensure your changes haven't introduced any regressions.
108+
109+
### Installing Test Dependencies
110+
111+
First, ensure you have pytest installed. If you haven't installed the development dependencies yet, you can do so with:
112+
113+
```shell
114+
uv sync --all-extras --dev
115+
```
116+
117+
This will install all development dependencies and optional dependencies including pytest and other testing tools.
118+
119+
### Running the Tests
120+
121+
To run all tests in the `tests` directory:
122+
123+
```shell
124+
uv run pytest tests
125+
```
126+
127+
For more verbose output that shows individual test results:
128+
129+
```shell
130+
uv run pytest tests -v
131+
```
132+
133+
You can also run tests for specific directories or files. For example:
134+
135+
```shell
136+
# Run tests in a specific directory
137+
uv run pytest tests/embedding
138+
139+
# Run tests in a specific file
140+
uv run pytest tests/embedding/test_bedrock_embedding.py
141+
142+
# Run a specific test class
143+
uv run pytest tests/embedding/test_bedrock_embedding.py::TestBedrockEmbedding
144+
145+
# Run a specific test method
146+
uv run pytest tests/embedding/test_bedrock_embedding.py::TestBedrockEmbedding::test_init_default
147+
```
148+
149+
The `-v` flag (verbose mode) provides more detailed output, showing each test case and its result individually. This is particularly useful when you want to see which specific tests are passing or failing.
150+
151+
105152
## Developer Certificate of Origin (DCO)
106153

107154
All contributions require a sign-off, acknowledging the [Developer Certificate of Origin](https://developercertificate.org/).

docs/contributing/index.md

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# Contributing to DeepSearcher
2+
3+
We welcome contributions from everyone. This document provides guidelines to make the contribution process straightforward.
4+
5+
6+
## Pull Request Process
7+
8+
1. Fork the repository and create your branch from `master`.
9+
2. Make your changes.
10+
3. Run tests and linting to ensure your code meets the project's standards.
11+
4. Update documentation if necessary.
12+
5. Submit a pull request.
13+
14+
15+
## Linting and Formatting
16+
17+
Keeping a consistent style for code, code comments, commit messages, and PR descriptions will greatly accelerate your PR review process.
18+
We require you to run code linter and formatter before submitting your pull requests:
19+
20+
To check the coding styles:
21+
22+
```shell
23+
make lint
24+
```
25+
26+
To fix the coding styles:
27+
28+
```shell
29+
make format
30+
```
31+
Our CI pipeline also runs these checks automatically on all pull requests to ensure code quality and consistency.
32+
33+
34+
## Development Environment Setup with uv
35+
36+
DeepSearcher uses [uv](https://github.com/astral-sh/uv) as the recommended package manager. uv is a fast, reliable Python package manager and installer. The project's `pyproject.toml` is configured to work with uv, which will provide faster dependency resolution and package installation compared to traditional tools.
37+
38+
### Install Project in Development Mode(aka Editable Installation)
39+
40+
1. Install uv if you haven't already:
41+
Follow the [offical installation instructions](https://docs.astral.sh/uv/getting-started/installation/).
42+
43+
2. Clone the repository and navigate to the project directory:
44+
```shell
45+
git clone https://github.com/zilliztech/deep-searcher.git && cd deep-searcher
46+
```
47+
3. Synchronize and install dependencies:
48+
```shell
49+
uv sync
50+
source .venv/bin/activate
51+
```
52+
`uv sync` will install all dependencies specified in `uv.lock` file. And the `source .venv/bin/activate` command will activate the virtual environment.
53+
54+
- (Optional) To install all optional dependencies:
55+
```shell
56+
uv sync --all-extras --dev
57+
```
58+
59+
- (Optional) To install specific optional dependencies:
60+
```shell
61+
# Take optional `ollama` dependency for example
62+
uv sync --extra ollama
63+
```
64+
For more optional dependencies, refer to the `[project.optional-dependencies]` part of `pyproject.toml` file.
65+
66+
67+
68+
### Adding Dependencies
69+
70+
When you need to add new dependencies to the `pyproject.toml` file, you can use the following commands:
71+
72+
```shell
73+
uv add <package_name>
74+
```
75+
DeepSearcher uses optional dependencies to keep the default installation lightweight. Optional features can be installed using the syntax `deepsearcher[<extra>]`. To add a dependency to an optional extra, use the following command:
76+
77+
```shell
78+
uv add <package_name> --optional <extra>
79+
```
80+
For more details, refer to the [offical Managing dependencies documentation](https://docs.astral.sh/uv/concepts/projects/dependencies/).
81+
82+
### Dependencies Locking
83+
84+
For development, we use lockfiles to ensure consistent dependencies. You can use
85+
```shell
86+
uv lock --check
87+
```
88+
to verify if your lockfile is up-to-date with your project dependencies.
89+
90+
When you modify or add dependencies in the project, the lockfile will be automatically updated the next time you run a uv command. You can also explicitly update the lockfile using:
91+
```shell
92+
uv lock
93+
```
94+
95+
While the environment is synced automatically, it may also be explicitly synced using uv sync:
96+
```shell
97+
uv sync
98+
```
99+
Syncing the environment manually is especially useful for ensuring your editor has the correct versions of dependencies.
100+
101+
102+
For more detailed information about dependency locking and syncing, refer to the [offical Locking and syncing documentation](https://docs.astral.sh/uv/concepts/projects/sync/).
103+
104+
105+
## Running Tests
106+
107+
Before submitting your pull request, make sure to run the test suite to ensure your changes haven't introduced any regressions.
108+
109+
### Installing Test Dependencies
110+
111+
First, ensure you have pytest installed. If you haven't installed the development dependencies yet, you can do so with:
112+
113+
```shell
114+
uv sync --all-extras --dev
115+
```
116+
117+
This will install all development dependencies and optional dependencies including pytest and other testing tools.
118+
119+
### Running the Tests
120+
121+
To run all tests in the `tests` directory:
122+
123+
```shell
124+
uv run pytest tests
125+
```
126+
127+
For more verbose output that shows individual test results:
128+
129+
```shell
130+
uv run pytest tests -v
131+
```
132+
133+
You can also run tests for specific directories or files. For example:
134+
135+
```shell
136+
# Run tests in a specific directory
137+
uv run pytest tests/embedding
138+
139+
# Run tests in a specific file
140+
uv run pytest tests/embedding/test_bedrock_embedding.py
141+
142+
# Run a specific test class
143+
uv run pytest tests/embedding/test_bedrock_embedding.py::TestBedrockEmbedding
144+
145+
# Run a specific test method
146+
uv run pytest tests/embedding/test_bedrock_embedding.py::TestBedrockEmbedding::test_init_default
147+
```
148+
149+
The `-v` flag (verbose mode) provides more detailed output, showing each test case and its result individually. This is particularly useful when you want to see which specific tests are passing or failing.
150+
151+
152+
## Developer Certificate of Origin (DCO)
153+
154+
All contributions require a sign-off, acknowledging the [Developer Certificate of Origin](https://developercertificate.org/).
155+
Add a `Signed-off-by` line to your commit message:
156+
157+
```text
158+
Signed-off-by: Your Name <your.email@example.com>
159+
```

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,8 @@ nav:
6060
- "Development Mode": installation/development.md
6161
- "FAQ":
6262
- "FAQ": faq/index.md
63+
- Contribution Guide:
64+
- "Contribution Guide": contributing/index.md
6365
- Usage:
6466
- "Usage": usage/index.md
6567
- "Quick Start": usage/quick_start.md

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ dev = [
3737
"mkdocs-jupyter>=0.25.0",
3838
"mkdocs-click>=0.8.1",
3939
"mkdocstrings[python]>=0.27.0",
40+
"pytest>=8.3.5",
4041
]
4142

4243
[project.optional-dependencies]

tests/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Tests for the deepsearcher package

tests/agent/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Tests for the agent module

tests/agent/test_base.py

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
import unittest
2+
from unittest.mock import MagicMock
3+
import numpy as np
4+
5+
from deepsearcher.llm.base import BaseLLM, ChatResponse
6+
from deepsearcher.embedding.base import BaseEmbedding
7+
from deepsearcher.vector_db.base import BaseVectorDB, RetrievalResult, CollectionInfo
8+
9+
10+
class MockLLM(BaseLLM):
11+
"""Mock LLM implementation for testing agents."""
12+
13+
def __init__(self, predefined_responses=None):
14+
"""
15+
Initialize the MockLLM.
16+
17+
Args:
18+
predefined_responses: Dictionary mapping prompt substrings to responses
19+
"""
20+
self.chat_called = False
21+
self.last_messages = None
22+
self.predefined_responses = predefined_responses or {}
23+
24+
def chat(self, messages, **kwargs):
25+
"""Mock implementation of chat that returns predefined responses or a default response."""
26+
self.chat_called = True
27+
self.last_messages = messages
28+
29+
if self.predefined_responses:
30+
message_content = messages[0]["content"] if messages else ""
31+
for key, response in self.predefined_responses.items():
32+
if key in message_content:
33+
return ChatResponse(content=response, total_tokens=10)
34+
35+
return ChatResponse(content="This is a test answer", total_tokens=10)
36+
37+
def literal_eval(self, text):
38+
"""Mock implementation of literal_eval."""
39+
# Default implementation returns a list with test_collection
40+
# Override this in specific tests if needed
41+
if text.strip().startswith("[") and text.strip().endswith("]"):
42+
# Return the list as is if it's already in list format
43+
try:
44+
import ast
45+
return ast.literal_eval(text)
46+
except:
47+
pass
48+
49+
return ["test_collection"]
50+
51+
52+
class MockEmbedding(BaseEmbedding):
53+
"""Mock embedding model implementation for testing agents."""
54+
55+
def __init__(self, dimension=8):
56+
"""Initialize the MockEmbedding with a specific dimension."""
57+
self._dimension = dimension
58+
59+
@property
60+
def dimension(self):
61+
"""Return the dimension of the embedding model."""
62+
return self._dimension
63+
64+
def embed_query(self, text):
65+
"""Mock implementation that returns a random vector of the specified dimension."""
66+
return np.random.random(self._dimension).tolist()
67+
68+
def embed_documents(self, documents):
69+
"""Mock implementation that returns random vectors for each document."""
70+
return [np.random.random(self._dimension).tolist() for _ in documents]
71+
72+
73+
class MockVectorDB(BaseVectorDB):
74+
"""Mock vector database implementation for testing agents."""
75+
76+
def __init__(self, collections=None):
77+
"""
78+
Initialize the MockVectorDB.
79+
80+
Args:
81+
collections: List of collection names to initialize with
82+
"""
83+
self.default_collection = "test_collection"
84+
self.search_called = False
85+
self.insert_called = False
86+
self._collections = []
87+
88+
if collections:
89+
for collection in collections:
90+
self._collections.append(
91+
CollectionInfo(collection_name=collection, description=f"Test collection {collection}")
92+
)
93+
else:
94+
self._collections = [
95+
CollectionInfo(collection_name="test_collection", description="Test collection for testing")
96+
]
97+
98+
def search_data(self, collection, vector, top_k=10, **kwargs):
99+
"""Mock implementation that returns test results."""
100+
self.search_called = True
101+
self.last_search_collection = collection
102+
self.last_search_vector = vector
103+
self.last_search_top_k = top_k
104+
105+
return [
106+
RetrievalResult(
107+
embedding=vector,
108+
text=f"Test result {i} for collection {collection}",
109+
reference=f"test_reference_{collection}_{i}",
110+
metadata={"a": i, "wider_text": f"Wider context for test result {i} in collection {collection}"}
111+
)
112+
for i in range(min(3, top_k))
113+
]
114+
115+
def insert_data(self, collection, chunks):
116+
"""Mock implementation of insert_data."""
117+
self.insert_called = True
118+
self.last_insert_collection = collection
119+
self.last_insert_chunks = chunks
120+
return True
121+
122+
def init_collection(self, dim, collection, **kwargs):
123+
"""Mock implementation of init_collection."""
124+
return True
125+
126+
def list_collections(self, dim=None):
127+
"""Mock implementation that returns the list of collections."""
128+
return self._collections
129+
130+
def clear_db(self, collection):
131+
"""Mock implementation of clear_db."""
132+
return True
133+
134+
135+
class BaseAgentTest(unittest.TestCase):
136+
"""Base test class for agent tests with common setup."""
137+
138+
def setUp(self):
139+
"""Set up test fixtures for agent tests."""
140+
self.llm = MockLLM()
141+
self.embedding_model = MockEmbedding(dimension=8)
142+
self.vector_db = MockVectorDB()

0 commit comments

Comments
 (0)