You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+22-30Lines changed: 22 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,10 +5,11 @@
5
5
The Dial RAG answers user questions using information from the documents provided by user. It supports the following document formats: PDF, DOC/DOCX, PPT/PPTX, TXT and other plain text formats such as code files. Also, it supports PDF and JPEG, PNG and other image formats for the image understanding.
6
6
7
7
The Dial RAG implements several retrieval methods to find the relevant information:
8
-
***Description retriever** - uses vision model to generate page images descriptions and perform search on them. Supports different vision models, like `gpt-4o-mini`, `gemini-1.5-flash-002` or `anthropic.claude-v3-haiku`.
9
-
***Multimodal retriever** - uses multimodal embedding models for pages images search. Supports different multimodal models, like [`azure-ai-vision-embeddings`](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/concept-image-retrieval), [Google `multimodalembedding@001`](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings) or [`amazon.titan-embed-image-v1`](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-multiemb-models.html)
10
-
***Semantic retriever** - uses [text embedding model](https://huggingface.co/epam/bge-small-en) to find the relevant information in the documents.
11
-
***Keyword retriever** - uses [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25) algorithm to find the relevant information in the documents.
8
+
9
+
-**Description retriever** - uses vision model to generate page images descriptions and perform search on them. Supports different vision models, like `gpt-4o-mini`, `gemini-1.5-flash-002` or `anthropic.claude-v3-haiku`.
10
+
-**Multimodal retriever** - uses multimodal embedding models for pages images search. Supports different multimodal models, like [`azure-ai-vision-embeddings`](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/concept-image-retrieval), [Google `multimodalembedding@001`](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings) or [`amazon.titan-embed-image-v1`](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-multiemb-models.html)
11
+
-**Semantic retriever** - uses [text embedding model](https://huggingface.co/epam/bge-small-en) to find the relevant information in the documents.
12
+
-**Keyword retriever** - uses [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25) algorithm to find the relevant information in the documents.
12
13
13
14
The Dial RAG is intended to be used in the [Dial](https://github.com/epam/ai-dial) environment. It uses the [Dial Core](https://github.com/epam/ai-dial-core) to access the LLMs and other services.
14
15
@@ -18,17 +19,18 @@ The Dial RAG is intended to be used in the [Dial](https://github.com/epam/ai-dia
18
19
19
20
Following environment variables are required to set for the deployment configuration:
20
21
21
-
|Variable|Description|
22
-
|---|---|
23
-
|`DIAL_URL`| url to the dial core |
24
-
|`DIAL_RAG__INDEX_STORAGE__USE_DIAL_FILE_STORAGE`| set to **True** to store indexes in the Dial File Storage instead of the local file storage |
|`DIAL_RAG__INDEX_STORAGE__USE_DIAL_FILE_STORAGE`| set to **True** to store indexes in the Dial File Storage instead of the local file storage |
25
26
26
27
### Configuration files
27
28
28
29
The Dial RAG provides a set of configuration files with predefined settings for different environments. The configuration files are located in the `config` directory.
29
30
You can set the environment variable `DIAL_RAG__CONFIG_PATH` to point to the required configuration file depending on the Dial environment and available models.
30
31
31
32
The following configuration files are available in the `config` directory:
33
+
32
34
-`config/aws_description.yaml` - AWS environment with description retriever, which uses `Claude Haiku 4.5` model for page images descriptions and `Claude Sonnet 3.5` for the answer generation.
33
35
-`config/aws_embedding.yaml` - AWS environment with multimodal retriever, which uses `amazon.titan-embed-image-v1` model for page images embeddings and `Claude Sonnet 3.5` for the answer generation.
34
36
-`config/azure_description.yaml` - Azure environment with description retriever, which uses `GPT-4.1 mini` model for page images descriptions and `GPT-4.1` for the answer generation.
@@ -39,14 +41,12 @@ The following configuration files are available in the `config` directory:
39
41
40
42
If you are running the Dial RAG in a different environment, you can create your own configuration file based on one of the provided files and set the `DIAL_RAG__CONFIG_PATH` environment variable to point to it. If you need a small change in the configuration (for example to change the model name), you can point the `DIAL_RAG__CONFIG_PATH` to the existing file and override the required settings using the environment variables. See the [Additional environment variables](#additional-environment-variables) section for the list of available settings.
41
43
42
-
43
44
### Logging configuration environment variables
44
45
45
-
|Variable|Default|Description|
46
-
|---|---|---|
47
-
|`LOG_LEVEL`|`INFO`| Log level for the application. |
48
-
|`LOG_LEVEL_OVERRIDE`|`{}`| Allows to override log level for specific modules. Example: `LOG_LEVEL_OVERRIDE='{"dial_rag": "DEBUG", "urllib3": "ERROR" }'`|
|`LOG_LEVEL`|`INFO`| Log level for the application. |
49
+
|`LOG_LEVEL_OVERRIDE`|`{}`| Allows to override log level for specific modules. Example: `LOG_LEVEL_OVERRIDE='{"dial_rag": "DEBUG", "urllib3": "ERROR" }'`|
50
50
51
51
### Additional environment variables
52
52
@@ -247,19 +247,17 @@ Dial RAG supports following commands in messages:
247
247
248
248
### Attach
249
249
250
-
`/attach <url>` - allows to provide an url to the attached document in the message body. Is equivalent to the setting `messages[i].custom_content.attachments[j].url` in the [Dial API](https://epam-rail.com/dial_api#/paths/~1openai~1deployments~1%7BDeployment%20Name%7D~1chat~1completions/post).
250
+
`/attach <url>` - allows to provide an url to the attached document in the message body. Is equivalent to the setting `messages[i].custom_content.attachments[j].url` in the [Dial API](https://dialx.ai/dial_api).
251
251
252
252
The `/attach` command is useful to attach the document which is available in the Internet and is not uploaded to the Dial File Storage.
253
253
254
-
255
254
### Debug commands
256
255
257
256
Dial RAG supports following debug commands if the option `ENABLE_DEBUG_COMMANDS` is set to `true`.
258
257
259
-
*`/model <model>` - allows to override the chat model used for the answer generation. Should be a deployment name of a chat model in available the Dial.
260
-
*`/query_model <model>` - allows to override the model used to summarize the chat history to the standalone question. Should be a deployment name of a chat model in available the Dial. The model should support `tool calls`.
261
-
*`/profile` - generates CPU profile report for the request. The report will be available as an attachment in the `Profiler` stage.
262
-
258
+
-`/model <model>` - allows to override the chat model used for the answer generation. Should be a deployment name of a chat model in available the Dial.
259
+
-`/query_model <model>` - allows to override the model used to summarize the chat history to the standalone question. Should be a deployment name of a chat model in available the Dial. The model should support `tool calls`.
260
+
-`/profile` - generates CPU profile report for the request. The report will be available as an attachment in the `Profiler` stage.
263
261
264
262
## Developer environment
265
263
@@ -275,7 +273,6 @@ poetry install
275
273
276
274
This will install all requirements for running the package, linting, formatting and tests.
277
275
278
-
279
276
Alternatively, if you have [uv](https://docs.astral.sh/uv/) installed, you can use it to create the environment with required version of Python and poetry:
280
277
281
278
```sh
@@ -291,14 +288,12 @@ If you want to use poetry from the uv with make commands, you can set the `POETR
291
288
POETRY="uvx poetry@2.2.1" make install
292
289
```
293
290
294
-
295
291
### IDE configuration
296
292
297
293
The recommended IDE is [VSCode](https://code.visualstudio.com/).
298
294
Open the project in VSCode and install the recommended extensions.
299
295
300
-
This project uses [Ruff](https://docs.astral.sh/ruff/) as a linter and formatter. To configure it for your IDE follow the instructions in https://docs.astral.sh/ruff/editors/setup/.
301
-
296
+
This project uses [Ruff](https://docs.astral.sh/ruff/) as a linter and formatter. To configure it for your IDE follow the instructions in <https://docs.astral.sh/ruff/editors/setup/>.
302
297
303
298
### Make on Windows
304
299
@@ -314,7 +309,7 @@ The command definitions inside Makefile should be cross-platform to keep the dev
314
309
315
310
### Environment Variables
316
311
317
-
Copy `.env.example` to `.env` and customize it for your environment for the development process. See the [Configuration](#Configuration) section for the list of environment variables.
312
+
Copy `.env.example` to `.env` and customize it for your environment for the development process. See the [Configuration](#configuration) section for the list of environment variables.
318
313
319
314
## Run
320
315
@@ -360,8 +355,6 @@ The `docker_compose_local` folder contains the Docker Compose file and auxiliary
360
355
docker-compose up --build dial-rag
361
356
```
362
357
363
-
364
-
365
358
## Lint
366
359
367
360
Run the linting before committing:
@@ -395,12 +388,12 @@ make docker_test
395
388
Some of the tests marked with the `@e2e_test` decorator utilize cached results located in the `./tests/cache` directory. By default, these tests will use cached values. During test execution, you may encounter warning or failure messages such as `Failed: There is no response found in cache, use environment variable REFRESH=True to update` This indicates that some logic has changed and that the cached responses are out of date.
396
389
397
390
These tests can be executed using environment variables, or nox sessions:
391
+
398
392
- `make test` (or `nox -s test`) - usual test run, executed on CI. The test uses *ONLY* the cached responses from LLM. If cache missing, test throws an exception.
399
393
- `REFRESH=True make test` (or `nox -s test -- --refresh`) - This flag will delete all unused cache files, and stores new ones required by the executed tests.
400
394
401
395
To use the `REFRESH` flag, you need to have running dial-core on `DIAL_CORE_HOST` (default "localhost:8080") with `DIAL_CORE_API_KEY` (default "dial_api_key").
402
396
403
-
404
397
## Clean
405
398
406
399
To remove the virtual environment and build artifacts:
@@ -411,10 +404,9 @@ make clean
411
404
412
405
## Update docs
413
406
414
-
This project uses [settings-doc](https://github.com/radeklat/settings-doc) to generate the [Configuration](#Configuration) section of this documentation from the Pydantic settings.
407
+
This project uses [settings-doc](https://github.com/radeklat/settings-doc) to generate the [Configuration](#configuration) section of this documentation from the Pydantic settings.
0 commit comments