Skip to content

Commit a56172f

Browse files
szymondudyczManul from Pathway
authored andcommitted
Update llm-app templates (#9526)
GitOrigin-RevId: 4718886d0a2a760580966169774e438e510ea6fb
1 parent cbd32a5 commit a56172f

File tree

21 files changed

+391
-130
lines changed

21 files changed

+391
-130
lines changed

templates/adaptive_rag/README.md

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ To learn more about building & deploying RAG applications with Pathway, includin
1919
## Introduction
2020
This app relies on modules provided under `pathway.xpacks.llm`.
2121

22-
BaseRAGQuestionAnswerer is the base class to build RAG applications with Pathway vector store and Pathway xpack components.
22+
`BaseRAGQuestionAnswerer` is the base class to build RAG applications with Pathway vector store and Pathway xpack components.
2323
It is meant to get you started with your RAG application right away.
2424

2525
Here, we extend the `BaseRAGQuestionAnswerer` to implement the adaptive retrieval and reply to requests in the endpoint `/v2/answer`.
@@ -54,21 +54,20 @@ Here some examples of what can be modified.
5454

5555
### LLM Model
5656

57-
You can choose any of the GPT-3.5 Turbo, GPT-4, or GPT-4 Turbo models proposed by Open AI.
58-
You can find the whole list on their [models page](https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo).
57+
You can choose any of the models offered by Open AI, like GPT-5, GPT-4.1, or GPT-4o.
58+
You can find the whole list on their [models page](https://platform.openai.com/docs/models).
5959

60-
You simply need to change the `model` to the one you want to use:
60+
You simply need to change the `model` to the one you want to use, e.g., to use GPT-5:
6161
```yaml
6262
$llm: !pw.xpacks.llm.llms.OpenAIChat
63-
model: "gpt-3.5-turbo"
63+
model: "gpt-5"
6464
retry_strategy: !pw.udfs.ExponentialBackoffRetryStrategy
6565
max_retries: 6
66-
cache_strategy: !pw.udfs.DiskCache
67-
temperature: 0.05
66+
cache_strategy: !pw.udfs.DefaultCache
6867
capacity: 8
6968
```
7069
71-
The default model is `gpt-3.5-turbo`
70+
The default model is `gpt-4.1-mini`.
7271

7372
You can also use different provider, by using different class from [Pathway LLM xpack](https://pathway.com/developers/user-guide/llm-xpack/overview),
7473
e.g. here is configuration for locally run Mistral model.
@@ -95,19 +94,19 @@ port: 8000
9594

9695
### Cache
9796

98-
You can configure whether you want to enable cache, to avoid repeated API accesses, and where the cache is stored.
97+
You can configure whether you want to enable cache or persistence, to avoid repeated API accesses, and where the cache is stored.
9998
Default values:
10099
```yaml
101-
with_cache: True
102-
cache_backend: !pw.persistence.Backend.filesystem
100+
persistence_mode: !pw.PersistenceMode.UDF_CACHING
101+
persistence_backend: !pw.persistence.Backend.filesystem
103102
path: ".Cache"
104103
```
105104

106105
### Data sources
107106

108107
You can configure the data sources by changing `$sources` in `app.yaml`.
109108
You can add as many data sources as you want. You can have several sources of the same kind, for instance, several local sources from different folders.
110-
The sections below describe how to configure local, Google Drive and Sharepoint source, but you can use any input [connector](https://pathway.com/developers/user-guide/connecting-to-data/connectors) from Pathway package.
109+
The sections below describe how to configure local, Google Drive and Sharepoint source, and you can check the examples of YAML configuration in our [user guide](https://pathway.com/developers/templates/yaml-snippets/data-sources-examples/). While these are not described in this Section, you can also use any input [connector](https://pathway.com/developers/user-guide/connecting-to-data/connectors) from Pathway package.
111110

112111
By default, the app uses a local data source to read documents from the `data` folder.
113112

templates/adaptive_rag/app.py

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import logging
2+
from warnings import warn
23

34
import pathway as pw
45
from dotenv import load_dotenv
@@ -25,17 +26,47 @@ class App(BaseModel):
2526
host: str = "0.0.0.0"
2627
port: int = 8000
2728

28-
with_cache: bool = True
29+
with_cache: bool | None = None # deprecated
30+
persistence_backend: pw.persistence.Backend | None = None
31+
persistence_mode: pw.PersistenceMode | None = pw.PersistenceMode.UDF_CACHING
2932
terminate_on_error: bool = False
3033

3134
def run(self) -> None:
32-
server = QASummaryRestServer(self.host, self.port, self.question_answerer)
33-
server.run(
34-
with_cache=self.with_cache,
35+
server = QASummaryRestServer( # noqa: F841
36+
self.host, self.port, self.question_answerer
37+
)
38+
39+
if self.persistence_mode is None:
40+
if self.with_cache is True:
41+
warn(
42+
"`with_cache` is deprecated. Please use `persistence_mode` instead.",
43+
DeprecationWarning,
44+
)
45+
persistence_mode = pw.PersistenceMode.UDF_CACHING
46+
else:
47+
persistence_mode = None
48+
else:
49+
persistence_mode = self.persistence_mode
50+
51+
if persistence_mode is not None:
52+
if self.persistence_backend is None:
53+
persistence_backend = pw.persistence.Backend.filesystem("./Cache")
54+
else:
55+
persistence_backend = self.persistence_backend
56+
persistence_config = pw.persistence.Config(
57+
persistence_backend,
58+
persistence_mode=persistence_mode,
59+
)
60+
else:
61+
persistence_config = None
62+
63+
pw.run(
64+
persistence_config=persistence_config,
3565
terminate_on_error=self.terminate_on_error,
66+
monitoring_level=pw.MonitoringLevel.NONE,
3667
)
3768

38-
model_config = ConfigDict(extra="forbid")
69+
model_config = ConfigDict(extra="forbid", arbitrary_types_allowed=True)
3970

4071

4172
if __name__ == "__main__":

templates/adaptive_rag/app.yaml

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ $sources:
4747
# https://pathway.com/developers/templates/rag-customization/llm-chats
4848

4949
$llm: !pw.xpacks.llm.llms.OpenAIChat
50-
model: "gpt-4o-mini"
50+
model: "gpt-4.1-mini"
5151
retry_strategy: !pw.udfs.ExponentialBackoffRetryStrategy
5252
max_retries: 6
5353
cache_strategy: !pw.udfs.DefaultCache {}
@@ -56,22 +56,25 @@ $llm: !pw.xpacks.llm.llms.OpenAIChat
5656

5757
# Specifies the embedder model for converting text into embeddings.
5858
$embedder: !pw.xpacks.llm.embedders.OpenAIEmbedder
59-
model: "text-embedding-ada-002"
59+
model: "text-embedding-3-small"
6060
cache_strategy: !pw.udfs.DefaultCache {}
61+
retry_strategy: !pw.udfs.ExponentialBackoffRetryStrategy {}
6162

6263
# Defines the splitter settings for dividing text into smaller chunks.
6364
$splitter: !pw.xpacks.llm.splitters.TokenCountSplitter
6465
max_tokens: 400
6566

6667
# Configures the parser for processing and extracting information from documents.
6768
$parser: !pw.xpacks.llm.parsers.DoclingParser
69+
async_mode: "fully_async"
70+
chunk: false
6871
cache_strategy: !pw.udfs.DefaultCache {}
6972

7073
# Sets up the retriever factory for indexing and retrieving documents.
71-
$retriever_factory: !pw.stdlib.indexing.BruteForceKnnFactory
74+
$retriever_factory: !pw.indexing.UsearchKnnFactory
7275
reserved_space: 1000
7376
embedder: $embedder
74-
metric: !pw.stdlib.indexing.BruteForceKnnMetricKind.COS
77+
metric: !pw.indexing.USearchMetricKind.COS
7578

7679
# Manages the storage and retrieval of documents for the RAG template.
7780
$document_store: !pw.xpacks.llm.document_store.DocumentStore
@@ -96,8 +99,15 @@ question_answerer: !pw.xpacks.llm.question_answering.AdaptiveRAGQuestionAnswerer
9699
# host: "0.0.0.0"
97100
# port: 8000
98101

99-
# Activate on-disk caching for UDFs for which `cache_strategy` is set
100-
# with_cache: true
102+
# By default, caching is enabled for UDFs with cache_strategy set.
103+
# You can disable it by uncommenting the following line.
104+
# persistence_mode: null
105+
# You can also set persistence_mode to !pw.PersistenceMode.PERSISTING to enable persistence
106+
# across restarts.
107+
# By default, when enabled, Cache is stored in .Cache directory.
108+
# You can customize the location by uncommenting and modifying the following lines:
109+
# persistence_backend: !pw.persistence.Backend.filesystem
110+
# path: ".Cache"
101111

102112
# If `terminate_on_error` is true then the program will terminate whenever any error is encountered.
103113
# Defaults to false, uncomment the following line if you want to set it to true

templates/document_indexing/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ Finally, the embeddings are indexed with the capabilities of Pathway's machine-l
4040
## Pipeline Organization
4141

4242
This folder contains several objects:
43-
- `main.py`, the pipeline code using Pathway and written in Python;
43+
- `app.py`, the pipeline code using Pathway and written in Python;
4444
- `app.yaml`, the file containing configuration of the pipeline, like embedding model, sources, or the server address;
4545
- `requirements.txt`, the textfile denoting the pip dependencies for running this pipeline. It can be passed to `pip install -r ...` to install everything that is needed to launch the pipeline locally;
4646
- `Dockerfile`, the Docker configuration for running the pipeline in the container;
@@ -96,9 +96,9 @@ cache_backend: !pw.persistence.Backend.filesystem
9696

9797
You can configure the data sources by changing `$sources` in `app.yaml`.
9898
You can add as many data sources as you want. You can have several sources of the same kind, for instance, several local sources from different folders.
99-
The sections below describe how to configure local, Google Drive and Sharepoint source, but you can use any input [connector](https://pathway.com/developers/user-guide/connecting-to-data/connectors) from Pathway package.
99+
The sections below describe how to configure local, Google Drive and Sharepoint source, and you can check the examples of YAML configuration in our [user guide](https://pathway.com/developers/templates/yaml-snippets/data-sources-examples/). While these are not described in this Section, you can also use any input [connector](https://pathway.com/developers/user-guide/connecting-to-data/connectors) from Pathway package.
100100

101-
By default, the app uses a local data source to read documents from the `data` folder.
101+
By default, the app uses a local data source to read documents from the `files-from-indexing` folder.
102102

103103
#### Local Data Source
104104

templates/document_indexing/app.py

Lines changed: 35 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import logging
2+
from warnings import warn
23

34
import pathway as pw
45
from dotenv import load_dotenv
@@ -25,17 +26,46 @@ class App(BaseModel):
2526
host: str = "0.0.0.0"
2627
port: int = 8000
2728

28-
with_cache: bool = True
29+
with_cache: bool | None = None # deprecated
30+
persistence_backend: pw.persistence.Backend | None = None
31+
persistence_mode: pw.PersistenceMode | None = pw.PersistenceMode.UDF_CACHING
2932
terminate_on_error: bool = False
3033

3134
def run(self) -> None:
32-
server = DocumentStoreServer(self.host, self.port, self.document_store)
33-
server.run(
34-
with_cache=self.with_cache,
35+
server = DocumentStoreServer( # noqa: F841
36+
self.host, self.port, self.document_store
37+
)
38+
if self.persistence_mode is None:
39+
if self.with_cache is True:
40+
warn(
41+
"`with_cache` is deprecated. Please use `persistence_mode` instead.",
42+
DeprecationWarning,
43+
)
44+
persistence_mode = pw.PersistenceMode.UDF_CACHING
45+
else:
46+
persistence_mode = None
47+
else:
48+
persistence_mode = self.persistence_mode
49+
50+
if persistence_mode is not None:
51+
if self.persistence_backend is None:
52+
persistence_backend = pw.persistence.Backend.filesystem("./Cache")
53+
else:
54+
persistence_backend = self.persistence_backend
55+
persistence_config = pw.persistence.Config(
56+
persistence_backend,
57+
persistence_mode=persistence_mode,
58+
)
59+
else:
60+
persistence_config = None
61+
62+
pw.run(
63+
persistence_config=persistence_config,
3564
terminate_on_error=self.terminate_on_error,
65+
monitoring_level=pw.MonitoringLevel.NONE,
3666
)
3767

38-
model_config = ConfigDict(extra="forbid")
68+
model_config = ConfigDict(extra="forbid", arbitrary_types_allowed=True)
3969

4070

4171
if __name__ == "__main__":

templates/document_indexing/app.yaml

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -51,14 +51,16 @@ $splitter: !pw.xpacks.llm.splitters.TokenCountSplitter
5151
max_tokens: 400
5252

5353
# Configures the parser for processing and extracting information from documents.
54-
$parser: !pw.xpacks.llm.parsers.UnstructuredParser
54+
$parser: !pw.xpacks.llm.parsers.DoclingParser
55+
async_mode: "fully_async"
56+
chunk: false
5557
cache_strategy: !pw.udfs.DefaultCache {}
5658

5759
# Sets up the retriever factory for indexing and retrieving documents.
58-
$retriever_factory: !pw.stdlib.indexing.BruteForceKnnFactory
60+
$retriever_factory: !pw.indexing.UsearchKnnFactory
5961
reserved_space: 1000
6062
embedder: $embedder
61-
metric: !pw.stdlib.indexing.BruteForceKnnMetricKind.COS
63+
metric: !pw.indexing.USearchMetricKind.COS
6264

6365
# Manages the storage and retrieval of documents for the RAG template.
6466
document_store: !pw.xpacks.llm.document_store.DocumentStore
@@ -71,8 +73,15 @@ document_store: !pw.xpacks.llm.document_store.DocumentStore
7173
# host: "0.0.0.0"
7274
# port: 8000
7375

74-
# Activate on-disk caching for UDFs for which `cache_strategy` is set
75-
# with_cache: true
76+
# By default, caching is enabled for UDFs with cache_strategy set.
77+
# You can disable it by uncommenting the following line.
78+
# persistence_mode: null
79+
# You can also set persistence_mode to !pw.PersistenceMode.PERSISTING to enable persistence
80+
# across restarts.
81+
# By default, when enabled, Cache is stored in .Cache directory.
82+
# You can customize the location by uncommenting and modifying the following lines:
83+
# persistence_backend: !pw.persistence.Backend.filesystem
84+
# path: ".Cache"
7685

7786
# If `terminate_on_error` is true then the program will terminate whenever any error is encountered.
7887
# Defaults to false, uncomment the following line if you want to set it to true

templates/multimodal_rag/README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -89,24 +89,24 @@ Here some examples of what can be modified.
8989

9090
### LLM Model
9191

92-
This template by default uses two llm models - GPT-3.5 Turbo for answering queries and GPT-4o for parsing tables and images.
92+
This template by default uses two llm models - GPT-4.1-mini for answering queries and GPT-4o for parsing tables and images.
9393

94-
You can replace GPT-3.5 Turbo with other Open AI models, like GPT-4, or GPT-4 Turbo.
95-
You can find the whole list on their [models page](https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo).
94+
You can replace either of them with other Open AI models, like GPT-4.1 or GPT-5, but keep in mind that the model used for parsing needs to support image input.
95+
You can find the whole list on their [models page](https://platform.openai.com/docs/models).
9696

97-
You simply need to change the `model` to the one you want to use:
97+
To change the model of the answering llm, you simply need to change the `model` in the `$llm` variable to the one you want to use, e.g. to use `GPT-5` set:
9898
```yaml
9999
$llm: !pw.xpacks.llm.llms.OpenAIChat
100-
model: "gpt-3.5-turbo"
100+
model: "gpt-5"
101101
retry_strategy: !pw.udfs.ExponentialBackoffRetryStrategy
102102
max_retries: 6
103-
cache_strategy: !pw.udfs.DiskCache
104-
temperature: 0.05
103+
cache_strategy: !pw.udfs.DefaultCache {}
104+
temperature: 0
105105
capacity: 8
106106
```
107107
108108
You can also use different provider, by using different class from [Pathway LLM xpack](https://pathway.com/developers/user-guide/llm-xpack/overview),
109-
e.g. here is configuration for locally run Mistral model.
109+
e.g. here is configuration for locally run Mistral model with Ollama.
110110
111111
```yaml
112112
$llm: !pw.xpacks.llm.llms.LiteLLMChat
@@ -132,19 +132,19 @@ port: 8000
132132

133133
### Cache
134134

135-
You can configure whether you want to enable cache, to avoid repeated API accesses, and where the cache is stored.
135+
You can configure whether you want to enable cache or persistence, to avoid repeated API accesses, and where the cache is stored.
136136
Default values:
137137
```yaml
138-
with_cache: True
139-
cache_backend: !pw.persistence.Backend.filesystem
138+
persistence_mode: !pw.PersistenceMode.UDF_CACHING
139+
persistence_backend: !pw.persistence.Backend.filesystem
140140
path: ".Cache"
141141
```
142142

143143
### Data sources
144144

145145
You can configure the data sources by changing `$sources` in `app.yaml`.
146146
You can add as many data sources as you want. You can have several sources of the same kind, for instance, several local sources from different folders.
147-
The sections below describe how to configure local, Google Drive and Sharepoint source, but you can use any input [connector](https://pathway.com/developers/user-guide/connecting-to-data/connectors) from Pathway package.
147+
The sections below describe how to configure local, Google Drive and Sharepoint source, and you can check the examples of YAML configuration in our [user guide](https://pathway.com/developers/templates/yaml-snippets/data-sources-examples/). While these are not described in this Section, you can also use any input [connector](https://pathway.com/developers/user-guide/connecting-to-data/connectors) from Pathway package.
148148

149149
By default, the app uses a local data source to read documents from the `data` folder.
150150

0 commit comments

Comments
 (0)