Skip to content

Commit b84498d

Browse files
Pathway-DevManul from Pathway
andcommitted
Daily Pathway examples refresh
Co-authored-by: Manul from Pathway <github.manul@pathway.com> GitOrigin-RevId: d167364ba193ef0e370866b315f069ac9afc1bff
1 parent c36bdca commit b84498d

File tree

3 files changed

+44
-44
lines changed

3 files changed

+44
-44
lines changed

examples/notebooks/tutorials/asynctransformer.ipynb

Lines changed: 35 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -162,11 +162,11 @@
162162
"output_type": "stream",
163163
"text": [
164164
" | value | ret | __time__ | __diff__\n",
165-
"^Z3QWT29... | 2 | 3 | 1741239254830 | 1\n",
166-
"^3CZ78B4... | 2 | 3 | 1741239254830 | 1\n",
167-
"^YYY4HAB... | 6 | 7 | 1741239255232 | 1\n",
168-
"^3HN31E1... | 6 | 7 | 1741239255232 | 1\n",
169-
"^X1MXHYY... | 12 | 13 | 1741239255830 | 1\n"
165+
"^Z3QWT29... | 2 | 3 | 1741325601406 | 1\n",
166+
"^3CZ78B4... | 2 | 3 | 1741325601406 | 1\n",
167+
"^YYY4HAB... | 6 | 7 | 1741325601806 | 1\n",
168+
"^3HN31E1... | 6 | 7 | 1741325601806 | 1\n",
169+
"^X1MXHYY... | 12 | 13 | 1741325602406 | 1\n"
170170
]
171171
}
172172
],
@@ -410,10 +410,10 @@
410410
"output_type": "stream",
411411
"text": [
412412
" | group | value | ret | __time__ | __diff__\n",
413-
"^Z3QWT29... | 2 | 1 | 2 | 1741239264432 | 1\n",
414-
"^Z3QWT29... | 2 | 1 | 2 | 1741239264632 | -1\n",
415-
"^Z3QWT29... | 2 | 3 | 4 | 1741239264632 | 1\n",
416-
"^YYY4HAB... | 1 | 2 | 3 | 1741239264832 | 1\n"
413+
"^Z3QWT29... | 2 | 1 | 2 | 1741325610972 | 1\n",
414+
"^Z3QWT29... | 2 | 1 | 2 | 1741325611172 | -1\n",
415+
"^Z3QWT29... | 2 | 3 | 4 | 1741325611172 | 1\n",
416+
"^YYY4HAB... | 1 | 2 | 3 | 1741325611372 | 1\n"
417417
]
418418
}
419419
],
@@ -535,14 +535,14 @@
535535
"output_type": "stream",
536536
"text": [
537537
" | group | value | ret | __time__ | __diff__\n",
538-
"^Z3QWT29... | 2 | 1 | 2 | 1741239267032 | 1\n",
539-
"^3HN31E1... | 4 | 3 | 4 | 1741239267032 | 1\n",
540-
"^Z3QWT29... | 2 | 1 | 2 | 1741239267132 | -1\n",
541-
"^3HN31E1... | 4 | 3 | 4 | 1741239267132 | -1\n",
542-
"^Z3QWT29... | 2 | 4 | 5 | 1741239267132 | 1\n",
543-
"^3HN31E1... | 4 | 2 | 3 | 1741239267132 | 1\n",
544-
"^YYY4HAB... | 1 | 2 | 3 | 1741239267232 | 1\n",
545-
"^3CZ78B4... | 3 | 2 | 3 | 1741239267232 | 1\n"
538+
"^Z3QWT29... | 2 | 1 | 2 | 1741325613568 | 1\n",
539+
"^3HN31E1... | 4 | 3 | 4 | 1741325613568 | 1\n",
540+
"^Z3QWT29... | 2 | 1 | 2 | 1741325613668 | -1\n",
541+
"^3HN31E1... | 4 | 3 | 4 | 1741325613668 | -1\n",
542+
"^Z3QWT29... | 2 | 4 | 5 | 1741325613668 | 1\n",
543+
"^3HN31E1... | 4 | 2 | 3 | 1741325613668 | 1\n",
544+
"^YYY4HAB... | 1 | 2 | 3 | 1741325613768 | 1\n",
545+
"^3CZ78B4... | 3 | 2 | 3 | 1741325613768 | 1\n"
546546
]
547547
}
548548
],
@@ -575,14 +575,14 @@
575575
"output_type": "stream",
576576
"text": [
577577
" | group | value | ret | __time__ | __diff__\n",
578-
"^Z3QWT29... | 2 | 1 | 2 | 1741239268286 | 1\n",
579-
"^3CZ78B4... | 3 | 1 | 2 | 1741239268286 | 1\n",
580-
"^3CZ78B4... | 3 | 1 | 2 | 1741239268386 | -1\n",
581-
"^3CZ78B4... | 3 | 2 | 3 | 1741239268386 | 1\n",
582-
"^3HN31E1... | 4 | 2 | 3 | 1741239268486 | 1\n",
583-
"^Z3QWT29... | 2 | 1 | 2 | 1741239268588 | -1\n",
584-
"^Z3QWT29... | 2 | 4 | 5 | 1741239268588 | 1\n",
585-
"^YYY4HAB... | 1 | 2 | 3 | 1741239268688 | 1\n"
578+
"^Z3QWT29... | 2 | 1 | 2 | 1741325614808 | 1\n",
579+
"^3CZ78B4... | 3 | 1 | 2 | 1741325614810 | 1\n",
580+
"^3CZ78B4... | 3 | 1 | 2 | 1741325614910 | -1\n",
581+
"^3CZ78B4... | 3 | 2 | 3 | 1741325614910 | 1\n",
582+
"^3HN31E1... | 4 | 2 | 3 | 1741325615010 | 1\n",
583+
"^Z3QWT29... | 2 | 1 | 2 | 1741325615110 | -1\n",
584+
"^Z3QWT29... | 2 | 4 | 5 | 1741325615110 | 1\n",
585+
"^YYY4HAB... | 1 | 2 | 3 | 1741325615208 | 1\n"
586586
]
587587
}
588588
],
@@ -625,10 +625,10 @@
625625
"output_type": "stream",
626626
"text": [
627627
" | group | value | ret | __time__ | __diff__\n",
628-
"^YYY4HAB... | 1 | 2 | 3 | 1741239270150 | 1\n",
629-
"^Z3QWT29... | 2 | 4 | 5 | 1741239270150 | 1\n",
630-
"^3CZ78B4... | 3 | 2 | 3 | 1741239270150 | 1\n",
631-
"^3HN31E1... | 4 | 2 | 3 | 1741239270150 | 1\n"
628+
"^YYY4HAB... | 1 | 2 | 3 | 1741325616682 | 1\n",
629+
"^Z3QWT29... | 2 | 4 | 5 | 1741325616682 | 1\n",
630+
"^3CZ78B4... | 3 | 2 | 3 | 1741325616682 | 1\n",
631+
"^3HN31E1... | 4 | 2 | 3 | 1741325616682 | 1\n"
632632
]
633633
}
634634
],
@@ -672,12 +672,12 @@
672672
"output_type": "stream",
673673
"text": [
674674
" | group | value | ret | __time__ | __diff__\n",
675-
"^Z3QWT29... | 2 | 1 | 2 | 1741239271402 | 1\n",
676-
"^3HN31E1... | 4 | 3 | 4 | 1741239271402 | 1\n",
677-
"^Z3QWT29... | 2 | 1 | 2 | 1741239271502 | -1\n",
678-
"^3HN31E1... | 4 | 3 | 4 | 1741239271502 | -1\n",
679-
"^YYY4HAB... | 1 | 2 | 3 | 1741239271602 | 1\n",
680-
"^3CZ78B4... | 3 | 2 | 3 | 1741239271602 | 1\n"
675+
"^Z3QWT29... | 2 | 1 | 2 | 1741325617944 | 1\n",
676+
"^3HN31E1... | 4 | 3 | 4 | 1741325617944 | 1\n",
677+
"^Z3QWT29... | 2 | 1 | 2 | 1741325618044 | -1\n",
678+
"^3HN31E1... | 4 | 3 | 4 | 1741325618044 | -1\n",
679+
"^YYY4HAB... | 1 | 2 | 3 | 1741325618142 | 1\n",
680+
"^3CZ78B4... | 3 | 2 | 3 | 1741325618142 | 1\n"
681681
]
682682
}
683683
],

examples/notebooks/tutorials/consistency.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -323,7 +323,7 @@
323323
"name": "stderr",
324324
"output_type": "stream",
325325
"text": [
326-
"INFO:pathway_engine.connectors.monitoring:subscribe-0: Done writing 0 entries, time 1741239490546. Current batch writes took: 0 ms. All writes so far took: 0 ms.\n"
326+
"INFO:pathway_engine.connectors.monitoring:subscribe-0: Done writing 0 entries, time 1741325837406. Current batch writes took: 0 ms. All writes so far took: 0 ms.\n"
327327
]
328328
},
329329
{

examples/notebooks/tutorials/rag-evaluations.ipynb

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@
6161
"source": [
6262
"Pathway streamlines the process of building RAG applications with always up-to-date knowledge. It empowers you to connect your LLM to live data sources and eliminates the need for separate ETL pipelines for knowledge management.\n",
6363
"\n",
64-
"However, simply building and deploying a RAG app isn't enough, and evaluations shouldn't be treated as an afterthought. In Pathway, we rely on frequent evaluation runs to keep our offerings reliable. This also prevents us from introducing any silent bugs into the pipeline. \n",
64+
"However, simply building and deploying a RAG app isn't enough, and evaluations shouldn't be treated as an afterthought. In Pathway, we rely on frequent evaluation runs to keep our offerings reliable. This also prevents us from introducing any silent bugs into the pipeline.\n",
6565
"This guide offers a simplified look at how we evaluate our RAG solutions at Pathway. For a detailed view of the full pipeline, including additional evaluation components and logging, check out the [complete CI workflow](https://github.com/pathwaycom/pathway/tree/main/integration_tests/rag_evals).\n",
6666
"\n",
6767
"You need to ensure that your RAG application delivers accurate and reliable results with YOUR data. This is where our blog post dives in. You will explore RAG evaluations, create synthetic test data if necessary, and learn how to optimize your Pathway RAG app.\n",
@@ -106,11 +106,11 @@
106106
"metadata": {},
107107
"source": [
108108
"Some of the retrieval metrics are:\n",
109-
"- `Hit@k`: Measures the proportion of times that the relevant item appears in the top-K retrieved results. This can be also mentioned as `\"Context Recall\"`, that is assuming there is only one relevant document. \n",
109+
"- `Hit@k`: Measures the proportion of times that the relevant item appears in the top-K retrieved results. This can be also mentioned as `\"Context Recall\"`, that is assuming there is only one relevant document.\n",
110110
"- `Context Recall`: Focuses on the comprehensiveness of the retrieved context, measuring the proportion of all relevant documents in the corpus that are successfully retrieved. It is formally defined as `(Number of Relevant Items Retrieved) / (Total Number of Relevant Items in Corpus)`. In simpler terms, recall tells you \"Of all the relevant documents that could have been retrieved, how many were actually retrieved?\". High recall signifies that your retrieval system is good at finding most of the relevant context available.\n",
111-
"- `Context Precision`: This metric focuses on the quality of the retrieved context by measuring the proportion of retrieved documents that are actually relevant to the query. Formally, it is calculated as `(Number of Relevant Items Retrieved) / (Total Number of Items Retrieved)`. In contrast to `Hit@k` (or \"Context Recall\") which emphasizes retrieving at least one relevant item within the top-K results, precision evaluates the relevance concentration within the retrieved set. Essentially, precision answers: \"Of all documents retrieved, how many were relevant?\". \n",
112-
"- `Mean Reciprocal Rank (MRR)`: Evaluates the ranking of retrieved documents by focusing on the position of the first relevant document in the ranked list. For each query, the Reciprocal Rank (RR) is calculated as 1 / rank, where rank is the position of the first relevant document. If there are no relevant documents in the retrieved list, RR is 0. MRR is then the mean of these reciprocal ranks across a set of queries. Generally, you shouldn't stress about this metric in your RAG application. This is largely because the benefit of having the most relevant context ranked at the top is less critical for LLMs. \n",
113-
"- `Normalized Discounted Cumulative Gain (NDCG)`: A ranking-based metric that evaluates the quality of retrieved results by considering both relevance and position. Unlike Hit@k and MRR, which primarily focus on whether relevant items appear at the top, NDCG assigns higher importance to highly relevant documents appearing earlier in the ranked list. This metric can be useful when you have more than one relevant items and their relevancy has float labels instead of booleans. \n",
111+
"- `Context Precision`: This metric focuses on the quality of the retrieved context by measuring the proportion of retrieved documents that are actually relevant to the query. Formally, it is calculated as `(Number of Relevant Items Retrieved) / (Total Number of Items Retrieved)`. In contrast to `Hit@k` (or \"Context Recall\") which emphasizes retrieving at least one relevant item within the top-K results, precision evaluates the relevance concentration within the retrieved set. Essentially, precision answers: \"Of all documents retrieved, how many were relevant?\".\n",
112+
"- `Mean Reciprocal Rank (MRR)`: Evaluates the ranking of retrieved documents by focusing on the position of the first relevant document in the ranked list. For each query, the Reciprocal Rank (RR) is calculated as 1 / rank, where rank is the position of the first relevant document. If there are no relevant documents in the retrieved list, RR is 0. MRR is then the mean of these reciprocal ranks across a set of queries. Generally, you shouldn't stress about this metric in your RAG application. This is largely because the benefit of having the most relevant context ranked at the top is less critical for LLMs.\n",
113+
"- `Normalized Discounted Cumulative Gain (NDCG)`: A ranking-based metric that evaluates the quality of retrieved results by considering both relevance and position. Unlike Hit@k and MRR, which primarily focus on whether relevant items appear at the top, NDCG assigns higher importance to highly relevant documents appearing earlier in the ranked list. This metric can be useful when you have more than one relevant items and their relevancy has float labels instead of booleans.\n",
114114
"\n",
115115
"As for the generation metrics:\n",
116116
"- `Faithfulness`: Evaluates how grounded the LLM's answer is in the retrieved context. It measures whether the claims in the generated answer are supported by the provided context. Penalizes the hallucinations.\n",
@@ -779,7 +779,7 @@
779779
"Calculate the evaluation metrics with our selected metrics.\n",
780780
"\n",
781781
"We introduced few modifications on top of the default RAGAS settings, namely:\n",
782-
"- We completely ignored semantic similarity in the answer correctness, we found that it usually gives \"false positives\" and unnecessarily rewards bad predictions\\*. \n",
782+
"- We completely ignored semantic similarity in the answer correctness, we found that it usually gives \"false positives\" and unnecessarily rewards bad predictions\\*.\n",
783783
"- We modified `answer_correctness_metric`'s prompt to be more forgiving and not look for the exact same words.\n",
784784
"- We increased `beta` parameter of the correctness to favor the recall rather than precision. We reward if LLM has more of relevant documents in the context. This is because LLM can choose to ignore irrelevant documents (False positive in context) which diminishes the importance of the precision.\n",
785785
"\n",
@@ -1525,9 +1525,9 @@
15251525
"id": "102",
15261526
"metadata": {},
15271527
"source": [
1528-
"Hmm, seems like this embedder cannot quite work as well as the previous one. Weirdly, faithfulness score dropped significantly, maybe the ordering of the chunks is the reason. \n",
1528+
"Hmm, seems like this embedder cannot quite work as well as the previous one. Weirdly, faithfulness score dropped significantly, maybe the ordering of the chunks is the reason.\n",
15291529
"\n",
1530-
"Let's see if we can improve the performance with the prompt. "
1530+
"Let's see if we can improve the performance with the prompt."
15311531
]
15321532
},
15331533
{

0 commit comments

Comments
 (0)