Skip to content

Commit 0d7435f

Browse files
authored
Merge pull request #250262 from ChenJieting/jieting/tool_related_doc
update VectorDB tools docs and add a faq.md for tools
2 parents b2d09fe + 3c6e582 commit 0d7435f

File tree

7 files changed

+193
-23
lines changed

7 files changed

+193
-23
lines changed
14.8 KB
Loading
17.7 KB
Loading

articles/machine-learning/prompt-flow/tools-reference/faiss-index-lookup-tool.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,15 @@ Faiss Index Lookup is a tool tailored for querying within a user-provided Faiss-
1818

1919
## Prerequisites
2020
- Prepare an accessible path on Azure Blob Storage. Here's the guide if a new storage account needs to be created: [Azure Storage Account](../../../storage/common/storage-account-create.md).
21-
- Create related Faiss-based index files on Azure Blob Storage. We support the LangChain format (index.faiss + index.pkl) for the index files, which can be prepared either by employing our EmbeddingStore SDK or following the quick guide from [LangChain documentation](https://python.langchain.com/docs/modules/data_connection/vectorstores/integrations/faiss). Please refer to [the sample notebook for creating Faiss index](https://aka.ms/pf-sample-build-faiss-index) for building index using EmbeddingStore SDK.
21+
- Create related Faiss-based index files on Azure Blob Storage. We support the LangChain format (index.faiss + index.pkl) for the index files, which can be prepared either by employing promptflow-vectordb SDK or following the quick guide from [LangChain documentation](https://python.langchain.com/docs/modules/data_connection/vectorstores/integrations/faiss). Please refer to [the sample notebook for creating Faiss index](https://aka.ms/pf-sample-build-faiss-index) for building index using promptflow-vectordb SDK.
2222
- Based on where you put your own index files, the identity used by the promptflow runtime should be granted with certain roles. Please refer to [Steps to assign an Azure role](../../../role-based-access-control/role-assignments-steps.md):
2323

2424
| Location | Role |
2525
| ---- | ---- |
2626
| workspace datastores or workspace default blob | AzureML Data Scientist |
2727
| other blobs | Storage Blob Data Reader |
28+
> [!NOTE]
29+
> When legacy tools switching to code first mode, if you encounter "'embeddingstore.tool.faiss_index_lookup.search' is not found" error, please refer to the [Troubleshoot Guidance](./troubleshoot-guidance.md).
2830
2931
## Inputs
3032

@@ -38,7 +40,7 @@ The tool accepts the following inputs:
3840

3941
## Outputs
4042

41-
The following is an example for JSON format response returned by the tool, which includes the top-k scored entities. The entity follows a generic schema of vector search result provided by our EmbeddingStore SDK. For the Faiss Index Search, the following fields are populated:
43+
The following is an example for JSON format response returned by the tool, which includes the top-k scored entities. The entity follows a generic schema of vector search result provided by promptflow-vectordb SDK. For the Faiss Index Search, the following fields are populated:
4244

4345
| Field Name | Type | Description |
4446
| ---- | ---- | ----------- |
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
---
2+
title: Troubleshoot guidance
3+
titleSuffix: Azure Machine Learning
4+
description: This article addresses frequent questions about tool usage.
5+
services: machine-learning
6+
ms.service: machine-learning
7+
ms.subservice: core
8+
ms.topic: reference
9+
author: ChenJieting
10+
ms.author: chenjieting
11+
ms.reviewer: lagayhar
12+
ms.date: 09/05/2023
13+
---
14+
15+
# Troubleshoot guidance
16+
17+
This article addresses frequent questions about tool usage.
18+
19+
## Error "package tool is not found" occurs when updating the flow for code first experience.
20+
21+
When you update flows for code first experience, if the flow utilized these tools (Faiss Index Lookup, Vector Index Lookup, Vector DB Lookup, Content Safety (Text)), you may encounter the error message like below:
22+
23+
<code><i>Package tool 'embeddingstore.tool.faiss_index_lookup.search' is not found in the current environment.</i></code>
24+
25+
To resolve the issue, you have two options:
26+
27+
- **Option 1**
28+
- Update your runtime to latest version.
29+
- Click on "Raw file mode" to switch to the raw code view, then open the "flow.dag.yaml" file.
30+
![how-to-switch-to-raw-file-mode](../media/faq/switch-to-raw-file-mode.png)
31+
- Update the tool names.
32+
![how-to-update-tool-name](../media/faq/update-tool-name.png)
33+
34+
| Tool | New tool name |
35+
| ---- | ---- |
36+
| Faiss Index Lookup tool | promptflow_vectordb.tool.faiss_index_lookup.FaissIndexLookup.search |
37+
| Vector Index Lookup | promptflow_vectordb.tool.vector_index_lookup.VectorIndexLookup.search |
38+
| Vector DB Lookup | promptflow_vectordb.tool.vector_db_lookup.VectorDBLookup.search |
39+
| Content Safety (Text) | content_safety_text.tools.content_safety_text_tool.analyze_text |
40+
- Save the "flow.dag.yaml" file.
41+
42+
- **Option 2**
43+
- Update your runtime to latest version.
44+
- Remove the old tool and re-create a new tool.

articles/machine-learning/prompt-flow/tools-reference/vector-db-lookup-tool.md

Lines changed: 140 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -19,68 +19,188 @@ Vector DB Lookup is a vector search tool that allows users to search top k simil
1919
| Name | Description |
2020
| --- | --- |
2121
| Azure Cognitive Search | Microsoft's cloud search service with built-in AI capabilities that enrich all types of information to help identify and explore relevant content at scale. |
22+
| Qdrant | Qdrant is a vector similarity search engine that provides a production-ready service with a convenient API to store, search and manage points (i.e. vectors) with an additional payload. |
23+
| Weaviate | Weaviate is an open source vector database that stores both objects and vectors. This allows for combining vector search with structured filtering. |
2224

23-
This tool adds support for more vector databases, including Pinecone, Weaviete, Qdrant etc.
25+
This tool will support more vector databases.
2426

2527
> [!IMPORTANT]
2628
> Prompt flow is currently in public preview. This preview is provided without a service-level agreement, and is not recommended for production workloads. Certain features might not be supported or might have constrained capabilities.
2729
> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
2830
2931
## Prerequisites
3032

31-
The tool searches data from a third-party vector database. To use it, you should create resources in advance and establish connections between the tool and the resource.
33+
The tool searches data from a third-party vector database. To use it, you should create resources in advance and establish connection between the tool and the resource.
3234

3335
- **Azure Cognitive Search:**
3436
- Create resource [Azure Cognitive Search](../../../search/search-create-service-portal.md).
35-
- Add "CognitiveSearchConnection" connection. Fill "API key" field with "Primary admin key" from "Keys" section of created resource, and fill "Api Base" field with the URL, the URL format is `https://{your_serive_name}.search.windows.net`.
37+
- Add "Cognitive search" connection. Fill "API key" field with "Primary admin key" from "Keys" section of created resource, and fill "API base" field with the URL, the URL format is `https://{your_serive_name}.search.windows.net`.
38+
39+
- **Qdrant:**
40+
- Follow the [installation](https://qdrant.tech/documentation/quick-start/) to deploy Qdrant to a self-maintained cloud server.
41+
- Add "Qdrant" connection. Fill "API base" with your self-maintained cloud server address and fill "API key" field.
42+
43+
- **Weaviate:**
44+
- Follow the [installation](https://weaviate.io/developers/weaviate/installation) to deploy Weaviate to a self-maintained instance.
45+
- Add "Weaviate" connection. Fill "API base" with your self-maintained instance address and fill "API key" field.
46+
47+
> [!NOTE]
48+
> When legacy tools switching to code first mode, if you encounter "'embeddingstore.tool.vector_db_lookup.search' is not found" error, please refer to the [Troubleshoot Guidance](./troubleshoot-guidance.md).
3649
3750
## Inputs
3851

39-
The following are available input parameters:
52+
The tool accepts the following inputs:
4053
- **Azure Cognitive Search:**
4154

4255
| Name | Type | Description | Required |
4356
| ---- | ---- | ----------- | -------- |
44-
| connection | CognitiveSearchConnection | The created workspace connection for accessing to Cognitive Search service endpoint. | Yes |
57+
| connection | CognitiveSearchConnection | The created connection for accessing to Cognitive Search endpoint. | Yes |
4558
| index_name | string | The index name created in Cognitive Search resource. | Yes |
46-
| text_field | string | The text field name. The returned text filed will populate the result of text. | No |
59+
| text_field | string | The text field name. The returned text field will populate the text of output. | No |
4760
| vector_field | string | The vector field name. The target vector is searched in this vector field. | Yes |
48-
| search_params | dict | The search parameters. It's key-value pairs. Except for parameters in the tool input list mentioned above, additional search parameters can be formed into a JSON object as search_params. For example, use `{"select": ""}` as search_params to select the returned fields, use `{"search": "", "queryType": "", ""semanticConfiguration": "", "queryLanguage": ""}` to perform a hybrid search. | No |
61+
| search_params | dict | The search parameters. It's key-value pairs. Except for parameters in the tool input list mentioned above, additional search parameters can be formed into a JSON object as search_params. For example, use `{"select": ""}` as search_params to select the returned fields, use `{"search": ""}` to perform a [hybrid search](../../../search/search-get-started-vector.md#hybrid-search). | No |
4962
| search_filters | dict | The search filters. It's key-value pairs, the input format is like `{"filter": ""}` | No |
50-
| vector | list | The target vector to be queried, which can be generated by the LLM tool. | Yes |
63+
| vector | list | The target vector to be queried, which can be generated by Embedding tool. | Yes |
5164
| top_k | int | The count of top-scored entities to return. Default value is 3 | No |
5265

66+
- **Qdrant:**
5367

54-
## Outputs
68+
| Name | Type | Description | Required |
69+
| ---- | ---- | ----------- | -------- |
70+
| connection | QdrantConnection | The created connection for accessing to Qdrant server. | Yes |
71+
| collection_name | string | The collection name created in self-maintained cloud server. | Yes |
72+
| text_field | string | The text field name. The returned text field will populate the text of output. | No |
73+
| search_params | dict | The search parameters can be formed into a JSON object as search_params. For example, use `{"params": {"hnsw_ef": 0, "exact": false, "quantization": null}}` to set search_params. | No |
74+
| search_filters | dict | The search filters. It's key-value pairs, the input format is like `{"filter": {"should": [{"key": "", "match": {"value": ""}}]}}` | No |
75+
| vector | list | The target vector to be queried, which can be generated by Embedding tool. | Yes |
76+
| top_k | int | The count of top-scored entities to return. Default value is 3 | No |
5577

56-
The following is an example JSON format response returned by the tool, which includes the top-k scored entities. The entity follows a generic schema of vector search result provided by our EmbeddingStore SDK.
78+
- **Weaviate:**
79+
80+
| Name | Type | Description | Required |
81+
| ---- | ---- | ----------- | -------- |
82+
| connection | WeaviateConnection | The created connection for accessing to Weaviate. | Yes |
83+
| class_name | string | The class name. | Yes |
84+
| text_field | string | The text field name. The returned text field will populate the text of output. | No |
85+
| vector | list | The target vector to be queried, which can be generated by Embedding tool. | Yes |
86+
| top_k | int | The count of top-scored entities to return. Default value is 3 | No |
5787

58-
**Azure Cognitive Search:**
88+
## Outputs
5989

60-
For the Azure Cognitive Search, the following fields are populated:
90+
The following is an example JSON format response returned by the tool, which includes the top-k scored entities. The entity follows a generic schema of vector search result provided by promptflow-vectordb SDK.
91+
- **Azure Cognitive Search:**
6192

62-
| Field Name | Type | Description |
63-
|-----------------|--------|-------------------------------------------------------------------|
64-
| vector | list | vector of the entity, the vector field name is specified in input |
65-
| text | string | text of the entity, the text field name is specified in input |
66-
| score | float | @search.score from the original entity, which evaluates the similarity between the entity and the query vector |
67-
| original_entity | dict | the original response json from search REST API |
93+
For Azure Cognitive Search, the following fields are populated:
6894

95+
| Field Name | Type | Description |
96+
| ---- | ---- | ----------- |
97+
| original_entity | dict | the original response json from search REST API|
98+
| score | float | @search.score from the original entity, which evaluates the similarity between the entity and the query vector |
99+
| text | string | text of the entity|
100+
| vector | list | vector of the entity|
69101

102+
<details>
103+
<summary>Output</summary>
104+
70105
```json
71106
[
72107
{
73108
"metadata": null,
74109
"original_entity": {
75110
"@search.score": 0.5099789,
76111
"id": "",
77-
"your_text_filed_name": "text",
112+
"your_text_filed_name": "sample text1",
78113
"your_vector_filed_name": [-0.40517663431890405, 0.5856996257406859, -0.1593078462266455, -0.9776269170785785, -0.6145604369828972],
79114
"your_additional_field_name": ""
80115
},
81116
"score": 0.5099789,
82-
"text": "text",
117+
"text": "sample text1",
83118
"vector": [-0.40517663431890405, 0.5856996257406859, -0.1593078462266455, -0.9776269170785785, -0.6145604369828972]
84119
}
85120
]
86121
```
122+
</details>
123+
124+
- **Qdrant:**
125+
126+
For Qdrant, the following fields are populated:
127+
128+
| Field Name | Type | Description |
129+
| ---- | ---- | ----------- |
130+
| original_entity | dict | the original response json from search REST API|
131+
| metadata | dict | payload from the original entity|
132+
| score | float | score from the original entity, which evaluates the similarity between the entity and the query vector|
133+
| text | string | text of the payload|
134+
| vector | list | vector of the entity|
135+
136+
<details>
137+
<summary>Output</summary>
138+
139+
```json
140+
[
141+
{
142+
"metadata": {
143+
"text": "sample text1"
144+
},
145+
"original_entity": {
146+
"id": 1,
147+
"payload": {
148+
"text": "sample text1"
149+
},
150+
"score": 1,
151+
"vector": [0.18257418, 0.36514837, 0.5477226, 0.73029673],
152+
"version": 0
153+
},
154+
"score": 1,
155+
"text": "sample text1",
156+
"vector": [0.18257418, 0.36514837, 0.5477226, 0.73029673]
157+
}
158+
]
159+
```
160+
</details>
161+
162+
- **Weaviate:**
163+
164+
For Weaviate, the following fields are populated:
165+
166+
| Field Name | Type | Description |
167+
| ---- | ---- | ----------- |
168+
| original_entity | dict | the original response json from search REST API|
169+
| score | float | certainty from the original entity, which evaluates the similarity between the entity and the query vector|
170+
| text | string | text in the original entity|
171+
| vector | list | vector of the entity|
172+
173+
<details>
174+
<summary>Output</summary>
175+
176+
```json
177+
[
178+
{
179+
"metadata": null,
180+
"original_entity": {
181+
"_additional": {
182+
"certainty": 1,
183+
"distance": 0,
184+
"vector": [
185+
0.58,
186+
0.59,
187+
0.6,
188+
0.61,
189+
0.62
190+
]
191+
},
192+
"text": "sample text1."
193+
},
194+
"score": 1,
195+
"text": "sample text1.",
196+
"vector": [
197+
0.58,
198+
0.59,
199+
0.6,
200+
0.61,
201+
0.62
202+
]
203+
}
204+
]
205+
```
206+
</details>

articles/machine-learning/prompt-flow/tools-reference/vector-index-lookup-tool.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@ Vector index lookup is a tool tailored for querying within an Azure Machine Lear
2929
| ---- | ---- |
3030
| workspace datastores or workspace default blob | AzureML Data Scientist |
3131
| other blobs | Storage Blob Data Reader |
32+
> [!NOTE]
33+
> When legacy tools switching to code first mode, if you encounter "'embeddingstore.tool.vector_index_lookup.search' is not found" error, please refer to the [Troubleshoot Guidance](./troubleshoot-guidance.md).
3234
3335
## Inputs
3436

@@ -42,7 +44,7 @@ The tool accepts the following inputs:
4244

4345
## Outputs
4446

45-
The following is an example for JSON format response returned by the tool, which includes the top-k scored entities. The entity follows a generic schema of vector search result provided by our EmbeddingStore SDK. For the Vector Index Search, the following fields are populated:
47+
The following is an example for JSON format response returned by the tool, which includes the top-k scored entities. The entity follows a generic schema of vector search result provided by promptflow-vectordb SDK. For the Vector Index Search, the following fields are populated:
4648

4749
| Field Name | Type | Description |
4850
| ---- | ---- | ----------- |

articles/machine-learning/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -656,6 +656,8 @@
656656
href: ./prompt-flow/tools-reference/vector-db-lookup-tool.md
657657
- name: SERP API tool
658658
href: ./prompt-flow/tools-reference/serp-api-tool.md
659+
- name: Troubleshoot Guidance
660+
href: ./prompt-flow/tools-reference/troubleshoot-guidance.md
659661
- name: Retrieval Augmented Generation (RAG)
660662
items:
661663
- name: What is RAG

0 commit comments

Comments
 (0)