Skip to content

Commit bd32174

Browse files
authored
Merge pull request #283420 from MicrosoftDocs/main
Publish to live, Friday 4 AM PST, 8/2
2 parents 5143cff + fa30df2 commit bd32174

File tree

3,599 files changed

+4225
-3925
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

3,599 files changed

+4225
-3925
lines changed

articles/ai-studio/concepts/connections.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,9 @@ You can [create connections](../how-to/connections-add.md) to Azure AI services
2626

2727
:::image type="content" source="../media/prompt-flow/llm-tool-connection.png" alt-text="Screenshot of a connection used by the LLM tool in prompt flow." lightbox="../media/prompt-flow/llm-tool-connection.png":::
2828

29-
As another example, you can [create a connection](../how-to/connections-add.md) to an Azure AI Search resource. The connection can then be used by prompt flow tools such as the Vector DB Lookup tool.
29+
As another example, you can [create a connection](../how-to/connections-add.md) to an Azure AI Search resource. The connection can then be used by prompt flow tools such as the Index Lookup tool.
3030

31-
:::image type="content" source="../media/prompt-flow/vector-db-lookup-tool-connection.png" alt-text="Screenshot of a connection used by the Vector DB Lookup tool in prompt flow." lightbox="../media/prompt-flow/vector-db-lookup-tool-connection.png":::
31+
:::image type="content" source="../media/prompt-flow/index-lookup-tool-connection.png" alt-text="Screenshot of a connection used by the Index Lookup tool in prompt flow." lightbox="../media/prompt-flow/index-lookup-tool-connection.png":::
3232

3333
## Connections to non-Microsoft services
3434

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
---
2+
title: Troubleshoot guidance for prompt flow
3+
titleSuffix: Azure AI Studio
4+
description: This article addresses frequent questions about prompt flow usage.
5+
manager: scottpolly
6+
ms.service: azure-ai-studio
7+
ms.topic: reference
8+
author: lgayhardt
9+
ms.author: lagayhar
10+
ms.reviewer: chenjieting
11+
ms.date: 07/31/2024
12+
---
13+
14+
# Troubleshoot guidance for prompt flow
15+
16+
This article addresses frequent questions about prompt flow usage.
17+
18+
## Compute session related issues
19+
20+
### Run failed because of "No module named XXX"
21+
22+
This type of error related to compute session lacks required packages. If you're using a default environment, make sure the image of your compute session is using the latest version. If you're using a custom base image, make sure you installed all the required packages in your docker context.
23+
24+
### Where to find the serverless instance used by compute session?
25+
26+
You can view the serverless instance used by compute session in the compute session list tab under compute page. Learn more about how to manage serverless instance [Manage a compute session](./create-manage-compute-session.md#manage-a-compute-session).
27+
28+
### Compute session failures using custom base image
29+
30+
## Flow run related issues
31+
32+
### How to find the raw inputs and outputs of in LLM tool for further investigation?
33+
34+
In prompt flow, on flow page with successful run and run detail page, you can find the raw inputs and outputs of LLM tool in the output section. Select the `view full output` button to view full output.
35+
36+
:::image type="content" source="../media/prompt-flow/view-full-output.png" alt-text="Screenshot that shows view full output on LLM node." lightbox = "../media/prompt-flow/view-full-output.png":::
37+
38+
`Trace` section includes each request and response to the LLM tool. You can check the raw message sent to the LLM model and the raw response from the LLM model.
39+
40+
:::image type="content" source="../media/prompt-flow/trace-large-language-model-tool.png" alt-text="Screenshot that shows raw request send to LLM model and response from LLM model." lightbox = "../media/prompt-flow/trace-large-language-model-tool.png":::
41+
42+
### How to fix 409 error from Azure OpenAI?
43+
44+
You may encounter 409 error from Azure OpenAI, it means you have reached the rate limit of Azure OpenAI. You can check the error message in the output section of LLM node. Learn more about [Azure OpenAI rate limit](../../ai-services/openai/quotas-limits.md).
45+
46+
:::image type="content" source="../media/prompt-flow/429-rate-limit.png" alt-text="Screenshot that shows 429 rate limit error from Azure OpenAI." lightbox = "../media/prompt-flow/429-rate-limit.png":::
47+
48+
### Identify which node consumes the most time
49+
50+
1. Check the compute session logs.
51+
52+
1. Try to find the following warning log format:
53+
54+
{node_name} has been running for {duration} seconds.
55+
56+
For example:
57+
58+
- **Case 1:** Python script node runs for a long time.
59+
60+
:::image type="content" source="../media/prompt-flow/runtime-timeout-running-for-long-time.png" alt-text="Screenshot that shows a timeout run sign." lightbox = "../media/prompt-flow/runtime-timeout-running-for-long-time.png":::
61+
62+
In this case, you can find that `PythonScriptNode` was running for a long time (almost 300 seconds). Then you can check the node details to see what's the problem.
63+
64+
- **Case 2:** LLM node runs for a long time.
65+
66+
:::image type="content" source="../media/prompt-flow/runtime-timeout-by-language-model-timeout.png" alt-text="Screenshot that shows timeout logs caused by an LLM timeout." lightbox = "../media/prompt-flow/runtime-timeout-by-language-model-timeout.png":::
67+
68+
In this case, if you find the message `request canceled` in the logs, it might be because the OpenAI API call is taking too long and exceeding the timeout limit.
69+
70+
An OpenAI API timeout could be caused by a network issue or a complex request that requires more processing time. For more information, see [OpenAI API timeout](https://help.openai.com/en/articles/6897186-timeout).
71+
72+
Wait a few seconds and retry your request. This action usually resolves any network issues.
73+
74+
If retrying doesn't work, check whether you're using a long context model, such as `gpt-4-32k`, and have set a large value for `max_tokens`. If so, the behavior is expected because your prompt might generate a long response that takes longer than the interactive mode's upper threshold. In this situation, we recommend trying `Bulk test` because this mode doesn't have a timeout setting.
75+
76+
1. If you can't find anything in logs to indicate it's a specific node issue:
77+
78+
- Contact the prompt flow team ([promptflow-eng](mailto:[email protected])) with the logs. We try to identify the root cause.
79+
80+
## Flow deployment related issues
81+
82+
### Upstream request timeout issue when consuming the endpoint
83+
84+
If you use CLI or SDK to deploy the flow, you may encounter timeout error. By default the `request_timeout_ms` is 5000. You can specify at max to 5 minutes, which is 300,000 ms. Following is example showing how to specify request time-out in the deployment yaml file. To learn more, see [deployment schema](../../machine-learning/reference-yaml-deployment-managed-online.md).
85+
86+
```yaml
87+
request_settings:
88+
request_timeout_ms: 300000
89+
```
90+
91+
### OpenAI API hits authentication error
92+
93+
If you regenerate your Azure OpenAI key and manually update the connection used in prompt flow, you may encounter errors like "Unauthorized. Access token is missing, invalid, audience is incorrect or have expired." when invoking an existing endpoint created before key regenerating.
94+
95+
This is because the connections used in the endpoints/deployments won't be automatically updated. Any change for key or secrets in deployments should be done by manual update, which aims to avoid impacting online production deployment due to unintentional offline operation.
96+
97+
- If the endpoint was deployed in the studio UI, you can just redeploy the flow to the existing endpoint using the same deployment name.
98+
- If the endpoint was deployed using SDK or CLI, you need to make some modification to the deployment definition such as adding a dummy environment variable, and then use `az ml online-deployment update` to update your deployment.
99+
100+
### Vulnerability issues in prompt flow deployments
101+
102+
For prompt flow runtime related vulnerabilities, following are approaches, which can help mitigate:
103+
104+
- Update the dependency packages in your requirements.txt in your flow folder.
105+
- If you're using customized base image for your flow, you need to update the prompt flow runtime to latest version and rebuild your base image, then redeploy the flow.
106+
107+
For any other vulnerabilities of managed online deployments, Azure AI fixes the issues in a monthly manner.
108+
109+
### "MissingDriverProgram Error" or "Could not find driver program in the request"
110+
111+
If you deploy your flow and encounter the following error, it might be related to the deployment environment.
112+
113+
```text
114+
'error':
115+
{
116+
'code': 'BadRequest',
117+
'message': 'The request is invalid.',
118+
'details':
119+
{'code': 'MissingDriverProgram',
120+
'message': 'Could not find driver program in the request.',
121+
'details': [],
122+
'additionalInfo': []
123+
}
124+
}
125+
```
126+
127+
```text
128+
Could not find driver program in the request
129+
```
130+
131+
There are two ways to fix this error.
132+
133+
- (Recommended) You can find the container image uri in your custom environment detail page, and set it as the flow base image in the flow.dag.yaml file. When you deploy the flow in UI, you just select **Use environment of current flow definition**, and the backend service will create the customized environment based on this base image and `requirement.txt` for your deployment. Learn more about [the environment specified in the flow definition](./flow-deploy.md#requirements-text-file).
134+
135+
- You can fix this error by adding `inference_config` in your custom environment definition.
136+
137+
Following is an example of customized environment definition.
138+
139+
```yaml
140+
$schema: https://azuremlschemas.azureedge.net/latest/environment.schema.json
141+
name: pf-customized-test
142+
build:
143+
path: ./image_build
144+
dockerfile_path: Dockerfile
145+
description: promptflow customized runtime
146+
inference_config:
147+
liveness_route:
148+
port: 8080
149+
path: /health
150+
readiness_route:
151+
port: 8080
152+
path: /health
153+
scoring_route:
154+
port: 8080
155+
path: /score
156+
```
157+
158+
### Model response taking too long
159+
160+
Sometimes, you might notice that the deployment is taking too long to respond. There are several potential factors for this to occur.
161+
162+
- The model used in the flow isn't powerful enough (example: use GPT 3.5 instead of text-ada)
163+
- Index query isn't optimized and taking too long
164+
- Flow has many steps to process
165+
166+
Consider optimizing the endpoint with above considerations to improve the performance of the model.
167+
168+
### Unable to fetch deployment schema
169+
170+
After you deploy the endpoint and want to test it in the **Test** tab in the deployment detail page, if the **Test** tab shows **Unable to fetch deployment schema**, you can try the following two methods to mitigate this issue:
171+
172+
:::image type="content" source="../media/prompt-flow//unable-to-fetch-deployment-schema.png" alt-text="Screenshot of the error unable to fetch deployment schema in Test tab in deployment detail page. " lightbox = "../media/prompt-flow/unable-to-fetch-deployment-schema.png":::
173+
174+
- Make sure you have granted the correct permission to the endpoint identity. Learn more about [how to grant permission to the endpoint identity](./flow-deploy.md#grant-permissions-to-the-endpoint).
175+
- It might be because you ran your flow in an old version runtime and then deployed the flow, the deployment used the environment of the runtime that was in old version as well. To update the runtime, follow [Update a runtime on the UI](./create-manage-compute-session.md#upgrade-compute-instance-runtime) and rerun the flow in the latest runtime and then deploy the flow again.
176+
177+
### Access denied to list workspace secret
178+
179+
If you encounter an error like "Access denied to list workspace secret", check whether you have granted the correct permission to the endpoint identity. Learn more about [how to grant permission to the endpoint identity](./flow-deploy.md#grant-permissions-to-the-endpoint).
328 KB
Loading
84.3 KB
Loading
55.1 KB
Loading
38.1 KB
Loading
623 KB
Loading
202 KB
Loading
Binary file not shown.
108 KB
Loading

0 commit comments

Comments
 (0)