Skip to content

Commit de257aa

Browse files
authored
feat(genapi): update troubleshooting
1 parent d2b9b1b commit de257aa

File tree

1 file changed

+10
-0
lines changed

1 file changed

+10
-0
lines changed

pages/generative-apis/troubleshooting/fixing-common-issues.mdx

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,15 +111,25 @@ Below are common issues that you may encounter when using Generative APIs, their
111111
- The model goes into an infinite loop while processing the input (which is a known structural issue with several AI models)
112112

113113
### Solution
114+
For queries that are too long to process:
114115
- Set a stricter **maximum token limit** to prevent overly long responses.
115116
- Reduce the size of the input tokens, or split the input into multiple API requests.
116117
- Use [Managed Inference](/managed-inference/), where no query timeout is enforced.
117118

119+
For queries having the model entering an infinite loop (more frequent when using **structured output**):
120+
- Set `temperature` to the default value recommended for the model. These values can be found in [Generative APIs Playground](https://console.scaleway.com/generative-api/models/fr-par/playground) when selecting the model. Avoid using temperature `0` as this can lock the model on outputing only the next (and same) most probable token over and over.
121+
- Ensure the `top_p` parameter is not set too low (recommended value is the default one `1`).
122+
- Add a `presence_penalty` value in your request (`0.5` is a good starting value). This option will help the model choosing different tokens than the one it is looping on, although it might impact accuracy on some tasks requiring to repeat multiple similar outputs.
123+
- Use more recent models, which are usually more optimized to avoid loops, especially when using structured output.
124+
- Optimize system prompt to provide clearer and simpler tasks. Currently, JSON output accuracy still rely on heuristics to constrain models to output only valid JSON tokens, and thus depends on prompts given. As a counter-example, providing contradictory requirements to a model - such as `Never output JSON` in the system prompt and `response_format` as `json_schema" in the query - may lead a model to never output closing JSON brackets `}`.
125+
118126
## Structured output (e.g., JSON) is not working correctly
119127

120128
### Description
121129
- Structured output response contains invalid JSON
122130
- Structured output response is valid JSON but content is less relevant
131+
- Structured output response never ends (loop over characters such as `"`, `\t` or `\n`).
132+
- For this issue, see infinite loop in [504 Gateway Timeout](#504-gateway-timeout)
123133

124134
### Causes
125135
- Incorrect field naming in the request, such as using `"format"` instead of the correct `"response_format"` field.

0 commit comments

Comments
 (0)