Skip to content

Commit de9b23e

Browse files
fpagnybene2k1
andauthored
feat(genapi): update faq explaining how multi-turn conversation works… (#4820)
* feat(genapi): update faq explaining how multi-turn conversation works with llm * Apply suggestions from code review --------- Co-authored-by: Benedikt Rollik <[email protected]>
1 parent 3d7f5e4 commit de9b23e

File tree

1 file changed

+39
-0
lines changed

1 file changed

+39
-0
lines changed

pages/generative-apis/troubleshooting/fixing-common-issues.mdx

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,45 @@ The embedding model you are using generates vector representations with a fixed
141141
```
142142
- Use a model with a lower number of dimensions. Using [Managed Inference](https://console.scaleway.com/inference/deployments), you can deploy for instance the`sentence-t5-xxl` model, which represents vectors with `768` dimensions.
143143

144+
## Previous messages are not taken into account by the model
145+
146+
### Causes
147+
- Previous messages are not sent to the model
148+
- The content sent exceeds the maximum context window for this model
149+
150+
### Solution
151+
- LLM models are completely "stateless" and thus do not store previous messages or conversations. For example, when building a chatbot application, each time a new message is sent by the user, all preceding messages in the conversation need to be sent through the API payload. Example payload for multi-turn conversation:
152+
```python
153+
from openai import OpenAI
154+
155+
client = OpenAI(
156+
base_url="https://api.scaleway.ai/v1",
157+
api_key=os.getenv("SCW_SECRET_KEY")
158+
)
159+
160+
response = client.chat.completions.create(
161+
model="llama-3.1-8b-instruct",
162+
messages=[
163+
{
164+
"role": "user",
165+
"content": "What is the solution to 1+1= ?"
166+
},
167+
{
168+
"role": "assistant",
169+
"content": "2"
170+
},
171+
{
172+
"role": "user",
173+
"content": "Double this number"
174+
}
175+
]
176+
)
177+
178+
print(response.choices[0].message.content)
179+
```
180+
This snippet will output the model response, which is `4`.
181+
- When exceeding maximum context window, you should receive a `400 - BadRequestError` detailing context length value you exceeded. In this case, you should reduce the size of the content you send to the API.
182+
144183
## Best practices for optimizing model performance
145184

146185
### Input size management

0 commit comments

Comments
 (0)