feat(genapi): update faq explaining how multi-turn conversation works… (#4820)

fpagny · bene2k1 · web-flow · commit de9b23e4e4ef · 2025-04-11T10:45:28.000+02:00
* feat(genapi): update faq explaining how multi-turn conversation works with llm

* Apply suggestions from code review

---------

Co-authored-by: Benedikt Rollik &lt;brollik@scaleway.com&gt;
diff --git a/pages/generative-apis/troubleshooting/fixing-common-issues.mdx b/pages/generative-apis/troubleshooting/fixing-common-issues.mdx
@@ -141,6 +141,45 @@ The embedding model you are using generates vector representations with a fixed
   ```
 - Use a model with a lower number of dimensions. Using [Managed Inference](https://console.scaleway.com/inference/deployments), you can deploy for instance  the`sentence-t5-xxl` model, which represents vectors with `768` dimensions.
 
+## Previous messages are not taken into account by the model
+
+### Causes
+- Previous messages are not sent to the model
+- The content sent exceeds the maximum context window for this model
+
+### Solution
+- LLM models are completely "stateless" and thus do not store previous messages or conversations. For example, when building a chatbot application, each time a new message is sent by the user, all preceding messages in the conversation need to be sent through the API payload. Example payload for multi-turn conversation:
+```python
+from openai import OpenAI
+
+client = OpenAI(
+    base_url="https://api.scaleway.ai/v1",
+    api_key=os.getenv("SCW_SECRET_KEY")
+)
+
+response = client.chat.completions.create(
+    model="llama-3.1-8b-instruct",
+    messages=[
+      {
+        "role": "user",
+        "content": "What is the solution to 1+1= ?"
+      },
+      {
+        "role": "assistant",
+        "content": "2"
+      },
+      {
+        "role": "user",
+        "content": "Double this number"
+      }
+    ]
+)
+
+print(response.choices[0].message.content)
+```
+This snippet will output the model response, which is `4`.
+- When exceeding maximum context window, you should receive a `400 - BadRequestError` detailing context length value you exceeded. In this case, you should reduce the size of the content you send to the API.
+
 ## Best practices for optimizing model performance
 
 ### Input size management