Skip to content

Commit f78f03c

Browse files
committed
fixing typo
1 parent f3491d9 commit f78f03c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

articles/ai-foundry/openai/how-to/latency.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ At the time of the request, the requested generation size (`max_tokens` paramete
9898
In summary, reducing the number of tokens generated per request reduces the latency of each request.
9999

100100
> [!NOTE]
101-
> `max_tokens` only changes the length of a response and in some cases might truncate it. The paramter doesn't change the quality of the response.
101+
> `max_tokens` only changes the length of a response and in some cases might truncate it. The parameter doesn't change the quality of the response.
102102
103103
### Streaming
104104
Setting `stream: true` in a request makes the service return tokens as soon as they're available, instead of waiting for the full sequence of tokens to be generated. It doesn't change the time to get all the tokens, but it reduces the time for first response. This approach provides a better user experience since end-users can read the response as it is generated.

0 commit comments

Comments
 (0)