You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/translator/text-translation/preview/overview.md
+4-3Lines changed: 4 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -63,14 +63,15 @@ To force the request to be handled within a specific geography, use the desired
63
63
64
64
Customers with a resource located in Switzerland North or Switzerland West can ensure that their Text API requests are served within Switzerland. To ensure that requests are handled in Switzerland, create the Translator resource in the `Resource region``Switzerland North` or `Switzerland West`, then use the resource's custom endpoint in your API requests.
65
65
66
-
> [!NOTE]
67
-
> When you deploy a large language model (LLM), The configuration options you select (global, data zone, or regional) determine where your data is processed.
66
+
### LLM processing
67
+
68
+
When you deploy a large language model (LLM), the configuration options you choose—global, data zone, or regional—directly impact and determine the specific location in which your data is processed. Therefore, your selections during setup play a significant role in defining the geographical boundaries for how and where the model processes your information.
68
69
69
70
#### Service limits
70
71
71
72
| Operation | Maximum Number of Array Elements | Maximum Size of Array Element | Generative AI LLM: Maximum Number of Array Elements | Generative AI LLM: Maximum Size of Array Element |
72
73
| --- | --- | --- | --- | --- |
73
-
| Translate | 1,000 | 50,000 | 50 |105,000 |
74
+
| Translate | 1,000 | 50,000 | 50 |5,000 |
74
75
75
76
The amount of computing resources you provide influences translation latency when you use generative AI large language models. By adjusting the capacity allocated during model deployment, you can affect latency.
0 commit comments