You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/speech-services-quotas-and-limits.md
+18-1Lines changed: 18 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -115,11 +115,17 @@ The limits in this table apply per Speech resource when you create a personal vo
115
115
| REST API limit (not including speech synthesis) | Not available for F0 | 50 requests per 10 seconds |
116
116
| Max number of transactions per second (TPS) for speech synthesis|Not available for F0 |200 transactions per second (TPS) (default value) |
117
117
118
+
#### Batch text to speech avatar
119
+
120
+
| Quota | Free (F0)| Standard (S0) |
121
+
|-----|-----|-----|
122
+
| REST API limit | Not available for F0 | 2 requests per 1 minute |
123
+
118
124
#### Real-time text to speech avatar
119
125
120
126
| Quota | Free (F0)| Standard (S0) |
121
127
|-----|-----|-----|
122
-
| New connections per minute | Not available for F0 |Two new connections per minute |
128
+
| New connections per minute | Not available for F0 |2 new connections per minute |
123
129
124
130
#### Audio Content Creation tool
125
131
@@ -297,3 +303,14 @@ Initiate the increase of the limit for concurrent requests for your resource, or
297
303
- Any other required information.
298
304
1. On the **Review + create** tab, select **Create**.
299
305
1. Note the support request number in Azure portal notifications. You're contacted shortly about your request.
306
+
307
+
### Text to speech avatar: increase new connections limit
308
+
309
+
To increase the limit of new connections per minute for text to speech avatar, contact your sales representative to create a [ticket](https://portal.azure.com/#blade/Microsoft_Azure_Support/HelpAndSupportBlade/overview) with the following information:
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/text-to-speech-avatar/custom-avatar-record-video-samples.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -130,9 +130,9 @@ High-quality avatar models are built from high-quality video recordings, includi
130
130
131
131
## Data requirements
132
132
133
-
Doing some basic processing of your video data is helpful for model training efficiency, such as:
133
+
Doing some basic processing of your video data is helpful for model training efficiency, such as:
134
134
135
-
- Make sure that the character is in the middle of the screen, the size and position are consistent during the video processing. Each video processing parameter such as brightness, contrast remains the same and doesn't change.
135
+
- Make sure that the character is in the middle of the screen, the size and position are consistent during the video processing. Each video processing parameter such as brightness, contrast remains the same and doesn't change. The output avatar's size, position, brightness, contrast will directly reflect those present in the training data. We don't apply any alterations during processing or model building.
136
136
- The start and end of the clip should be kept in state 0; the actors should close their mouths and smile, and look ahead. The video should be continuous, not abrupt.
137
137
138
138
**Avatar training video recording file format:** .mp4 or .mov.
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/text-to-speech-avatar/real-time-synthesis-avatar.md
+13-2Lines changed: 13 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -82,9 +82,18 @@ const avatarConfig = new SpeechSDK.AvatarConfig(
82
82
83
83
Real-time avatar uses WebRTC protocol to output the avatar video stream. You need to set up the connection with the avatar service through WebRTC peer connection.
84
84
85
-
First, you need to create a WebRTC peer connection object. WebRTC is a P2P protocol, which relies on ICE server for network relay. Azure provides [Communication Services](../../../communication-services/overview.md), which can provide network relay function. Therefore, we recommend you fetch the ICE server from the Azure Communication resource, which is mentioned in the [prerequisites section](#prerequisites). You can also choose to use your own ICE server.
85
+
First, you need to create a WebRTC peer connection object. WebRTC is a P2P protocol, which relies on ICE server for network relay. Speech service provides network relay function and exposes a REST API to issue the ICE server information. Therefore, we recommend you fetch the ICE server from the speech service. You can also choose to use your own ICE server.
86
86
87
-
The following code snippet shows how to create the WebRTC peer connection. The ICE server URL, ICE server username, and ICE server credential can all be fetched from the network relay token you prepared in the [prerequisites section](#prerequisites) or from the configuration of your own ICE server.
87
+
Here is a sample request to fetch ICE information from the speech service endpoint:
88
+
89
+
```HTTP
90
+
GET /cognitiveservices/avatar/relay/token/v1 HTTP/1.1
91
+
92
+
Host: westus2.tts.speech.microsoft.com
93
+
Ocp-Apim-Subscription-Key: YOUR_RESOURCE_KEY
94
+
```
95
+
96
+
The following code snippet shows how to create the WebRTC peer connection. The ICE server URL, ICE server username, and ICE server credential can all be fetched from the payload of above HTTP request.
Our real-time API disconnects after 5 minutes of avatar's idle state. Even if the avatar is not idle and functioning normally, the real-time API will disconnect after a 10-minute connection. To ensure continuous operation of the real-time avatar for more than 10 minutes, you can enable auto-reconnect. For how to set up auto-reconnect, refer to this [sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/js/browser/avatar/README.md) (search "auto reconnect").
154
+
144
155
## Synthesize talking avatar video from text input
145
156
146
157
After the above steps, you should see the avatar video being played in the web browser. The avatar is active, with eye blink and slight body movement, but not speaking yet. The avatar is waiting for text input to start speaking.
0 commit comments