You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Supported and unsupported SDK features for personal voice
89
+
90
+
The following table outlines which SDK features are supported for Phoenix and Dragon models. For details on how to utilize these SDK features in your applications, refer to [Subscribe to synthesizer events](how-to-speech-synthesis.md#subscribe-to-synthesizer-events).
91
+
92
+
|**SDK features**|**Description**|**Supported in Phoenix**|**Supported in Dragon**|
| Word boundary | Signals that a word boundary was received during synthesis, providing precise word timing during the speech synthesis process. | Yes | No |
95
+
| Viseme events | Provides viseme (lips, jaw, and tongue movement) information during synthesis, allowing visual synchronization. | Yes | No |
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/text-to-speech-avatar/custom-avatar-record-video-samples.md
+23-1Lines changed: 23 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -113,7 +113,7 @@ Gesture video clips are optional, and customers who have the need to insert cert
113
113
114
114
**Gesture tips:**
115
115
- Each gesture clip should be within 10 seconds.
116
-
- Gestures should start from status 0 and end with status 0; otherwise, the gestureclip can't be smoothly inserted into the avatar video.
116
+
- Gestures should start from status 0 and end with status 0. It's essential that the character maintains the same position as in status 0, which is in the middle of the screen, throughout the gesture. Otherwise, the gesture clip can't be smoothly inserted into the avatar video.
117
117
- The gesture clip only captures the body gestures; the actor doesn't have to speak during making gestures.
118
118
- We recommend designing a list of gestures before recording; here are some examples of gesture video clips:
119
119
@@ -132,6 +132,28 @@ High-quality avatar models are built from high-quality video recordings, includi
132
132
|---------|--------------|
133
133
| - Ensure all video clips are taken in the same conditions.</br>- During the recording process, design the size and display area of the character you need so that the character can be displayed on the screen appropriately.</br> - Actor should be steady during the recording. </br> - Mind facial expressions, which should be suitable for the avatar's use case. For example, look positive and smile if the custom text to speech avatar is used as customer service. Look professionally if the avatar is used for news reporting.</br> - Maintain eye gaze towards the camera, even when using a teleprompter.</br> - Return your body to status 0 when pausing speaking.</br> - Speak on a self-chosen topic, and minor speech mistakes like miss a word or mispronounced are acceptable. If the actor misses a word or mispronounces something, just go back to status 0, pause for 3 seconds, and then continue speaking.</br> - Consciously pause between sentences and paragraphs. When pausing, go back to the status 0 and close your lips. </br> - The audio should be clear and loud enough; bad audio quality impacts training result.</br> - Keep the shooting environment quiet. | - Don't adjust the camera parameters, focal length, position, angle of view. Don't move the camera; keep the person's position, size, angle, consistent in the camera.</br> - Characters that are too small might lead to a loss of image quality during post-processing. Characters that are too large might cause the screen to overflow during gestures and movements.</br> - Don't make too long gestures or too much movement for one gesture; for example, actor's hands are always making gestures and forget to go back to status 0.</br> - The actor's movements and gestures must not block the face.</br> - Avoid small movements of the actor like licking lips, touching hair, talking sideways, constant head shaking during speech, and not closing up after speaking.</br> - Avoid background noise; staff should avoid walking and talking during video recording.</br> - Avoid other people's voice recorded during the actor speaking. |
134
134
135
+
### How to prepare an interaction video clip
136
+
137
+
Creating a high-quality interaction video clip is essential if you're building a real-time conversation with a custom avatar. The clip should consist of a question-and-answer format, where a photographer asks a question, and the actor responds. Loop the question-answer pair until the conversation is complete. If you're filming alone, imagine someone else asking the questions during the asking phase.
138
+
139
+
Here are some tips for each phase:
140
+
141
+
**Asking phase:**
142
+
- Maintain status 0, don't speak, but still feel relaxed.
143
+
- Even remaining in status 0, don't keep still. Perform like you're waiting.
144
+
- Maintain a smile as if listening or waiting patiently.
145
+
- Avoid nodding frequently.
146
+
- Length: Each asking slot should last around 3–5 seconds.
147
+
148
+
**Answering phase:**
149
+
- Speak naturally with natural hand gestures from time to time.
150
+
- Use natural and common gestures when speaking. Avoid meaningful gestures like pointing, applause, or thumbs up.
151
+
- Begin gestures after starting to speak, and stop them before you finish.
152
+
- Length: Each answering slot should last around 5 seconds.
153
+
154
+
**Total video length:**
155
+
- Aim for a total video length of 1–5 minutes.
156
+
135
157
## Data requirements
136
158
137
159
Doing some basic processing of your video data is helpful for model training efficiency, such as:
Copy file name to clipboardExpand all lines: articles/ai-services/translator/document-translation/faq.md
+11-5Lines changed: 11 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,25 +22,31 @@ ms.author: lajanuar
22
22
23
23
If the language of the content in the source document is known, we recommend that you specify the source language in the request to get a better translation. If the document has content in multiple languages or the language is unknown, then don't specify the source language in the request. Document Translation automatically identifies language for each text segment and translates.
24
24
25
-
#### To what extent are the layout, structure, and formatting maintained?
25
+
#### To what extent are document layout, structure, formatting, and font style retained?
26
26
27
-
When text is translated from the source to target language, the overall length of translated text can differ from source. The result could be reflow of text across pages. The same fonts aren't always available in both source and target language. In general, the same font style is applied in target language to retain formatting closer to source.
27
+
* PDF documents generated from digital file formats (also known as "native" PDFs) provide optimal output.
28
+
29
+
* Printed documents scanned into an electronic format (scanned PDF files) can result in loss of the original formatting, layout, and style.
30
+
31
+
* The translation of text from one language to another can alter its length. This variation can impact the layout, causing the text to reflow or shift across different pages.
32
+
33
+
* Various factors influence the preservation and retention of font style. For instance, some fonts aren't available in both the source and target languages. Typically, the same font style, or an optimally suited alternative, is applied to the target language to maintain formatting that most closely resembles the original source text.
28
34
29
35
#### Will the text in an image within a document gets translated?
30
36
31
-
​No. The text in an image within a document isn't translated.
37
+
No. The text in an image within a document isn't translated.
32
38
33
39
#### Can Document Translation translate content from scanned documents?
34
40
35
41
Yes. Document Translation translates content from _scanned PDF_ documents.
36
42
37
43
#### Can encrypted or password-protected documents be translated?
38
44
39
-
​No. The service can't translate encrypted or password-protected documents. If your scanned or text-embedded PDFs are password-locked, you must remove the lock before submission.
45
+
No. The service can't translate encrypted or password-protected documents. If your scanned or text-embedded PDFs are password-locked, you must remove the lock before submission.
40
46
41
47
#### If I'm using managed identities, do I also need a SAS token URL?
42
48
43
-
​No. Don't include SAS token-appended URLs. Managed identities eliminate the need for you to include shared access signature tokens (SAS) with your HTTP requests.
49
+
No. Don't include SAS token-appended URLs. Managed identities eliminate the need for you to include shared access signature tokens (SAS) with your HTTP requests.
Copy file name to clipboardExpand all lines: articles/search/retrieval-augmented-generation-overview.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ The decision about which information retrieval system to use is critical because
30
30
Azure AI Search is a [proven solution for information retrieval](/azure/developer/python/get-started-app-chat-template?tabs=github-codespaces) in a RAG architecture. It provides indexing and query capabilities, with the infrastructure and security of the Azure cloud. Through code and other components, you can design a comprehensive RAG solution that includes all of the elements for generative AI over your proprietary content.
31
31
32
32
> [!NOTE]
33
-
> New to copilot and RAG concepts? Watch [Vector search and state of the art retrieval for Generative AI apps](https://ignite.microsoft.com/sessions/18618ca9-0e4d-4f9d-9a28-0bc3ef5cf54e?source=sessions).
33
+
> New to copilot and RAG concepts? Watch [Vector search and state of the art retrieval for Generative AI apps](https://www.youtube.com/watch?v=lSzc1MJktAo).
34
34
35
35
## Approaches for RAG with Azure AI Search
36
36
@@ -222,6 +222,8 @@ A RAG solution that includes Azure AI Search can leverage [built-in data chunkin
222
222
223
223
+[Try this RAG quickstart](search-get-started-rag.md) for a demonstration of query integration with chat models over a search index.
224
224
225
+
+[Tutorial: How to build a RAG solution in Azure AI Search](tutorial-rag-build-solution.md) for focused coverage on the features and pattern for RAG solutions that obtain grounding data from a search index.
226
+
225
227
+ Start with solution accelerators:
226
228
227
229
+["Chat with your data" solution accelerator](https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator) helps you create a custom RAG solution over your content.
0 commit comments