For ONVIF TTS audio proposal, to support device with TTS function#694
For ONVIF TTS audio proposal, to support device with TTS function#694Peggy0422 wants to merge 19 commits intodevelopmentfrom
Conversation
1. Added AddTTSAudioClip request and AddTTSAudioClip response for sending a text and its TTS configuration to the device(1621-1652)(2036-2041)(2418-2422)(2935-2943). 2. Added complex types "TTS Audio" (1465-1485)for TTSConfiguration to support TTS function. It includes parameters Content, Language, VoiceType. 3. updated AudioClipCapabilities with TTSCapabilities(177-181), and added complex types for TTSCapabilities(201-220)to indicate the device supports TTS function and its corresponding configuration. complex types TTSCapabilities includes MaxContentLength, TTSLanguage and TTSVoiceType. 4. Added simpleType TTSLanguage(220-231) and TTSVoiceType(232-238).
1. Added detailed descriptions for AddTTSAudioClip operations, explaining their purpose, parameters, and responses.(2359-2416) 2. updated audio clip Capabilities with TTSCapabilities.(2698-2700)
update code line information for TTS function
correct some editorial errors
Updated the description of the AddTTSAudioClip operation to clarify the parameters and response. Updated the description of TTScapabilities.
TTS audio clip pull request was firstly created as number 668
Updated TTS configuration description and added TTSCapabilities entry.
|
OLD PR for reference |
doc/Media2.xml
Outdated
| </varlistentry> | ||
| </variablelist> | ||
| <para></para> | ||
| <para><emphasis role="bold">Note:</emphasis> Audio clip uploads to the device can fail in the following scenarios, and a specific HTTP error code should be returned to the client when an upload fails.</para> |
There was a problem hiding this comment.
this note seems not applicable for TTSAudioClip
There was a problem hiding this comment.
Yes, it is not for TTS, I will delete it.
delete inappropriate note for OPTION AddTTSAudioClip
johado
left a comment
There was a problem hiding this comment.
Some small textual comments.
doc/Media2.xml
Outdated
| <title>AddTTSAudioClip</title> | ||
| <para>This operation adds a text, audio clip configuration and TTS configuration to the device, for device converting the text to an audio clip based on the TTS configuration. | ||
| The response to the command includes a unique token for this converted audio clip. | ||
| If the device is unable to support language specified in the TTS configuration, the associated configuration will deleted from the device.</para> |
There was a problem hiding this comment.
add "be" to "will be deleted"
doc/Media2.xml
Outdated
| <term>response</term> | ||
| <listitem> | ||
| <para role="param">Token - [tt:ReferenceToken]</para> | ||
| <para role="text">Unique token of the TTS audio clip to be uploaded.</para> |
There was a problem hiding this comment.
Change "to be uploaded" to "that was added" ?
There was a problem hiding this comment.
Thank you very much for your advise, we consider using the word "assign", which should be more precise.
doc/Media2.xml
Outdated
| </varlistentry> | ||
| <varlistentry> | ||
| <term>TTSCapabilities</term> | ||
| <listitem><para>Indicates device supports TTS function and TTS configuration.See tr2: TTSCapabilities.</para></listitem> |
There was a problem hiding this comment.
Add space after .: "..configuration. See tr2:..."
wsdl/ver20/media/wsdl/media.wsdl
Outdated
| </xs:element> | ||
| <xs:element name="Language" type="xs:string"> | ||
| <xs:annotation> | ||
| <xs:documentation>Language for the TTS audio clip playback. See tr2: TTSLanguage. </xs:documentation> |
There was a problem hiding this comment.
Change to "See tr2:TTSLanguage and TTSCapabilities." ?
There was a problem hiding this comment.
Thank you for your option. TTSLanguage is an attribute within TTSCapability already. If we want to point out that the language for TTS audio clip playback must be one of the languages that supported by the device, we could consider revise the explanation to clearly indicate this, such as: "The language which is supported and used for TTS audio clip playback. "
wsdl/ver20/media/wsdl/media.wsdl
Outdated
| </xs:element> | ||
| <xs:element name="VoiceType" type="xs:string"> | ||
| <xs:annotation> | ||
| <xs:documentation>The voice type for the TTS audio clip playback. See tr2: TTSVoiceType.</xs:documentation> |
There was a problem hiding this comment.
Change to "See tr2:TTSVoiceType and TTSCapabilities." ?
There was a problem hiding this comment.
I propose to update the explanation for TTSVoiceType, just like commit for TTSLanguage
wsdl/ver20/media/wsdl/media.wsdl
Outdated
| <xs:sequence> | ||
| <xs:element name="Token" type="tt:ReferenceToken"> | ||
| <xs:annotation> | ||
| <xs:documentation>Unique token of the TTS audio clip to be uploaded.</xs:documentation> |
There was a problem hiding this comment.
change "to be uploaded" to something more relevant. converted, generated, ..?
There was a problem hiding this comment.
Thank you very much for bring it up, yes, we consider changing it and using the word "assign", which should be more precise.
wsdl/ver20/media/wsdl/media.wsdl
Outdated
| <xs:anyAttribute processContents="lax"/> | ||
| </xs:complexType> | ||
| <!--===============TTS Language================--> | ||
| <xs:simpleType name="TTSLanguage"> |
There was a problem hiding this comment.
What is reasoning behind decision of languages in below list?
There was a problem hiding this comment.
Is there any standard for offical language names that can be refered to?
TTSCapabilities and TTSAudio uses open strings, so enum should provide a good pattern.
There was a problem hiding this comment.
There was a problem hiding this comment.
Thank you so much for your comments! We truly appreciate your input and have been carefully considering how to best define these general concepts. Your mention of ISO international standards was particularly helpful and guided our further research. We also looked into RFC 5646 for language representation across countries. So we would like to use alpha-2 codes to represent languages and countries, as recommended in ISO 639-1 and ISO 3166-1. For languages with regional variations, we plan to adopt the language-country format (e.g., en-US, zh-CN). Thank you again for your feedback.
doc/Media2.xml
Outdated
| </itemizedlist> | ||
| </section> | ||
| </section> | ||
| <section xml:id="section_wvd_dzg_rye"> |
There was a problem hiding this comment.
id should be unique in xml, right? seems as it is a copy of SetAudioClip section below
There was a problem hiding this comment.
Yes, thank you for the suggestion. I have revised it accordingly.
update description for TTSLanguage and TTSVoiceType
Update documentation for Token element for AddTTSAudioClip response
Updated TTSLanguage type to include ISO language and country codes with documentation.
wsdl/ver20/media/wsdl/media.wsdl
Outdated
| See <a href="https://www.iso.org/obp/ui/">ISO Country Codes</a>. | ||
| </xs:documentation> | ||
| </xs:annotation> | ||
| <xs:restriction base="xs:string"> |
There was a problem hiding this comment.
Do we really need to make an explicit restriction here and not just defined it as a string? If we go this way, whenever we need to add a language we need to update the WSDL file.
There was a problem hiding this comment.
Thank you very much for your comment! Yes, this is an important issue we should considered.
Previously, we defined languages using string format and listed commonly used or potentially needed languages. However, this approach does introduce a maintenance burden—as you pointed out, each new language addition would require updating the WSDL file.To address this, we now directly reference ISO-standard language codes via strings. Users may refer to the official ISO codes for specific needs, while the WSDL only defines the reference rules. The examples in TTSLanguage are provided for convenience. I hope this clarifies the approach. Thank you again for your comment!
Added note about enumeration values being illustrative in TTSLanguage.
Revise the description of language definition in TTScapability and TTSAudio
To support audio product with TTS function, several operation should be done:
Added TTSCapabilities(Optional): indicate whether the device is capable of TTS function and its corresponding TTS configuration. So add complex type "TTSCapabilities" to the existing complex type "AudioClipCapabilities".
Parameter:
Parameter:
Reponse:
media2.wsdl
media2.xml and documentation