Skip to content

Commit a610c53

Browse files
feat: [google-cloud-texttospeech] Support promptable voices M4A AudioEncoding; doc fixes. (googleapis#14292)
- [ ] Regenerate this pull request now. BEGIN_COMMIT_OVERRIDE feat: Support promptable voices by specifying a model name and a prompt feat: Add enum value M4A to enum AudioEncoding docs: A comment for enum value `AUDIO_ENCODING_UNSPECIFIED` in enum `AudioEncoding` is changed feat: [google-cloud-texttospeech] Support promptable voices by specifying a model name and a prompt docs: A comment for method `StreamingSynthesize` in service `TextToSpeech` is changed docs: A comment for enum value `OGG_OPUS` in enum `AudioEncoding` is changed docs: A comment for enum value `PCM` in enum `AudioEncoding` is changed docs: A comment for field `low_latency_journey_synthesis` in message `.google.cloud.texttospeech.v1beta1.AdvancedVoiceOptions` is changed docs: A comment for enum value `PHONETIC_ENCODING_IPA` in enum `PhoneticEncoding` is changed docs: A comment for enum value `PHONETIC_ENCODING_X_SAMPA` in enum `PhoneticEncoding` is changed docs: A comment for field `phrase` in message `.google.cloud.texttospeech.v1beta1.CustomPronunciationParams` is changed docs: A comment for field `pronunciations` in message `.google.cloud.texttospeech.v1beta1.CustomPronunciations` is changed docs: A comment for message `MultiSpeakerMarkup` is changed docs: A comment for field `custom_pronunciations` in message `.google.cloud.texttospeech.v1beta1.SynthesisInput` is changed docs: A comment for field `voice_clone` in message `.google.cloud.texttospeech.v1beta1.VoiceSelectionParams` is changed docs: A comment for field `speaking_rate` in message `.google.cloud.texttospeech.v1beta1.AudioConfig` is changed docs: A comment for field `audio_encoding` in message `.google.cloud.texttospeech.v1beta1.StreamingAudioConfig` is changed docs: A comment for field `text` in message `.google.cloud.texttospeech.v1beta1.StreamingSynthesisInput` is changed END_COMMIT_OVERRIDE feat: Add enum value M4A to enum AudioEncoding docs: A comment for enum value `AUDIO_ENCODING_UNSPECIFIED` in enum `AudioEncoding` is changed PiperOrigin-RevId: 799573824 Source-Link: googleapis/googleapis@a92cee3 Source-Link: https://github.com/googleapis/googleapis-gen/commit/d182858feeaba9d167f784aaf301f395d1d288b4 Copy-Tag: eyJwIjoicGFja2FnZXMvZ29vZ2xlLWNsb3VkLXRleHR0b3NwZWVjaC8uT3dsQm90LnlhbWwiLCJoIjoiZDE4Mjg1OGZlZWFiYTlkMTY3Zjc4NGFhZjMwMWYzOTVkMWQyODhiNCJ9 BEGIN_NESTED_COMMIT feat: [google-cloud-texttospeech] Support promptable voices by specifying a model name and a prompt feat: Add enum value M4A to enum AudioEncoding docs: A comment for method `StreamingSynthesize` in service `TextToSpeech` is changed docs: A comment for enum value `AUDIO_ENCODING_UNSPECIFIED` in enum `AudioEncoding` is changed docs: A comment for enum value `OGG_OPUS` in enum `AudioEncoding` is changed docs: A comment for enum value `PCM` in enum `AudioEncoding` is changed docs: A comment for field `low_latency_journey_synthesis` in message `.google.cloud.texttospeech.v1beta1.AdvancedVoiceOptions` is changed docs: A comment for enum value `PHONETIC_ENCODING_IPA` in enum `PhoneticEncoding` is changed docs: A comment for enum value `PHONETIC_ENCODING_X_SAMPA` in enum `PhoneticEncoding` is changed docs: A comment for field `phrase` in message `.google.cloud.texttospeech.v1beta1.CustomPronunciationParams` is changed docs: A comment for field `pronunciations` in message `.google.cloud.texttospeech.v1beta1.CustomPronunciations` is changed docs: A comment for message `MultiSpeakerMarkup` is changed docs: A comment for field `custom_pronunciations` in message `.google.cloud.texttospeech.v1beta1.SynthesisInput` is changed docs: A comment for field `voice_clone` in message `.google.cloud.texttospeech.v1beta1.VoiceSelectionParams` is changed docs: A comment for field `speaking_rate` in message `.google.cloud.texttospeech.v1beta1.AudioConfig` is changed docs: A comment for field `audio_encoding` in message `.google.cloud.texttospeech.v1beta1.StreamingAudioConfig` is changed docs: A comment for field `text` in message `.google.cloud.texttospeech.v1beta1.StreamingSynthesisInput` is changed PiperOrigin-RevId: 799242210 Source-Link: googleapis/googleapis@b738e78 Source-Link: https://github.com/googleapis/googleapis-gen/commit/45f4f522fd302bbbfe6fe2f0d08a26f020e90e19 Copy-Tag: eyJwIjoicGFja2FnZXMvZ29vZ2xlLWNsb3VkLXRleHR0b3NwZWVjaC8uT3dsQm90LnlhbWwiLCJoIjoiNDVmNGY1MjJmZDMwMmJiYmZlNmZlMmYwZDA4YTI2ZjAyMGU5MGUxOSJ9 END_NESTED_COMMIT --------- Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com> Co-authored-by: Victor Chudnovsky <vchudnov@google.com>
1 parent cfaafcc commit a610c53

25 files changed

+186
-63
lines changed

packages/google-cloud-texttospeech/google/cloud/texttospeech/gapic_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,4 @@
1313
# See the License for the specific language governing permissions and
1414
# limitations under the License.
1515
#
16-
__version__ = "2.27.0" # {x-release-please-version}
16+
__version__ = "0.0.0" # {x-release-please-version}

packages/google-cloud-texttospeech/google/cloud/texttospeech_v1/gapic_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,4 @@
1313
# See the License for the specific language governing permissions and
1414
# limitations under the License.
1515
#
16-
__version__ = "2.27.0" # {x-release-please-version}
16+
__version__ = "0.0.0" # {x-release-please-version}

packages/google-cloud-texttospeech/google/cloud/texttospeech_v1/services/text_to_speech/async_client.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -438,7 +438,7 @@ async def sample_synthesize_speech():
438438
voice.language_code = "language_code_value"
439439
440440
audio_config = texttospeech_v1.AudioConfig()
441-
audio_config.audio_encoding = "PCM"
441+
audio_config.audio_encoding = "M4A"
442442
443443
request = texttospeech_v1.SynthesizeSpeechRequest(
444444
input=input,

packages/google-cloud-texttospeech/google/cloud/texttospeech_v1/services/text_to_speech/client.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -856,7 +856,7 @@ def sample_synthesize_speech():
856856
voice.language_code = "language_code_value"
857857
858858
audio_config = texttospeech_v1.AudioConfig()
859-
audio_config.audio_encoding = "PCM"
859+
audio_config.audio_encoding = "M4A"
860860
861861
request = texttospeech_v1.SynthesizeSpeechRequest(
862862
input=input,

packages/google-cloud-texttospeech/google/cloud/texttospeech_v1/services/text_to_speech_long_audio_synthesize/async_client.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -340,7 +340,7 @@ async def sample_synthesize_long_audio():
340340
input.text = "text_value"
341341
342342
audio_config = texttospeech_v1.AudioConfig()
343-
audio_config.audio_encoding = "PCM"
343+
audio_config.audio_encoding = "M4A"
344344
345345
voice = texttospeech_v1.VoiceSelectionParams()
346346
voice.language_code = "language_code_value"

packages/google-cloud-texttospeech/google/cloud/texttospeech_v1/services/text_to_speech_long_audio_synthesize/client.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -768,7 +768,7 @@ def sample_synthesize_long_audio():
768768
input.text = "text_value"
769769
770770
audio_config = texttospeech_v1.AudioConfig()
771-
audio_config.audio_encoding = "PCM"
771+
audio_config.audio_encoding = "M4A"
772772
773773
voice = texttospeech_v1.VoiceSelectionParams()
774774
voice.language_code = "language_code_value"

packages/google-cloud-texttospeech/google/cloud/texttospeech_v1/types/cloud_tts.py

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,8 @@ class AudioEncoding(proto.Enum):
8181
8282
Values:
8383
AUDIO_ENCODING_UNSPECIFIED (0):
84-
Not specified. Will return result
84+
Not specified. Only used by GenerateVoiceCloningKey.
85+
Otherwise, will return result
8586
[google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT].
8687
LINEAR16 (1):
8788
Uncompressed 16-bit signed little-endian
@@ -109,6 +110,8 @@ class AudioEncoding(proto.Enum):
109110
samples (Linear PCM). Note that as opposed to
110111
LINEAR16, audio won't be wrapped in a WAV (or
111112
any other) header.
113+
M4A (8):
114+
M4A audio.
112115
"""
113116
AUDIO_ENCODING_UNSPECIFIED = 0
114117
LINEAR16 = 1
@@ -117,6 +120,7 @@ class AudioEncoding(proto.Enum):
117120
MULAW = 5
118121
ALAW = 6
119122
PCM = 7
123+
M4A = 8
120124

121125

122126
class ListVoicesRequest(proto.Message):
@@ -520,6 +524,10 @@ class VoiceSelectionParams(proto.Message):
520524
[VoiceCloneParams.voice_clone_key] is set, the service
521525
chooses the voice clone matching the specified
522526
configuration.
527+
model_name (str):
528+
Optional. The name of the model. If set, the
529+
service will choose the model matching the
530+
specified configuration.
523531
"""
524532

525533
language_code: str = proto.Field(
@@ -545,6 +553,10 @@ class VoiceSelectionParams(proto.Message):
545553
number=5,
546554
message="VoiceCloneParams",
547555
)
556+
model_name: str = proto.Field(
557+
proto.STRING,
558+
number=6,
559+
)
548560

549561

550562
class AudioConfig(proto.Message):
@@ -802,6 +814,11 @@ class StreamingSynthesisInput(proto.Message):
802814
may not be used with any other voices.
803815
804816
This field is a member of `oneof`_ ``input_source``.
817+
prompt (str):
818+
This is system instruction supported only for
819+
controllable voice models.
820+
821+
This field is a member of `oneof`_ ``_prompt``.
805822
"""
806823

807824
text: str = proto.Field(
@@ -814,6 +831,11 @@ class StreamingSynthesisInput(proto.Message):
814831
number=5,
815832
oneof="input_source",
816833
)
834+
prompt: str = proto.Field(
835+
proto.STRING,
836+
number=6,
837+
optional=True,
838+
)
817839

818840

819841
class StreamingSynthesizeRequest(proto.Message):

packages/google-cloud-texttospeech/google/cloud/texttospeech_v1beta1/gapic_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,4 @@
1313
# See the License for the specific language governing permissions and
1414
# limitations under the License.
1515
#
16-
__version__ = "2.27.0" # {x-release-please-version}
16+
__version__ = "0.0.0" # {x-release-please-version}

packages/google-cloud-texttospeech/google/cloud/texttospeech_v1beta1/services/text_to_speech/async_client.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -438,7 +438,7 @@ async def sample_synthesize_speech():
438438
voice.language_code = "language_code_value"
439439
440440
audio_config = texttospeech_v1beta1.AudioConfig()
441-
audio_config.audio_encoding = "PCM"
441+
audio_config.audio_encoding = "M4A"
442442
443443
request = texttospeech_v1beta1.SynthesizeSpeechRequest(
444444
input=input,
@@ -547,7 +547,7 @@ def streaming_synthesize(
547547
metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),
548548
) -> Awaitable[AsyncIterable[cloud_tts.StreamingSynthesizeResponse]]:
549549
r"""Performs bidirectional streaming speech synthesis:
550-
receive audio while sending text.
550+
receives audio while sending text.
551551
552552
.. code-block:: python
553553

packages/google-cloud-texttospeech/google/cloud/texttospeech_v1beta1/services/text_to_speech/client.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -856,7 +856,7 @@ def sample_synthesize_speech():
856856
voice.language_code = "language_code_value"
857857
858858
audio_config = texttospeech_v1beta1.AudioConfig()
859-
audio_config.audio_encoding = "PCM"
859+
audio_config.audio_encoding = "M4A"
860860
861861
request = texttospeech_v1beta1.SynthesizeSpeechRequest(
862862
input=input,
@@ -962,7 +962,7 @@ def streaming_synthesize(
962962
metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),
963963
) -> Iterable[cloud_tts.StreamingSynthesizeResponse]:
964964
r"""Performs bidirectional streaming speech synthesis:
965-
receive audio while sending text.
965+
receives audio while sending text.
966966
967967
.. code-block:: python
968968

0 commit comments

Comments
 (0)