Skip to content

Adding support for pronunciation_dict_id#56

Open
bpanahij wants to merge 14 commits intocartesia-ai:mainfrom
Tavus-Engineering:Adding-support-for-pronunciation_dict_id
Open

Adding support for pronunciation_dict_id#56
bpanahij wants to merge 14 commits intocartesia-ai:mainfrom
Tavus-Engineering:Adding-support-for-pronunciation_dict_id

Conversation

@bpanahij
Copy link
Contributor

@bpanahij bpanahij commented Nov 8, 2025

Summary

Adds support for the pronunciation_dict_id parameter to TTS generation requests, enabling users to apply custom pronunciation dictionaries to their text-to-speech generations.

Changes

  • Added pronunciation_dict_id parameter to GenerationRequestParams and GenerationRequest types
    • src/cartesia/tts/requests/generation_request.py:72-76
    • src/cartesia/tts/types/generation_request.py:77-79
  • Added test coverage for both SSE and WebSocket interfaces
    • tests/custom/test_client.py:434-455 - SSE test
    • tests/custom/test_client.py:610-641 - WebSocket async test

Implementation Details

The pronunciation_dict_id parameter is:

  • Optional (typing_extensions.NotRequired in TypedDict, typing.Optional[str] in Pydantic model)
  • Available in both SSE and WebSocket generation methods
  • Applied on a per-generation basis, allowing different pronunciation dictionaries for different requests

Test Plan

  • ✅ Added test_sse_pronunciation_dict() to verify SSE endpoint accepts the parameter
  • ✅ Added test_ws_pronunciation_dict() to verify WebSocket endpoint accepts the parameter
  • ✅ Both tests validate audio generation works correctly with the parameter

This enables users of the official Cartesia Python client to leverage custom pronunciation dictionaries when generating speech.

type:
string
A pronunciation dict ID to use for the generation. This will be applied
to this TTS generation only.

This enable the use of custom pronunciation dicts when using the
official Cartesia python client.
…nciation_dict_id

Adding support for pronunciation_dict_id
type:
string
A pronunciation dict ID to use for the generation. This will be applied
to this TTS generation only.

This enable the use of custom pronunciation dicts when using the
official Cartesia python client.
@noahlt noahlt mentioned this pull request Nov 13, 2025
noahlt added a commit that referenced this pull request Nov 13, 2025
Takes changes from
#56 and adds support
in bytes method and ws send wrapper methods.

---------

Co-authored-by: Brian Johnson <brian@pjohnson.info>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant