Skip to content

Commit c68c180

Browse files
committed
docs: update README.md to reflect new API features and usage examples
1 parent 1ea4829 commit c68c180

File tree

1 file changed

+153
-120
lines changed

1 file changed

+153
-120
lines changed

README.md

Lines changed: 153 additions & 120 deletions
Original file line numberDiff line numberDiff line change
@@ -6,212 +6,245 @@
66
[![codecov](https://img.shields.io/codecov/c/github/fishaudio/fish-audio-python)](https://codecov.io/gh/fishaudio/fish-audio-python)
77
[![License](https://img.shields.io/github/license/fishaudio/fish-audio-python)](https://github.com/fishaudio/fish-audio-python/blob/main/LICENSE)
88

9-
The official Python library for the Fish Audio API.
9+
The official Python library for the Fish Audio API
1010

11-
## Notice: New API Available
11+
**Documentation:** [Python SDK Guide](https://docs.fish.audio/developer-guide/sdk-guide/python/) | [API Reference](https://docs.fish.audio/api-reference/sdk/python/)
1212

13-
The SDK now includes a modern `fishaudio` API with improved ergonomics, better type safety, and enhanced features.
14-
15-
For new projects, use the `fishaudio` module. For existing projects using the legacy API, see the [Legacy SDK section](#legacy-sdk) below
16-
17-
## API Documentation
18-
19-
For complete documentation and API reference, visit the [Python SDK Guide](https://docs.fish.audio/developer-guide/sdk-guide/python/) and [API Reference](https://docs.fish.audio/api-reference/sdk/python/).
13+
> **Note:** If you're using the legacy `fish_audio_sdk` API, see the [migration guide](https://docs.fish.audio/archive/python-sdk-legacy/migration-guide) to upgrade.
2014
2115
## Installation
2216

23-
This package is available on PyPI:
24-
2517
```bash
2618
pip install fish-audio-sdk
19+
20+
# With audio playback utilities
21+
pip install fish-audio-sdk[utils]
2722
```
2823

29-
You may install from source by running the following command in the repository root:
24+
## Authentication
25+
26+
Get your API key from [fish.audio/app/api-keys](https://fish.audio/app/api-keys):
3027

3128
```bash
32-
python -m pip install .
29+
export FISH_API_KEY=your_api_key_here
3330
```
3431

35-
## Usage
36-
37-
The client will need to be configured with an API key, which you can obtain from [Fish Audio](https://fish.audio/app/api-keys).
32+
Or provide directly:
3833

3934
```python
4035
from fishaudio import FishAudio
4136

42-
client = FishAudio() # Automatically reads from the FISH_API_KEY environment variable
43-
44-
client = FishAudio(api_key="your-api-key") # Or provide the API key directly
37+
client = FishAudio(api_key="your_api_key")
4538
```
4639

47-
The SDK provides [text-to-speech](#text-to-speech), [voice cloning](#instant-voice-cloning), [speech recognition](#speech-recognition-asr), and [voice management](#voice-management) capabilities.
48-
49-
### Text-to-Speech
50-
51-
Convert text to natural-sounding speech with support for multiple voices, formats, and real-time streaming.
40+
## Quick Start
5241

53-
#### Basic
42+
**Synchronous:**
5443

5544
```python
5645
from fishaudio import FishAudio
57-
from fishaudio.utils import save, play
46+
from fishaudio.utils import play, save
5847

5948
client = FishAudio()
6049

61-
audio = client.tts.convert(text="Hello, world!") # Default voice and settings
62-
play(audio) # Play audio directly
50+
# Generate audio
51+
audio = client.tts.convert(text="Hello, world!")
6352

64-
audio = client.tts.convert(text="Welcome to Fish Audio SDK!")
65-
save(audio, "output.mp3") # You can also save to a file
53+
# Play or save
54+
play(audio)
55+
save(audio, "output.mp3")
56+
```
57+
58+
**Asynchronous:**
59+
60+
```python
61+
import asyncio
62+
from fishaudio import AsyncFishAudio
63+
from fishaudio.utils import play, save
64+
65+
async def main():
66+
client = AsyncFishAudio()
67+
audio = await client.tts.convert(text="Hello, world!")
68+
play(audio)
69+
save(audio, "output.mp3")
70+
71+
asyncio.run(main())
6672
```
6773

68-
#### With Reference Voice
74+
## Core Features
6975

70-
Use a reference voice ID to ensure consistent voice characteristics across generations:
76+
### Text-to-Speech
77+
78+
**With custom voice:**
7179

7280
```python
73-
# Use an existing voice by ID
81+
# Use a specific voice by ID
7482
audio = client.tts.convert(
75-
text="This will sound like the reference voice!",
76-
reference_id="802e3bc2b27e49c2995d23ef70e6ac89" # Energetic Male
83+
text="Custom voice",
84+
reference_id="802e3bc2b27e49c2995d23ef70e6ac89"
7785
)
7886
```
7987

80-
#### Instant Voice Cloning
81-
82-
Immediately clone a voice from a short audio sample:
88+
**With speed control:**
8389

8490
```python
85-
# Clone a voice from audio sample
86-
with open("reference.wav", "rb") as f:
87-
audio = client.tts.convert(
88-
text="This will sound like the reference voice!",
89-
reference_audio=f.read(),
90-
reference_text="Transcription of the reference audio"
91-
)
91+
audio = client.tts.convert(
92+
text="Speaking faster!",
93+
speed=1.5 # 1.5x speed
94+
)
9295
```
9396

94-
#### Streaming Audio Chunks
97+
**Reusable configuration:**
98+
99+
```python
100+
from fishaudio.types import TTSConfig, Prosody
95101

96-
For processing audio chunks as they're generated:
102+
config = TTSConfig(
103+
prosody=Prosody(speed=1.2, volume=-5),
104+
reference_id="933563129e564b19a115bedd57b7406a",
105+
format="wav",
106+
latency="balanced"
107+
)
108+
109+
# Reuse across generations
110+
audio1 = client.tts.convert(text="First message", config=config)
111+
audio2 = client.tts.convert(text="Second message", config=config)
112+
```
113+
114+
**Chunk-by-chunk processing:**
97115

98116
```python
99-
# Stream and process audio chunks
100-
for chunk in client.tts.stream(text="Long text content..."):
101-
# Process each chunk as it arrives
117+
# Stream and process chunks as they arrive
118+
for chunk in client.tts.stream(text="Long content..."):
102119
send_to_websocket(chunk)
103120

104121
# Or collect all chunks
105122
audio = client.tts.stream(text="Hello!").collect()
106123
```
107124

108-
#### Real-time WebSocket Streaming
125+
[Learn more](https://docs.fish.audio/developer-guide/sdk-guide/python/text-to-speech)
109126

110-
For low-latency bidirectional streaming where you send text chunks and receive audio in real-time:
127+
### Speech-to-Text
111128

112129
```python
113-
from fishaudio import FishAudio
114-
from fishaudio.utils import play
130+
# Transcribe audio
131+
with open("audio.wav", "rb") as f:
132+
result = client.asr.transcribe(audio=f.read(), language="en")
115133

116-
client = FishAudio()
134+
print(result.text)
135+
136+
# Access timestamped segments
137+
for segment in result.segments:
138+
print(f"[{segment.start:.2f}s - {segment.end:.2f}s] {segment.text}")
139+
```
140+
141+
[Learn more](https://docs.fish.audio/developer-guide/sdk-guide/python/speech-to-text)
142+
143+
### Real-time Streaming
144+
145+
Stream dynamically generated text for conversational AI and live applications:
146+
147+
**Synchronous:**
117148

118-
# Stream text chunks and receive audio in real-time
149+
```python
119150
def text_chunks():
120151
yield "Hello, "
121152
yield "this is "
122-
yield "streaming audio!"
153+
yield "streaming!"
123154

124155
audio_stream = client.tts.stream_websocket(text_chunks(), latency="balanced")
125156
play(audio_stream)
126157
```
127158

128-
### Speech Recognition (ASR)
129-
130-
To transcribe audio to text:
159+
**Asynchronous:**
131160

132161
```python
133-
from fishaudio import FishAudio
134-
135-
client = FishAudio()
162+
async def text_chunks():
163+
yield "Hello, "
164+
yield "this is "
165+
yield "streaming!"
136166

137-
# Transcribe audio to text
138-
with open("audio.wav", "rb") as f:
139-
result = client.asr.transcribe(audio=f.read())
140-
print(result.text)
167+
audio_stream = await client.tts.stream_websocket(text_chunks(), latency="balanced")
168+
play(audio_stream)
141169
```
142170

143-
### Voice Management
171+
[Learn more](https://docs.fish.audio/developer-guide/sdk-guide/python/websocket)
144172

145-
Manage voice references and list available voices.
173+
### Voice Cloning
146174

147-
```python
148-
from fishaudio import FishAudio
175+
**Instant cloning:**
149176

150-
client = FishAudio()
151-
152-
# List available voices
153-
voices = client.voices.list(language="en", tags="male")
154-
155-
# Get a specific voice by ID
156-
voice = client.voices.get(voice_id="802e3bc2b27e49c2995d23ef70e6ac89")
177+
```python
178+
from fishaudio.types import ReferenceAudio
157179

158-
# Create a custom voice
159-
with open("voice_sample.wav", "rb") as f:
160-
new_voice = client.voices.create(
161-
title="My Custom Voice",
162-
voices=[f.read()],
163-
description="My cloned voice"
180+
# Clone voice on-the-fly
181+
with open("reference.wav", "rb") as f:
182+
audio = client.tts.convert(
183+
text="Cloned voice speaking",
184+
references=[ReferenceAudio(
185+
audio=f.read(),
186+
text="Text spoken in reference"
187+
)]
164188
)
165189
```
166190

167-
### Async Usage
168-
169-
You can also use the SDK in asynchronous applications:
191+
**Persistent voice models:**
170192

171193
```python
172-
import asyncio
173-
from fishaudio import AsyncFishAudio
174-
175-
async def main():
176-
client = AsyncFishAudio()
177-
178-
audio = await client.tts.convert(text="Async text-to-speech!")
179-
# Process audio...
194+
# Create voice model for reuse
195+
with open("voice_sample.wav", "rb") as f:
196+
voice = client.voices.create(
197+
title="My Voice",
198+
voices=[f.read()],
199+
description="Custom voice clone"
200+
)
180201

181-
asyncio.run(main())
202+
# Use the created model
203+
audio = client.tts.convert(
204+
text="Using my saved voice",
205+
reference_id=voice.id
206+
)
182207
```
183208

184-
### Account
209+
[Learn more](https://docs.fish.audio/developer-guide/sdk-guide/python/voice-cloning)
185210

186-
Check your remaining API credits, usage, and account details:
211+
## Resource Clients
187212

188-
```python
189-
from fishaudio import FishAudio
213+
| Resource | Description | Key Methods |
214+
|----------|-------------|-------------|
215+
| `client.tts` | Text-to-speech | `convert()`, `stream()`, `stream_websocket()` |
216+
| `client.asr` | Speech recognition | `transcribe()` |
217+
| `client.voices` | Voice management | `list()`, `get()`, `create()`, `update()`, `delete()` |
218+
| `client.account` | Account info | `get_credits()`, `get_package()` |
190219

191-
client = FishAudio()
192-
credits = client.account.get_credits()
193-
print(f"Remaining credits: {credits.credit}")
194-
```
195-
196-
197-
### Optional Dependencies
220+
## Error Handling
198221

199-
For audio playback utilities to help with playing and saving audio files, install the `utils` extra:
222+
```python
223+
from fishaudio.exceptions import (
224+
AuthenticationError,
225+
RateLimitError,
226+
ValidationError,
227+
FishAudioError
228+
)
200229

201-
```bash
202-
pip install fish-audio-sdk[utils]
230+
try:
231+
audio = client.tts.convert(text="Hello!")
232+
except AuthenticationError:
233+
print("Invalid API key")
234+
except RateLimitError:
235+
print("Rate limit exceeded")
236+
except ValidationError as e:
237+
print(f"Invalid request: {e}")
238+
except FishAudioError as e:
239+
print(f"API error: {e}")
203240
```
204241

205-
## Legacy SDK
206-
207-
The legacy `fish_audio_sdk` module continues to be supported for existing projects:
242+
## Resources
208243

209-
```python
210-
from fish_audio_sdk import Session
211-
212-
session = Session("your_api_key")
213-
```
244+
- **Documentation:** [SDK Guide](https://docs.fish.audio/developer-guide/sdk-guide/python/) | [API Reference](https://docs.fish.audio/api-reference/sdk/python/)
245+
- **Package:** [PyPI](https://pypi.org/project/fish-audio-sdk/) | [GitHub](https://github.com/fishaudio/fish-audio-python)
246+
- **Legacy SDK:** [Documentation](https://docs.fish.audio/archive/python-sdk-legacy) | [Migration Guide](https://docs.fish.audio/archive/python-sdk-legacy/migration-guide)
214247

215-
For complete legacy SDK documentation, see the [Legacy API Documentation](https://docs.fish.audio/archive/python-sdk-legacy).
248+
## License
216249

217-
We recommend migrating to the new `fishaudio` module - see our [Migration Guide](https://docs.fish.audio/archive/python-sdk-legacy/migration-guide) for assistance.
250+
This project is licensed under the Apache-2.0 License - see the [LICENSE](LICENSE) file for details.

0 commit comments

Comments
 (0)