Skip to content

Commit b6ea04e

Browse files
authored
Merge pull request #41 from himanshumahajan138/feature/add-text-to-speech
Fixes: #38 ; Adds Text-to-Speech Converter using gTTS
2 parents d49deeb + 6194813 commit b6ea04e

File tree

4 files changed

+183
-0
lines changed

4 files changed

+183
-0
lines changed

Text To Speech/README.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Script Name
2+
**Text to Speech Converter using gTTS**
3+
4+
- This script converts text into speech using Google’s Text-to-Speech (gTTS) API and saves the output as an audio file (e.g., `.mp3` format).
5+
- It allows for customization of language, speech speed, accents, and other pre-processing and tokenizing options.
6+
- Features:
7+
- Support for multiple languages using IETF language tags.
8+
- Localized accents via different Google Translate top-level domains (`tld`).
9+
- Option to slow down speech for easier comprehension.
10+
- Custom text pre-processing and tokenization options.
11+
- Timeout control for network requests.
12+
- Automatic playing of the audio file after saving (optional).
13+
14+
# Description
15+
This script provides a convenient interface for converting text into speech using the `gTTS` library. The text can be read in multiple languages, at different speeds, and with various localized accents. The script also includes advanced options for pre-processing the input text and customizing how it's tokenized before being sent to the gTTS API.
16+
17+
### Key Features:
18+
- **Multilingual Support**: Specify different languages using IETF language tags (`en`, `es`, etc.).
19+
- **Accents**: Use top-level domains (`tld`), such as `com`, `co.uk`, etc., to localize the accent.
20+
- **Custom Speed**: Option to slow down the speech for better understanding.
21+
- **Pre-Processing**: Built-in support for text pre-processing (e.g., removing punctuation).
22+
- **Timeout**: Set timeout limits for the API request.
23+
24+
# Prerequisites
25+
The following libraries are required to run the script:
26+
```bash
27+
pip install gtts
28+
```
29+
30+
Additionally, the script uses built-in libraries like `os`.
31+
32+
# Installing Instructions
33+
1. **Clone the Repository**:
34+
Clone this repository to your local machine using:
35+
```bash
36+
git clone <repository-url>
37+
```
38+
39+
2. **Install Dependencies**:
40+
Navigate to the project directory and install the required packages:
41+
```bash
42+
pip install -r requirements.txt
43+
```
44+
45+
3. **Run the Script**:
46+
After cloning and installing dependencies, you can run the script directly:
47+
```bash
48+
python text_to_speech.py
49+
```
50+
51+
4. **Customize the Script**:
52+
You can modify the input text, language, speed, and other options directly in the script:
53+
```python
54+
text_to_speech("Hello, welcome to the gTTS Python tutorial.", lang='en', slow=False)
55+
```
56+
57+
# Output
58+
### Example output:
59+
After running the script with the text `"Hello, welcome to the gTTS Python tutorial."`, the output file `output.mp3` is generated.
60+
61+
62+
# Author
63+
**[Himanshu Mahajan](https://github.com/himanshumahajan138)**

Text To Speech/requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
gTTS==2.5.2

Text To Speech/runtime.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
python-3.10.7

Text To Speech/text_to_speech.py

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
from gtts import gTTS
2+
import os
3+
4+
5+
def text_to_speech(
6+
text,
7+
lang="en",
8+
tld="com",
9+
slow=False,
10+
lang_check=True,
11+
pre_processor_funcs=None,
12+
tokenizer_func=None,
13+
timeout=None,
14+
output_file="output.mp3",
15+
):
16+
"""
17+
Convert the provided text to speech and save it as an audio file.
18+
19+
Args:
20+
text (string): The text to be read.
21+
lang (string, optional): The language (IETF language tag) to read the text in. Default is 'en'.
22+
tld (string, optional): Top-level domain for Google Translate host (e.g., 'com', 'co.uk').
23+
This affects accent localization. Default is 'com'.
24+
slow (bool, optional): If True, reads the text more slowly. Default is False.
25+
lang_check (bool, optional): If True, enforces valid language, raising a ValueError if unsupported. Default is True.
26+
pre_processor_funcs (list, optional): List of pre-processing functions to modify the text before tokenizing.
27+
Defaults to a list of built-in pre-processors.
28+
tokenizer_func (callable, optional): Function to tokenize the text. Defaults to a built-in tokenizer.
29+
timeout (float or tuple, optional): Seconds to wait for server response. Can be a float or a (connect, read) tuple.
30+
Default is None (wait indefinitely).
31+
output_file (string): Path for the output audio file (default: 'output.mp3').
32+
33+
Raises:
34+
AssertionError: When text is None or empty.
35+
ValueError: When lang_check is True and lang is unsupported.
36+
"""
37+
38+
# Use default pre-processor functions if not provided
39+
if pre_processor_funcs is None:
40+
pre_processor_funcs = [
41+
# Example built-in functions from gTTS:
42+
# Converts tone marks, abbreviations, and deals with word substitutions
43+
lambda text: text.replace(
44+
".", ""
45+
), # You can define more or use built-ins from gTTS
46+
]
47+
48+
# Use default tokenizer if not provided
49+
if tokenizer_func is None:
50+
tokenizer_func = lambda text: text.split() # Basic tokenizer example
51+
52+
try:
53+
# Create the gTTS object with the provided arguments
54+
tts = gTTS(
55+
text=text,
56+
lang=lang,
57+
tld=tld,
58+
slow=slow,
59+
lang_check=lang_check,
60+
pre_processor_funcs=pre_processor_funcs,
61+
tokenizer_func=tokenizer_func,
62+
timeout=timeout,
63+
)
64+
65+
# Save the audio file
66+
tts.save("Text To Speech/"+output_file)
67+
print(f"Audio saved at Text To Speech/{output_file}")
68+
69+
# Optionally, play the audio file (Windows or Linux/MacOS)
70+
# if os.name == "nt": # Windows
71+
# os.system(f"start {output_file}")
72+
# else: # macOS/Linux
73+
# os.system(f"xdg-open {output_file}")
74+
75+
except AssertionError as ae:
76+
print(f"Assertion Error: {ae}")
77+
except ValueError as ve:
78+
print(f"Value Error: {ve}")
79+
except RuntimeError as re:
80+
print(f"Runtime Error: {re}")
81+
82+
83+
if __name__ == "__main__":
84+
# Example usage of the text_to_speech function with various arguments
85+
86+
# Basic example (English, default options)
87+
text = "Hello, welcome to the gTTS Python tutorial."
88+
text_to_speech(text)
89+
90+
# # Custom example (Spanish, slow speech, and custom file name)
91+
# text_to_speech(
92+
# "Hola, bienvenido al tutorial de gTTS.",
93+
# lang="es",
94+
# slow=True,
95+
# output_file="spanish_slow.mp3",
96+
# )
97+
98+
# # Custom example with localized accent (UK English)
99+
# text_to_speech(
100+
# "Hello! How are you today?",
101+
# lang="en",
102+
# tld="co.uk",
103+
# output_file="british_accent.mp3",
104+
# )
105+
106+
# # You can pass custom pre-processor functions to modify the text before it’s tokenized.
107+
# text_to_speech(
108+
# "Dr. Smith is a great person.",
109+
# pre_processor_funcs=[lambda x: x.replace(".", "")],
110+
# output_file="custom_pre-processor.mp3",
111+
# )
112+
113+
# # You can set a timeout to limit how long the request to Google Translate waits.
114+
# text_to_speech(
115+
# "This will timeout after 5 seconds.",
116+
# output_file="timeout.mp3",
117+
# timeout=5.0
118+
# )

0 commit comments

Comments
 (0)