Skip to content

Commit 5b7c2b9

Browse files
Merge pull request #6 from Goekdeniz-Guelmez/adding-three-to-five-speacker-formats
v1.1.0
2 parents 84bcea6 + eec8655 commit 5b7c2b9

File tree

9 files changed

+409
-117
lines changed

9 files changed

+409
-117
lines changed

README.md

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
![logo](logo.jpeg)
44

5-
A local AI-powered tool that converts PDF documents into engaging podcasts, using local LLMs and TTS models.
5+
A local AI-powered tool that converts PDF documents into engaging audio's such as podcasts, using local LLMs and TTS models.
66

77
## Features
88

@@ -68,7 +68,10 @@ You can use the default configuration or create a custom JSON config file with t
6868

6969
```json
7070
{
71-
"Co-Host-Speaker-Voice": "af_sky+af_bella",
71+
"Co-Host-Speaker-1-Voice": "af_sky+af_bella",
72+
"Co-Host-Speaker-2-Voice": "af_echo",
73+
"Co-Host-Speaker-3-Voice": "af_nova",
74+
"Co-Host-Speaker-4-Voice": "af_shimmer",
7275
"Host-Speaker-Voice": "af_alloy",
7376

7477
"Small-Text-Model": {
@@ -224,10 +227,9 @@ python -m local_notebooklm.start --pdf PATH_TO_PDF [options]
224227
| `--length` | Content length (short, medium, long, very-long) | medium |
225228
| `--style` | Content style (normal, casual, formal, technical, academic, friendly, gen-z, funny) | normal |
226229
| `--preference` | Additional focus preferences or instructions | None |
230+
| `--language` | Language the audio should be in | english |
227231
| `--output-dir` | Directory to store output files | ./output |
228232

229-
Local-NotebookLM currently does NOT support multible languages other then english, you can try working around it by adding a text in the preferences saying what language the audio should be, also be sure the TTS model supports your desired language.
230-
231233
#### Format Types
232234

233235
Local-NotebookLM now supports both single-speaker and two-speaker formats:
@@ -251,6 +253,10 @@ Local-NotebookLM now supports both single-speaker and two-speaker formats:
251253
- q-and-a
252254
- meeting
253255

256+
**Multi-Speaker Formats:**
257+
- panel-discussion (3, 4, or 5 speakers)
258+
- debate (3, 4, or 5 speakers)
259+
254260
#### Example Commands
255261

256262
Basic usage:
@@ -270,7 +276,7 @@ python -m local_notebooklm.start --pdf documents/research_paper.pdf --preference
270276

271277
Using custom config:
272278
```bash
273-
python -m local_notebooklm.start --pdf documents/research_paper.pdf --config custom_config.json --output-dir ./my_podcast
279+
python -m local_notebooklm.start --pdf documents/research_paper.pdf --config custom_config.json --output-dir ./my_podcast --language german
274280
```
275281

276282
### Programmatic API
@@ -287,7 +293,8 @@ success, result = podcast_processor(
287293
length="long",
288294
style="professional",
289295
preference="Focus on the key technical aspects",
290-
output_dir="./test_output"
296+
output_dir="./test_output",
297+
language="english"
291298
)
292299

293300
if success:
@@ -441,6 +448,13 @@ flowchart TD
441448
class pdf,file1,file2,file3,fileAudio data
442449
```
443450

451+
## Multiple Language Support
452+
453+
Local-NotebookLM now supports multiple languages. You can specify the language when using the programmatic API or through the command line.
454+
455+
**Important Note:** When using a non-English language, ensure that both your selected LLM and TTS models support the desired language. Language support varies significantly between different models and providers. For optimal results, verify that your chosen models have strong capabilities in your target language before processing.
456+
457+
444458
## Output Files
445459

446460
The pipeline generates the following files:

local_notebooklm/processor.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@ def podcast_processor(
1414
style="normal",
1515
preference="nothing",
1616
output_dir="./output",
17-
skip_to=None
17+
skip_to: int = None,
18+
language: str = "english"
1819
):
1920
# Load config
2021
if config_path:
@@ -116,7 +117,8 @@ def podcast_processor(
116117
input_file=transcript_file,
117118
output_dir=str(output_dirs["step3"]),
118119
format_type=format_type,
119-
system_prompt=system_prompts["step3"]
120+
system_prompt=system_prompts["step3"],
121+
language=language
120122
)
121123
else:
122124
print("Skipping Step 3, assuming files exist in output directory...")

local_notebooklm/steps/helpers.py

Lines changed: 31 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,24 +13,51 @@
1313
"lecture", "tutorial", "q-and-a",
1414
"news-report", "executive-brief", "meeting", "analysis"
1515
]
16+
17+
1618
SingleSpeakerFormats = Literal[
1719
"summary", "narration", "storytelling", "explainer",
1820
"lecture", "tutorial", "news-report", "executive-brief", "analysis"
1921
]
20-
TwoSpeakerFormats = Literal[
21-
"podcast", "interview", "panel-discussion",
22-
"debate", "q-and-a", "meeting"
23-
]
2422
SINGLE_SPEAKER_FORMATS = {
2523
"summary", "narration", "storytelling", "explainer",
2624
"lecture", "tutorial", "news-report", "executive-brief", "analysis"
2725
}
2826

27+
TwoSpeakerFormats = Literal[
28+
"podcast", "interview", "panel-discussion",
29+
"debate", "q-and-a", "meeting"
30+
]
2931
TWO_SPEAKER_FORMATS = {
3032
"podcast", "interview", "panel-discussion",
3133
"debate", "q-and-a", "meeting"
3234
}
35+
36+
ThreeSpeakerFormats = Literal[
37+
"three-people-podcast", "three-people-panel-discussion", "three-people-debate"
38+
]
39+
THREE_SPEAKER_FORMATS = {
40+
"three-people-podcast", "three-people-panel-discussion", "three-people-debate"
41+
}
42+
43+
FourSpeakerFormats = Literal[
44+
"four-people-podcast", "four-people-panel-discussion", "four-people-debate"
45+
]
46+
FOUR_SPEAKER_FORMATS = {
47+
"four-people-podcast", "four-people-panel-discussion", "four-people-debate"
48+
}
49+
50+
FiveSpeakerFormats = Literal[
51+
"five-people-podcast", "five-people-panel-discussion", "five-people-debate"
52+
]
53+
FIVE_SPEAKER_FORMATS = {
54+
"five-people-podcast", "five-people-panel-discussion", "five-people-debate"
55+
}
56+
57+
3358
LengthType = Literal["short", "medium", "long", "very-long"]
59+
60+
3461
StyleType = Literal["normal", "friendly", "professional", "academic", "casual", "technical", "gen-z", "funny"]
3562

3663

0 commit comments

Comments
 (0)