Skip to content

Commit 26f10a6

Browse files
authored
Small notes on next steps (meta-llama#746)
2 parents e159996 + 5248cb1 commit 26f10a6

File tree

5 files changed

+50
-7
lines changed

5 files changed

+50
-7
lines changed

recipes/quickstart/NotebookLlama/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ Note 1: In Step 1, we prompt the 1B model to not modify the text or summarize it
2323

2424
Note 2: For Step 2, you can also use `Llama-3.1-8B-Instruct` model, we recommend experimenting and trying if you see any differences. The 70B model was used here because it gave slightly more creative podcast transcripts for the tested examples.
2525

26+
Note 3: For Step 4, please try to extend the approach with other models. These models were chosen based on a sample prompt and worked best, newer models might sound better. Please see [Notes](./TTS_Notes.md) for some of the sample tests.
27+
2628
### Detailed steps on running the notebook:
2729

2830
Requirements: GPU server or an API provider for using 70B, 8B and 1B Llama models.

recipes/quickstart/NotebookLlama/Step-1 PDF-Pre-Processing-Logic.ipynb

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2696,6 +2696,16 @@
26962696
"print(processed_text[-1000:])"
26972697
]
26982698
},
2699+
{
2700+
"cell_type": "markdown",
2701+
"id": "3d996ac5",
2702+
"metadata": {},
2703+
"source": [
2704+
"### Next Notebook: Transcript Writer\n",
2705+
"\n",
2706+
"Now that we have the pre-processed text ready, we can move to converting into a transcript in the next notebook"
2707+
]
2708+
},
26992709
{
27002710
"cell_type": "code",
27012711
"execution_count": null,

recipes/quickstart/NotebookLlama/Step-2-Transcript-Writer.ipynb

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -302,6 +302,16 @@
302302
" pickle.dump(save_string_pkl, file)"
303303
]
304304
},
305+
{
306+
"cell_type": "markdown",
307+
"id": "dbae9411",
308+
"metadata": {},
309+
"source": [
310+
"### Next Notebook: Transcript Re-writer\n",
311+
"\n",
312+
"We now have a working transcript but we can try making it more dramatic and natural. In the next notebook, we will use `Llama-3.1-8B-Instruct` model to do so."
313+
]
314+
},
305315
{
306316
"cell_type": "code",
307317
"execution_count": null,

recipes/quickstart/NotebookLlama/Step-3-Re-Writer.ipynb

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,16 @@
253253
" pickle.dump(save_string_pkl, file)"
254254
]
255255
},
256+
{
257+
"cell_type": "markdown",
258+
"id": "2dccf336",
259+
"metadata": {},
260+
"source": [
261+
"### Next Notebook: TTS Workflow\n",
262+
"\n",
263+
"Now that we have our transcript ready, we are ready to generate the audio in the next notebook."
264+
]
265+
},
256266
{
257267
"cell_type": "code",
258268
"execution_count": null,

recipes/quickstart/NotebookLlama/Step-4-TTS-Workflow.ipynb

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,9 @@
1111
"\n",
1212
"In this notebook, we will learn how to generate Audio using both `suno/bark` and `parler-tts/parler-tts-mini-v1` models first. \n",
1313
"\n",
14-
"After that, we will use the output from Notebook 3 to generate our complete podcast"
14+
"After that, we will use the output from Notebook 3 to generate our complete podcast\n",
15+
"\n",
16+
"Note: Please feel free to extend this notebook with newer models. The above two were chosen after some tests using a sample prompt."
1517
]
1618
},
1719
{
@@ -117,11 +119,7 @@
117119
"id": "50b62df5-5ea3-4913-832a-da59f7cf8de2",
118120
"metadata": {},
119121
"source": [
120-
"Generally in life, you set your device to \"cuda\" and are happy. \n",
121-
"\n",
122-
"However, sometimes you want to compensate for things and set it to `cuda:7` to tell the system but even more-so the world that you have 8 GPUS.\n",
123-
"\n",
124-
"Jokes aside please set `device = \"cuda\"` below if you're using a single GPU node."
122+
"Please set `device = \"cuda\"` below if you're using a single GPU node."
125123
]
126124
},
127125
{
@@ -161,7 +159,7 @@
161159
],
162160
"source": [
163161
"# Set up device\n",
164-
"device = \"cuda:7\" if torch.cuda.is_available() else \"cpu\"\n",
162+
"device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
165163
"\n",
166164
"# Load model and tokenizer\n",
167165
"model = ParlerTTSForConditionalGeneration.from_pretrained(\"parler-tts/parler-tts-mini-v1\").to(device)\n",
@@ -639,6 +637,19 @@
639637
" parameters=[\"-q:a\", \"0\"])"
640638
]
641639
},
640+
{
641+
"cell_type": "markdown",
642+
"id": "c7ce5836",
643+
"metadata": {},
644+
"source": [
645+
"### Suggested Next Steps:\n",
646+
"\n",
647+
"- Experiment with the prompts: Please feel free to experiment with the SYSTEM_PROMPT in the notebooks\n",
648+
"- Extend workflow beyond two speakers\n",
649+
"- Test other TTS Models\n",
650+
"- Experiment with Speech Enhancer models as a step 5."
651+
]
652+
},
642653
{
643654
"cell_type": "code",
644655
"execution_count": null,

0 commit comments

Comments
 (0)