Skip to content

Commit 572f44d

Browse files
committed
Address comments, add the file
1 parent 9057ba4 commit 572f44d

File tree

2 files changed

+12
-3
lines changed

2 files changed

+12
-3
lines changed

recipes/quickstart/NotebookLlama/README.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,17 +12,26 @@ It assumes zero knowledge of LLMs, prompting and audio models, everything is cov
1212

1313
Here is step by step thought (pun intended) for the task:
1414

15-
- Step 1: Pre-process PDF: Use `Llama-3.2-1B` to pre-process the PDF and save it in a `.txt` file.
16-
- Step 2: Transcript Writer: Use `Llama-3.1-70B` model to write a podcast transcript from the text
17-
- Step 3: Dramatic Re-Writer: Use `Llama-3.1-8B` model to make the transcript more dramatic
15+
- Step 1: Pre-process PDF: Use `Llama-3.2-1B-Instruct` to pre-process the PDF and save it in a `.txt` file.
16+
- Step 2: Transcript Writer: Use `Llama-3.1-70B-Instruct` model to write a podcast transcript from the text
17+
- Step 3: Dramatic Re-Writer: Use `Llama-3.1-8B-Instruct` model to make the transcript more dramatic
1818
- Step 4: Text-To-Speech Workflow: Use `parler-tts/parler-tts-mini-v1` and `bark/suno` to generate a conversational podcast
1919

20+
Note 1: In Step 1, we prompt the 1B model to not modify the text or summarize it, strictly clean up extra characters or garbage characters that might get picked due to encoding from PDF. Please see the prompt in Notebook 1 for more details.
21+
22+
Note 2: For Step 2, you can also use `Llama-3.1-8B-Instruct` model, we recommend experimenting and trying if you see any differences. The 70B model was used here because it gave slightly more creative podcast transcripts for the tested examples.
23+
2024
### Detailed steps on running the notebook:
2125

2226
Requirements: GPU server or an API provider for using 70B, 8B and 1B Llama models.
27+
For running the 70B model, you will need a GPU with aggregated memory around 140GB to infer in bfloat-16 precision.
2328

2429
Note: For our GPU Poor friends, you can also use the 8B and lower models for the entire pipeline. There is no strong recommendation. The pipeline below is what worked best on first few tests. You should try and see what works best for you!
2530

31+
- Before getting started, please make sure to login using the `huggingface cli` and then launch your jupyter notebook server to make sure you are able to download the Llama models.
32+
33+
You'll need your Hugging Face access token, which you can get at your Settings page [here](https://huggingface.co/settings/tokens). Then run `huggingface-cli login` and copy and paste your Hugging Face access token to complete the login to make sure the scripts can download Hugging Face models if needed.
34+
2635
- First, please Install the requirements from [here]() by running inside the folder:
2736

2837
```
Binary file not shown.

0 commit comments

Comments
 (0)