Skip to content
This repository was archived by the owner on May 20, 2025. It is now read-only.

Commit 2ed980f

Browse files
committed
add more details and screenshots
1 parent 4bb53ff commit 2ed980f

File tree

4 files changed

+76
-16
lines changed

4 files changed

+76
-16
lines changed

docs/guides/python/ai-podcast-part-1.mdx

Lines changed: 76 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ In this first part we'll using the [suno/bark](https://huggingface.co/suno/bark)
3030

3131
## Prerequisites
3232

33-
- [uv](https://docs.astral.sh/uv/#getting-started) - for simplified dependency management
33+
- [uv](https://docs.astral.sh/uv/#getting-started) - for Python dependency management
3434
- The [Nitric CLI](/get-started/installation)
3535
- _(optional)_ An [AWS](https://aws.amazon.com) account
3636

@@ -64,6 +64,11 @@ As you may know, Nitric helps with both cloud resource creation and interaction.
6464

6565
To achieve this let's create a new python module which defines the resources for this project. We'll create this as `common/resources.py` in our project.
6666

67+
```bash
68+
mkdir common
69+
touch common/resources.py
70+
```
71+
6772
```python title: common/resources.py
6873
from nitric.resources import api, bucket, job
6974
# Our main API for submitting audio generation jobs
@@ -170,7 +175,7 @@ Nitric.run()
170175

171176
Ok, now that we have our job defined we need a way to trigger it. We'll create an API that lets us submit text to be converted to audio, using the job we just defined.
172177

173-
In the existing API endpoint in `services/api.py` overwrite with the following.
178+
In the existing `services/api.py` file, overwrite the contents with the following.
174179

175180
```python title:services/api.py
176181
from common.resources import main_api, gen_audio_job
@@ -235,15 +240,25 @@ Now that we have the basic structure of our project set up, we can test it local
235240
nitric start
236241
```
237242

238-
Once its up and running we can test out our API by running:
243+
Once its up and running we can test our API with any HTTP Client:
239244

240245
```bash
241246
curl -X POST http://localhost:4001/audio/test -d "Okay this is cool, but let's wait and see what comes next"
242247
```
243248

244-
Or you can use the [nitric dashboard](http://localhost:49152/) to submit the same text.
249+
<Note>
250+
If port 4001 is already in use on your machine the port will be different,
251+
e.g. 4002. You can find the port in the terminal output when you start the
252+
project.
253+
</Note>
254+
255+
Alternatively, you can use the [nitric dashboard](http://localhost:49152/) to submit the same text.
245256

246-
Or you can use your favorite API client to test it out.
257+
<img
258+
src="/docs/images/guides/ai-podcast/part-1/dashboard.png"
259+
style={{ maxWidth: 800, width: '100%' }}
260+
alt="screen shot of the local development dashboard"
261+
/>
247262

248263
<Note>
249264
If you're running without a GPU it can take some time for the audio content to
@@ -252,6 +267,12 @@ Or you can use your favorite API client to test it out.
252267

253268
Watch the logs in the terminal where you started the project to see the progress of the audio generation. When it's complete you'll can access it from the `clips` bucket using the local Nitric Dashboard e.g. http://localhost:49152/storage/.
254269

270+
<img
271+
src="/docs/images/guides/ai-podcast/part-1/dashboard-storage.png"
272+
style={{ maxWidth: 800, width: '100%' }}
273+
alt="screen shot of the local development dashboard"
274+
/>
275+
255276
Once the generation is complete you should have something like this:
256277

257278
<div class="mx-auto max-w-lg rounded-sm bg-black p-4 text-white shadow dark:bg-white dark:text-black">
@@ -302,7 +323,7 @@ model_dir = "./.model"
302323
async def do_download_audio_model(ctx: JobContext):
303324
model_id = ctx.req.data["model_id"]
304325

305-
print("Downloading models - this may take several minutes")
326+
print("Downloading models - this may take several minutes without much feedback, please be patient")
306327
processor = AutoProcessor.from_pretrained(model_id)
307328
model = BarkModel.from_pretrained(model_id)
308329

@@ -403,6 +424,8 @@ async def do_generate_audio(ctx: JobContext):
403424
requests.put(upload_url, data=buffer.getvalue(), headers={"Content-Type": "audio/wav"}, timeout=600)
404425

405426
print("Done!")
427+
428+
Nitric.run()
406429
```
407430

408431
<Note>
@@ -414,7 +437,7 @@ async def do_generate_audio(ctx: JobContext):
414437

415438
Then we can add an API endpoint to trigger the download job and update the API endpoint to allow selection of models and voice presets.
416439

417-
```python
440+
```python title: services/api.py
418441
from common.resources import main_api, gen_audio_job, download_audio_model_job
419442
from nitric.application import Nitric
420443
from nitric.context import HttpContext
@@ -486,6 +509,10 @@ You should get a similiar result to before. The main difference is that the mode
486509

487510
So that the AI workload can use GPUs in the cloud we'll need to make sure it ships with drivers and libraries to support that. We can do this by specifying a custom Dockerfile for our batch service under `torch.dockerfile`.
488511

512+
```bash
513+
touch torch.dockerfile
514+
```
515+
489516
```dockerfile title: torch.dockerfile
490517
# The python version must match the version in .python-version
491518
FROM ghcr.io/astral-sh/uv:python3.11-bookworm-slim AS builder
@@ -536,6 +563,10 @@ ENTRYPOINT python -u $HANDLER
536563

537564
We'll also add a dockerignore file to try and keep the image size down.
538565

566+
```bash
567+
touch torch.dockerfile.dockerignore
568+
```
569+
539570
```gitignore title: torch.dockerfile.dockerignore
540571
.mypy_cache/
541572
.nitric/
@@ -621,27 +652,52 @@ batch-compute-env:
621652
```
622653
623654
<Note>
624-
You will need to make sure your machine is configured to deploy to AWS. See
625-
the [Nitric Pulumi AWS Provider documentation](/providers/aws) for more
626-
information.
655+
You will need to setup your machine to deploy to AWS. See the [Nitric Pulumi
656+
AWS Provider documentation](/providers/aws) for more information.
627657
</Note>
628658
659+
### Requesting a G instance quota increase
660+
661+
Most AWS accounts **will not** have access to on-demand GPU instances (G
662+
Instances), if you'd like to run models using a GPU you'll need to request a quota increase for G instances.
663+
664+
If you prefer not to use a GPU you can set `gpus=0` in the `@gen_audio_job` decorator in `batches/podcast.py`.
665+
629666
<Note>
630-
Most AWS accounts **will not** have access to on demand GPU instances so
631-
you'll probably need to update your AWS service quotas to allow GPU instances.
632-
The model will also work on CPU so if you can't get access to GPUs you can
633-
always increase the CPU count and memory to compensate.
667+
**Important:** If the gpus value in `batches/podcast.py` exceeds the number of
668+
available GPUs in your AWS account, the job will never start. If you want to
669+
run without a GPU, make sure to set `gpus=0` in the `@gen_audio_job`
670+
decorator. This is just a quirk of how AWS Batch works.
634671
</Note>
635672

636-
Once that's configured we can deploy our project to the cloud using:
673+
If you want to use a GPU you'll need to request a quota increase for G instances in AWS.
674+
675+
To request a quota increase for G instances in AWS you can follow these steps:
676+
677+
1. Go to the [AWS Service Quotas for EC2](https://console.aws.amazon.com/servicequotas/home/services/ec2/quotas) page.
678+
2. Find/Search for **All G and VT Spot Instance Requests**
679+
3. Click **Request quota increase**
680+
4. Choose an appropriate value, e.g. 4, 8 or 16 depending on your needs
681+
682+
<img
683+
src="/docs/images/guides/ai-podcast/part-1/g-instance-quota-increase.png"
684+
style={{ maxWidth: 500, width: '100%', border: '1px solid #e5e7eb' }}
685+
alt="screen shot of requesting a G instance quota increase on AWS"
686+
/>
687+
688+
Once you've requested the quota increase it may take time for AWS to approve it.
689+
690+
### Deploy the project
691+
692+
Once the above is complete, we can deploy the project to the cloud using:
637693

638694
```bash
639695
nitric up
640696
```
641697

642698
<Note>
643699
The initial deployment may take time due to the size of the python/Nvidia
644-
driver and CUDA runtime dependencies. Be patient.
700+
driver and CUDA runtime dependencies.
645701
</Note>
646702

647703
Once the project is deployed you can try out some generation, just like before depending on the hardware you were running on locally you may notice a speed up in generation time.
@@ -653,4 +709,8 @@ Running on g5.xlarge from testing this project will cost ~$0.05/minute of audio
653709

654710
</Note>
655711

712+
You can see the status of your batch jobs in the [AWS Batch console](console.aws.amazon.com/batch/home) and the model and audio files in the [AWS S3 console](https://s3.console.aws.amazon.com/s3/home).
713+
714+
## Next steps
715+
656716
In part two of this guide we'll look at adding an LLM agent to our project to automatically generate scripts for our podcasts from small prompts.
423 KB
Loading
437 KB
Loading
74 KB
Loading

0 commit comments

Comments
 (0)