MusicGen-AI/README.md at main · info-gallary/MusicGen-AI

MusicGen Agent Composer

Generates evolving MusicGen prompts with an LLM agent, then renders audio segments and stitches them with crossfades.
Backed by FastAPI for generation/rendering, with a Streamlit client and an optional React + Vite UI under frontend/MusicGen.

Architecture

Agent: agent.py uses agno with Groq Llama to produce 5s segment prompts from a description.
Renderer: main.py loads Hugging Face MusicGen (transformers) and renders WAV from prompts.
API: api.py exposes endpoints to get instructions and render audio; caches the model.
Clients: app.py (Streamlit) talks to the API. React app lives in frontend/MusicGen.

Prerequisites

Quick Start

Create and activate a virtual environment, then install Python deps:
- Windows (PowerShell): python -m venv .venv; .\.venv\Scripts\Activate.ps1; pip install -r requirements.txt
- macOS/Linux: python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
Set your Groq key in .env (already present as an example):
- GROQ_API_KEY="<your_key>"
Start the API server: uvicorn api:app --reload --port 8080
Open the Streamlit client in another terminal: streamlit run app.py

Env Vars

GROQ_API_KEY (required for agent.py). The API and Streamlit load it via python-dotenv.

API Endpoints

GET /health — Versions of transformers/torch and CUDA status.
POST /v1/instructions — Body: { "description": str } → plan + list of prompts.
POST /v1/render — Body: { "prompts": [str], "guidance_scale": float, "max_new_tokens": int, "fade_ms": int, "output_dir": str, "return_audio_b64": bool } → stitched audio; optional base64.
POST /v1/generate-and-render — One shot: description → final audio.
POST /v1/render-segment — Render a single prompt; saves segment_XX.wav.

Common Workflows

Generate only (agent): use POST /v1/instructions from the UI (app.py) or cURL.
Render timeline: send edited prompts to POST /v1/render to get a final WAV with crossfades.
Per‑segment tweak: call POST /v1/render-segment to overwrite segments/segment_XX.wav.

React Frontend (Optional)

Location: frontend/MusicGen
Node 18+ recommended.
Install and run:
- cd frontend/MusicGen
- npm install
- npm run dev
The UI has an input for API Base (default http://127.0.0.1:8080). Ensure the FastAPI server is running.

Files

Troubleshooting

Torch or transformers install issues:
- Try CPU‑only first: pip install torch --index-url https://download.pytorch.org/whl/cpu
- Ensure transformers and accelerate are installed; update with pip install -U transformers accelerate.
Model download errors:
- Verify internet access; rerun the API so transformers can fetch facebook/musicgen-small.
Agent errors about Groq:
- Confirm .env is loaded and GROQ_API_KEY is valid.

Notes

First runs will download model weights; this may take a while.
The DuckDuckGo tool import in agent.py is optional and safely skipped if unavailable.

Provide feedback