MusicGen Agent Composer
- Generates evolving MusicGen prompts with an LLM agent, then renders audio segments and stitches them with crossfades.
- Backed by FastAPI for generation/rendering, with a Streamlit client and an optional React + Vite UI under
frontend/MusicGen.
Architecture
- Agent:
agent.pyusesagnowith Groq Llama to produce 5s segment prompts from a description. - Renderer:
main.pyloads Hugging Face MusicGen (transformers) and renders WAV from prompts. - API:
api.pyexposes endpoints to get instructions and render audio; caches the model. - Clients:
app.py(Streamlit) talks to the API. React app lives infrontend/MusicGen.
Prerequisites
- Python 3.10+
- Internet access on first run to download the MusicGen model
- Groq API key for the agent
Quick Start
- Create and activate a virtual environment, then install Python deps:
- Windows (PowerShell):
python -m venv .venv; .\.venv\Scripts\Activate.ps1; pip install -r requirements.txt - macOS/Linux:
python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
- Windows (PowerShell):
- Set your Groq key in
.env(already present as an example):GROQ_API_KEY="<your_key>"
- Start the API server:
uvicorn api:app --reload --port 8080 - Open the Streamlit client in another terminal:
streamlit run app.py
Env Vars
GROQ_API_KEY(required foragent.py). The API and Streamlit load it viapython-dotenv.
API Endpoints
GET /health— Versions oftransformers/torchand CUDA status.POST /v1/instructions— Body:{ "description": str }→ plan + list of prompts.POST /v1/render— Body:{ "prompts": [str], "guidance_scale": float, "max_new_tokens": int, "fade_ms": int, "output_dir": str, "return_audio_b64": bool }→ stitched audio; optional base64.POST /v1/generate-and-render— One shot: description → final audio.POST /v1/render-segment— Render a single prompt; savessegment_XX.wav.
Common Workflows
- Generate only (agent): use
POST /v1/instructionsfrom the UI (app.py) or cURL. - Render timeline: send edited prompts to
POST /v1/renderto get a final WAV with crossfades. - Per‑segment tweak: call
POST /v1/render-segmentto overwritesegments/segment_XX.wav.
React Frontend (Optional)
- Location:
frontend/MusicGen - Node 18+ recommended.
- Install and run:
cd frontend/MusicGennpm installnpm run dev
- The UI has an input for API Base (default
http://127.0.0.1:8080). Ensure the FastAPI server is running.
Files
agent.py:1— LLM agent to expand descriptions into segment prompts.main.py:1— MusicGen load/generate utilities (transformers/torch).api.py:1— FastAPI service exposing generation/render routes.app.py:1— Streamlit client for the API.requirements.txt:1— Python dependencies.frontend/MusicGen— React + Vite frontend (optional).
Troubleshooting
- Torch or transformers install issues:
- Try CPU‑only first:
pip install torch --index-url https://download.pytorch.org/whl/cpu - Ensure
transformersandaccelerateare installed; update withpip install -U transformers accelerate.
- Try CPU‑only first:
- Model download errors:
- Verify internet access; rerun the API so
transformerscan fetchfacebook/musicgen-small.
- Verify internet access; rerun the API so
- Agent errors about Groq:
- Confirm
.envis loaded andGROQ_API_KEYis valid.
- Confirm
Notes
- First runs will download model weights; this may take a while.
- The DuckDuckGo tool import in
agent.pyis optional and safely skipped if unavailable.