Explore song lyrics as interactive 3D vector spaces using OpenAI embeddings and TensorBoard visualization.
Transform your favorite songs into navigable semantic landscapes where similar lyrical themes naturally cluster together. Discover hidden patterns in musical storytelling through the power of AI embeddings.
- 🎭 Multi-Artist Support: Analyze lyrics from different artists and genres
- 🌌 3D Vector Visualization: Interactive exploration of semantic relationships
- 🎯 Smart Clustering: Similar themes automatically group together
- 📊 Cross-Song Analysis: Compare lyrical patterns across different songs
- 🔍 Search & Filter: Find specific lyrics in the vector space
- 🎨 Artist Comparison: Visualize how different artists approach similar themes
- Python 3.12+
- OpenAI API key
- uv (recommended) or pip
git clone https://github.com/Siddhant-K-code/song-vector-explorer.git
cd song-vector-explorer
uv sync
-
Set your OpenAI API key:
export OPENAI_API_KEY="your-api-key-here"
-
Add lyrics: Place your
.txt
files in the appropriate folders:assets/lyrics/ ├── the-weeknd/ │ ├── die-for-you.txt │ ├── after-hours.txt │ └── blinding-lights.txt ├── drake/ │ └── gods-plan.txt └── shakira/ └── hips-dont-lie.txt
-
Generate embeddings:
uv run python src/embedding.py
-
Create visualization:
uv run python src/visualize.py
-
Launch TensorBoard:
uv run tensorboard --logdir=logs # or use the shortcut uv run task serve
-
Open your browser and navigate to
http://localhost:6006
- Lyrics Processing: Each line of lyrics is processed individually
- AI Embeddings: OpenAI's
text-embedding-3-large
model converts lyrics to high-dimensional vectors - Vector Storage: Embeddings are saved as NumPy arrays for efficient processing
- Visualization: TensorBoard's projector displays the multi-dimensional space in 3D
- Interactive Exploration: Navigate, search, and analyze the semantic landscape
song-vector-explorer/
├── src/
│ ├── embedding.py # Generate embeddings from lyrics
│ └── regist.py # Create TensorBoard visualization
├── assets/
│ ├── lyrics/ # Song lyrics organized by artist
│ ├── vectors/ # Generated embedding vectors
│ └── metadata/ # Song and artist metadata
├── logs/ # TensorBoard log files
├── pyproject.toml # Project dependencies
└── README.md # This file
- Love & Relationships: Romantic lyrics cluster together
- Success & Ambition: Achievement-focused lyrics form distinct groups
- Party & Celebration: Upbeat, energetic lyrics gather in their own space
- Introspection: Reflective, personal lyrics create contemplative clusters
- The Weeknd: Dark romance, nightlife, emotional complexity
- Drake: Success narratives, vulnerability, Toronto references
- Shakira: Dance, celebration, Latin cultural themes
- See how different genres approach similar themes
- Identify unique linguistic patterns per artist
- Discover unexpected connections between songs
- Create a new folder in
assets/lyrics/
- Add
.txt
files with song lyrics - Run the embedding and registration process
- Explore the updated visualization
Edit src/embedding.py
to:
- Use different OpenAI models
- Adjust preprocessing steps
- Add custom metadata fields
In TensorBoard's Projector:
- PCA: Linear dimensionality reduction
- t-SNE: Non-linear clustering visualization
- UMAP: Balanced approach for structure preservation
OPENAI_API_KEY=your-api-key-here
numpy>=2.3.0
- Numerical computingopenai>=1.86.0
- AI embeddingstensorboard>=2.19.0
- Visualizationtorch>=2.7.1
- Tensor operations
Expected Clustering Patterns:
🎵 The Weeknd Songs:
- Dark, moody themes cluster in one region
- Relationship complexity forms sub-clusters
- Nightlife references create distinct groupings
🎤 Drake Songs:
- Success themes cluster together
- Emotional vulnerability forms separate groups
- Toronto/Canadian references create unique clusters
💃 Shakira Songs:
- Dance and celebration themes cluster prominently
- Latin cultural references form distinct groups
- Confidence and empowerment create strong clusters
This project is licensed under the MIT License - see the LICENSE file for details.
Made with ❤️ for music and AI enthusiasts