A from-scratch implementation of the Shazam audio fingerprinting algorithm in Go.
This project recreates the core logic behind Shazam, as described in the original 2003 paper "An Industrial-Strength Audio Search Algorithm" by Avery Li-Chun Wang. It features a Digital Signal Processing (DSP) pipeline, a custom PostgreSQL-based fingerprint database, and a React frontend for real-time audio recognition.
This system does not use machine learning or external fingerprinting libraries. Instead, it relies on pure signal processing and probabilistic hashing:
- Spectrogram Generation: Converts raw audio (WAV) into a time-frequency spectrogram using Fast Fourier Transform (FFT).
- Constellation Map: Identifies high-energy peaks (local maxima) in the spectrogram to create a sparse representation of the audio.
- Combinatorial Hashing: Generates unique hashes by pairing "anchor" peaks with nearby "target" peaks and their time deltas. This makes the fingerprints valid even in noisy environments.
- Matching & Time Coherency: Matches query fingerprints against the database and uses diagonal alignment (linearity search) to filter out false positives. If the time offsets of the matching hashes align, it's a match.
- Go 1.25+: Core logic, high-performance DSP pipeline.
- Chi: Lightweight router for the REST API.
- PostgreSQL: Relational database for storing song metadata and millions of fingerprints.
- pgx: High-performance PostgreSQL driver.
- FFmpeg: Used for normalizing audio input (converting various formats to 44.1kHz mono WAV).
- React 19: Modern UI library.
- Vite: Fast build tool.
- TailwindCSS v4: Styling.
- MediaRecorder API: Capturing microphone input in the browser.
Ensure you have the following installed on your system:
- Go (v1.25 or later)
- Node.js & pnpm
- Docker & Docker Compose
- FFmpeg (must be available in your system path)
- goose (for database migrations:
go install github.com/pressly/goose/v3/cmd/goose@latest) - sqlc (optional, for regenerating DB queries)
-
Clone the repository:
git clone https://github.com/Danztee/shazam-build.git cd shazam-build -
Environment Setup: Create a
.envfile in the root directory. You can copy the example:cp .env.example .env
Ensure your
.envcontains the necessary variables (e.g.,DATABASE_URL,PORT,SPOTIFY_CLIENT_ID,SPOTIFY_CLIENT_SECRETfor ingestion). -
Start the Database: Use the Makefile to spin up the PostgreSQL container:
make docker-run
-
Run Migrations: Apply the database schema to your local Postgres instance:
make migrate-up
You can start both the backend and frontend with a single command:
make run- Backend API:
http://localhost:8080 - Frontend:
http://localhost:5173
The system needs a reference database of songs to recognize them. You can add songs via the API:
Endpoint: POST /api/v1/songs
Payload:
{
"spotify_url": "https://open.spotify.com/track/..."
}This will fetch the metadata, download the audio, process it through the DSP pipeline, and store the fingerprints.
- Open the frontend at
http://localhost:5173. - Grant microphone permissions.
- Click the Listen (or Shazam-like) button.
- Play a song in the background.
- After a few seconds of recording, the app will send the audio clip to the backend and display the matched result.
├── cmd/ # Entry points (API server)
├── frontend/ # React + Vite application
├── internal/
│ ├── audio/ # DSP logic (FFT, Spectrogram, etc.)
│ ├── database/ # Database connections and queries
│ ├── songs/ # Business logic for Song management & Matching
│ ├── spotify/ # Spotify API integration
│ └── ...
├── migrations/ # SQL migrations (Goose)
├── queries/ # SQLC queries
├── LINKEDIN_POST.md # Detailed technical write-up
└── Makefile # Command shortcuts
Distributed under the MIT License. See LICENSE for more information.