Skip to content

Code to reproduce the experiments presented in the article "Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection via Multi-View Fusion" (Findings of ACL 2025)

License

Notifications You must be signed in to change notification settings

deezer/robust-AI-lyrics-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Double Entendre

Robust audio-based AI-generated lyrics detection via multi-view fusion.

Code to reproduce the experiments presented in "Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection via Multi-View Fusion" (Findings of ACL 2025) and "AI-Generated Song Detection via Lyrics Transcripts" (ISMIR 2025).

This repository provides tools for detecting AI-generated music by analyzing sung lyrics extracted via speech transcription. It supports multiple transcribers, feature extractors (text and audio), and evaluation scenarios including audio perturbations and out-of-distribution generalization.

Overview

Double Entendre Method

Supported scenarios:

  • Real vs. Fake (AI-generated lyrics + AI vocals)
  • Real vs. Half-fake (human lyrics + AI vocals)
  • Cross-platform generalization (Suno → Udio)
  • Robustness under audio attacks (reverb, pitch, EQ, noise, time stretch)

Project Structure

.
├── transcription/       # Audio → text transcription
│   ├── run_whisper.py   # Whisper transcription
│   ├── run_meta.py      # Meta models (Seamless, MMS)
│   └── ...
├── features/            # Feature extraction & detection
│   ├── feature_extractor.py  # Base class
│   ├── sbert.py, llm2vec.py  # Text features
│   ├── xeus.py, w2v2.py      # Speech features
│   ├── run_features.py       # Main experiments
│   └── ...
├── pyproject.toml
└── README.md

See module READMEs for details:


Installation

git clone https://github.com/deezer/robust-AI-lyrics-detection.git
cd robust-AI-lyrics-detection
pip install .

Requirements: Python 3.9–3.11, CUDA-capable GPU recommended.

Optional Dependencies

pip install ".[meta]"       # Meta speech models (Seamless, MMS)
pip install ".[demucs]"     # Source separation (improves transcription)
pip install ".[xeus]"       # XEUS speech features (requires checkpoint below)
pip install ".[binoculars]" # Binoculars detector

XEUS Checkpoint

XEUS requires a manual checkpoint download:

wget https://huggingface.co/espnet/xeus/resolve/main/model/xeus_checkpoint.pth -P checkpoints/

Quick Start

1. Transcribe

cd transcription
python run_whisper.py --model large-v2 --file_path data/real/songs.json

2. Detect

cd features
python run_features.py

Data Format

Dataset JSON files should follow this structure:

{
  "train": [
    {
      "md5": "abc123...",
      "text": "Original lyrics text...",
      "language_str": "en",
      "genre": "pop",
      "artist_name": "Artist",
      "class": "real"
    }
  ],
  "test": [...]
}

For AI-generated songs, include mp3_url or local audio_path instead of md5.

Field Description
md5 Audio file hash (for path lookup)
text Ground-truth lyrics
language_str ISO language code
class real, generated, or half-fake
transcription Added by transcription scripts

Configuration

Environment variables for path configuration:

export AUDIO_BASE_PATH=/path/to/audio      # Audio file storage
export PROJECT_BASE_DIR=/path/to/project   # Project root
export ARTEFACTS_DIR=./artefacts           # Model outputs

Citation

If you use this code, please cite:

Double Entendre (Findings of ACL 2025)

@inproceedings{frohmann2025double,
    author    = {Frohmann, Markus and Meseguer-Brocal, Gabriel and 
                 Epure, Elena V. and Labrak, Yanis},
    title     = {{Double Entendre}: Robust Audio-Based {AI}-Generated Lyrics 
                 Detection via Multi-View Fusion},
    booktitle = {Findings of the Association for Computational Linguistics: ACL 2025},
    year      = {2025},
    publisher = {Association for Computational Linguistics}
}

AI-Generated Song Detection via Lyrics Transcripts (ISMIR 2025)

@inproceedings{frohmann2025ai,
    author    = {Frohmann, Markus and Epure, Elena and 
                 Meseguer Brocal, Gabriel and Schedl, Markus and Hennequin, Romain},
    title     = {{AI}-Generated Song Detection via Lyrics Transcripts},
    booktitle = {Proceedings of the 26th International Society for 
                 Music Information Retrieval Conference (ISMIR)},
    year      = {2025},
    pages     = {121--130},
    address   = {Daejeon, South Korea},
    doi       = {10.5281/zenodo.17706345}
}

Related Work


License

MIT License. See LICENSE.

Built at Deezer Research, Paris.

About

Code to reproduce the experiments presented in the article "Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection via Multi-View Fusion" (Findings of ACL 2025)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages