🐋 The Cetacean Scrolls 📜

A memory machine for the deep.

🌊 Project Overview

Cetacean Scrolls is an ongoing project to transcribe, analyze, and interpret whale songs using AI.
It’s inspired by the idea that whales—especially humpbacks—may carry generational memory, environmental records, and possibly even non-human cultural transmission within their long, haunting vocalizations.

We aim to build a modular, extensible system that can:

🎧 Load long whale audio files
✂️ Chunk them into overlapping segments
🧠 Transcribe audio using OpenAI’s gpt-4o-transcribe
🔍 Analyze the transcripts for motifs, structure, and anomalies
🗒️ Log every entry as part of a growing, global Cetacean Memory Archive

🧠 Why This Matters

We often assume that language is a uniquely human tool—but whales may be one of the few species on Earth capable of expressing intergenerational memory through sound.

📡 This project explores the idea of "interpreting possible non-human cultural memory."

By listening more deeply, we hope to:

Preserve endangered acoustic data before it disappears
Detect environmental disruptions and deep-sea anomalies
Help build AI models that decode complex, non-human communication systems

This isn't just bioacoustics—this is archiving the consciousness of the oceans.

⚙️ How It Works (Modules)

load_audio_chunks.py 🧩 Split long MP3s into clean 5-min chunks
transcribe_chunk.py 🎙️ Transcribe each chunk using GPT-4o
analyze_transcript.py 🤖 Prompt GPT-4.1 to interpret motifs/anomalies
log_scroll_entry.py 📜 Save all entries to a time-stamped log
run_scroll_listener.py 🔁 Master pipeline runner

🚀 Setup

# Recommended: create a virtual environment
python -m venv venv
source venv/bin/activate  # Windows: .\venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt
🔧 Make sure ffmpeg is installed on your system.
It’s required for the pydub audio slicing to work.

📊 Coming Soon: The Scrolls Dashboard
We are currently building a Flask-powered dashboard that will:

Visualize motif shifts over time 📈

Display “Scroll entries” in a sleek UI 🖥️

Let collaborators search, contribute, and annotate 🌐

💡 Until then: feel free to explore the scripts, run your own hydrophone audio, and help us decode the language of the deep.

👀 Stay Tuned
This project is actively evolving—check back for:

🌐 Real-time hydrophone support

🌊 Environmental context tagging

🧬 Motif clustering and tracking

🐦🐋🐘 Cross-species sonics (birds, elephants, more?)

🤝 Contributing
This repo will eventually open to collaboration for:

Finding/cleaning hydrophone recordings

Improving AI analysis models

Building long-term data backups

Expanding into new marine species and locations

For now, feel free to ⭐ watch the project or send private feedback.

🧜‍♂️ Final Note
We're not just building software—
We're building the library of ocean memory
before it's lost forever.

🐋📜

— The Cetacean Scrolls Team

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
media		media
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analyze_transcripts.py		analyze_transcripts.py
load_audio_chunks.py		load_audio_chunks.py
log_scroll_entry.py		log_scroll_entry.py
requirements.txt		requirements.txt
run_scroll_listener.py		run_scroll_listener.py
transcribe_chunks.py		transcribe_chunks.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐋 The Cetacean Scrolls 📜

🌊 Project Overview

🧠 Why This Matters

⚙️ How It Works (Modules)

🚀 Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🐋 The Cetacean Scrolls 📜

🌊 Project Overview

🧠 Why This Matters

⚙️ How It Works (Modules)

🚀 Setup

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages