Getting Started with AUDIOANALYSISX1

🚀 Quick Start (3 Steps)

1. Install Dependencies

pip install -r requirements.txt

2. Download Sample Audio Files

python download_samples.py

This creates test samples in samples/ directory:

Human voices (clean recordings)
TTS samples (AI-generated speech)
Manipulated audio (pitch-shifted, time-stretched)

3. Analyze Audio

Option A: Web GUI (Recommended - Most User-Friendly)

python start_gui.py

Opens a beautiful web interface in your browser with:

🖱️ Drag-and-drop file upload
📊 Real-time visualizations
📥 Download reports
📁 Batch processing

Option B: Simple Command Line

# Analyze a single file
python analyze.py samples/tts/tts_smooth_prosody.wav

# Analyze all samples
python analyze.py --batch samples/

📖 Usage Examples

Simple Analysis

python analyze.py your_audio.wav

Output:

╔══════════════════════════════════════════════════════════════╗
║                   AUDIOANALYSISX1                            ║
║          Voice Manipulation & AI Detection System            ║
╚══════════════════════════════════════════════════════════════╝

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ANALYSIS RESULTS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Voice Type:            Female      F0: 185.7 Hz (Median)
Physical Characteristics: Male     F1: 235 Hz, F2: 473 Hz
Manipulation:          DETECTED    99% (Very High)
AI Voice:              DETECTED    AI-Generated (Type Unknown)

⚠ EVIDENCE DETECTED:
  [1] Pitch-Formant Incoherence Detected
  [2] Phase Decoherence / Transient Smearing Detected
  [3] Spectral Artifacts Detected
  [4] AI Voice Detected (80% confidence)

Batch Analysis

python analyze.py --batch samples/ --output results/

Analyzes all audio files in a directory and generates a summary table.

Interactive Mode (Full Features)

python tui.py interactive

Provides a menu-driven interface with:

Single file analysis
Batch processing
Sample generation
Full visualizations

🎯 What Does It Detect?

1. Voice Manipulation

✅ Pitch-shifting (male ↔ female voice conversion)
✅ Time-stretching (speed manipulation)
✅ Combined attacks (pitch + time)

2. AI-Generated Voices

✅ TTS Systems (Tacotron, FastSpeech, VITS)
✅ Voice Cloning (Real-Time VC, SV2TTS)
✅ Neural Vocoders (WaveNet, WaveGlow, HiFi-GAN)
✅ Deepfakes (multi-stage synthesis)

📊 Understanding Results

Report Structure

Every analysis generates:

JSON Report (*_report.json)
- Complete technical details
- Cryptographically signed
- Machine-readable
Markdown Report (*_report.md)
- Human-readable summary
- Executive summary
- Evidence details
Visualizations (4 PNG files)
- Overview dashboard
- Mel spectrogram
- Phase analysis
- Pitch-formant comparison

Key Fields

{
  "ALTERATION_DETECTED": true,          // Any manipulation found?
  "AI_VOICE_DETECTED": true,            // AI-generated voice?
  "AI_TYPE": "Neural Vocoder",          // Type of AI synthesis
  "CONFIDENCE": "95% (Very High)",      // Detection confidence

  "PRESENTED_AS": "Female",             // Apparent gender (from pitch)
  "PROBABLE_SEX": "Male",               // Actual gender (from formants)

  "EVIDENCE_VECTOR_1_PITCH": "...",     // Pitch manipulation evidence
  "EVIDENCE_VECTOR_2_TIME": "...",      // Time manipulation evidence
  "EVIDENCE_VECTOR_3_SPECTRAL": "...",  // Spectral artifacts
  "EVIDENCE_VECTOR_4_AI": "..."         // AI detection evidence
}

🛠️ Advanced Options

Disable Visualizations (Faster)

python analyze.py audio.wav --no-viz

Custom Output Directory

python analyze.py audio.wav --output ./investigation_001/

Batch with Pattern Matching

python analyze.py --batch ./evidence/ --pattern "*.mp3"

🔍 Detection Methods

Phase 1: F0 Analysis

Extracts fundamental frequency to determine presented pitch.

Phase 2: Formant Analysis

Extracts vocal tract resonances (physical characteristics).

Phase 3: Manipulation Detection

Pitch-formant incoherence
Mel spectrogram artifacts
Phase decoherence/transient smearing

Phase 4: AI Voice Detection

Neural vocoder artifacts
Prosody unnaturalness
Breathing/pause patterns
Micro-timing perfection
Harmonic structure anomalies
Statistical feature anomalies

Phase 5: Report Synthesis

Consolidates all findings with cryptographic verification.

💡 Tips

Best Practices

Use WAV files for best accuracy (lossless format)
3-10 seconds is optimal length
Clean audio works better than noisy recordings
Review visualizations for detailed analysis

Interpreting Confidence

Confidence	Meaning	Action
95-99%	Very High	Strong evidence, high certainty
85-94%	High	Multiple indicators detected
60-84%	Medium	Some indicators, review carefully
<60%	Low	Inconclusive, may need more analysis

📁 Project Structure

voice/
├── analyze.py              # ← Simple CLI interface
├── tui.py                  # ← Interactive TUI
├── download_samples.py     # ← Sample downloader
├── pipeline.py             # Core detection pipeline
├── samples/                # Test audio samples
│   ├── human/
│   ├── tts/
│   ├── voice_cloning/
│   ├── deepfake/
│   └── manipulated/
└── results/                # Analysis outputs

🆘 Troubleshooting

"No module named 'librosa'"

pip install -r requirements.txt

"File not found"

# Check file path
ls -la your_audio.wav

# Use absolute path
python analyze.py /full/path/to/audio.wav

"Parselmouth error"

# Convert to WAV format
ffmpeg -i input.mp3 output.wav

# Then analyze
python analyze.py output.wav

📚 More Information

Full Documentation: See README.md
Technical Details: See TECHNICAL.md
API Reference: See API.md
Deployment Guide: See DEPLOYMENT.md

🎓 Next Steps

✅ Run python download_samples.py to get test samples
✅ Try python analyze.py samples/tts/tts_smooth_prosody.wav
✅ Analyze your own audio files
✅ Explore interactive mode: python tui.py interactive
✅ Read full documentation for advanced features

🔬 You're ready to detect voice manipulation and AI-generated voices!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting Started with AUDIOANALYSISX1

🚀 Quick Start (3 Steps)

1. Install Dependencies

2. Download Sample Audio Files

3. Analyze Audio

📖 Usage Examples

Simple Analysis

Batch Analysis

Interactive Mode (Full Features)

🎯 What Does It Detect?

1. Voice Manipulation

2. AI-Generated Voices

📊 Understanding Results

Report Structure

Key Fields

🛠️ Advanced Options

Disable Visualizations (Faster)

Custom Output Directory

Batch with Pattern Matching

🔍 Detection Methods

Phase 1: F0 Analysis

Phase 2: Formant Analysis

Phase 3: Manipulation Detection

Phase 4: AI Voice Detection

Phase 5: Report Synthesis

💡 Tips

Best Practices

Interpreting Confidence

📁 Project Structure

🆘 Troubleshooting

"No module named 'librosa'"

"File not found"

"Parselmouth error"

📚 More Information

🎓 Next Steps

FilesExpand file tree

getting-started.md

Latest commit

History

getting-started.md

File metadata and controls

Getting Started with AUDIOANALYSISX1

🚀 Quick Start (3 Steps)

1. Install Dependencies

2. Download Sample Audio Files

3. Analyze Audio

📖 Usage Examples

Simple Analysis

Batch Analysis

Interactive Mode (Full Features)

🎯 What Does It Detect?

1. Voice Manipulation

2. AI-Generated Voices

📊 Understanding Results

Report Structure

Key Fields

🛠️ Advanced Options

Disable Visualizations (Faster)

Custom Output Directory

Batch with Pattern Matching

🔍 Detection Methods

Phase 1: F0 Analysis

Phase 2: Formant Analysis

Phase 3: Manipulation Detection

Phase 4: AI Voice Detection

Phase 5: Report Synthesis

💡 Tips

Best Practices

Interpreting Confidence

📁 Project Structure

🆘 Troubleshooting

"No module named 'librosa'"

"File not found"

"Parselmouth error"

📚 More Information

🎓 Next Steps