Whisper Hotkey Testing Guide

Test Coverage

Current Status: 62.03% line coverage (163 passing, 46 ignored)

Run coverage report:

cargo llvm-cov                                    # Terminal summary
cargo llvm-cov --lcov --output-path coverage.lcov # LCOV format
cargo llvm-cov --html && open target/llvm-cov/html/index.html # HTML

Coverage by Module:

config.rs: 88.07% ✅
input/hotkey.rs: 72.31% ✅
audio/capture.rs: 70.28% ✅
transcription/download.rs: 60.87%
tray.rs: 58.97%
permissions.rs: 55.83%
input/cgevent.rs: 54.17%
telemetry.rs: 49.40%
transcription/engine.rs: 45.26%
main.rs: 0.00% (binary, not unit testable)

Expected Coverage Gaps:

Some modules have lower coverage due to external dependencies that cannot be mocked in unit tests:

transcription/engine.rs (327 lines, 180 uncovered): Requires actual Whisper model file (~75MB). Tests are #[ignore]
tray.rs (1,105 lines, 439 uncovered): Requires macOS main thread for Menu/TrayIcon creation. Tests are #[ignore]
main.rs (210 lines, 210 uncovered): Binary initialization and event loop, requires full system integration

These account for 829 uncovered lines (58.5% of total 1,416 gaps). Remaining gaps are testable. Current 62.03% coverage is realistic given integration-heavy codebase.

Phase Completion:

✅ Phase 1: Low-hanging fruit (47% → 60%)
✅ Phase 2: Mock infrastructure (60% → 75%)
✅ Phase 3: Tray menu logic (75% → 82%)
✅ Phase 4: Testability refactoring (58% → 62%)
- Extracted pure configuration logic from FFI calls
- tray.rs: Menu configuration now 100% testable
- engine.rs: Sampling strategy extracted and tested
- +17 new unit tests covering previously untestable code

Quick Test

After building, test the full pipeline:

Build: cargo build --release
Run: cargo run --release
Test hotkey: Press Ctrl+Option+Z (or configured hotkey)
Record: Hold for 5s, speak clearly
Verify: Text inserted at cursor

Expected latency: <2.5s total (5s recording + <2s transcription + <100ms insertion)

Unit Tests

# Fast unit tests (no hardware)
cargo test

# Hardware tests (requires mic + permissions)
cargo test -- --ignored

Key tests:

config.rs:237 - Config parsing with default optimization parameters
engine.rs:327 - Long 30s recordings
engine.rs:347 - Optimization param variations
engine.rs:374 - Noise handling

Performance Profiling

Audio Latency

Enable detailed logging:

RUST_LOG=whisper_hotkey=trace cargo run --release

Key metrics (check logs):

start_recording: latency_us <50μs target
stop_recording: total_ms <50ms target
convert_to_16khz_mono: total_us <20ms target
transcription: inference_ms <2000ms for 10s audio

CPU Profiling

# Install flamegraph
cargo install flamegraph

# Profile (run app, trigger hotkey, Ctrl+C)
sudo cargo flamegraph --release

# Open flamegraph.svg
open flamegraph.svg

Look for:

Whisper inference (should dominate)
Audio conversion overhead (should be minimal)
Unnecessary allocations in hot paths

Memory Profiling

macOS Instruments (recommended)

# Build release
cargo build --release

# Profile allocations
instruments -t Allocations target/release/whisper-hotkey

# Profile leaks
instruments -t Leaks target/release/whisper-hotkey

Trigger multiple recordings (10+) while profiling.

Expected memory:

Idle: ~1.5GB (model loaded)
During transcription: +50-100MB temporary
No leaks after recordings complete

Valgrind (Linux alternative)

valgrind --leak-check=full --show-leak-kinds=all target/release/whisper-hotkey

Manual Test Cases

Edge Cases

1. Silence (0-1s)

Press hotkey, don't speak, release
Expected: Empty or minimal text

2. Short audio (1-3s)

"Hello world"
Expected: Accurate transcription, <1.5s latency

3. Long recording (30s+)

Read paragraph continuously
Expected: Complete transcription, <5s latency, no crashes

4. Noise

Record with background music/TV
Expected: Partial transcription or gibberish (no crash)

5. Rapid hotkey cycles

Press/release 10 times quickly (0.1s each)
Expected: All cycles complete, no deadlocks

6. Concurrent recordings

Press hotkey during active transcription
Expected: Graceful handling (queue or reject)

App Compatibility

Test text insertion in:

✓ TextEdit
✓ VS Code
✓ Chrome (Gmail, Google Docs)
✓ Slack
✓ Terminal (may fail - secure input mode)
✓ Notes
✓ Messages

Check logs for insertion failures: ~/.whisper-hotkey/crash.log

Config Changes

Edit ~/.whisper-hotkey.toml:

Optimization tuning

[model]
threads = 8        # Try 2, 4, 8
beam_size = 1      # Try 1 (fast), 5 (balanced), 10 (accurate)

Restart app, test 10s recording:

threads=8, beam_size=1: Fastest (~1s)
threads=4, beam_size=5: Balanced (~2s)
threads=4, beam_size=10: Accurate (~3s)

Different models

[model]
name = "tiny"   # or base, small, medium, large
path = "~/.whisper-hotkey/models/ggml-tiny.bin"

Restart, verify auto-download + transcription accuracy.

Performance Targets

Metric	Target	Command
Audio start	<50μs	`RUST_LOG=trace` logs
Transcription (10s)	<2s	`RUST_LOG=info` logs
Text insertion	<100ms	Test manually
Idle CPU	<1%	Activity Monitor
Idle RAM	~1.5GB	Activity Monitor
No leaks	0	Instruments/Valgrind

Failing targets? Check:

Config optimization (threads, beam_size)
Model size (try smaller)
Flamegraph for bottlenecks

Debugging

App won't start

# Check config syntax
cat ~/.whisper-hotkey.toml

# Validate manually
cargo run 2>&1 | grep -i error

# Reset config
rm ~/.whisper-hotkey.toml && cargo run

No transcription output

# Check logs
tail -f ~/.whisper-hotkey/crash.log

# Verify model
ls -lh ~/.whisper-hotkey/models/

# Test directly
cargo test test_transcribe_short_audio -- --ignored --nocapture

Audio not recording

# Reset mic permissions
tccutil reset Microphone

# Restart app, grant permission

# Verify device
cargo test test_audio_capture_initialization -- --ignored --nocapture

Memory leak

# Profile with Instruments
instruments -t Leaks target/release/whisper-hotkey

# Check ringbuf not growing
# Run 100 recordings, verify RSS stable

CI/CD

GitHub Actions runs on PR/main:

cargo fmt --check
cargo clippy -- -D warnings
cargo test (unit only, no hardware)
cargo build --release

Hardware tests (require manual run):

cargo test -- --ignored

Reporting Issues

Include:

macOS version (sw_vers)
Chip (M1/M2/Intel)
Logs (~/.whisper-hotkey/crash.log)
Config (~/.whisper-hotkey.toml)
Steps to reproduce
Expected vs actual behavior

Example:

**macOS**: 14.0 (M1 Pro)
**Config**: threads=4, beam_size=5, model=small
**Issue**: 30s recording crashes with "mutex poisoned"
**Logs**: [attach crash.log]
**Reproduce**: Hold hotkey for 30s, speak continuously

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper Hotkey Testing Guide

Test Coverage

Quick Test

Unit Tests

Performance Profiling

Audio Latency

CPU Profiling

Memory Profiling

macOS Instruments (recommended)

Valgrind (Linux alternative)

Manual Test Cases

Edge Cases

1. Silence (0-1s)

2. Short audio (1-3s)

3. Long recording (30s+)

4. Noise

5. Rapid hotkey cycles

6. Concurrent recordings

App Compatibility

Config Changes

Optimization tuning

Different models

Performance Targets

Debugging

App won't start

No transcription output

Audio not recording

Memory leak

CI/CD

Reporting Issues

FilesExpand file tree

TESTING.md

Latest commit

History

TESTING.md

File metadata and controls

Whisper Hotkey Testing Guide

Test Coverage

Quick Test

Unit Tests

Performance Profiling

Audio Latency

CPU Profiling

Memory Profiling

macOS Instruments (recommended)

Valgrind (Linux alternative)

Manual Test Cases

Edge Cases

1. Silence (0-1s)

2. Short audio (1-3s)

3. Long recording (30s+)

4. Noise

5. Rapid hotkey cycles

6. Concurrent recordings

App Compatibility

Config Changes

Optimization tuning

Different models

Performance Targets

Debugging

App won't start

No transcription output

Audio not recording

Memory leak

CI/CD

Reporting Issues