Complete reference for configuring MeetMemo.
Create a .env file in the project root:
cp example.env .env| Variable | Description | Example |
|---|---|---|
HF_TOKEN |
Hugging Face API token for PyAnnote models | hf_abc... |
LLM_API_URL |
LLM endpoint base URL (no /v1/chat/completions suffix) |
http://localhost:1234 |
LLM_MODEL_NAME |
Model identifier | qwen2.5-14b-instruct |
DATABASE_URL |
PostgreSQL connection string (auto-set in Docker) | postgresql://... |
| Variable | Description | Default |
|---|---|---|
LLM_API_KEY |
API key for LLM service | Empty (none) |
POSTGRES_PASSWORD |
PostgreSQL password | changeme |
WHISPER_MODEL_NAME |
Whisper model for transcription | turbo |
COMPUTE_TYPE |
Inference precision (float16/int8) | float16 |
TIMEZONE_OFFSET |
Timezone offset from UTC (hours) | +8 |
NVIDIA_VISIBLE_DEVICES |
GPU selection (all, 0, 0,1) |
all |
HTTP_PORT |
External HTTP port for nginx | 80 |
HTTPS_PORT |
External HTTPS port for nginx | 443 |
MeetMemo uses faster-whisper with CTranslate2 for 4x faster transcription compared to openai-whisper, while maintaining the same accuracy and 99+ language support.
Set WHISPER_MODEL_NAME environment variable:
| Model | VRAM | Speed | Accuracy | Use Case |
|---|---|---|---|---|
tiny |
~1GB | Fastest | Basic | Quick drafts, testing |
base |
~1GB | Fast | Good | General use |
small |
~2GB | Moderate | Better | Most meetings |
medium |
~5GB | Slow | High | Important recordings |
large |
~10GB | Slowest | Highest | Critical accuracy needs |
turbo |
~6GB | Fast | High | Default - Best balance |
large-v3 |
~10GB | Very Slow | Highest+ | 10-20% better than large-v2 |
Control inference precision with COMPUTE_TYPE environment variable:
| Compute Type | Memory Usage | Speed | Quality | Best For |
|---|---|---|---|---|
float16 |
Medium | Fast | High | GPU (default, recommended) |
int8 |
Low | Very Fast | Good | CPU or low VRAM |
int8_float16 |
Low-Medium | Fast | Good-High | Hybrid scenarios |
Example:
# In .env file
WHISPER_MODEL_NAME=turbo
COMPUTE_TYPE=float16Configure in backend/config.py:
max_file_size: int = 100 * 1024 * 1024 # 100MB
allowed_audio_types: list[str] = [
'audio/wav', 'audio/mpeg', 'audio/mp4',
'audio/x-m4a', 'audio/webm', 'audio/flac', 'audio/ogg'
]Auto-cleanup of old jobs and exports:
cleanup_interval_hours: int = 1 # Check every hour
job_retention_hours: int = 12 # Keep jobs for 12 hours
export_retention_hours: int = 24 # Keep exports for 24 hoursAll runtime data is stored in named Docker volumes:
| Volume | Purpose | Path in Container |
|---|---|---|
meetmemo_audiofiles |
Uploaded audio files | /app/audiofiles |
meetmemo_transcripts |
Transcription JSONs | /app/transcripts |
meetmemo_summary |
Summary files | /app/summary |
meetmemo_exports |
PDF/Markdown exports | /app/exports |
meetmemo_logs |
Application logs | /app/logs |
meetmemo_whisper_cache |
Legacy (unused) | /root/.cache/whisper |
meetmemo_huggingface_cache |
Whisper + PyAnnote models | /root/.cache/huggingface |
meetmemo_torch_cache |
PyTorch cache | /root/.cache/torch |
meetmemo_postgres_data |
Database | /var/lib/postgresql/data |
PostgreSQL connection pool settings in backend/config.py:
db_pool_min_size: int = 5 # Minimum connections
db_pool_max_size: int = 20 # Maximum connectionsConfigure in backend/config.py:
log_level: str = "INFO" # DEBUG, INFO, WARNING, ERROR, CRITICAL
log_file: str = "logs/app.log"
log_max_bytes: int = 10 * 1024 * 1024 # 10MB per file
log_backup_count: int = 5 # Keep 5 rotated files
log_to_console: bool = True # Also output to stdoutIn .env:
NVIDIA_VISIBLE_DEVICES=0 # Use only GPU 0
NVIDIA_VISIBLE_DEVICES=0,1 # Use GPUs 0 and 1
NVIDIA_VISIBLE_DEVICES=all # Use all GPUs (default)Remove the runtime: nvidia line from docker-compose.yml:
meetmemo-backend:
# runtime: nvidia # Comment out this lineNote: CPU-only mode is significantly slower for transcription and diarization.
Set your local timezone offset:
TIMEZONE_OFFSET=+8 # GMT+8 (Asia/Singapore, Beijing)
TIMEZONE_OFFSET=-5 # GMT-5 (US Eastern)
TIMEZONE_OFFSET=0 # GMT (UTC)This affects timestamps in generated files and logs.
MeetMemo works with any OpenAI-compatible API:
# Local LM Studio
LLM_API_URL=http://localhost:1234
LLM_MODEL_NAME=qwen2.5-14b-instruct
LLM_API_KEY=
# Ollama with OpenAI compatibility
LLM_API_URL=http://localhost:11434/v1
LLM_MODEL_NAME=llama3
LLM_API_KEY=
# OpenAI
LLM_API_URL=https://api.openai.com/v1
LLM_MODEL_NAME=gpt-4
LLM_API_KEY=sk-...Adjust LLM timeout in backend/config.py:
llm_timeout: float = 60.0 # 60 secondsConfigure external ports via .env file:
HTTP_PORT=8080
HTTPS_PORT=8443Then access via https://localhost:8443
This is the recommended approach as it keeps your configuration in one place and doesn't require editing docker-compose.yml.
Alternatively, you can edit docker-compose.yml directly:
nginx:
ports:
- "8080:80" # HTTP (redirects to HTTPS)
- "8443:443" # HTTPSNote: If both environment variables and manual configuration are present, environment variables take precedence.
SSL/TLS settings in nginx/nginx.conf:
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
client_max_body_size 500M; # Max upload sizeProxy timeouts for long-running transcriptions:
proxy_connect_timeout 600s;
proxy_send_timeout 600s;
proxy_read_timeout 600s;