The FFmpeg-RTMP distributed transcoding system now supports comprehensive resource limits per job to prevent resource exhaustion and ensure system stability.
- Purpose: Prevent jobs from monopolizing CPU resources
- Implementation: Linux cgroups (v1 and v2 supported)
- Fallback: Process nice priority when cgroups unavailable
- Purpose: Prevent OOM conditions
- Implementation: Linux cgroups memory controller
- Behavior: Process killed if exceeding limit
- Purpose: Prevent disk full conditions
- Check: Before job starts
- Alert: Warning at 90% usage, reject at 95%
- Purpose: Kill runaway jobs
- Implementation: Context-based with fallback monitoring
- Cleanup: Process group termination (SIGTERM → SIGKILL)
- Purpose: Lower priority for transcoding jobs
- Implementation: Linux nice value (default: 10)
- Benefit: System remains responsive during heavy load
{
"scenario": "1080p-h264",
"parameters": {
"bitrate": "4M",
"duration": 300
},
"resource_limits": {
"max_cpu_percent": 200, // 200% = 2 cores
"max_memory_mb": 2048, // 2GB memory limit
"max_disk_mb": 5000, // 5GB temp space required
"timeout_sec": 600 // 10 minute timeout
}
}
If no resource_limits specified, defaults are:
{
MaxCPUPercent: numCPU * 100, // All available CPUs
MaxMemoryMB: 2048, // 2GB
MaxDiskMB: 5000, // 5GB
TimeoutSec: 3600, // 1 hour
}
- Type: Integer
- Unit: Percentage (100 = 1 core)
- Examples:
100= 1 CPU core200= 2 CPU cores400= 4 CPU cores
- Note: Requires cgroups (root/sudo), otherwise uses nice priority
- Type: Integer
- Unit: Megabytes
- Typical Values:
- 720p: 512-1024 MB
- 1080p: 1024-2048 MB
- 4K: 2048-4096 MB
- Note: Requires cgroups (root/sudo)
- Type: Integer
- Unit: Megabytes
- Purpose: Check available disk space before starting
- Note: Always enforced, no special permissions needed
- Type: Integer
- Unit: Seconds
- Purpose: Maximum job execution time
- Note: Always enforced via context timeout
1. Detect cgroup version (v1 or v2)
2. Check cgroup permissions
3. Initialize cgroup manager
4. Set up disk monitoring
1. Check disk space (reject if < 5% available)
2. Parse resource limits from job parameters
3. Generate input video if needed
4. Start transcoding process
5. Set process priority (nice value)
6. Create cgroup (if permissions available)
7. Add process to cgroup
8. Monitor process (timeout, memory)
9. Wait for completion
10. Cleanup cgroup
11. Return results
- Interval: Every 5 seconds
- Checks:
- Process exists
- Timeout exceeded
- Memory usage (logged if over limit)
- Action: Kill process group if timeout exceeded
- Path:
/sys/fs/cgroup/ffmpeg-job-{jobID}/ - Files:
cpu.max: CPU quota/periodmemory.max: Memory limit in bytescgroup.procs: Process list
- CPU Path:
/sys/fs/cgroup/cpu/ffmpeg-job-{jobID}/ - Memory Path:
/sys/fs/cgroup/memory/ffmpeg-job-{jobID}/ - Files:
- CPU:
cpu.cfs_quota_us,cpu.cfs_period_us - Memory:
memory.limit_in_bytes
- CPU:
With Root/Sudo (Full cgroup support):
sudo ./bin/agent -master https://localhost:8080 ...
Without Root (Fallback mode):
./bin/agent -master https://localhost:8080 ...
# Still enforces: disk limits, timeouts, nice priority
# Cannot enforce: CPU/memory cgroups
>>> RESOURCE CHECK PHASE <<<
Disk space: 39.9% used (142023 MB available)
Resource limits: CPU=200%, Memory=2048MB, Disk=5000MB, Timeout=600s
>>> TRANSCODING EXECUTION PHASE <<<
Process started with PID: 12345
Set process priority: nice=10 (lower than normal)
Detected cgroup v2 (root: /sys/fs/cgroup)
Set CPU limit: 200% (quota=200000, period=100000)
Set memory limit: 2048 MB
✓ Process added to cgroup: /sys/fs/cgroup/ffmpeg-job-abc123
WARNING: Cannot create cgroup (permission denied), running without cgroup limits
WARNING: Disk usage is high: 92.5% (available: 15000 MB)
WARNING: Process 12345 using 2500 MB (limit: 2048 MB)
insufficient disk space: 96.2% used
Process 12345 exceeded timeout (600 seconds), killing...
- Disk check: <1ms per job
- Cgroup creation: 1-5ms per job
- Process monitoring: Negligible (5s intervals)
- Nice priority: No measurable overhead
- System stability: Prevents resource exhaustion
- Predictable performance: Jobs stay within limits
- Multi-tenancy: Fair resource sharing
- Responsiveness: System remains usable under load
// Light encoding (720p, fast preset)
"max_cpu_percent": 100-200
// Medium encoding (1080p, medium preset)
"max_cpu_percent": 200-400
// Heavy encoding (4K, slow preset)
"max_cpu_percent": 400-800
// Conservative (safety margin)
"max_memory_mb": 2048
// Standard transcoding
"max_memory_mb": 1024
// Lightweight (720p only)
"max_memory_mb": 512
// Fast jobs (< 1 min duration)
"timeout_sec": 300 // 5 minutes
// Standard jobs (1-5 min duration)
"timeout_sec": 900 // 15 minutes
// Long jobs (> 5 min duration)
"timeout_sec": 3600 // 1 hour
// Always require sufficient space
"max_disk_mb": duration_sec * bitrate_mbps * 2 // 2x for safety
Problem: WARNING: Cannot create cgroup (permission denied)
Solutions:
- Run worker with sudo:
sudo ./bin/agent ... - Add user to cgroup-permitted group
- Accept fallback mode (disk/timeout/nice still work)
Problem: Jobs consistently timeout before completion
Solutions:
- Increase
timeout_secvalue - Check if CPU limit too low (causing slow encoding)
- Verify hardware capabilities match workload
Problem: WARNING: Disk usage is high
Solutions:
- Clean up /tmp regularly
- Increase disk space
- Reduce concurrent jobs
- Enable input cleanup (PERSIST_INPUTS=false)
Problem: Jobs killed by OOM killer
Solutions:
- Increase
max_memory_mb - Reduce concurrent jobs
- Use lower resolution/bitrate
- Check for memory leaks in FFmpeg
{
"scenario": "720p30-h264",
"parameters": {
"bitrate": "2M",
"output_mode": "rtmp"
},
"resource_limits": {
"max_cpu_percent": 200,
"max_memory_mb": 1024,
"timeout_sec": 3600
},
"queue": "live",
"priority": "high"
}
{
"scenario": "1080p30-h264",
"parameters": {
"bitrate": "4M",
"duration": 300,
"preset": "medium"
},
"resource_limits": {
"max_cpu_percent": 400,
"max_memory_mb": 2048,
"timeout_sec": 1800
},
"queue": "batch",
"priority": "low"
}
{
"scenario": "4k-h265",
"parameters": {
"bitrate": "20M",
"duration": 600,
"preset": "slow"
},
"resource_limits": {
"max_cpu_percent": 800,
"max_memory_mb": 4096,
"timeout_sec": 7200
},
"queue": "batch",
"priority": "medium"
}
- Linux kernel 3.10+ (cgroup v1)
- /tmp with 10GB+ free space
- 2GB+ RAM per worker
- Linux kernel 4.5+ (cgroup v2)
- /tmp with 50GB+ free space
- 8GB+ RAM per worker
- SSD storage for /tmp
- Root/sudo access for full cgroup support
- systemd for cgroup management
- Dedicated /tmp partition
- GPU resource limits (CUDA/OpenCL)
- Network bandwidth throttling
- I/O priority (ionice)
- Per-user resource quotas
- Resource reservation/scheduling
- Cgroup delegation (rootless)
Version: 1.0
Last Updated: 2026-01-05
Status: Production Ready