Skip to content

Flask server crashes during TitleDB loading on ARM (Raspberry Pi) #311

@amadib

Description

@amadib

Flask dev server killed during TitleDB memory load/unload cycle on ARM


Environment

  • Platform: Raspberry Pi 4 (ARM64, 4GB RAM)
  • OS: DietPi (Debian-based)
  • Docker: Running via Docker Compose
  • Image: a1ex4/ownfoil:latest (pulled 2026-03-23)
  • Memory limit: Tested at 128M, 256M, and 512M
  • Library: 45 Switch games (~mixed NSP/NSZ)

Bug

The Flask development server dies every time the scheduled TitleDB refresh runs (~every 2 hours). The Python process
survives but the web server stops accepting connections.

Reproduction

  1. Run ownfoil on ARM with a game library
  2. Wait for the first scheduled update_db_and_scan job to complete
  3. After "Loading TitleDBs into memory..." the memory spikes to ~375MB
  4. After "TitleDBs unloaded." the web server is dead (port 8465 refuses connections)
  5. ps inside the container shows the Python process still running, but netstat/ss shows no listening sockets

Log sequence (repeats every cycle)

  INFO  (scheduler) Starting job update_db_and_scan
  INFO  (app) Running update job...
  INFO  (titledb) Updating titledb...
  INFO  (titledb) Titledb already up to date
  INFO  (library) Scanning library path /games ...
  INFO  (scheduler) Completed job update_db_and_scan. Next run at ...
  INFO  (titles) Loading TitleDBs into memory...
    <--- memory spikes to ~375MB here --->
  INFO  (library) Identifying file (1/45): ...
    <--- library identification runs --->
  INFO  (library) Generating library done.
  INFO  (titles) Unloading TitleDBs from memory...
  INFO  (titles) TitleDBs unloaded.
    <--- Flask server is dead after this point. No error logged. --->
    <--- Port 8465 refuses connections. Process still alive. --->

What I've tried

  • valid_keys: false — same crash
  • Memory limits from 128M to 512M — same crash (not a Docker OOM, confirmed via dmesg)
  • Latest image — same crash
  • The server works fine before the first TitleDB load completes

Root cause (suspected)

The Flask development server runs in the main thread. TitleDB loading happens in a background thread and allocates
~375MB. On ARM (which has tighter memory constraints and a less forgiving Python GIL implementation), this kills the
Flask server thread. The scheduler thread survives, leaving a zombie process with no web server.

Suggested fixes (from Claude 4.6 Opus and needs to be tested)

Option A — Subprocess isolation (cleanest):

  import multiprocessing

  def load_titledb_worker():
      # load and process TitleDB here
      pass

  p = multiprocessing.Process(target=load_titledb_worker)
  p.start()
  p.join()  # memory is fully reclaimed when subprocess exits

Option B — Production WSGI server:
Replace Flask's dev server with gunicorn, which uses worker processes that survive independently of background thread
crashes:
gunicorn -w 2 -b 0.0.0.0:8465 app:app

Option C — Lazy TitleDB queries:
Instead of loading the entire TitleDB into memory, query the SQLite database on demand. The data is already persisted
after the first download.

Current workaround

Disable the Docker healthcheck and rely on restart: unless-stopped. The server comes back in ~30 seconds after each
crash. Results in brief downtime every 2 hours.

  healthcheck:
    disable: true
  deploy:
    resources:
      limits:
        memory: 512M

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions