Skip to content

[BUG] EXO (All models) hang on WARMING UP, then 0% RAM usage on Python 3.13 systems #1303

@Stratamesh

Description

@Stratamesh

Describe the bug

On any model, and any # of nodes, EXO will active its "LOADING" step, then switch to "WARMING UP"... On Ubuntu 25, which natively ships with and runs Python 3.13, Python process during the "WARMING UP" stage will break and lose all RAM assigned to it. The Python task will then become zombie.

To Reproduce

Steps to reproduce the behavior:

  1. Launch EXO, load model on Ubuntu 25 (Python 3.13)
  2. Wait for WARMING UP stage
  3. You should get a FAILED message after the Python task has failed and possibly a "Python 3.13 has quit unexpectedly" pop-up.

Expected behavior

EXO should correctly load the model as it does with Python 3.11 on all connected nodes.

Actual behavior

EXO starts, begins to load model (ex. Llama3.1:8B) 1 or 2+ nodes. EXO successfully loads model into RAM on both/one node(s). 5s after, python task held by EXO quits due to incompatible memory manager causing a crash at the C-extension level.

Environment

  • macOS Version: N/A
  • EXO Version: Latest (Official)
  • Hardware:
    • Device 1: Dell PowerEdge (64GB) x2 Intel Xeon
    • Device 2: Dell PowerEdge (128GB) x2 AMD
    • Additional devices:
  • Interconnection:
    • 1GbE Ethernet between all devices.

Additional context

ALL packages up to date, many models attempted, Python reinstalled. I understand this is likely an already known issue, but would like to express my concerns/experience to better improve the software.

Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions