Skip to content
Alexander Borzunov edited this page Aug 6, 2023 · 3 revisions

This page lists common errors and ways to address them.

  1. I get this error: hivemind.dht.protocol.ValidationError: local time must be within 3 seconds of others on WSL. What should I do?

    Petals needs clocks on all nodes to be synchronized. Please set the date using an NTP server:

    sudo apt install ntpdate
    sudo ntpdate pool.ntp.org
  2. The server starts loading blocks and then prints: Killed. What should I do?

    This happens since Windows doesn't allocate much RAM to WSL by default, so the server gets OOM-killed.

    To increase the memory limit, go to C:/Users/username and create the .wslconfig with this contents:

    [wsl2]
    memory=12GB

    Then reboot WSL (run sudo reboot in the WSL console) and it should work fine.

  3. I get this error: torch.cuda.OutOfMemoryError: CUDA out of memory. What should I do?

    If you use an Anaconda env, run this before starting the server:

    export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128

    If you use Docker, add this argument after --rm in the Docker command:

    -e "PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128"
Clone this wiki locally