Skip to content

NCCL doesn't gracefully exit #114

@frenzybiscuit

Description

@frenzybiscuit

On the latest TabbyAPI when using NCCL it doesn't gracefully exit.

When using 3 or 4 3090, one process remains running on the very first card in the CUDA list, which requires manually being killed. This is when using ctrl+c to exit tabbyapi.

This problem has been an ongoing issue, which I've been content to just ignore. But I'd like to report the bug.

Hardware:

7950x;
128GB RAM;
4x3090;

TabbyAPI 8b6b793bfc4b848986d55340aed1f02e55ff9db8 git commit

Using EXL3. Using Tensor Paralellism with 24, 24, 24, 24 or 24, 24, 24.

FP16 cache.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions