Skip to content

Segfault during inference #1368

@maxlund

Description

@maxlund
Crashed Thread:        23

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000700
Exception Codes:       0x0000000000000001, 0x0000000000000700

Termination Reason:    Namespace SIGNAL, Code 11 Segmentation fault: 11
Terminating Process:   exc handler [49912]


Thread 23 Crashed:
0   AGXMetalG13X                  	       0x32128d734 -[AGXG13XFamilyCommandBuffer tryCoalescingPreviousComputeCommandEncoderWithConfig:nextEncoderClass:] + 180
1   AGXMetalG13X                  	       0x32128d618 -[AGXG13XFamilyCommandBuffer computeCommandEncoderWithConfig:] + 84
2   AGXMetalG13X                  	       0x32128d544 -[AGXG13XFamilyCommandBuffer computeCommandEncoderWithDispatchType:] + 136
3   libmlx.dylib                  	       0x32351dda8 mlx::core::metal::CommandEncoder::CommandEncoder(mlx::core::metal::DeviceStream&) + 140
4   libmlx.dylib                  	       0x3235203a8 mlx::core::metal::Device::get_command_encoder(int) + 284
5   libmlx.dylib                  	       0x323559640 mlx::core::RandomBits::eval_gpu(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>> const&, mlx::core::array&) + 484
6   libmlx.dylib                  	       0x32355605c mlx::core::metal::eval(mlx::core::array&) + 192
7   libmlx.dylib                  	       0x322b0608c mlx::core::eval_impl(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>>, bool) + 4736
8   libmlx.dylib                  	       0x322b06c58 mlx::core::async_eval(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>>) + 112
9   core.cpython-310-darwin.so    	       0x320219d78 0x320180000 + 630136

Environment:

import numpy as np
import mlx.core as mx
import mlx_whisper
import platform

print(f"numpy: {np.__version__}")
print(f"mlx: {mx.__version__}")
print(f"mlx_whisper: {mlx_whisper.__version__}")
print(f"macOS {platform.mac_ver()}")

Prints:

numpy: 1.26.4
mlx: 0.24.1
mlx_whisper: 0.4.1
macOS ('15.0.1', ('', '', ''), 'arm64')

Tried to reproduce letting it run overnight transcribing many hours, no luck. Seen it happen a few times now though.

Thanks for all your great work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions