-
Notifications
You must be signed in to change notification settings - Fork 72
Open
Description
So I was trying to deploy a custom model on the tritonserver(23.08) with the onnxruntime_backend(onnxruntime version 1.15.1). But while doing so, we are facing this issue:
onnx runtime error 6: Non-zero status code returned while running Mul node. Name:\'Mul_8702\' Status Message: /workspace/onnxruntime/onnxruntime/core/framework/bfc_arena.cc:368 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 2830172160\
There are 7 other models also hosted on the the same server and those work fine(even under stress) but things break once this new model is added. Any idea why this might be happening? The server is also hosted in a T4 gpu and these are our current stats:
| model_control_mode | MODE_NONE │
│ | │
│ | strict_model_config | 0 │
│ | │
│ | rate_limit | OFF │
│ | │
│ | pinned_memory_pool_byte_size | 268435456 │
│ | │
│ | cuda_memory_pool_byte_size{0} | 67108864 │
│ | │
│ | min_supported_compute_capability | 6.0 │
│ | │
│ | strict_readiness | 1 │
│ | │
│ | exit_timeout | 30 │
│ | │
│ | cache_enabled | 0 │
│ |
Any help on understanding why this might be caused and how to fix this will be appreciated
Thanks!
Metadata
Metadata
Assignees
Labels
No labels