Description:
I am trying to run a model with Bfloat16 inputs in ONNX but I get the following message:
“Invalid argument: load failed for model ‘FluxEasyControlKVCacheApplyTokenDrop’: version 1 is at UNAVAILABLE state: Internal: unsupported datatype TYPE_BF16 for input ‘double_blocks_k_cache’ for model ‘FluxEasyControlKVCacheApplyTokenDrop’;\n”
It's a bit frustrating because I am able to run the model outside of Triton with Bfloat16 inputs.
I am running in the 25.02 triton ngc container