Does video-llama or any finetuned version support types such as float16 or bfloat16 when inferencing? Thanks

When I change input weights type either using model.half() or dtype = torch.float16/bfloat16, it gets much slower on CPU inferencing.