W4fp8 achieves good quantization and acceleration effects. Can LLMCompressor implement this quantization method and adapt it to vLLM? Ref: https://nvidia.github.io/TensorRT-Model-Optimizer/guides/_choosing_quant_methods.html#:~:text=Ampere%20and%20later.-,INT4%2DFP8%20AWQ%20(W4A8),-High