It would be great if this can support the performance metrices for each input token we provide.
First token latency
other tokens latency
First infer latency
other infers latency
token/sec
Something which we have here - https://github.com/openvinotoolkit/openvino.genai/tree/master/llm_bench/python