Skip to content

Performance unexplainabilitty for tflite int8 and fp32 models #13

@arun-kumark

Description

@arun-kumark

Dear all,
I am testing the performance/throughput of fp32 and quantized models on my platform. My configuration is as follows:

tflite-runtime==2.5.0.post1
tensorflow==1.14.0

*FP32 on CPU

-INFO- Running prediction...
-INFO- Acquired 1 file(s) for model 'MobileNet v1.0'
-INFO- Task runtime: 0:00:28.796083
-INFO- Throughput: 35.8 fps
-INFO- Latency: 29.5 ms
-INFO- Target          Workload        H/W   Prec  Batch Conc. Metric       Score    Units
-INFO- -----------------------------------------------------------------------------------
-INFO- tensorflow_lite mobilenet       cpu   fp32      1     1 throughput    35.8      fps
-INFO- tensorflow_lite mobilenet       cpu   fp32      1     1 latency       29.5       ms
-INFO- Total runtime: 0:00:28.830364
-INFO- Done

INT8 on CPU

google@localhost:~/mlmark$ harness/mlmark.py -c config/tflite-cpu-mobilenet-int8-throughput.json 
-INFO- Running prediction...
-INFO- Acquired 1 file(s) for model 'MobileNet v1.0'
-INFO- Task runtime: 0:01:00.933346
-INFO- Throughput: 16.9 fps
-INFO- Latency: 65. ms
-INFO- Target          Workload        H/W   Prec  Batch Conc. Metric       Score    Units
-INFO- -----------------------------------------------------------------------------------
-INFO- tensorflow_lite mobilenet       cpu   int8      1     1 throughput    16.9      fps
-INFO- tensorflow_lite mobilenet       cpu   int8      1     1 latency       65.        ms
-INFO- Total runtime: 0:01:00.960828
-INFO- Done

Observations: The performance of FP32 model is almost double than INT8 models on CPU, but Google TensorFlow lite benchmarking mentions the opposite:
https://www.tensorflow.org/lite/guide/hosted_models#quantized_models

I also tried replacing the models from the models present in above Hosted location, but the harness gives the similar results.

Could you let me know, where it's going wrong?

Thanks
Kind Regards
Arun

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions