How could I improve the inference performance?


I used the command 
```
nlp-train transformer_glue \
    --task_name mrpc \
    --model_name_or_path bert-base-uncased \
    --model_type quant_bert \
    --learning_rate 2e-5 \
    --output_dir /tmp/mrpc-8bit \
    --evaluate_during_training \
    --data_dir /path/to/MRPC \
    --do_lower_case
```
to training the model and 
```
nlp-inference transformer_glue \
    --model_path /tmp/mrpc-8bit \
    --task_name mrpc \
    --model_type quant_bert \
    --output_dir /tmp/mrpc-8bit \
    --data_dir /path/to/MRPC \
    --do_lower_case \
    --overwrite_output_dir \
    --load_quantized_model
```
to do inference,but got the same performance as no flag `--load_quantized_model`.How could I improve the inference performance?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How could I improve the inference performance? #154

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How could I improve the inference performance? #154

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions