Save and load models trained with QuantizationAwareTraining() #16357

w2ex · 2023-01-13T14:57:29Z

w2ex
Jan 13, 2023

Hi,

I started playing with the QuantizationAwareTraining() callback, which seems quite useful for deploying lite versions of our models on CPU.

However, I don't quite get totally how I am supposed to use it.

At the moment, I run my pl.Trainer with the callbacks ModelCheckpoint() to save the best steps, and QuantizationAwareTraining().

Once my training is completed, I usually load the best steps saved with ModelCheckpoint() in order to save them with onnx.
When I do this, I get a mistake if I use model.load_from_checkpoint(checkpoint_path=...), because the checkpoints also saves the fake_quant layers : Unexpected key(s) in state_dict: "input_layer.0.weight_fake_quant.fake_quant_enabled"...
So I use the strict=False flag to ignore these layers, save the not-quantized model to onnx and then quantize it with onnxruntime.quantization.quantize_static(...).

Now, that seems sub-optimal and not the proper way to do it. Seems like I lose all the histogram values saved in the checkpoints that could be useful for the conversion. The on_fit_end() hook seems to quantify the model when it stopped training but I don't seem to profit from that (is it because I interrupt the Trainer before waiting for the epochs to complete, hence it is never called ?).

If I would be willing to save the full model, with the fake quantization layers in order to be able to run it on GPUs (so that I can evaluate it on large datasets while getting an idea of the real performances of the quantized model), should I call the _prepare_model() from QuantizationAwareTraining() to the model before loading the checkpoint ?

I don't know if all of this is clear, but to make it simpler : when training a model with QuantizationAwareTraining(), how can I save and load the model with the fake_quantization layers so that I can evaluate it on GPU ?

Thank you very much :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Save and load models trained with QuantizationAwareTraining() #16357

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Save and load models trained with QuantizationAwareTraining() #16357

Uh oh!

w2ex Jan 13, 2023

Replies: 0 comments

w2ex
Jan 13, 2023