Enable 16-bit activations in Cadence Quantizer For fully_connected and linear #15010
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Context
We currently only support 8-bit for most operators. We would like to add generic ops for 16-bit activations, for the following ops:
This Diff
Here, we add support for
quantized_linear
andquantized_fully_connected
. We need to do the following:quantized_fully_connected_out.cpp
andquantized_linear_out.cpp
.ref_implementations.py
, so tests can run with 16-bit activations to validate the quantization is correct.CadenceWith16BitLinearActivationsQuantizer
) for checking this works and create a unit test.Differential Revision: D84284794