-
Notifications
You must be signed in to change notification settings - Fork 397
Add migx bert squad quant example #441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add migx bert squad quant example #441
Conversation
Added additional pieces with argparse to select which version of squad we want to test, batch size, sequence length and some other useful things.
…runs Seeing stall on larger batch sizes. Adding flags for debugging.
- Allow for mixed precsion via fp16 flag -Allow for variable batch -Allow for variable calibrate data size
Running into an issue with shapes with the calibration tools if I break up this calibration read. This needs a large amount of memory to create the histogram.
|
ping @cloudhan @PeixuanZuo @ytaous |
cloudhan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some unresolved conflicts leak into your commit.
quantization/image_classification/migraphx/resnet50/e2e_migraphx_resnet_example.py
Outdated
Show resolved
Hide resolved
This comment was marked as resolved.
This comment was marked as resolved.
Useful if we want to use another falvor of bert for now. TODO need to handle/fix some of the input/output arg maps vs the input data vs model input/outputs
Another knob to tune/play with in perf runs. Right now just allow this to be default.
…o handle features in each example Our MIGraphX EP requires a recompile of the model if we constantly change the input dimensions or batch size of the parameters. Without this we actually cause a slowdown with the larger batch size runs as we tend to go above the feature index. A workaround is to ensure that batch size stays constant as we feed data into the model we're testing to get inference timing and accuracy results via repeating the same sample until we have enough data for a proper batch size.
feec95d to
a258158
Compare
|
@cloudhan, please help run the example end-2-end to see whether it works. |
|
@tianleiwu We don't work on AMD related EPs anymore. |
cloudhan don't work on the EP now.
That's news to me. When did that occur. Is this something we need to also bring up with our other Devs? |
Add an MIGraphX based quantization test the inference examples repo