Skip to content

Conversation

@TedThemistokleous
Copy link
Contributor

Add an MIGraphX based quantization test the inference examples repo

Ted Themistokleous and others added 15 commits January 22, 2024 22:43
Added additional pieces with argparse to select which version of squad we want to test, batch size, sequence length and some other useful things.
…runs

Seeing stall on larger batch sizes. Adding flags for debugging.
- Allow for mixed precsion via fp16 flag
-Allow for variable batch
-Allow for variable calibrate data size
Running into an issue with shapes with the calibration tools if I break up this calibration read. This needs a large amount of memory to create the histogram.
@TedThemistokleous
Copy link
Contributor Author

ping @cloudhan @PeixuanZuo @ytaous

Copy link

@cloudhan cloudhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some unresolved conflicts leak into your commit.

@cloudhan cloudhan requested a review from tianleiwu June 12, 2024 03:22
@cloudhan

This comment was marked as resolved.

@cloudhan cloudhan requested a review from faxu June 19, 2024 13:49
@tianleiwu tianleiwu requested a review from yufenglee June 24, 2024 18:12
@tianleiwu tianleiwu requested a review from yufenglee June 24, 2024 18:12
TedThemistokleous and others added 4 commits July 3, 2024 18:11
Useful if we want to use another falvor of bert for now.

TODO need to handle/fix some of the input/output arg maps vs the input data vs model input/outputs
Another knob to tune/play with in perf runs. Right now just allow this to be default.
…o handle features in each example

Our MIGraphX EP requires a recompile of the model if we constantly change the input dimensions or batch size of the parameters. Without this we actually cause a slowdown with the larger batch size runs as we tend to go above the feature index.

A workaround is to ensure that batch size stays constant as we feed data into the model we're testing to get inference timing and accuracy results via repeating the same sample until we have enough data for a proper batch size.
@cloudhan cloudhan removed their request for review July 5, 2024 00:58
@TedThemistokleous TedThemistokleous force-pushed the add_migx_bert_squad_quant_example branch from feec95d to a258158 Compare August 9, 2024 18:42
@tianleiwu
Copy link

@cloudhan, please help run the example end-2-end to see whether it works.

@cloudhan
Copy link

@tianleiwu We don't work on AMD related EPs anymore.

@tianleiwu tianleiwu dismissed cloudhan’s stale review October 24, 2024 17:31

cloudhan don't work on the EP now.

@tianleiwu tianleiwu enabled auto-merge (squash) October 24, 2024 17:31
@TedThemistokleous
Copy link
Contributor Author

@tianleiwu We don't work on AMD related EPs anymore.

That's news to me. When did that occur. Is this something we need to also bring up with our other Devs?

@tianleiwu tianleiwu merged commit daefee3 into microsoft:main Oct 25, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants