Skip to content

Conversation

@TedThemistokleous
Copy link
Collaborator

Description

Motivation and Context

Updated 32 log statements in the compute function from LOGS_DEFAULT(VERBOSE)
to LOGS_DEFAULT(INFO) for better visibility of compute-related operations
during inference.
Use this call from previous genreated bits to encapsulate model compilation and parameters needed to ensure an MIGraphX program is properly  compiled and setup accordingly
Cleans up the compute thread and keeps all inputs and pieces clear to whats going to be run to migraphx through the api.
Remoev this from the compute so we can encapsulate things in a reasonable way
make this a seperate call that takes in input context, program and  paramters hape and name information so we can populate the items needed based of the MIGraphX program to perform a run_async later.

Doing this as part of cleaning up the compute function to further optimize later
capture this in a sepeate call so we get an idea of how input shapes are handled during compute and the modes
Reuse this and remove a bunch of redundant repeated code
Store a compiled or preloaded from disk MIGraphX program into a map index by batch size. Use this as the program in the compute method if an incomming batch size matches that of what we wanted to run.

If this fails, fallback to the preload from disk, and if that fails compile the model in the compute thread
@TedThemistokleous TedThemistokleous self-assigned this Jan 3, 2026
- TIghten lock around run async'
- Remove O(n) lookup with find and use unordered_set instead
- Use optional to help tighten up lock
Should improve runtime from O(N*2) to O(N) for running through outputs and checking
Do this so that we can mange updates between inferences for things like dynamic batch or sequence length more effetively. Right now we were corsely recompiling based on any mismatch and always checking input shapes. In these cases now instaed of cheaping all N inputs we should check the symbolic dimensions  for updates
Move this out into a seperate call to ensure we're tracking whether we get a dynamic batch size as well as other symbolic dimensions in the model we detect on compile
This is used everywhere in the EP and should be encapsulated to ensure we're consistent across all our compiles and load attempts
…tions

Clean up the compile() call so its clear we're just setting up an initial set of options for a model before we perform first inference in compute. Use this to link program input names to initial shapes, regardless if they're dynamic or static.
Used to get initial info of the model before we pass things to the compute thread
Make things const and dont lookup stuff in map index a 3 times for the same thing.
Reduces further API overhead and gets us closer to migraphx driver performance parity
Dont need the overhead of a mutex as most cases they'll be multple instanecs of the run_async call
…compute func

Ensure we can main aech run more effectively and cleanly. Allows us to add / change to these if needed
refactor whats needed here so that we can pass in the updated input paramter shapes and update compile options
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants