You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? If so, please describe.
There might be different parameters a user can tune to deploy their model on the runtime, e.g., TRANSFORMERS_CACHE, FLASH_ATTENTION (true/false), DEPLOYMENT_FRAMEWORK (tgis_native, hf_accelerate,etc), NUM_GPU, and so on. Is there a place where to the list of supported parameters and their usage is documented?
Describe your proposed solution
it would be nice to have a way to list all the supported parameters with a description of their accepted values and usage.
Maybe a readme file in the repo could address it