-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Currently, jit_compile is set based on whether a base model is being used, because historically, most base models were not compatible with the current generation CPU's XLA capabilities. This logic is no longer correct becasue, for example, the text embedding used in the generative model is compatible with XLA, as long as tokenization is de-coupled from the model.
We need to keep the API backeward compatible in its defaults for now, but add an optional parameter: jit_compile, that when False does nothing. When True, overrides the default logic that sets jit_compile to True by default when a base model / embedding model is used.
Metadata
Metadata
Assignees
Labels
No labels