Update main README for 0.9.0 release (#1660)

kunal-vaishnavi · web-flow · commit c1cab7a55452 · 2025-07-28T19:13:38.000Z
### Description

This PR updates the repo's main README about the new way to set inputs
for non-LLMs.

### Motivation and Context

This will make users aware of an upcoming breaking change for the next
release (0.9.0).
diff --git a/README.md b/README.md
@@ -1,12 +1,6 @@
 # ONNX Runtime GenAI
 
-Note: between release candidate 0.7.0-rc2 and release 0.7.0 there is a breaking Python API change in `tokenizer.encode(prompt)`. Previously this method returned a Python list and now returns a numpy array. When concatenating the tokens generated by two prompts to pass to `append_tokens` e.g a system prompt and a user prompt, you must use the following instead of `system_prompt + input_tokens`:
-
-```python
-system_tokens = tokenizer.encode(system_prompt)
-input_tokens = tokenizer.encode(prompt)
-generator.append_tokens(np.concatenate([system_tokens, input_tokens]))
-```
+Note: between 0.9.0 and the 0.8.3 release, there is a breaking API change. Previously, the inputs for non-LLMs would be set with `params.SetInputs(inputs)`. Now, inputs for non-LLMs are set with `generator.SetInputs(inputs)`. With this change, inputs for all models and their modalities are set on the `generator` instead of the `generatorParams`. The inputs for LLMs are set with `generator.append_tokens(tokens)` and the inputs for non-LLMs are set with `generator.SetInputs(inputs)`.
 
 [![Latest version](https://img.shields.io/nuget/vpre/Microsoft.ML.OnnxRuntimeGenAI.Managed?label=latest)](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntimeGenAI.Managed/absoluteLatest)