Skip to content

Conversation

@natke
Copy link
Contributor

@natke natke commented Jan 2, 2026

Description

Update the API docs with the latest code changes

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the GenAI API documentation to reflect the latest code changes across multiple programming language bindings. The updates include new features, API refinements, and expanded configuration options for the ONNX Runtime GenAI library.

Key Changes:

  • Added new configuration options for multi-modal models (image/video token IDs, vision processing parameters)
  • Introduced batching engine support (dynamic and static batching configurations)
  • Expanded API surface with new classes for Request/Engine-based generation patterns
  • Enhanced tokenizer capabilities with new methods for token ID retrieval and option updates

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/routes/onnx/+page.svelte Removed deprecated model types link reference from UI
docs/genai/reference/config.md Added engine/batching sections, multi-modal token IDs, session/run options, and hardware device filtering
docs/genai/api/python.md Added Request/Engine classes, updated GeneratorParams, enhanced Config with model data methods, and tokenizer improvements
docs/genai/api/java.md Restructured Model constructors, added close() methods, expanded Tokenizer/Generator APIs, and documented multi-modal processors
docs/genai/api/csharp.md Added hardware device filtering to Config, updated Generator/Tokenizer APIs, documented chat client and multi-modal support
docs/genai/api/cpp.md Added model data management, hardware device options, Request/Engine classes, and enhanced processor batch support
docs/genai/api/c.md Comprehensive updates including Request/Engine API, enhanced tokenizer functions, multi-modal processing, and execution provider registration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Path to the vision adapter file.

- **spatial_merge_size**: *(int, optional)*
Patch merge size used by some models (for example, Qwen2.5-VL). Defaults to 2.
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example reference "for example, Qwen2.5-VL" should use consistent spacing. The version number format "2.5" appears to use a period separator which is standard, but verify this matches the official model name capitalization.

Suggested change
Patch merge size used by some models (for example, Qwen2.5-VL). Defaults to 2.
Patch merge size used by some models (such as Qwen2.5-VL). Defaults to 2.

Copilot uses AI. Check for mistakes.
Name of the image features output tensor.

- **pipeline**: *(array, optional)*
Ordered list of sub-models for vision pipelines (for example, patch embedding, attention, merge).
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description states "for example, patch embedding, attention, merge" but should have consistent list formatting. Consider using "e.g., patch embedding, attention, and merge" for better readability.

Suggested change
Ordered list of sub-models for vision pipelines (for example, patch embedding, attention, merge).
Ordered list of sub-models for vision pipelines (e.g., patch embedding, attention, and merge).

Copilot uses AI. Check for mistakes.
- **outputs**: *(array of string)*
Graph output names.
- **run_on_cpu**: *(bool, optional)*
If true, forces CPU EP when multiple EPs are configured.
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description "If true, forces CPU EP when multiple EPs are configured" uses inconsistent abbreviations. Consider expanding "EP" to "execution provider" on first use or consistently using the full term for clarity, especially since this is user-facing documentation.

Suggested change
If true, forces CPU EP when multiple EPs are configured.
If true, forces use of the CPU execution provider when multiple execution providers are configured.

Copilot uses AI. Check for mistakes.
```

- `apply_chat_template(template_str: str, messages: str, tools: str = None, add_generation_prompt: bool = False) -> str`
- `apply_chat_template(messages: str, template_str: str | None = None, tools: str | None = None, add_generation_prompt: bool = True) -> str`
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter order in the function signature has changed from the old version. The old signature had "template_str" as the first parameter, but the new signature has "messages" first and "template_str" as optional second parameter. Ensure this change is intentional and matches the actual API implementation.

Copilot uses AI. Check for mistakes.

```python
chat = tokenizer.apply_chat_template("{user}: {message}", messages="Hi!", add_generation_prompt=True)
chat = tokenizer.apply_chat_template(messages="Hi!", template_str="{user}: {message}", add_generation_prompt=True)
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example usage still shows the old parameter order with "template_str" first: tokenizer.apply_chat_template(messages="Hi!", template_str="{user}: {message}", add_generation_prompt=True). However, if the new signature has "messages" as first parameter and "template_str" as optional second parameter, verify this example is consistent with the signature shown on line 299.

Copilot uses AI. Check for mistakes.

```python
result = processor(prompt="Describe this image", images=onnxruntime_genai.Images.open("image.png"))
batched = processor(["Describe first", "Describe second"], images=onnxruntime_genai.Images.open("image1.png", "image2.png"))
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example shows batched = processor(["Describe first", "Describe second"], images=onnxruntime_genai.Images.open("image1.png", "image2.png")) but Images.open() is documented as a static load method in other parts of the docs. Verify that Images.open() supports multiple file paths as arguments, or if it should be Images.open(["image1.png", "image2.png"]) with a list.

Suggested change
batched = processor(["Describe first", "Describe second"], images=onnxruntime_genai.Images.open("image1.png", "image2.png"))
batched = processor(["Describe first", "Describe second"], images=onnxruntime_genai.Images.open(["image1.png", "image2.png"]))

Copilot uses AI. Check for mistakes.

### OgaShutdown

Cleanly shutdown the genai library and its ONNX Runtime usage on process exit.
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description states "Cleanly shutdown the genai library" but should be "Cleans up and shuts down the GenAI library" or "Cleanly shuts down the GenAI library" for proper grammar. "shutdown" as a verb should be two words "shut down".

Suggested change
Cleanly shutdown the genai library and its ONNX Runtime usage on process exit.
Cleanly shuts down the GenAI library and its ONNX Runtime usage on process exit.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants