Explicit EP Download & Per-EP Progress Reporting#568
Explicit EP Download & Per-EP Progress Reporting#568
Conversation
…o baijumeswani/explicit-ep-download
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR makes execution-provider (EP) discovery/download explicit (instead of implicitly happening during catalog access) and adds per-EP progress reporting across the SDKs. It also removes the “6 hour TTL” catalog refresh behavior in favor of always refreshing model lists (delegating caching decisions to the native core), and updates docs/samples accordingly.
Changes:
- Add typed EP discovery + download/register APIs across Rust/JS/Python/C# (including progress callbacks/streaming).
- Remove implicit catalog refresh TTL logic (and Rust cache invalidator wiring) so catalogs refresh on each query/update.
- Update READMEs, generated docs, and samples to use explicit EP management.
Reviewed changes
Copilot reviewed 54 out of 54 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| www/src/routes/models/service.ts | Removes DirectML EP from the GPU EP list used for model discovery calls. |
| sdk/rust/src/types.rs | Adds EpInfo and EpDownloadResult Rust types with PascalCase serde mapping. |
| sdk/rust/src/model_variant.rs | Removes catalog cache invalidation signaling from variant operations. |
| sdk/rust/src/lib.rs | Re-exports EpInfo / EpDownloadResult from the crate root. |
| sdk/rust/src/foundry_local_manager.rs | Adds Rust EP discovery/download APIs, including streaming per-EP progress callback parsing. |
| sdk/rust/src/catalog.rs | Removes TTL + invalidation logic; update_models() now always refreshes behind a gate. |
| sdk/rust/README.md | Documents explicit EP management + progress callbacks for Rust. |
| sdk/python/test/test_foundry_local_manager.py | Adds unit tests for discover_eps() and download_and_register_eps() parsing/behavior. |
| sdk/python/src/foundry_local_manager.py | Introduces Python discover_eps() and download_and_register_eps() (optional streaming progress callback). |
| sdk/python/src/ep_types.py | Adds Python dataclasses EpInfo and EpDownloadResult. |
| sdk/python/src/catalog.py | Removes 6-hour refresh guard; updates docstring to “refreshes on each query call”. |
| sdk/python/src/init.py | Re-exports EpInfo / EpDownloadResult in the Python package API surface. |
| sdk/python/README.md | Updates backend matrix + documents explicit EP management and progress callback usage. |
| sdk/python/examples/chat_completion.py | Updates example to explicitly discover/download EPs before listing/using models. |
| sdk/js/test/foundryLocalManager.test.ts | Adds tests ensuring correct command/params wiring for EP downloads. |
| sdk/js/src/types.ts | Adds EpInfo / EpDownloadResult TypeScript interfaces. |
| sdk/js/src/foundryLocalManager.ts | Adds JS EP discovery, blocking download+register, and streaming progress download APIs. |
| sdk/js/src/catalog.ts | Removes the 6-hour TTL guard so model list is refreshed every query. |
| sdk/js/README.md | Documents explicit EP management and progress callback usage in JS. |
| sdk/js/examples/chat-completion.ts | Updates example to discover/download EPs and uses a concrete model alias. |
| sdk/js/docs/README.md | Updates generated TypeDoc output for new EP types. |
| sdk/js/docs/classes/FoundryLocalManager.md | Updates generated TypeDoc output for new EP methods. |
| sdk/cs/src/FoundryLocalManager.cs | Adds DiscoverEps() and DownloadAndRegisterEpsAsync APIs (with and without progress callback). |
| sdk/cs/src/EpInfo.cs | Adds C# EpInfo and EpDownloadResult record types with JSON property mapping. |
| sdk/cs/src/Detail/JsonSerializationContext.cs | Registers EP types for source-generated JSON serialization (AOT-friendly). |
| sdk/cs/src/Catalog.cs | Removes 6-hour TTL caching from catalog refresh behavior. |
| sdk/cs/README.md | Documents the new explicit EP management APIs and progress callback overload. |
| sdk/cs/docs/api/microsoft.ai.foundry.local.runtime.md | Generated doc updates (record-related members appearing). |
| sdk/cs/docs/api/microsoft.ai.foundry.local.prompttemplate.md | Generated doc updates (record-related members appearing). |
| sdk/cs/docs/api/microsoft.ai.foundry.local.parameter.md | Generated doc updates (record-related members appearing). |
| sdk/cs/docs/api/microsoft.ai.foundry.local.openaichatclient.md | Generated doc updates reflecting additional overload documentation. |
| sdk/cs/docs/api/microsoft.ai.foundry.local.openaiaudioclient.md | Generated doc updates adding live transcription session API entry. |
| sdk/cs/docs/api/microsoft.ai.foundry.local.openai.responseformatextended.md | New generated docs for ResponseFormatExtended. |
| sdk/cs/docs/api/microsoft.ai.foundry.local.openai.liveaudiotranscriptionsession.md | New generated docs for live transcription session type. |
| sdk/cs/docs/api/microsoft.ai.foundry.local.openai.liveaudiotranscriptionresponse.md | New generated docs for live transcription response type. |
| sdk/cs/docs/api/microsoft.ai.foundry.local.modelsettings.md | Generated doc updates (record-related members appearing). |
| sdk/cs/docs/api/microsoft.ai.foundry.local.modelinfo.md | Generated doc updates (new documented properties/record members). |
| sdk/cs/docs/api/microsoft.ai.foundry.local.foundrylocalmanager.md | Generated docs updated for new EP APIs and updated catalog remarks. |
| sdk/cs/docs/api/microsoft.ai.foundry.local.foundrylocalexception.md | Generated docs updated (serialization event section appears). |
| sdk/cs/docs/api/microsoft.ai.foundry.local.epinfo.md | New generated docs for EpInfo. |
| sdk/cs/docs/api/microsoft.ai.foundry.local.epdownloadresult.md | New generated docs for EpDownloadResult. |
| sdk/cs/docs/api/index.md | Generated docs index updated to include EP + OpenAI live transcription types. |
| samples/js/native-chat-completions/app.js | Sample updated to discover/download EPs with per-EP progress before using models. |
| samples/cs/GettingStarted/src/ToolCallingFoundryLocalWebServer/Program.cs | Sample updated to use DownloadAndRegisterEpsAsync() instead of implicit/old EP step. |
| samples/cs/GettingStarted/src/ToolCallingFoundryLocalSdk/Program.cs | Sample updated to use DownloadAndRegisterEpsAsync() instead of implicit/old EP step. |
| samples/cs/GettingStarted/src/ModelManagementExample/Program.cs | Sample updated to use DownloadAndRegisterEpsAsync() instead of implicit/old EP step. |
| samples/cs/GettingStarted/src/HelloFoundryLocalSdk/Program.cs | Sample updated to show EP discovery and per-EP progress display while downloading/registering EPs. |
| samples/cs/GettingStarted/src/FoundryLocalWebServer/Program.cs | Sample updated to use DownloadAndRegisterEpsAsync() instead of implicit/old EP step. |
| samples/cs/GettingStarted/src/AudioTranscriptionExample/Program.cs | Sample updated to use explicit EP download/register; sets audio language setting. |
Comments suppressed due to low confidence (2)
sdk/js/examples/chat-completion.ts:58
- Hard-coding a specific model alias in an example makes it fail for users who don’t have that exact model available. Prefer keeping the placeholder (e.g.
MODEL_ALIAS) or selecting frommodels[0]/ prompting the user, and avoid the redundantif (!modelToLoad)check sinceCatalog.getModel()throws when not found.
samples/js/native-chat-completions/app.js:40 - This sample hard-codes
modelAlias = 'qwen2.5-0.5b', which can make the sample fail in fresh environments where that model isn’t present. Consider either leaving a placeholder, selecting an available model from the catalog output, or making the alias configurable via an env var/CLI arg.
// Get the model object
const modelAlias = 'qwen2.5-0.5b'; // Using an available model from the list above
const model = await manager.catalog.getModel(modelAlias);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
17db553 to
8ed6595
Compare
…o baijumeswani/explicit-ep-download
- C#: Add DownloadAndRegisterEpsAsync overload with Action<string, double> progressCallback - Parse wire format with CultureInfo.InvariantCulture for locale safety - Use ExecuteCommandWithCallbackAsync for streaming path - JS: Add downloadAndRegisterEpsWithProgress(names?, progressCallback?) returning Promise<void> - Uses executeCommandStreaming for real-time progress - Python: Add progress_callback parameter to download_and_register_eps - Uses execute_command_with_callback for streaming path - Rust: Add download_and_register_eps_with_progress with typed FnMut(&str, f64) callback - Parses wire format inside SDK, exposes typed callback to consumers - C# sample: Replace spinner with per-EP progress display (discover -> aligned columns -> \r updates) - JS sample: Add EP discovery and per-EP progress display block Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add per-EP progress callback examples to all 4 SDK READMEs (C#, JS, Python, Rust) - Regenerate JS TypeDoc API docs (new downloadAndRegisterEpsWithProgress method) - Regenerate C# xmldoc2md API docs (new DownloadAndRegisterEpsAsync overload) - Fix ambiguous cref in C# XML doc comment for DownloadAndRegisterEpsAsync Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Point all SDK and sample version references to our published packages with EP progress support on ORT-Nightly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- JS: Make progressCallback required (not optional) in downloadAndRegisterEpsWithProgress - C#: Guard null/empty result.Data in DiscoverEpsImpl before deserializing - Python: Use Callable[[str, float], None] instead of built-in callable - Rust README: Fix closure to use move + Arc<Mutex> for Send + 'static Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f1ad266 to
b92986e
Compare
…ess-v2 # Conflicts: # samples/cs/Directory.Packages.props # samples/cs/audio-transcription-example/Program.cs # samples/cs/foundry-local-web-server/Program.cs # samples/cs/native-chat-completions/Program.cs # samples/cs/tool-calling-foundry-local-sdk/Program.cs # samples/js/native-chat-completions/app.js
- Remove duplicate downloadAndRegisterEps() from main merge (superseded by typed version) - Regenerate JS TypeDoc docs (progressCallback now required) - Regenerate C# xmldoc2md docs for net9.0 target - Fix missing param tag for ct in ICatalog.GetLatestVersionAsync Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 47 out of 47 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- JS tests: make async, await downloadAndRegisterEps, stub executeCommandStreaming instead of executeCommand - Fix 'avaialable' typo in ICatalog.cs and generated docs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 47 out of 47 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
sdk/cs/docs/api/index.md:33
sdk/cs/docs/api/index.mdno longer links toModelVariant, butmicrosoft.ai.foundry.local.modelvariant.mdis still present in the docs set. This makes the generated API index incomplete/harder to navigate; consider regenerating docs or re-adding the missingModelVariantentry to the index.
[LogLevel](./microsoft.ai.foundry.local.loglevel.md)
[Model](./microsoft.ai.foundry.local.model.md)
[ModelInfo](./microsoft.ai.foundry.local.modelinfo.md)
[ModelSettings](./microsoft.ai.foundry.local.modelsettings.md)
[OpenAIAudioClient](./microsoft.ai.foundry.local.openaiaudioclient.md)
[OpenAIChatClient](./microsoft.ai.foundry.local.openaichatclient.md)
[Parameter](./microsoft.ai.foundry.local.parameter.md)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ess-v2 # Conflicts: # sdk/cs/src/ICatalog.cs
- C#: Add 4 overloads for DownloadAndRegisterEpsAsync so callers never need to pass null (no-args, names-only, callback-only, all) - JS: Add TypeScript overload signatures so downloadAndRegisterEps can be called with just a callback (no undefined first arg) - Rust: Split into download_and_register_eps (no callback) and download_and_register_eps_with_progress (with callback) to avoid requiring None::<fn(&str, f64)> type annotations - Python __init__.py: Revert EP type exports (not user-constructible) - Update READMEs and samples to use cleaner overload call patterns Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The nightly versions from the merge are not yet published to public feeds, causing pipeline failures. Reset to the working versions on main. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…dAsync The sample still referenced the old API that was replaced by DownloadAndRegisterEpsAsync in the explicit EP download changes. Also fix Rust formatting (.await on separate line). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Makes execution provider (EP) management explicit across all SDKs and adds real-time per-EP download progress
reporting. Previously, EP downloads happened implicitly during catalog access with no granular progress visibility.
Now callers explicitly discover, download, and monitor EPs with typed APIs and streaming progress callbacks.
What's included
Explicit EP discovery and download (all SDKs)
EPs, returns typed EpDownloadResult
Per-EP progress callbacks (all SDKs)
ExecuteCommandWithCallbackAsync; parses wire format with CultureInfo.InvariantCulture for locale safety
the SDK
Live Audio Transcription (C#)
Other improvements
Testing
Breaking changes
downloadAndRegisterEps / download_and_register_eps before accessing hardware-accelerated models.