Skip to content

Explicit EP Download & Per-EP Progress Reporting#568

Open
bmehta001 wants to merge 36 commits intomainfrom
bhamehta/per-ep-progress-v2
Open

Explicit EP Download & Per-EP Progress Reporting#568
bmehta001 wants to merge 36 commits intomainfrom
bhamehta/per-ep-progress-v2

Conversation

@bmehta001
Copy link
Copy Markdown
Contributor

@bmehta001 bmehta001 commented Mar 31, 2026

Makes execution provider (EP) management explicit across all SDKs and adds real-time per-EP download progress
reporting. Previously, EP downloads happened implicitly during catalog access with no granular progress visibility.
Now callers explicitly discover, download, and monitor EPs with typed APIs and streaming progress callbacks.

What's included

Explicit EP discovery and download (all SDKs)

  • DiscoverEps() / discoverEps() / discover_eps() — returns typed EpInfo with name and registration status
  • DownloadAndRegisterEpsAsync() / downloadAndRegisterEps() / download_and_register_eps() — downloads and registers
    EPs, returns typed EpDownloadResult
  • Catalog access no longer blocks on EP downloads

Per-EP progress callbacks (all SDKs)

  • C#: DownloadAndRegisterEpsAsync(names, Action<string, double> progressCallback, ct) — uses
    ExecuteCommandWithCallbackAsync; parses wire format with CultureInfo.InvariantCulture for locale safety
  • JS: downloadAndRegisterEpsWithProgress(names?, progressCallback?) — uses executeCommandStreaming
  • Python: download_and_register_eps(names, progress_callback) — uses execute_command_with_callback
  • Rust: download_and_register_eps_with_progress(names, FnMut(&str, f64)) — parses "name|percent" wire format inside
    the SDK

Live Audio Transcription (C#)

  • New LiveAudioTranscriptionSession with real-time streaming over WebSocket
  • Supports start/stop/send audio chunks with configurable output types
  • Unit tests with mocked CoreInterop

Other improvements

  • Typed EpInfo / EpDownloadResult in dedicated type files across all SDKs
  • EP unit tests for JS and Python
  • Removed implicit 6-hour catalog TTL caching (delegated to native core)
  • New CoreInterop methods for callback-based command execution (C#)
  • AOT-compatible JSON serialization context for EP types (C#)

Testing

  • New unit tests for EP discovery/download in JS and Python

Breaking changes

  • Catalog no longer implicitly triggers EP downloads — callers must explicitly call DownloadAndRegisterEpsAsync /
    downloadAndRegisterEps / download_and_register_eps before accessing hardware-accelerated models.

Copilot AI review requested due to automatic review settings March 31, 2026 17:07
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
foundry-local Ready Ready Preview, Comment Apr 1, 2026 9:47am

Request Review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes execution-provider (EP) discovery/download explicit (instead of implicitly happening during catalog access) and adds per-EP progress reporting across the SDKs. It also removes the “6 hour TTL” catalog refresh behavior in favor of always refreshing model lists (delegating caching decisions to the native core), and updates docs/samples accordingly.

Changes:

  • Add typed EP discovery + download/register APIs across Rust/JS/Python/C# (including progress callbacks/streaming).
  • Remove implicit catalog refresh TTL logic (and Rust cache invalidator wiring) so catalogs refresh on each query/update.
  • Update READMEs, generated docs, and samples to use explicit EP management.

Reviewed changes

Copilot reviewed 54 out of 54 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
www/src/routes/models/service.ts Removes DirectML EP from the GPU EP list used for model discovery calls.
sdk/rust/src/types.rs Adds EpInfo and EpDownloadResult Rust types with PascalCase serde mapping.
sdk/rust/src/model_variant.rs Removes catalog cache invalidation signaling from variant operations.
sdk/rust/src/lib.rs Re-exports EpInfo / EpDownloadResult from the crate root.
sdk/rust/src/foundry_local_manager.rs Adds Rust EP discovery/download APIs, including streaming per-EP progress callback parsing.
sdk/rust/src/catalog.rs Removes TTL + invalidation logic; update_models() now always refreshes behind a gate.
sdk/rust/README.md Documents explicit EP management + progress callbacks for Rust.
sdk/python/test/test_foundry_local_manager.py Adds unit tests for discover_eps() and download_and_register_eps() parsing/behavior.
sdk/python/src/foundry_local_manager.py Introduces Python discover_eps() and download_and_register_eps() (optional streaming progress callback).
sdk/python/src/ep_types.py Adds Python dataclasses EpInfo and EpDownloadResult.
sdk/python/src/catalog.py Removes 6-hour refresh guard; updates docstring to “refreshes on each query call”.
sdk/python/src/init.py Re-exports EpInfo / EpDownloadResult in the Python package API surface.
sdk/python/README.md Updates backend matrix + documents explicit EP management and progress callback usage.
sdk/python/examples/chat_completion.py Updates example to explicitly discover/download EPs before listing/using models.
sdk/js/test/foundryLocalManager.test.ts Adds tests ensuring correct command/params wiring for EP downloads.
sdk/js/src/types.ts Adds EpInfo / EpDownloadResult TypeScript interfaces.
sdk/js/src/foundryLocalManager.ts Adds JS EP discovery, blocking download+register, and streaming progress download APIs.
sdk/js/src/catalog.ts Removes the 6-hour TTL guard so model list is refreshed every query.
sdk/js/README.md Documents explicit EP management and progress callback usage in JS.
sdk/js/examples/chat-completion.ts Updates example to discover/download EPs and uses a concrete model alias.
sdk/js/docs/README.md Updates generated TypeDoc output for new EP types.
sdk/js/docs/classes/FoundryLocalManager.md Updates generated TypeDoc output for new EP methods.
sdk/cs/src/FoundryLocalManager.cs Adds DiscoverEps() and DownloadAndRegisterEpsAsync APIs (with and without progress callback).
sdk/cs/src/EpInfo.cs Adds C# EpInfo and EpDownloadResult record types with JSON property mapping.
sdk/cs/src/Detail/JsonSerializationContext.cs Registers EP types for source-generated JSON serialization (AOT-friendly).
sdk/cs/src/Catalog.cs Removes 6-hour TTL caching from catalog refresh behavior.
sdk/cs/README.md Documents the new explicit EP management APIs and progress callback overload.
sdk/cs/docs/api/microsoft.ai.foundry.local.runtime.md Generated doc updates (record-related members appearing).
sdk/cs/docs/api/microsoft.ai.foundry.local.prompttemplate.md Generated doc updates (record-related members appearing).
sdk/cs/docs/api/microsoft.ai.foundry.local.parameter.md Generated doc updates (record-related members appearing).
sdk/cs/docs/api/microsoft.ai.foundry.local.openaichatclient.md Generated doc updates reflecting additional overload documentation.
sdk/cs/docs/api/microsoft.ai.foundry.local.openaiaudioclient.md Generated doc updates adding live transcription session API entry.
sdk/cs/docs/api/microsoft.ai.foundry.local.openai.responseformatextended.md New generated docs for ResponseFormatExtended.
sdk/cs/docs/api/microsoft.ai.foundry.local.openai.liveaudiotranscriptionsession.md New generated docs for live transcription session type.
sdk/cs/docs/api/microsoft.ai.foundry.local.openai.liveaudiotranscriptionresponse.md New generated docs for live transcription response type.
sdk/cs/docs/api/microsoft.ai.foundry.local.modelsettings.md Generated doc updates (record-related members appearing).
sdk/cs/docs/api/microsoft.ai.foundry.local.modelinfo.md Generated doc updates (new documented properties/record members).
sdk/cs/docs/api/microsoft.ai.foundry.local.foundrylocalmanager.md Generated docs updated for new EP APIs and updated catalog remarks.
sdk/cs/docs/api/microsoft.ai.foundry.local.foundrylocalexception.md Generated docs updated (serialization event section appears).
sdk/cs/docs/api/microsoft.ai.foundry.local.epinfo.md New generated docs for EpInfo.
sdk/cs/docs/api/microsoft.ai.foundry.local.epdownloadresult.md New generated docs for EpDownloadResult.
sdk/cs/docs/api/index.md Generated docs index updated to include EP + OpenAI live transcription types.
samples/js/native-chat-completions/app.js Sample updated to discover/download EPs with per-EP progress before using models.
samples/cs/GettingStarted/src/ToolCallingFoundryLocalWebServer/Program.cs Sample updated to use DownloadAndRegisterEpsAsync() instead of implicit/old EP step.
samples/cs/GettingStarted/src/ToolCallingFoundryLocalSdk/Program.cs Sample updated to use DownloadAndRegisterEpsAsync() instead of implicit/old EP step.
samples/cs/GettingStarted/src/ModelManagementExample/Program.cs Sample updated to use DownloadAndRegisterEpsAsync() instead of implicit/old EP step.
samples/cs/GettingStarted/src/HelloFoundryLocalSdk/Program.cs Sample updated to show EP discovery and per-EP progress display while downloading/registering EPs.
samples/cs/GettingStarted/src/FoundryLocalWebServer/Program.cs Sample updated to use DownloadAndRegisterEpsAsync() instead of implicit/old EP step.
samples/cs/GettingStarted/src/AudioTranscriptionExample/Program.cs Sample updated to use explicit EP download/register; sets audio language setting.
Comments suppressed due to low confidence (2)

sdk/js/examples/chat-completion.ts:58

  • Hard-coding a specific model alias in an example makes it fail for users who don’t have that exact model available. Prefer keeping the placeholder (e.g. MODEL_ALIAS) or selecting from models[0] / prompting the user, and avoid the redundant if (!modelToLoad) check since Catalog.getModel() throws when not found.
    samples/js/native-chat-completions/app.js:40
  • This sample hard-codes modelAlias = 'qwen2.5-0.5b', which can make the sample fail in fresh environments where that model isn’t present. Consider either leaving a placeholder, selecting an available model from the catalog output, or making the alias configurable via an env var/CLI arg.
// Get the model object
const modelAlias = 'qwen2.5-0.5b'; // Using an available model from the list above
const model = await manager.catalog.getModel(modelAlias);


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

baijumeswani and others added 7 commits March 31, 2026 11:07
- C#: Add DownloadAndRegisterEpsAsync overload with Action<string, double> progressCallback
  - Parse wire format with CultureInfo.InvariantCulture for locale safety
  - Use ExecuteCommandWithCallbackAsync for streaming path
- JS: Add downloadAndRegisterEpsWithProgress(names?, progressCallback?) returning Promise<void>
  - Uses executeCommandStreaming for real-time progress
- Python: Add progress_callback parameter to download_and_register_eps
  - Uses execute_command_with_callback for streaming path
- Rust: Add download_and_register_eps_with_progress with typed FnMut(&str, f64) callback
  - Parses wire format inside SDK, exposes typed callback to consumers
- C# sample: Replace spinner with per-EP progress display (discover -> aligned columns -> \r updates)
- JS sample: Add EP discovery and per-EP progress display block

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add per-EP progress callback examples to all 4 SDK READMEs (C#, JS, Python, Rust)
- Regenerate JS TypeDoc API docs (new downloadAndRegisterEpsWithProgress method)
- Regenerate C# xmldoc2md API docs (new DownloadAndRegisterEpsAsync overload)
- Fix ambiguous cref in C# XML doc comment for DownloadAndRegisterEpsAsync

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Point all SDK and sample version references to our published packages
with EP progress support on ORT-Nightly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- JS: Make progressCallback required (not optional) in downloadAndRegisterEpsWithProgress
- C#: Guard null/empty result.Data in DiscoverEpsImpl before deserializing
- Python: Use Callable[[str, float], None] instead of built-in callable
- Rust README: Fix closure to use move + Arc<Mutex> for Send + 'static

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ess-v2

# Conflicts:
#	samples/cs/Directory.Packages.props
#	samples/cs/audio-transcription-example/Program.cs
#	samples/cs/foundry-local-web-server/Program.cs
#	samples/cs/native-chat-completions/Program.cs
#	samples/cs/tool-calling-foundry-local-sdk/Program.cs
#	samples/js/native-chat-completions/app.js
- Remove duplicate downloadAndRegisterEps() from main merge (superseded by typed version)
- Regenerate JS TypeDoc docs (progressCallback now required)
- Regenerate C# xmldoc2md docs for net9.0 target
- Fix missing param tag for ct in ICatalog.GetLatestVersionAsync

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 47 out of 47 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- JS tests: make async, await downloadAndRegisterEps, stub
  executeCommandStreaming instead of executeCommand
- Fix 'avaialable' typo in ICatalog.cs and generated docs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 47 out of 47 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

sdk/cs/docs/api/index.md:33

  • sdk/cs/docs/api/index.md no longer links to ModelVariant, but microsoft.ai.foundry.local.modelvariant.md is still present in the docs set. This makes the generated API index incomplete/harder to navigate; consider regenerating docs or re-adding the missing ModelVariant entry to the index.
[LogLevel](./microsoft.ai.foundry.local.loglevel.md)

[Model](./microsoft.ai.foundry.local.model.md)

[ModelInfo](./microsoft.ai.foundry.local.modelinfo.md)

[ModelSettings](./microsoft.ai.foundry.local.modelsettings.md)

[OpenAIAudioClient](./microsoft.ai.foundry.local.openaiaudioclient.md)

[OpenAIChatClient](./microsoft.ai.foundry.local.openaichatclient.md)

[Parameter](./microsoft.ai.foundry.local.parameter.md)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ess-v2

# Conflicts:
#	sdk/cs/src/ICatalog.cs
- C#: Add 4 overloads for DownloadAndRegisterEpsAsync so callers
  never need to pass null (no-args, names-only, callback-only, all)
- JS: Add TypeScript overload signatures so downloadAndRegisterEps
  can be called with just a callback (no undefined first arg)
- Rust: Split into download_and_register_eps (no callback) and
  download_and_register_eps_with_progress (with callback) to avoid
  requiring None::<fn(&str, f64)> type annotations
- Python __init__.py: Revert EP type exports (not user-constructible)
- Update READMEs and samples to use cleaner overload call patterns

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
baijumeswani
baijumeswani previously approved these changes Apr 1, 2026
The nightly versions from the merge are not yet published to public
feeds, causing pipeline failures. Reset to the working versions on main.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…dAsync

The sample still referenced the old API that was replaced by
DownloadAndRegisterEpsAsync in the explicit EP download changes.
Also fix Rust formatting (.await on separate line).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants