Support shared weights across prefill/decode subgraphs

## Background

We are trying to run LLaMA on an NPU backend that requires static shapes. Because of this constraint, the prefill and decode phases have different input/output specs (e.g., [B, T] vs [B, 1], different KV-cache I/O), so we need two separate graphs. However, we want to share a single set of weights between them to avoid duplicating constant buffers and to reduce memory footprint and load time.

## Proposed Representation

The Circle schema supports multiple `SubGraphs` under `Model.subgraphs`, where each subgraph has its own tensors, inputs, and outputs, while constant data lives in the global `Model.buffers`.

I believe we can model:

- SubGraph 0: prefill graph
- SubGraph 1: decode graph

and share weights by having weight tensors in both subgraphs reference the same `Model.buffers[buffer_index]`.

Optionally, use `Model.signature_defs` to expose two entry points:

- signature "prefill" → subgraph_index = 0
- signature "decode" → subgraph_index = 1

Note that quantization metadata should be considered properly: quantization parameters live on `Tensor.quantization`, so two subgraphs could technically attach different qparams to the same buffer.

## Expected Behavior

- Prefill and decode graphs can differ in I/O and intermediate tensor shapes.
- Weights should be stored once and reused across both subgraphs.

We need clarification on whether buffer sharing across subgraphs is really possible and whether it is supported by the runtime.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support shared weights across prefill/decode subgraphs #508

Background

Proposed Representation

Expected Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support shared weights across prefill/decode subgraphs #508

Description

Background

Proposed Representation

Expected Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions