Skip to content

RFC: Unified Graph-Backed Workspace Model for Invoke #8988

@joshistoast

Description

@joshistoast

Summary

This RFC proposes a major UI and architectural direction for Invoke.

  • Top-level tabs become a primary area of work (like files in Photoshop)
  • Each top-level tab owns it's own graph/workflow context
  • Existing "pages" like Generate, Canvas and Workflow should be rethought as:
    • tab-local panels
    • center-area views
    • contextual tools over the same tab state
  • The application moves towards a unified workspace shell rather than separate pages with partially shared behaviors
  • To support this properly, the application will have a proper dockview implementation where panels can be moved into various regions.
    • Panels will need to be responsive to fit within width/height constraints of vertical/horizontal regions.
    • Add a bottom status panel w/ a dockview region.

Problem Statement

Invoke has multiple major surfaces that all map to workflows/graphs in different ways, which currently lead to unpredictable behaviors in the UI when switching pages to perform new tasks.

Examples:

  • Some state is shared or influences generations between the pages in ways that aren't obvious
  • Generate, Canvas and Workflow are treated as distinct pages/modes, even though they all ultimately produce or operate on graphs.
  • The workflow page is separated from the rest of the pages, despite that being the underlying execution model
  • The current dockview implementation is not built around it's initially intended purpose, so it's currently static
  • Discussions around tabbed canvases and workflows are happening already as separate concerns, even though they appear to want the same thing: a top-level tab with it's own graph-backed working context

This creates UX issues:

  • Hidden or surprising user workflow influence
  • Unclear sources of truth
  • Too much conceptual separation between pages and surfaces that are actually related
  • Difficulty scaling the application without adding new pages/modes
  • Reduced flexibility for future UI composition and features

Proposal

Core Idea

Invoke should adopt a unifying workspace model. Where each top-level tab represents a graph-backed workspace similar to Photoshop with a tab for each working document. Panels and center regions operate on that tab's state. The graph becomes the underlying model of both the UI and the backend. Most users will continue to interact with it through hard-coded panels rather than through direct graph edits.

In Practice

A top-level tab should own:

  • prompt/generation state
  • parameters state
  • canvas/layers state
  • a graph

The broader UI state shared across tabs:

  • generation queue
  • dockview state
  • UI settings
  • gallery
  • models

The default state of the UI before user dockview preferences will become:

  • Top-level tabs = workspaces
  • Left rail = graph-authoring panels/controls
  • Center area = views into current graph/tab state
  • Right rail = contextual utilities and inspectors

Center Views

The center region will be able to switch between various views and working areas of the panel and tab outputs

  • Viewer
  • Canvas
  • Workflow

This means Canvas is no longer a separate page/mode. It becomes another view over the tab's state.

Panels over Pages

Instead of separate top-level pages that inconsistently overlap, panels become the primary way to edit the tab state.

Examples:

Left-side panels:

  • Parameters
  • Upscaler
  • Linear UI for custom graphs

Right-side panels:

  • Queue
  • Gallery
  • Layers
  • Model Manager

Some panels are always available, some will need to be contextual.

Example:

  • Layers may only be fully interactive when within a canvas view

Why top-level tabs should own their own graph

This is the key architectural shift, and will allow Invoke to hit many birds with a single stone by tying the tab to a graph.

1. It matches the backend model

Invoke already executes graphs, and with this in mind the mental model becomes easy to work with while working within Invoke and not so difficult to implement (hopefully).

2. Creates a clean unit of work

A tab becomes a real, consistent working context, not just a unit of navigation.

This allows:

  • true parallel exploration
  • tab-specific histories
  • tab-specific canvas states
  • tab-specific workflows/graph state
  • easier branching and explorations of ideas

3. Removes non-obvious cross-page influence

A significant point of UX friction present within the UI today is that different pages influence one another in ways that are not obvious.

If a tab can own it's own graph context, then any image, mask, layer, controlnet or other control input affecting generation belongs to that tab and can be made easily visible there.

4. Unifies currently separate roadmap items

Tabbed canvases should not share context with the same generation parameters and inputs, this will require more time spent switching settings back and forth than is necessary. This can be easily solvable by including canvas state in it's workflow context.

5. Scales better than adding new pages

As Invoke grows and implements new features + technologies, this will be an easier model to extend than on a per-page basis.


Source of truth: structured vs workflow panels

To keep the app approachable and predictable while preserving power-user potential, this RFC proposes a distinction between structured panels and workflow-dependent authored tabs (linear UI).

Structured panels

Examples:

  • Parameters
  • Canvas
  • Upscale

Behavior:

  • hard-coded panels are the source of truth
  • directly edit top-level tab graph state
  • workflow/graph is read-only

This keeps the basic tools simple, reliable and beginner-friendly.

Linear UI Panel

Examples:

  • imported or custom workflows
  • graph-first workflow with optional linear UI inputs

Behavior:

  • graph is source of truth
  • optional linear UI elements can be bound to workflow inputs
  • workflow/graph can be directly edited

Why the split matters

Trying to keep an arbitrarily edited graph to work across hard-coded panels that expect certain behaviors and inputs to be defined is not an effort worth considering at this time.

Continuing to keep structured/hard-coded panels as source of truth for the tab graph avoids this trap entirely while still clearly exposing the graph to be edited and played with later in a linear UI panel.

Power users will still have a first-class path:

  • inspect the generated graph
  • fork/duplicate to a workflow panel
  • continue editing there

How panels influence the tab graph

In structured/hard-coded panels, they should not directly individually edit nodes/edges.

Instead:

  • panels edit tab graph state as a whole
  • new graph state is what is queued

Workflow visibility

To make workflows more understandable and accessible in all panel contexts, there will be a workflow view to sit besides the viewer and launchpad.


Why this implies moving away from separate pages

If generate, canvas, workflow, upscale and etc. all influence or reflect a similar graph-backed backed tab state, then keeping them fully separate becomes increasingly artificial.

A more coherent model:

  • one application shell
  • tab-local state
  • center-area view switching
  • contextual panels
  • graph-based execution

This does not mean everything must be visible at once, all panels will be collapsible.

"One page" should mean:

  • one unified workspace architecture
  • one consistent tab model
  • fewer conceptual jumps

It should not mean:

  • one giant always-open mega-screen

Contextual panels, view switches, and layout rules will be what make the unified shell usable and not overwhelming.


Addressing dockview

Dockview currently controls the tab and panel UI states within Invoke. The intended outcome of implementing dockview initially was to allow for tabs and panels to be moved around freely, reorganized, pulled out into windows and floating above others. Unfortunately, due to technical complications at the time the UI utilizes it's basic functions to keep a static layout.

If this RFC is taken seriously, we need to address dockview and implement it properly for the concept of one page, many panels to work.

  • Left panel region
  • Right panel region
  • Center region
  • Bottom status bar region (similar to code editors)

This will not be a side concern, but a pivotal infrastructure to the entire layout.

Impact on users

Beginners

Benefits:

  • simpler mental model
  • less surprising shared state
  • easier understanding of "what will run"
  • hard-coded/structured tabs remain safe and guided
  • no need to understand graphs

Potential concerns:

  • a unified application could feel dense if not carefully designed
  • panel/view/tab terminology needs to be very clear

Recommendation:

  • keep default layouts simple and reminiscent of current invoke defaults
  • expose workflow as inspect-able and non-intimidating
  • Avoid overloading the workspace

Creatives

Benefits:

  • can safely work on multiple images at once with different contexts
  • less mode switching
  • closer behaviors to familiar pro tools

Potential concerns:

  • canvas tooling needs to remain fast and focused
  • layers must continue to feel natural and not a separate "bolted on" panel.

Recommendation:

  • treat layers as a contextual but important panel

Power users

Benefits:

  • first-class graph model
  • easily inspect generated workflows with less steps
  • custom workflow tabs are now seamlessly possible
  • custom linear UI becomes more accessible

Potential concerns:

  • read-only workflow preview from structured panels may feel restrictive

Recommendation:

  • expose a "fork/duplicate" button on workflows
  • store a linear UI configuration in panel graphs, so duplication feels similar to panel it was duplicated from

UX principles

This proposal should follow a few clear principles:

1. One tab, one workflow context

No hidden state leakage between other tabs.

2. Graph influence should be easily visible

If canvas, parameters, layers, etc. influence the generated graph, the UI should communicate it subtly to not be annoying to power users, yet informative enough to others.

3. Structured first, graph second

Casual users and creatives generally consider panels as the primary interface... interacting and viewing the graph is visible and learnable, but not required.

4. Power should be available

Linear UI panels should be as first-class as the rest.

5. One page doesn't mean all panels all the time

Use smart defaults, collapsed states, contextuality and view-based behavior to keep the interface clean and approachable.


Example user flows

Structured Panel

  1. User opens a new tab
  2. Default center view is viewer
  3. Left panel shows parameters
  4. User switches to canvas view
  5. Layers panel becomes open/available
  6. User creates a new layer and mask
  7. User switches view to viewer and clicks "invoke"
  8. The workspace tab's compiled graph reflects the prompt + canvas layers + mask
  9. User switches to workflow view to inspect the generated graph
  10. If they want to customize nodes directly, they choose:
  • Edit workflow (opens linear panel and makes workflow editable)

Workflow panel

  1. User opens/imports a custom workflow
  2. Center view is workflow by default
  3. Customizable linear UI section within panel
  4. User edits nodes directly
  5. Tab does not pretend to be a structured generate/canvas tab

Open questions

  • Should layout persistence be configurable to be global or per-tab?
  • Should top-level tab state be exportable?
  • Should the center viewing area become a dock-able region? If so what panels should allowed there.
  • Should workflows/custom nodes have panels available to the UI?
  • Should left/right/center/bottom regions allow to hide a panel completely and add it again later? e.g. 'remove from dock'
  • How flexible should we make the dockview? Should we allow to extract to new windows? Should we restrict certain panels to certain places?
  • Should the layers panel auto-open when the canvas view is opened?
  • Should we take this time in this major frontend rewrite to also implement a new web canvas, split off from the @invoke-ai/ui-library, or use a new UI library (Ark UI has been discussed)?

Request for feedback

I'd especially like feedback on:

  • What concerns this raises for beginners, creatives and power users
  • What contributors think the right path is for the docking/panels experience

TL;DR

Invoke should move toward a unified graph-backed workspaces model, where each top-level tab owns it's workflow context, existing pages become contextual panels, and center views display tab-local state.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions