Skip to content

fix 5 critical pipeline and render performance bottlenecks#1012

Open
Mitesh-Chaudhari wants to merge 2 commits intolichtblick-suite:developfrom
Mitesh-Chaudhari:bugfix/critical-fixes-for-performance-improvements
Open

fix 5 critical pipeline and render performance bottlenecks#1012
Mitesh-Chaudhari wants to merge 2 commits intolichtblick-suite:developfrom
Mitesh-Chaudhari:bugfix/critical-fixes-for-performance-improvements

Conversation

@Mitesh-Chaudhari
Copy link
Copy Markdown

@Mitesh-Chaudhari Mitesh-Chaudhari commented Mar 25, 2026

User-Facing Changes

This PR addresses five critical performance bottlenecks identified through structured analysis under high-volume MCAP workloads. Users will experience:

  • Faster Topic Changes: Adding a new topic no longer wipes the entire in-memory cache. Seek performance is maintained after subscription changes.

  • Smoother 3D Rendering: Redundant React re-renders on every pipeline tick are eliminated. WebGL frames are now scheduled via requestAnimationFrame.

  • Faster User Script Preloading: Preloaded blocks are processed in parallel across Worker RPCs, significantly reducing wait times for large files.

  • Reduced UI Jank: Panel log writes no longer force every PanelContext consumer to re-render.

Description

C-1 — CachingIterableSource: selective block eviction on topic change
File: packages/suite-base/src/players/IterablePlayer/CachingIterableSource.ts

  • Problem
    Every topic-subscription change — including adding a single topic — triggered a full cache wipe:
    // Before
this.#cache.length = 0;   // discard up to 600 MB
this.#totalSizeBytes = 0;

A cache built over several minutes of forward playback was silently destroyed whenever any panel added a subscription, forcing a full re-read from disk.

  • Root cause
    CacheBlock stored no record of which topics it was read with, making it impossible to know which blocks were still valid for the new subscription set.

  • Fix
    Added a topics: TopicSelection field to CacheBlock, stamped at creation time. On a topic change a three-case strategy is applied:

  1. Case : Empty subscription
    Action : Wipe everything (unchanged)

  2. Case : Topic additions
    Action : Evict only blocks whose block.topics is missing at least one new topic

  3. Case : Topic removals only
    Action : Keep the entire cache — downstream iterator skips unsubscribed topics

Also added #highWaterMark: Time tracking (advances as blocks are consumed) as a secondary safety bound for future eviction work.

Test updated: CachingIterableSource.test.ts — "should clear the cache when topics change" updated to assert selective preservation.

C-2 — UserScriptPlayer: parallel block processing with Promise.all
File: packages/suite-base/src/players/UserScriptPlayer/index.ts

  • Problem
    #getMessages() was previously fixed to use Promise.all for real-time message delivery. However #getBlocks() — the code path for every preloaded block in a heavy MCAP load — still processed messages sequentially:
    // Before — O(blocks × messages × scripts) serial Worker round-trips
for (const block of blocks) {
  for (const message of blockMessages) {
    const output = await scriptRegistration.processBlockMessage(message, globalVariables);
  }
}

For a 911 MB MCAP file with 705 000 messages this created hundreds of thousands of sequential Worker round-trips and was the dominant bottleneck for large file preloads.

  • Fix
    Applied the same Promise.all batching to #getBlocks:
    // After — blocks processed concurrently; messages within each block dispatched in parallel
const outputBlocks = await Promise.all(
  blocks.map(async (block) => {
    const tasks = eligibleMessages.map((msg) =>
      scriptRegistration.processBlockMessage(msg, globalVariables),
    );
    const results = await Promise.all(tasks);
    // ...
  }),
);

Block ordering is preserved because Promise.all on the outer blocks.map maintains index correspondence.

C-3 — ThreeDeeRender: requestAnimationFrame-based render scheduling
File: packages/suite-base/src/panels/ThreeDeeRender/ThreeDeeRender.tsx

  • Problem
    requestRender was implemented as setRenderToken(t + 1) — a React state setter. Calling it from inside useEffect callbacks scheduled an extra React commit cycle per call. With 7 call sites, every scene update produced two commits instead of one.
    // Before — setState inside effects = extra commit per render request
const [renderToken, setRenderToken] = useState(0);
const requestRender = useCallback(() => setRenderToken((t) => t + 1), []);
  • Fix
    // After — native RAF, zero React state, multiple requests coalesce into one frame
const rafIdRef = useRef<number | undefined>(undefined);
const requestRender = useCallback(() => {
  if (rafIdRef.current == undefined) {
    rafIdRef.current = requestAnimationFrame(() => {
      rafIdRef.current = undefined;
      rendererRef.current?.animationFrame();
      renderDoneRef.current?.();   // pipeline signal — no setState (see C-4)
      renderDoneRef.current = undefined;
    });
  }
}, []);

C-4 — ThreeDeeRender: guard remaining unconditional setState calls in onRender
File: packages/suite-base/src/panels/ThreeDeeRender/ThreeDeeRender.tsx

  • Problem
    Even after guards were added for colorScheme, topics, parameters, sharedPanelState, and timezone, two setState calls in onRender remained unconditional on every pipeline tick:
    // Before — guaranteed re-render every tick regardless of other guards
setRenderDone(() => done);               // every tick
setCurrentFrameMessages(currentFrame);  // every tick — always a new array ref
  • Fix
    renderDone is moved to a useRef and called inside the RAF callback (see C-3). currentFrame is guarded with Object.is against a previous-value ref:

// After
renderDoneRef.current = done; // ref write — no React commit

if (!Object.is(renderState.currentFrame, prevCurrentFrameRef.current)) {
  prevCurrentFrameRef.current = renderState.currentFrame;
  setCurrentFrameMessages(renderState.currentFrame);  // only when frame ref changes
}
  1. Metric : React commits per tick (idle scene)
    Before : 2
    After : 0

  2. Metric : React commits per tick (new frame data)
    Before : 2
    After : 1

C-5 — Panel: stabilise PanelContext value; remove logs.length from memo deps
Files: Panel.tsx, PanelToolbarControls.tsx, types.ts

  • Problem
    The useMemo wrapping panelContextValue had logs.length in its dependency array:
    // Before — logs.length changing on every log write invalidates the memo
const panelContextValue = useMemo(() => ({ ..., logCount: logs.length }), [
  ...,
  logs.length,  // ← changes on every panel.logError() call during playback
]);

Every panel log write during playback produced a new context object, forcing all PanelContext consumers (toolbar buttons, child components) to reconcile unnecessarily.

  • Fix
    // After — logCountRef updated via mutation; stable getLogCount callback exposed in context
const logCountRef = useRef(logs.length);
logCountRef.current = logs.length;                        // mutation — no render
const getLogCount = useCallback(() => logCountRef.current, []);
const panelContextValue = useMemo(
  () => ({ ..., getLogCount }),
  [...deps /* logs.length removed */, getLogCount],
);

PanelToolbarControls reads getLogCount() on render — the badge value is always current without creating a reactive dependency.

Type change: PanelContextType.logCount?: number is replaced by PanelContextType.getLogCount?: () => number. Any custom panel that reads logCount directly from PanelContext must be updated to call getLogCount?.() ?? 0 instead.

@aneuwald-ctw
Can you please review this PR.

Checklist

  • The web version was tested and it is running ok
  • The desktop version was tested and it is running ok
  • This change is covered by unit tests
  • Files constants.ts, types.ts and *.style.ts have been checked and relevant code snippets have been relocated

@Mitesh-Chaudhari
Copy link
Copy Markdown
Author

Hi @aneuwald-ctw and @lichtblick-suite maintainers, I noticed the CI checks are still pending for these updates. Could you please check if they require manual approval to run? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant