refactor(api,robot-server): Upgrade anyio 3.7.1 -> 4.9.0 #19071

SyntaxColoring · 2025-07-29T19:25:01Z

Overview

This upgrades our dependency on anyio in all Python code that runs on robots.

The latest release, 4.9.0 has a memory usage fix that may be important to us. anyio.to_thread.run_sync(), a helper that we use a lot, was needlessly retaining references to certain objects, preventing them from being garbage-collected.

This alone doesn't seem to fix our memory usage problems, but it's probably a necessary component of the fix.

Test Plan and Hands on Testing

Just CI.

Risk assessment

Medium.

There is one breaking change that is relevant to us, relating to how exceptions are raised out of task groups. I think this affects at least this call site:

opentrons/robot-server/robot_server/protocols/router.py

Lines 395 to 404 in e493012

    
           try: 
        
               source = await protocol_reader.save( 
        
                   files=buffered_files, 
        
                   directory=protocol_directory / protocol_id, 
        
                   content_hash=content_hash, 
        
               ) 
        
           except ProtocolFilesInvalidError as e: 
        
               raise ProtocolFilesInvalid(detail=str(e)).as_error( 
        
                   status.HTTP_422_UNPROCESSABLE_ENTITY 
        
               ) from e

If it matters, we'll address it in a separate PR.

sfoster1

We should do this even if it doesn't fix the issues all on its own - it is a good idea.

sfoster1 · 2025-07-30T20:03:01Z

Opentrons/buildroot#251

Resolve conflicts in: * api/Pipfile.lock * robot-server/Pipfile.lock * system-server/Pipfile * system-server/Pipfile.lock

codecov · 2025-07-31T16:58:16Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 24.89%. Comparing base (eddea9e) to head (4cf6789).
⚠️ Report is 3 commits behind head on edge.

Additional details and impacted files

@@           Coverage Diff           @@
##             edge   #19071   +/-   ##
=======================================
  Coverage   24.89%   24.89%           
=======================================
  Files        3371     3371           
  Lines      296350   296350           
  Branches    31444    31444           
=======================================
  Hits        73773    73773           
  Misses     222553   222553           
  Partials       24       24

Flag	Coverage Δ
protocol-designer	`18.91% <ø> (ø)`
step-generation	`5.36% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Uhh will this fix snapshot tests?

@SyntaxColoring

…col runs (#19108) Closes [RQA-3917](https://opentrons.atlassian.net/browse/RQA-3917)  # Overview Although protocol engine is eventually dereferenced during the protocol run lifecycle, there exists a circular reference between command history and additional objects that isn't well captured by the gc lib or memray call stack analysis. Fully clearing the run's `CommandHistory` before dereferencing the run orchestrator eliminates a memory leak.  ## Test Plan and Hands on Testing - See ticket for the wonderful script made by @SyntaxColoring, which was modified to run a step-intensive protocol 25 times in a simulated environment. The following outputs were generated with memray, passing no additional flags when generating the bin file. To convert the bin file to an HTML file, the following was run: `memray flamegraph --split-threads --temporal /path/to/file`. ### Twenty-Five Simulated Protocol Runs (with PR only) <img width="1451" height="411" alt="Screenshot 2025-08-01 at 4 20 37 PM" src="https://github.com/user-attachments/assets/577ea481-8106-43e9-9ff8-8f41e3dbcae3" /> Note the total memory usage (which may vary by robot). Compare with `edge` output, below. We expect to see some increase in total heap usage up to a point as various caching occurs. After run 17, there is no more apparent memory increase. ### Twenty-Five Simulated Protocol Runs (`edge` prior to any recent memory fixes, without PR) <img width="1475" height="380" alt="Screenshot 2025-08-01 at 10 48 33 PM" src="https://github.com/user-attachments/assets/95c8b516-011e-4f8c-a890-505ad57bd1e2" /> Note that after the 25th run, total `opentrons-robot-server` heap allocation is substantially greater than the above case. ### Twenty-Five Simulated Protocol Runs (with PR), No LRU Caching, #19107 Cherry-Pick Included <img width="1464" height="358" alt="Screenshot 2025-08-01 at 4 28 56 PM" src="https://github.com/user-attachments/assets/df9ad20c-ddfd-48c4-8b15-f3f350d68a5a" /> Effectively no increase in memory utilization after initialization and the completion of the second protocol run. ### Two Real Protocol Runs (with PR), No LRU Caching Included <img width="1459" height="370" alt="Screenshot 2025-08-01 at 4 30 29 PM" src="https://github.com/user-attachments/assets/da3f2fa8-294e-4639-a551-423e86ed375f" /> The various spikes during the run are because of camera captures via HTTP. ### Six Real Protocol Runs (with PR, #19107, #19110, #19109, #19071) <img width="1458" height="361" alt="Screenshot 2025-08-01 at 11 03 41 PM" src="https://github.com/user-attachments/assets/5bd18f3f-4a78-4540-8502-b5ec0a5b2e9f" /> Run between 10-40 minutes. The end of run memory for run 2 is 504MB, which is equivalent to the end of run 6 memory. The memray HTML analysis file is too large to attach directly on github, but it's included in the ticket.  ## Changelog - Fixed command history accumulating in memory across protocol runs.   ## Risk assessment low - we are clearing state exactly before we dereference the run orchestrator, at which point we don't expect this state to be available, anyway.  [RQA-3917]: https://opentrons.atlassian.net/browse/RQA-3917?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ --------- Co-authored-by: Max Marrone <[email protected]> Co-authored-by: Seth Foster <[email protected]>

SyntaxColoring added 3 commits July 29, 2025 15:10

WIP

3237972

Locks

6877717

system-server fixup

2c28484

sfoster1 approved these changes Jul 30, 2025

View reviewed changes

Merge branch 'edge' into anyio_upgrades

e493012

Resolve conflicts in: * api/Pipfile.lock * robot-server/Pipfile.lock * system-server/Pipfile * system-server/Pipfile.lock

Update dependency in analyses-snapshot-testing too.

bbd989e

SyntaxColoring marked this pull request as ready for review July 31, 2025 16:58

SyntaxColoring mentioned this pull request Jul 31, 2025

package/python-anyio: bump to 4.9.0 Opentrons/buildroot#251

Merged

Merge branch 'edge' into anyio_upgrades

4cf6789

Uhh will this fix snapshot tests?

SyntaxColoring added the gen-analyses-snapshot-pr Generate a healing PR if the analyses snapshot test fails label Jul 31, 2025

This comment was marked as resolved.

Sign in to view

github-actions bot mentioned this pull request Jul 31, 2025

fix(analyses-snapshot-testing): heal anyio_upgrades snapshots #19100

Merged

fix(analyses-snapshot-testing): heal anyio_upgrades snapshots (#19100)

f7e9aae

SyntaxColoring removed the gen-analyses-snapshot-pr Generate a healing PR if the analyses snapshot test fails label Jul 31, 2025

SyntaxColoring merged commit d1663b7 into edge Aug 1, 2025
50 checks passed

SyntaxColoring deleted the anyio_upgrades branch August 1, 2025 12:49

mjhuff mentioned this pull request Aug 2, 2025

fix(robot-server, api): fix command history accumulation across protocol runs #19108

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(api,robot-server): Upgrade anyio 3.7.1 -> 4.9.0 #19071

refactor(api,robot-server): Upgrade anyio 3.7.1 -> 4.9.0 #19071

Uh oh!

SyntaxColoring commented Jul 29, 2025 •

edited

Loading

Uh oh!

sfoster1 left a comment

Uh oh!

sfoster1 commented Jul 30, 2025

Uh oh!

codecov bot commented Jul 31, 2025 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

	try:
	source = await protocol_reader.save(
	files=buffered_files,
	directory=protocol_directory / protocol_id,
	content_hash=content_hash,
	)
	except ProtocolFilesInvalidError as e:
	raise ProtocolFilesInvalid(detail=str(e)).as_error(
	status.HTTP_422_UNPROCESSABLE_ENTITY
	) from e

refactor(api,robot-server): Upgrade anyio 3.7.1 -> 4.9.0 #19071

refactor(api,robot-server): Upgrade anyio 3.7.1 -> 4.9.0 #19071

Uh oh!

Conversation

SyntaxColoring commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Test Plan and Hands on Testing

Risk assessment

Uh oh!

sfoster1 left a comment

Choose a reason for hiding this comment

Uh oh!

sfoster1 commented Jul 30, 2025

Uh oh!

codecov bot commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

SyntaxColoring commented Jul 29, 2025 •

edited

Loading

codecov bot commented Jul 31, 2025 •

edited

Loading