Memory spikes by jvdoorss · Pull Request #374 · jwohlwend/boltz

jvdoorss · 2025-06-16T12:04:10Z

A number of changes intended to avoid short-lived memory allocations for intermediate variables.

More complicated one-liners can imply that several like-sized Tensors are stored until the computation is completed, even though a more sequential approach could reduce that number. For multi-GB tensors in the trunk-module, containing $\mathcal{O}(N_{residue}^2)$ elements, this can prove a crucial optimization.

Predicting the structure of 9b9j on a 40GB A100 gpu failed without these changes, but succeeds with them.

Typical changes:

alter z = z + a to z += a (or z[:] = z + a) to write directly to already allocated memory

reuse allocated memory in the outer scope by not initializing a new variable if the shape is constant:

def inner_scope(z):
  z[:] = shape_preserving_op(z)
  ...

instead of

def inner_scope(z):
  z = shape_preserving_op(z) #now there's 2 z's
  ...

delete redundant tensors
split up one-liners

Clearly this approach has some drawbacks, reducing speed (probably) and versatility of methods (that now modify the outer scope), but might be an acceptable trade-off.

NOTE While I think changes like this will in general optimize memory movements, as e.g. argued here and as seen from the actual memory consumption improvement of Boltz, it could be that not all proposed changes have an actual effect.

* try to have only 1 instance of the trunk pairwise embeddings at any time * avoid summing more than 2 terms at a time * update Tensors in-place by using '[:]' or '+='

* remove large Tensors asap * avoid duplicating `n_res ** 3`-sized b-Tensor * update in-place

The transition module handles large tensors. Computing multiple operation simultaneously in a one-liner with such tensors requires a lot of memory. A more sequential process reduces that load.

jvdoorss and others added 3 commits June 13, 2025 14:54

avoid duplication-while-updating for large Tensors

c6bc753

* try to have only 1 instance of the trunk pairwise embeddings at any time * avoid summing more than 2 terms at a time * update Tensors in-place by using '[:]' or '+='

curate Tensor population in PairWeightedAveraging

ee8f8c1

* remove large Tensors asap * avoid duplicating `n_res ** 3`-sized b-Tensor * update in-place

transform multiply in a memory efficient way

a5a1e59

The transition module handles large tensors. Computing multiple operation simultaneously in a one-liner with such tensors requires a lot of memory. A more sequential process reduces that load.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory spikes#374

Memory spikes#374
jvdoorss wants to merge 3 commits intojwohlwend:mainfrom
PUXANO:memory_spikes

jvdoorss commented Jun 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jvdoorss commented Jun 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant