[BUG] NeMo upgrade supporting hyena flash decode requires more memory for generation

### BioNeMo Framework Version

103d76f12ab2600e3d3b6f1789669ca2e94fc042

### Bug Description

We used to be able to run on L4, now the peak memory usage is 32G for our test_evo2.py generation tasks, and 50G for the 7b model, which we used to be able to run. The big change is related to following the new API related to supporting flash decode at inference time. 

Changes to skip tests that now fail: https://github.com/NVIDIA/bionemo-framework/pull/1000
NeMo diff that relates to the issue: https://github.com/NVIDIA/NeMo/compare/164d12b7155c082c050c2c5249480bf95f0865e7...b97e42b3dd1c9bcdf37c81c63220744af474c9c0 (see changes to all files that mention `hyena` in the path). 

### Steps to Reproduce

1.  `test_evo2.py` worked with before https://github.com/NVIDIA/bionemo-framework/pull/1000 and the nemo version bump to top of tree
2. Bump nemo and to main and you see the memory increase leading to fails on `test_evo2.py` on L4

### Error Messages and Logs

```shell

```

### Docker Image

_No response_

### System Information

Environment Details:
- OS: [e.g., Ubuntu 20.04]
- CPU: [e.g., Intel i9-12900K]
- RAM: [e.g., 64GB]

GPU Details:
- GPU Model: [e.g., NVIDIA RTX 4090]
- GPU Memory: [e.g., 24GB]
- CUDA Version: [e.g., 12.1]
- CUDA Driver: [e.g., 525.85.05]
- cuDNN Version: [e.g., 8.9.0]


### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] NeMo upgrade supporting hyena flash decode requires more memory for generation #1013

BioNeMo Framework Version

Bug Description

Steps to Reproduce

Error Messages and Logs

Docker Image

System Information

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] NeMo upgrade supporting hyena flash decode requires more memory for generation #1013

Description

BioNeMo Framework Version

Bug Description

Steps to Reproduce

Error Messages and Logs

Docker Image

System Information

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions