large memory usage

![image](https://github.com/zhuzilin/ring-flash-attention/assets/17453999/57e20774-f48c-47a6-8208-b97a23928b17)
Thanks for sharing this excellent implementation of ring attention.
Here are my test results on 2*A100 (with nvlink).  Judging from the results, the memory usage of ring attention（ring_flash_attn_qkvpacked_func） seems to be very large. This is not as expected. Are there any possible problems?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

large memory usage #23

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

large memory usage #23

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions