Skip to content

Auto determine how much of the model to load into RAM #9

@chelsea0x3b

Description

@chelsea0x3b

Use cases:

  1. You can fit the whole model into GPU ram
  2. You can fit part of the model into GPU ram
  3. You need keep all the model weights on disk

In all these cases, we should be able to detect how much GPU ram is available, and determine the max amount of model to store that way. More advanced use cases of sharing GPU with other applications may need manual control over the memory, but that can be done later.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions