Skip to content

[RFC] FLA Config Cache #704

@zhiyuan1i

Description

@zhiyuan1i

Proposal

Create a new class that hacks Triton autotune to enable reading kernel autotune configuration files from local or remote disks.

Rationale

This can alleviate:

  1. potential bwd precision issues on H20
  2. compilation failures with certain configurations in some Triton versions
  3. cross-task precision alignment
  4. faster kernel launch speed

At present, we do not plan to consider automated configuration generation or cross-shape speed optimization; in other words, users will need to manually tune and generate configurations, and fla will provide a script to achieve similar tuning.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions