Skip to content

Request: lowest, largest and median clks of hits in Instruction timing view #109

@DadSchoorse

Description

@DadSchoorse

Instruction timing currently only has "Clks normalized by wavefronts", "Clks normalized by hit count" and "Total clks" options. The second one is the most useful of them when trying to view alu latency. But, often there are latency outliers. I assume one reason is when the CU issued instructions from other waves between two neighboring VALU, but there could be other reasons too.

Currently, the only way to avoid the issues - at least partially - is to only look at the hits from "slowest in selection" or "fastest in selection". They will still have outliers, but they are much more apparent and not hidden in the average of thousands of hits. The obvious disadvantage is that they these two waves might not cover all execution paths in the program, so in branch heavy workloads a lot of the shader will just be greyed out.

Lowest and median clks of the hits would exclude large outliers by definition and make "Wavefront latency: selection total" more useful for displaying ALU latency and sub optimal instruction scheduling.

Examples of the issues I'm currently having:

Image

This is using the selection total option. All of the VALU are displayed has having huge latency (in ALU terms), if they were always required then performance would be terrible.

Image

This is fasted in selection. Clearly the compiler did a decent scheduling job here, a lot of instructions have minimum possible latency.

Image

This is slowest in selection. There are clear outliers, but overall this is still more informational than the first view. Not what I would want to use, but still better than the first option if the second had no coverage of this part of the program.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions