-
Notifications
You must be signed in to change notification settings - Fork 69
A tracking utility for gathering the compile and/or runtime time, size, profiling and other statistics #4777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
41015d0
to
1216480
Compare
7843958
to
9752167
Compare
9752167
to
d2a0de4
Compare
}, | ||
py::call_guard<py::gil_scoped_release>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't allow calling the callback function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to make conditional? For example, still use it if pyCb=std::nullopt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it's released in the beginning of the lambda and acquired on each callback call.
I would also add tests for this utility so that the code does not become outdated unexpectedly. |
@AndreyPavlenko Will it be possible to distinguish between configurations for the single script like our microbenchmarks? Like if I call kernel with different input parameters, I would probably want to get separate compile time for each input size. |
There will be separate reports for each compilation. |
Can you show how to distinguish between them? I think currently I only see a folder with kernel name and inside a lot of files with similar names, like |
Currently it has the same name as the kernel name and it's difficult to distinguish. A similar issue is discussed here - #4800 (comment) .
|
d2a0de4
to
55be9d9
Compare
Now constexprs are added to the kernel names and the grid is added to the kernel runs. |
55be9d9
to
7b81c66
Compare
f16b622
to
c465ae3
Compare
c465ae3
to
b15f57b
Compare
@@ -13,6 +13,7 @@ | |||
import os | |||
import subprocess | |||
from pathlib import Path | |||
from .track import track |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be aligned with current style and move it to top of this file.
from .track import track | |
from triton.backends.intel.track import track |
@@ -68,6 +68,7 @@ env: | |||
VERIFY: ${{ (github.event_name == 'pull_request' || github.event_name == 'schedule' || inputs.verify) && '1' || '0' }} | |||
TAG: ${{ inputs.tag || (github.event_name == 'pull_request' && format('pr-{0}', github.event.number)) || (github.event_name == 'schedule' && 'ci') || 'test' }} | |||
N_RUNS: ${{ inputs.n_runs || '1' }} | |||
TRITON_TRACK_DUMP: "$PWD/reports/track" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make it optional depending on input from user. It can cause overhead, which can generally be avoided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also enable this profiling for some test in intel
folder at least.
|
||
|
||
def _tr_env(name: str, default: str = "", type: Any = str) -> Any: | ||
return type(os.environ.get(name, default).strip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This returns a type, not a value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returns the value of the specified type - str, int, etc.
# To enable the tracking, set the environment variable ``TRITON_TRACK_DUMP`` | ||
# to either ``1``, ``true``, ``yes``, ``on``, ``y`` or a path to a directory | ||
# where the tracking reports will be dumped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need all these possible values for TRITON_TRACK_DUMP
?
I would leave only path to a directory and undefined cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can also print dumps into console. So many values to be consistent with other bool env vars, that support all these values.
else: | ||
import pathlib | ||
_TR_DUMP = lambda tr, dir=pathlib.Path(_TR_DUMP): tr.dump(dir) | ||
if _TR_DUMP is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's reduce number of code under top level if else branches.
We can define several classes and choose appropriate one in the end of this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will allow us to move almost all python imports at the top of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But, in this case, it will do useless calculations and imports. I can split it into multiple files and import the required, depending on the env vars.
return time, time | ||
return 0., {k: Track._to_value(v) if isinstance(v, dict) else v for k, v in values.items()} | ||
|
||
if _tr_env_on("TRITON_TRACK_SORT", True): # Sort results by the total time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undocumented
|
||
if _tr_env_on("TRITON_TRACK_SORT", True): # Sort results by the total time | ||
|
||
def _to_value(values: Dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to define duplicate function and redefine it?
We could check TRITON_TRACK_SORT
inside of _to_value
function.
@staticmethod | ||
def on_exit(self): | ||
self.pr.disable() | ||
st = pstats.Stats(self.pr, stream=TrackAndProfile._DEVNULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably not necessary unless st.print_results
is called?
st = pstats.Stats(self.pr, stream=TrackAndProfile._DEVNULL) | |
st = pstats.Stats(self.pr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's required, because when st.get_print_list() is called, it prints messages to the stream.
|
||
return decorator(funcOrName) if callable(funcOrName) else decorator | ||
|
||
# This ugly hook is used to decorate the upstream functions and avoid circular imports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do circular imports appear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we decorate the functions in the triton.runtime.jit module from backend, but the backend is called by that module.
To enable the tracking, set the environment variable
TRITON_TRACK_DUMP
to either1
,true
,yes
,on
,y
or a path to a directory where the tracking reports will be dumped.To add the profiling statistics to the reports, set the
TRITON_TRACK_PROFILE
environment variable.To track the kernel launches, set the
TRITON_TRACK_RUN
environment variable.Link #4716