-
Notifications
You must be signed in to change notification settings - Fork 19
[Example] One shot all reduce #245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
8751b7d
to
a76965c
Compare
a76965c
to
ec93b60
Compare
|
||
|
||
@helion.jit( | ||
config=helion.Config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we able to autotune this yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the blockers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need support for collaborative autotuning on multiple torchrun initiated processes.
I have event-based benchmarking infra ready in #393 (autotuner/benchmarker) which reports timing results on process 0.
We need to:
- Make sure all processes benchmark the same configs in the same order. (Is there any randomization in the autotuning process?)
- Use the event based benchmarker when in torchrun env inside autotuner. (easy)
- Communicate results from process 0 to all processes, OR process 0 makes a decision and communicate the optimal config to all processes. (Through caching?)
dc63692
to
f618391
Compare
stack-info: PR: #245, branch: joydddd/stack/12
ec93b60
to
15b3f75
Compare
15b3f75
to
d3b2b64
Compare
d3b2b64
to
cee26aa
Compare
cee26aa
to
30959b0
Compare
30959b0
to
2c0a1be
Compare
stack-info: PR: #245, branch: joydddd/stack/12
cb9d73e
to
4273b27
Compare
stack-info: PR: #245, branch: joydddd/stack/12
4273b27
to
abf1f4b
Compare
Stacked PRs:
[Example] One shot all reduce