Skip to content

Conversation

@victor-eds
Copy link
Contributor

@victor-eds victor-eds commented Nov 20, 2024

Use llvm.func intel_reqd_sub_group_size to express sub-group size instead of triton_gen attributes that are later translated.

Replace triton_gen.max_work_group_size value type with dense array.

…ND-ranges

Use `llvm.func` `reqd_work_group_size` and `intel_reqd_sub_group_size` to
express ND-range dimensions instead of `triton_gen` attributes that are
later translated.

Signed-off-by: victor-eds <[email protected]>
@victor-eds victor-eds requested review from a team, chengjunlu and whitneywhtsang November 20, 2024 11:47
@victor-eds victor-eds self-assigned this Nov 20, 2024
@chengjunlu
Copy link
Contributor

The changes LGTM. But I am not sure about the reason of the old code. Let @whitneywhtsang to approve.

@whitneywhtsang
Copy link
Contributor

There are some test_reduce failures, maybe due to changing from max_work_group_size to reqd_work_group_size?

@victor-eds
Copy link
Contributor Author

There are some test_reduce failures, maybe due to changing from max_work_group_size to reqd_work_group_size?

Interesting. I'll look into that tomorrow.

@whitneywhtsang
Copy link
Contributor

There are some test_reduce failures, maybe due to changing from max_work_group_size to reqd_work_group_size?

Interesting. I'll look into that tomorrow.

Confirmed it is due to changing from max_work_group_size to reqd_work_group_size. I modified main branch to use reqd_work_group_size, and I can observe the test_reduce failures.

@victor-eds
Copy link
Contributor Author

Confirmed it is due to changing from max_work_group_size to reqd_work_group_size. I modified main branch to use reqd_work_group_size, and I can observe the test_reduce failures.

Interesting. I'll take a look.

@victor-eds
Copy link
Contributor Author

Interesting. I'll take a look.

@whitneywhtsang @etiotto Apparently not setting anything at all fixes crashes. I'd go with this for now, get this merged to get going and open an investigation ticket to tackle ASAP. The reqd_work_group_size attribute may be helpful for codegen and improve performance after all and I'd say we want to use that.

My guess is we're modifying the number of warps or warp size at some point during this lowering process and this mismatch leads to crashes.

Does this course of action sound good?

@victor-eds
Copy link
Contributor Author

I'll restore back max_work_group_size and file a ticket to use reqd_work_group_size when we find out what's the matter here

@victor-eds
Copy link
Contributor Author

I will keep the dense array specification for max_work_group_size as this has no impact, I had already done it and it will make the change to llvm.func's reqd_work_group_size easier.

@victor-eds
Copy link
Contributor Author

@etiotto are we OK with this now that we're keeping the max_work_group_size code?

@etiotto
Copy link
Contributor

etiotto commented Nov 26, 2024

@etiotto are we OK with this now that we're keeping the max_work_group_size code?

yes

@etiotto etiotto merged commit 553b997 into intel:main Nov 26, 2024
5 checks passed
@victor-eds victor-eds deleted the use-llvm-func-attrs branch November 26, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants