-
Notifications
You must be signed in to change notification settings - Fork 743
Add support for bias in optimized op_linear.cpp. #11210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11210
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit c6d1d2a with merge base 6875c8e ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
Summary: Diff uses `op_add_sub_impl` to add bias after optimized gemm call. Reviewed By: zonglinpeng Differential Revision: D75491158
6739c91 to
7205931
Compare
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
shim_et/xplat/executorch/kernels/optimized/op_registration_util.bzl
Outdated
Show resolved
Hide resolved
Summary: Diff uses `op_add_sub_impl` to add bias after optimized gemm call. Reviewed By: zonglinpeng Differential Revision: D75491158
7205931 to
c7022c9
Compare
Summary: Diff uses `op_add_sub_impl` to add bias after optimized gemm call. Reviewed By: zonglinpeng Differential Revision: D75491158
c7022c9 to
306ee28
Compare
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
1 similar comment
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
Summary: Pull Request resolved: pytorch#11210 Diff uses `op_add_sub_impl` to add bias after optimized gemm call. Reviewed By: zonglinpeng Differential Revision: D75491158
306ee28 to
d625780
Compare
Summary: Diff uses `op_add_sub_impl` to add bias after optimized gemm call. Reviewed By: zonglinpeng Differential Revision: D75491158
d625780 to
2b84698
Compare
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
Summary: Diff uses `op_add_sub_impl` to add bias after optimized gemm call. Reviewed By: zonglinpeng Differential Revision: D75491158
2b84698 to
70c406c
Compare
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
Summary: Diff uses `op_add_sub_impl` to add bias after optimized gemm call. Reviewed By: zonglinpeng Differential Revision: D75491158
70c406c to
ab44a56
Compare
1eacff3 to
7f7ff17
Compare
Summary: Diff initializes the output tensor before calling gemm with beta=1 when bias is non-nullopt. Reviewed By: larryliu0820, zonglinpeng Differential Revision: D75491158
7f7ff17 to
0b33f00
Compare
Summary: Pull Request resolved: pytorch#11210 Diff initializes the output tensor before calling gemm with beta=1 when bias is non-nullopt. Reviewed By: larryliu0820, zonglinpeng Differential Revision: D75491158
0b33f00 to
9205582
Compare
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
kimishpatel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to block this for now in favor of landing the other fix for bias. mm.out -> linear.out is the right change though
kimishpatel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok so this diff does incorporate "idea" from the other diff so seems ok to me. but would like the test case to be updated
Summary: Diff initializes the output tensor before calling gemm with beta=1 when bias is non-nullopt. Reviewed By: larryliu0820, zonglinpeng Differential Revision: D75491158
9205582 to
b14f114
Compare
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
Summary: Pull Request resolved: pytorch#11210 Diff initializes the output tensor before calling gemm with beta=1 when bias is non-nullopt. Reviewed By: larryliu0820, zonglinpeng Differential Revision: D75491158
b14f114 to
b38dfa3
Compare
Summary: Diff initializes the output tensor before calling gemm with beta=1 when bias is non-nullopt. Reviewed By: larryliu0820, zonglinpeng Differential Revision: D75491158
b38dfa3 to
499fb97
Compare
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
Summary: Diff initializes the output tensor before calling gemm with beta=1 when bias is non-nullopt. Reviewed By: larryliu0820, zonglinpeng Differential Revision: D75491158
499fb97 to
7f303be
Compare
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
Summary: Diff initializes the output tensor before calling gemm with beta=1 when bias is non-nullopt. Reviewed By: larryliu0820, zonglinpeng Differential Revision: D75491158
7f303be to
49b3b4a
Compare
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
| // Output is a n x m x scalar_t, while bias is m x scalar_t. | ||
| const size_t row_size = static_cast<size_t>(m) * sizeof(scalar_t); | ||
| for (const auto col : c10::irange(n)) { | ||
| std::memcpy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To handle 2d bias, you need to fix this. Bias pointer is not advancing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, it's more complicated than that. bias can be 1xm, nx1, or nxm.
kimishpatel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am accepting this but do note that you want either some random values or "not-all-outputs-are-same" values. The test as is, is quite weak IMO.
Summary: Diff initializes the output tensor before calling gemm with beta=1 when bias is non-nullopt. Reviewed By: larryliu0820, zonglinpeng, kimishpatel Differential Revision: D75491158
49b3b4a to
c6d1d2a
Compare
|
This pull request was exported from Phabricator. Differential Revision: D75491158 |
Summary: Diff uses
op_add_sub_implto add bias after optimized gemm call.Differential Revision: D75491158