-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Optimze copy tensor with padding #32461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
820f1d4
to
31ed86d
Compare
build_jenkins |
c2698b0
to
98db555
Compare
Tensor layout related properties are calculated once and used those cached values during per element offset calculation. This brings ~200x improvement in wait time between two queries for PhiSlica model. That means a user has to wait only for 0.36 sec (instead of 74 sec !!!) between two queries. These numbers are from LNL. JIRA: https://jira.devtools.intel.com/browse/CVS-174810
98db555
to
f3da61f
Compare
for (size_t i = 0; i < p_sizes.size(); i++) { | ||
padded_sizes[i] = p_sizes[i]; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, do we need this function inside the layout.cpp?
Seems that we can calculate jut in the target copy function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is using memeber variables of layout class. For example, data_padding. Even get_tensor() call has dependency to the layout object.
} | ||
|
||
private: | ||
static void get_axes_map(cldnn::format& fmt, int64_t* axes_map, size_t& map_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#32371 (comment)
Also, as I mentioned in the previous comment, please move this function to format.hpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can move get_linear_offset_params() to layout.hpp. May I know the reason or advantage you see to keep in hpp?
Tensor layout related properties are calculated once and used those
cached values during per element offset calculation. This brings ~200x improvement in wait time between two queries for PhiSlica model. That means a user has to wait only for 0.36 sec (instead of 74 sec !!!) between two queries. These numbers are from LNL.
JIRA: https://jira.devtools.intel.com/browse/CVS-174810