Skip to content

Conversation

susbhere
Copy link
Contributor

Tensor layout related properties are calculated once and used those
cached values during per element offset calculation. This brings ~200x improvement in wait time between two queries for PhiSlica model. That means a user has to wait only for 0.36 sec (instead of 74 sec !!!) between two queries. These numbers are from LNL.

JIRA: https://jira.devtools.intel.com/browse/CVS-174810

@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label Oct 17, 2025
@susbhere susbhere force-pushed the optimize_padded_copy branch 2 times, most recently from 820f1d4 to 31ed86d Compare October 17, 2025 11:40
@susbhere susbhere marked this pull request as ready for review October 17, 2025 11:43
@susbhere susbhere requested review from a team as code owners October 17, 2025 11:43
@susbhere
Copy link
Contributor Author

build_jenkins

@susbhere susbhere force-pushed the optimize_padded_copy branch 5 times, most recently from c2698b0 to 98db555 Compare October 17, 2025 15:00
Tensor layout related properties are calculated once and used those
cached values during per element offset calculation. This brings ~200x improvement in wait time between two queries for PhiSlica model. That means a user has to wait only for 0.36 sec (instead of 74 sec !!!) between two queries. These numbers are from LNL.

JIRA: https://jira.devtools.intel.com/browse/CVS-174810
@susbhere susbhere force-pushed the optimize_padded_copy branch from 98db555 to f3da61f Compare October 17, 2025 15:25
for (size_t i = 0; i < p_sizes.size(); i++) {
padded_sizes[i] = p_sizes[i];
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, do we need this function inside the layout.cpp?
Seems that we can calculate jut in the target copy function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is using memeber variables of layout class. For example, data_padding. Even get_tensor() call has dependency to the layout object.

}

private:
static void get_axes_map(cldnn::format& fmt, int64_t* axes_map, size_t& map_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#32371 (comment)
Also, as I mentioned in the previous comment, please move this function to format.hpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move get_linear_offset_params() to layout.hpp. May I know the reason or advantage you see to keep in hpp?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GPU OpenVINO GPU plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants