Skip to content

Commit 242fc55

Browse files
fix: extend memory benchmark with 4+ gb buffers
4+ gb test benchmarks added Limited benchmarks cases added for pre-si Event based kernel execution time added General purpose timestamp calculation functions are added to framework Signed-off-by: Shametska, Raman <[email protected]>
1 parent b5cc46a commit 242fc55

File tree

5 files changed

+158
-48
lines changed

5 files changed

+158
-48
lines changed

TESTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ FullRemoteAccessMemory|Uses stream memory in a fashion described by 'type' to me
134134
FullRemoteAccessMemoryXeCoresDistributed|Uses stream memory in a fashion described by 'type' to measure bandwidth of full remote memory accesswhen hwthreads are distributed between XeCores.|<ul><li>--blockAccess Block access (1) or scatter access (0) (0 or 1)</li><li>--elementSize Size of the single element to read in bytes (1, 2, 4, 8)</li><li>--size Size of the memory to stream. Must be a power of 2</li><li>--type Memory streaming type (Read or Write or Scale or Triad)</li><li>--useEvents Perform GPU-side measurements using events (0 or 1)</li><li>--workItems Number of work items equal to SIMD size * used hwthreads</li></ul>|:x:|:heavy_check_mark:|
135135
MapBuffer|allocates an OpenCL buffer and measures map bandwidth. Mapping operation means memory transfer from GPU to CPU or a no-op, depending on map flags.|<ul><li>--compressed Select if the buffer is to be compressed. Will be skipped, if device does not support compression (0 or 1)</li><li>--contents Contents of the buffer (Zeros or Random)</li><li>--mapFlags OpenCL map flags passed during memory mapping (Read or Write or WriteInvalidate)</li><li>--size Size of the buffer</li><li>--useEvents Perform GPU-side measurements using events (0 or 1)</li></ul>|:x:|:heavy_check_mark:|
136136
QueueInOrderMemcpy|measures time on CPU spent for multiple in order memcpy.|<ul><li>--IsCopyOnly If true, Copy Engine is selected. If false, Compute Engine is selected (0 or 1)</li><li>--count Number of memcpy operations</li><li>--destinationPlacement Placement of the destination buffer (Device or Host or Shared or non-USM-mapped or non-USMmisaligned or non-USM4KBAligned or non-USM2MBAligned or non-USMmisaligned-imported or non-USM4KBAligned-imported or non-USM2MBAligned-imported)</li><li>--size Size of memory allocation</li><li>--sourcePlacement Placement of the source buffer (Device or Host or Shared or non-USM-mapped or non-USMmisaligned or non-USM4KBAligned or non-USM2MBAligned or non-USMmisaligned-imported or non-USM4KBAligned-imported or non-USM2MBAligned-imported)</li></ul>|:heavy_check_mark:|:x:|
137-
RandomAccessMemory|Measures device-memory random access bandwidth for different allocation sizes, alignments and access modes.The benchmark uses 10 million accesses to memory.|<ul><li>--accessMode Access mode to be used('Read', 'Write', 'ReadWrite')</li><li>--alignment Alignment request for the allocated memory</li><li>--allocationSize Size of device memory to be allocated.(Maximum supported is 16GB)</li><li>--randomAccessRange Percentage of allocation size to be used for random access</li></ul>|:heavy_check_mark:|:x:|
137+
RandomAccessMemory|Measures device-memory random access bandwidth for different allocation sizes, alignments and access modes.The benchmark uses 10 million accesses to memory.|<ul><li>--accessMode Access mode to be used('Read', 'Write', 'ReadWrite')</li><li>--alignment Alignment request for the allocated memory</li><li>--allocationSize Size of device memory to be allocated.(Maximum supported is 16GB)</li><li>--randomAccessRange Percentage of allocation size to be used for random access</li><li>--useEvents Perform GPU-side measurements using events (0 or 1)</li></ul>|:heavy_check_mark:|:x:|
138138
ReadBuffer|allocates an OpenCL buffer and measures read bandwidth. Read operation means transfer from GPU to CPU.|<ul><li>--compressed Select if the buffer is to be compressed. Will be skipped, if device does not support compression (0 or 1)</li><li>--contents Contents of the buffer (Zeros or Random)</li><li>--reuse How hostptr allocation can be reused due to previous operations (Aligned4KB or Misaligned or Usm or Map)</li><li>--size Size of the buffer</li><li>--useEvents Perform GPU-side measurements using events (0 or 1)</li></ul>|:x:|:heavy_check_mark:|
139139
ReadBufferMisaligned|allocates an OpenCL buffer and measures read bandwidth. Read operation means transfer from GPU to CPU. Destination pointer passed by the application will be misaligned by the specified amount of bytes.|<ul><li>--misalignment Number of bytes by which misaligned the destination pointer will be misaligned</li><li>--size Size of the buffer</li><li>--useEvents Perform GPU-side measurements using events (0 or 1)</li></ul>|:x:|:heavy_check_mark:|
140140
ReadBufferRect|allocates an OpenCL buffer and measures rectangle read bandwidth. Rectangle read operation means transfer from GPU to CPU.|<ul><li>--compressed Select if the buffer is to be compressed. Will be skipped, if device does not support compression (0 or 1)</li><li>--origin Origin of the rectangle</li><li>--rPitch Row pitch of the rectangle</li><li>--region Size of the rectangle</li><li>--sPitch Silice pitch of the rectangle</li><li>--size Size of the buffer</li></ul>|:x:|:heavy_check_mark:|

source/benchmarks/memory_benchmark/definitions/random_access.h

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,31 @@
11
/*
2-
* Copyright (C) 2023 Intel Corporation
2+
* Copyright (C) 2023-2025 Intel Corporation
33
*
44
* SPDX-License-Identifier: MIT
55
*
66
*/
77

88
#pragma once
99

10+
#include "framework/argument/basic_argument.h"
1011
#include "framework/argument/compression_argument.h"
1112
#include "framework/argument/enum/buffer_contents_argument.h"
1213
#include "framework/test_case/test_case.h"
14+
#include "framework/utility/common_help_message.h"
1315

1416
struct RandomAccessArguments : TestCaseArgumentContainer {
1517
PositiveIntegerArgument allocationSize;
1618
PositiveIntegerArgument alignment;
1719
StringArgument accessMode;
1820
PositiveIntegerArgument randomAccessRange;
21+
BooleanArgument useEvents;
1922

2023
RandomAccessArguments()
2124
: allocationSize(*this, "allocationSize", "Size of device memory to be allocated.(Maximum supported is 16GB)"),
2225
alignment(*this, "alignment", "Alignment request for the allocated memory"),
2326
accessMode(*this, "accessMode", "Access mode to be used('Read', 'Write', 'ReadWrite')"),
24-
randomAccessRange(*this, "randomAccessRange", "Percentage of allocation size to be used for random access") {}
27+
randomAccessRange(*this, "randomAccessRange", "Percentage of allocation size to be used for random access"),
28+
useEvents(*this, "useEvents", CommonHelpMessage::useEvents()) {}
2529
};
2630

2731
struct RandomAccess : TestCase<RandomAccessArguments> {

source/benchmarks/memory_benchmark/gtest/random_access.cpp

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
/*
2-
* Copyright (C) 2023 Intel Corporation
2+
* Copyright (C) 2023-2025 Intel Corporation
33
*
44
* SPDX-License-Identifier: MIT
55
*
66
*/
77

88
#include "definitions/random_access.h"
99

10+
#include "framework/enum/measurement_type.h"
1011
#include "framework/test_case/register_test_case.h"
1112
#include "framework/utility/common_gtest_args.h"
1213
#include "framework/utility/memory_constants.h"
@@ -15,7 +16,7 @@
1516

1617
[[maybe_unused]] static const inline RegisterTestCase<RandomAccess> registerTestCase{};
1718

18-
class RandomAccessTest : public ::testing::TestWithParam<std::tuple<size_t, size_t, std::string, size_t>> {
19+
class RandomAccessTest : public ::testing::TestWithParam<std::tuple<size_t, size_t, std::string, size_t, size_t>> {
1920
};
2021

2122
TEST_P(RandomAccessTest, Test) {
@@ -25,6 +26,7 @@ TEST_P(RandomAccessTest, Test) {
2526
args.alignment = std::get<1>(GetParam());
2627
args.accessMode = std::get<2>(GetParam());
2728
args.randomAccessRange = std::get<3>(GetParam());
29+
args.useEvents = std::get<4>(GetParam());
2830

2931
RandomAccess test;
3032
test.run(args);
@@ -38,4 +40,15 @@ INSTANTIATE_TEST_SUITE_P(
3840
::testing::Values(256 * megaByte, 1 * gigaByte, 8 * gigaByte, 16 * gigaByte),
3941
::testing::Values(64 * kiloByte, 1 * gigaByte),
4042
::testing::Values("Read", "Write", "ReadWrite"),
41-
::testing::Values(100)));
43+
::testing::Values(100),
44+
::testing::Values(true, false)));
45+
46+
INSTANTIATE_TEST_SUITE_P(
47+
RandomAccessTestLIMITED,
48+
RandomAccessTest,
49+
::testing::Combine(
50+
::testing::Values(3 * gigaByte, 7.5 * gigaByte),
51+
::testing::Values(64 * kiloByte),
52+
::testing::Values("ReadWrite"),
53+
::testing::Values(100),
54+
::testing::Values(true)));

0 commit comments

Comments
 (0)