Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions sycl/source/detail/config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,34 @@ const std::array<std::pair<std::string, backend>, 8> &getSyclBeMap() {
{"*", backend::all}}};
return SyclBeMap;
}
namespace {

unsigned int parseLevel(const char *ValStr) {
unsigned int intVal = 0;

if (ValStr) {
try {
intVal = std::stoul(ValStr);
} catch (...) {
// If the value is not null and not a number, it is considered
// to enable disk cache tracing. This is the legacy behavior.
intVal = 1;
}
}

// Legacy behavior.
if (intVal > 7)
intVal = 1;

return intVal;
}

} // namespace

void SYCLConfigTrace::reset() { Level = parseLevel(BaseT::getRawValue()); }

unsigned int SYCLConfigTrace::Level =
parseLevel(SYCLConfigTrace::BaseT::getRawValue());

} // namespace detail
} // namespace _V1
Expand Down
47 changes: 7 additions & 40 deletions sycl/source/detail/config.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -709,52 +709,19 @@ template <> class SYCLConfig<SYCL_JIT_AMDGCN_PTX_TARGET_FEATURES> {
// tracing of the corresponding caches. If the input value is not null and
// not a valid number, the disk cache tracing will be enabled (depreciated
// behavior). The default value is 0 and no tracing is enabled.
template <> class SYCLConfig<SYCL_CACHE_TRACE> {
class SYCLConfigTrace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, the main point of optimization is that you removed from the hot path the check if traces are enabled because you changed the reset logic. But why did you replace the template specialization with a new SYCLConfigTrace class? Why not apply this optimization to the SYCLConfig template class, so that it is enabled for all configs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen only this one having impact on flamegraphs. You are righ - it is worth to look at other ones and try to unify approach. Changing pattern for others may be a next step. I just wanted to save time now and focus on hunting next perf bottlenecks.

SYCLConfig template class was already specialized for traces. It does not use common code from SYCLConfig. Maybe it is possible to unify this code also for all other config values. Maybe it is not - for some reason we had this specialization. I did not analyzed the rest of variables deeply.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like you suggest accumulating the tech debt in favor of performance.
So, since we already achieved the initial perf targets the pressure from our customers is decreased, so we should look for clean solution instead of intermediate steps.

In that particular case, what was the reason to get rid of template specialization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like you suggest accumulating the tech debt in favor of performance.

I do not introduce tech debt. I do not optimize all the code. Just the most important one. I don't do premature optimization.

In that particular case, what was the reason to get rid of template specialization?

Because this particular template specialization is useless. The simpler code, the better. A class without a template is simpler than templated class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this particular template specialization is useless. The simpler code, the better. A class without a template is simpler than templated class.

This is exactly my concern: this PR introduces a discrepancy. So far, we have a unified design (maybe not the best) with template specializations for a particular config. Now we have a dedicated class (perhaps it is a simpler approach and more readable) for a particular config.

I do not introduce tech debt. I do not optimize all the code. Just the most important one. I don't do premature optimization.

I agree that we should not do premature optimizations. But in that particular case I just suggest that we need to unify the design for all config types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change all specialization to concrete classes. I like simplifying the code. Will that be OK for now?

using BaseT = SYCLConfigBase<SYCL_CACHE_TRACE>;
enum TraceBitmask { DiskCache = 1, InMemCache = 2, KernelCompiler = 4 };

public:
static unsigned int get() { return getCachedValue(); }
static void reset() { (void)getCachedValue(true); }
static bool isTraceDiskCache() {
return getCachedValue() & TraceBitmask::DiskCache;
}
static bool isTraceInMemCache() {
return getCachedValue() & TraceBitmask::InMemCache;
}
static bool isTraceKernelCompiler() {
return getCachedValue() & TraceBitmask::KernelCompiler;
}
static unsigned int get() { return Level; }
static void reset();
static bool isTraceDiskCache() { return Level & DiskCache; }
static bool isTraceInMemCache() { return Level & InMemCache; }
static bool isTraceKernelCompiler() { return Level & KernelCompiler; }

private:
static unsigned int getCachedValue(bool ResetCache = false) {
const auto Parser = []() {
const char *ValStr = BaseT::getRawValue();
int intVal = 0;

if (ValStr) {
try {
intVal = std::stoi(ValStr);
} catch (...) {
// If the value is not null and not a number, it is considered
// to enable disk cache tracing. This is the legacy behavior.
intVal = 1;
}
}

// Legacy behavior.
if (intVal > 7)
intVal = 1;

return intVal;
};

static unsigned int Level = Parser();
if (ResetCache)
Level = Parser();

return Level;
}
static unsigned int Level;
};

// SYCL_IN_MEM_CACHE_EVICTION_THRESHOLD accepts an integer that specifies
Expand Down
12 changes: 12 additions & 0 deletions sycl/source/detail/kernel_program_cache.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,18 @@
namespace sycl {
inline namespace _V1 {
namespace detail {

void KernelProgramCache::traceKernelImpl(const char *Msg,
KernelNameStrRefT KernelName,
bool IsFastKernelCache) {
std::string Identifier =
"[IsFastCache: " + std::to_string(IsFastKernelCache) +
"][Key:{Name = " + KernelName.data() + "}]: ";

std::cerr << "[In-Memory Cache][Thread Id:" << std::this_thread::get_id()
<< "][Kernel Cache]" << Identifier << Msg << std::endl;
}

adapter_impl &KernelProgramCache::getAdapter() {
return MParentContext.getAdapter();
}
Expand Down
24 changes: 9 additions & 15 deletions sycl/source/detail/kernel_program_cache.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -331,7 +331,7 @@ class KernelProgramCache {
template <typename MsgType>
static inline void traceProgram(const MsgType &Msg,
const ProgramCacheKeyT &CacheKey) {
if (!SYCLConfig<SYCL_CACHE_TRACE>::isTraceInMemCache())
if (!SYCLConfigTrace::isTraceInMemCache())
return;

int ImageId = CacheKey.first.second;
Expand Down Expand Up @@ -361,21 +361,15 @@ class KernelProgramCache {
<< "][Program Cache]" << Identifier << Msg << std::endl;
}

static void traceKernelImpl(const char *Msg, KernelNameStrRefT KernelName,
bool IsFastKernelCache);

// Sends message to std:cerr stream when SYCL_CACHE_TRACE environemnt is
// set.
template <typename MsgType>
static inline void traceKernel(const MsgType &Msg,
KernelNameStrRefT KernelName,
bool IsFastKernelCache = false) {
if (!SYCLConfig<SYCL_CACHE_TRACE>::isTraceInMemCache())
return;

std::string Identifier =
"[IsFastCache: " + std::to_string(IsFastKernelCache) +
"][Key:{Name = " + KernelName.data() + "}]: ";

std::cerr << "[In-Memory Cache][Thread Id:" << std::this_thread::get_id()
<< "][Kernel Cache]" << Identifier << Msg << std::endl;
static void traceKernel(const char *Msg, KernelNameStrRefT KernelName,
bool isFastKernelCache = false) {
if (__builtin_expect(SYCLConfigTrace::isTraceInMemCache(), false))
traceKernelImpl(Msg, KernelName, isFastKernelCache);
}

Locked<ProgramCache> acquireCachedPrograms() {
Expand Down Expand Up @@ -513,7 +507,7 @@ class KernelProgramCache {
auto LockedCacheKP = acquireKernelsPerProgramCache();
// List kernels that are to be removed from the cache, if tracing is
// enabled.
if (SYCLConfig<SYCL_CACHE_TRACE>::isTraceInMemCache()) {
if (__builtin_expect(SYCLConfigTrace::isTraceInMemCache(), false)) {
for (const auto &Kernel : LockedCacheKP.get()[NativePrg])
traceKernel("Kernel evicted.", Kernel.first);
}
Expand Down
8 changes: 2 additions & 6 deletions sycl/source/detail/persistent_device_code_cache.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -222,19 +222,15 @@ class PersistentDeviceCodeCache {

/* Sends message to std:cerr stream when SYCL_CACHE_TRACE environemnt is set*/
static void trace(const std::string &msg, const std::string &path = "") {
static const bool traceEnabled =
SYCLConfig<SYCL_CACHE_TRACE>::isTraceDiskCache();
if (traceEnabled) {
if (__builtin_expect(SYCLConfigTrace::isTraceDiskCache(), false)) {
auto outputPath = path;
std::replace(outputPath.begin(), outputPath.end(), '\\', '/');
std::cerr << "[Persistent Cache]: " << msg << outputPath << std::endl;
}
}
static void trace_KernelCompiler(const std::string &msg,
const std::string &path = "") {
static const bool traceEnabled =
SYCLConfig<SYCL_CACHE_TRACE>::isTraceKernelCompiler();
if (traceEnabled) {
if (__builtin_expect(SYCLConfigTrace::isTraceKernelCompiler(), false)) {
auto outputPath = path;
std::replace(outputPath.begin(), outputPath.end(), '\\', '/');
std::cerr << "[kernel_compiler Persistent Cache]: " << msg << outputPath
Expand Down
69 changes: 33 additions & 36 deletions sycl/unittests/config/ConfigTests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ TEST(ConfigTests, CheckConfigProcessing) {
File.close();
}
try {
sycl::detail::SYCLConfig<sycl::detail::SYCL_DEVICE_ALLOWLIST>::get();
throw std::logic_error("sycl::exception didn't throw");
sycl::detail::readConfig(true);
throw std::logic_error("sycl::exception didn't throw 1");
} catch (sycl::exception &e) {
EXPECT_EQ(
std::string(
Expand All @@ -46,8 +46,8 @@ TEST(ConfigTests, CheckConfigProcessing) {
File.close();
}
try {
sycl::detail::SYCLConfig<sycl::detail::SYCL_DEVICE_ALLOWLIST>::get();
throw std::logic_error("sycl::exception didn't throw");
sycl::detail::readConfig(true);
throw std::logic_error("sycl::exception didn't throw 2");
} catch (sycl::exception &e) {
EXPECT_EQ(
std::string(
Expand All @@ -64,8 +64,8 @@ TEST(ConfigTests, CheckConfigProcessing) {
File.close();
}
try {
sycl::detail::SYCLConfig<sycl::detail::SYCL_DEVICE_ALLOWLIST>::get();
throw std::logic_error("sycl::exception didn't throw");
sycl::detail::readConfig(true);
throw std::logic_error("sycl::exception didn't throw 3");
} catch (sycl::exception &e) {
EXPECT_EQ(
std::string(
Expand All @@ -82,8 +82,8 @@ TEST(ConfigTests, CheckConfigProcessing) {
File.close();
}
try {
sycl::detail::SYCLConfig<sycl::detail::SYCL_DEVICE_ALLOWLIST>::get();
throw std::logic_error("sycl::exception didn't throw");
sycl::detail::readConfig(true);
throw std::logic_error("sycl::exception didn't throw 4");
} catch (sycl::exception &e) {
EXPECT_EQ(
std::string(
Expand All @@ -103,8 +103,8 @@ TEST(ConfigTests, CheckConfigProcessing) {
File.close();
}
try {
sycl::detail::SYCLConfig<sycl::detail::SYCL_DEVICE_ALLOWLIST>::get();
throw std::logic_error("sycl::exception didn't throw");
sycl::detail::readConfig(true);
throw std::logic_error("sycl::exception didn't throw 5");
} catch (sycl::exception &e) {
EXPECT_TRUE(std::regex_match(
e.what(),
Expand All @@ -121,8 +121,8 @@ TEST(ConfigTests, CheckConfigProcessing) {
File.close();
}
try {
sycl::detail::SYCLConfig<sycl::detail::SYCL_DEVICE_ALLOWLIST>::get();
throw std::logic_error("sycl::exception didn't throw");
sycl::detail::readConfig(true);
throw std::logic_error("sycl::exception didn't throw 6");
} catch (sycl::exception &e) {
EXPECT_TRUE(std::regex_match(
e.what(), std::regex("Variable name is more than ([\\d]+) or less "
Expand All @@ -142,8 +142,8 @@ TEST(ConfigTests, CheckConfigProcessing) {
File.close();
}
try {
sycl::detail::SYCLConfig<sycl::detail::SYCL_DEVICE_ALLOWLIST>::get();
throw std::logic_error("sycl::exception didn't throw");
sycl::detail::readConfig(true);
throw std::logic_error("sycl::exception didn't throw 7");
} catch (sycl::exception &e) {
EXPECT_TRUE(std::regex_match(
e.what(), std::regex("The value contains more than ([\\d]+) characters "
Expand All @@ -159,8 +159,8 @@ TEST(ConfigTests, CheckConfigProcessing) {
File.close();
}
try {
sycl::detail::SYCLConfig<sycl::detail::SYCL_DEVICE_ALLOWLIST>::get();
throw std::logic_error("sycl::exception didn't throw");
sycl::detail::readConfig(true);
throw std::logic_error("sycl::exception didn't throw 8");
} catch (sycl::exception &e) {
EXPECT_TRUE(std::regex_match(
e.what(), std::regex("The value contains more than ([\\d]+) characters "
Expand All @@ -176,8 +176,8 @@ TEST(ConfigTests, CheckConfigProcessing) {
File.close();
}
try {
sycl::detail::SYCLConfig<sycl::detail::SYCL_DEVICE_ALLOWLIST>::get();
throw std::logic_error("sycl::exception didn't throw");
sycl::detail::readConfig(true);
throw std::logic_error("sycl::exception didn't throw 9");
} catch (sycl::exception &e) {
EXPECT_TRUE(std::regex_match(
e.what(), std::regex("The value contains more than ([\\d]+) characters "
Expand Down Expand Up @@ -249,20 +249,17 @@ TEST(ConfigTests, CheckSyclCacheTraceTest) {
// Lambda to test parsing of SYCL_CACHE_TRACE
auto TestConfig = [](int expectedValue, int expectedDiskCache,
int expectedInMemCache, int expectedKernelCompiler) {
EXPECT_EQ(static_cast<unsigned int>(expectedValue),
SYCLConfig<SYCL_CACHE_TRACE>::get());
EXPECT_EQ(static_cast<unsigned int>(expectedValue), SYCLConfigTrace::get());

EXPECT_EQ(
expectedDiskCache,
static_cast<int>(
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::isTraceDiskCache()));
static_cast<int>(sycl::detail::SYCLConfigTrace::isTraceDiskCache()));
EXPECT_EQ(
expectedInMemCache,
static_cast<int>(
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::isTraceInMemCache()));
static_cast<int>(sycl::detail::SYCLConfigTrace::isTraceInMemCache()));
EXPECT_EQ(expectedKernelCompiler,
static_cast<int>(sycl::detail::SYCLConfig<
SYCL_CACHE_TRACE>::isTraceKernelCompiler()));
static_cast<int>(
sycl::detail::SYCLConfigTrace::isTraceKernelCompiler()));
};

// Lambda to set SYCL_CACHE_TRACE
Expand All @@ -279,40 +276,40 @@ TEST(ConfigTests, CheckSyclCacheTraceTest) {
TestConfig(0, 0, 0, 0);

SetSyclCacheTraceEnv("1");
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::reset();
sycl::detail::SYCLConfigTrace::reset();
TestConfig(1, 1, 0, 0);

SetSyclCacheTraceEnv("2");
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::reset();
sycl::detail::SYCLConfigTrace::reset();
TestConfig(2, 0, 1, 0);

SetSyclCacheTraceEnv("3");
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::reset();
sycl::detail::SYCLConfigTrace::reset();
TestConfig(3, 1, 1, 0);

SetSyclCacheTraceEnv("4");
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::reset();
sycl::detail::SYCLConfigTrace::reset();
TestConfig(4, 0, 0, 1);

SetSyclCacheTraceEnv("5");
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::reset();
sycl::detail::SYCLConfigTrace::reset();
TestConfig(5, 1, 0, 1);

SetSyclCacheTraceEnv("6");
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::reset();
sycl::detail::SYCLConfigTrace::reset();
TestConfig(6, 0, 1, 1);

SetSyclCacheTraceEnv("7");
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::reset();
sycl::detail::SYCLConfigTrace::reset();
TestConfig(7, 1, 1, 1);

SetSyclCacheTraceEnv("8");
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::reset();
sycl::detail::SYCLConfigTrace::reset();
TestConfig(1, 1, 0, 0);

// Set random non-null value. It should default to 1.
SetSyclCacheTraceEnv("random");
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::reset();
sycl::detail::SYCLConfigTrace::reset();
TestConfig(1, 1, 0, 0);

// When SYCL_CACHE_TRACE is not set, it should default to 0.
Expand All @@ -321,7 +318,7 @@ TEST(ConfigTests, CheckSyclCacheTraceTest) {
#else
unsetenv("SYCL_CACHE_TRACE");
#endif
sycl::detail::SYCLConfig<SYCL_CACHE_TRACE>::reset();
sycl::detail::SYCLConfigTrace::reset();
TestConfig(0, 0, 0, 0);
}

Expand Down
Loading