Skip to content

Conversation

uditagarwal97
Copy link
Owner

Reverts #22

jsji and others added 30 commits August 20, 2024 06:19
…l#15142)

Requesting ownership for libdevice nativecpu files on behalf of
@intel/dpcpp-nativecpu-pi-reviewers. Thank you
This commit reenables a selection of tests for ze_debug as they no
longer seem to be failing for this option.

Signed-off-by: Larsen, Steffen <[email protected]>
  CONFLICT (content): Merge conflict in clang/test/Driver/linker-wrapper.c
  CONFLICT (content): Merge conflict in .github/workflows/pr-code-format.yml
  CONFLICT (content): Merge conflict in llvm/include/llvm/InitializePasses.h
  CONFLICT (content): Merge conflict in llvm/include/llvm/LinkAllPasses.h
  CONFLICT (content): Merge conflict in clang/include/clang/Sema/Sema.h
  CONFLICT (content): Merge conflict in clang/include/clang/Sema/SemaSYCL.h
  CONFLICT (content): Merge conflict in clang/lib/Sema/SemaSYCL.cpp
By storing the env variables in the config helper we eliminate the need
for extra function parameters while also making it accessible throughout
the codebase.
… operations (intel#15140)

Multiple declarations were missing for shuffle and broadcast operations,
in particular work group broadcast ones. This adds them.
intel#15099)

Currently if there is only single device in the context then kernel
compiler passes ip version of that device via -device option to ocloc
when compiling OpenCL program to spirv to let ocloc enable all
extensions
supported by that device. Problem is that ocloc -spv_only doesn't
produce
spirv file when multiple devices are provided via  -device option.
That's why in this case enable common extensions supported by all
devices
manually. To do that use ocloc query to get common supported features
for the list of devices and then process the return and enable features
via ocloc -internal_options -cl-ext=+feature1,...
As we are working towards upstreaming the SYCL support in the compiler,
a few areas need to be cleaned up for readability and improved logic to
reduce some repetitious output. Does not impact behaviors in general.
Restores missed context analysis for host pointer update.
Regression is caused by
intel@3dc75a7

---------

Signed-off-by: Tikhomirova, Kseniya <[email protected]>
I don't know if this change the intention of this test, but there is a
problem of buffer out-of-bound access in the test. `Buf` is originally a
size `1` buffer, but in line 186, the range is set to 10, which would
lead to buffer access out-of-bound in `FunctorMulti F`.
…4839)

We added some codes to prevent "LoopIdiomRecognize" from converting loop
to memset, so that it won't conflict with host Asan. Now we needn't this
fix anymore.
This commit reenables a selection of tests that have previously been
disabled, most for different reasons.

Additionally, the subbuffer test still seems to fail on Gen12 OpenCL, so
this commit adds a comment with a link to the GH issue.

---------

Signed-off-by: Larsen, Steffen <[email protected]>
Wrap the tests in `REQUIRE: asserts` in order to make sure that the
compiler can generate the messages.

Also make sure we don't run the e2e tests on windows, as JIT is not
supported there.
Running the check-sycl-dumps target sets the SYCL_LIB_DUMPS_ONLY param
for lit which should only enable the ABI dump tests. However, since
1fde656 we've added a non `lit.formats.ShTest()` which doesn't
respect config.suffixes, so we need to account for this.
…ntel#15160)

1) It keeps all `convert`-related code in one place 
2) It allows not to include that extra functionality if this particular
   method isn't used
3) Potentially, it moves all instances where `half`/`bfloat16` needs to
   be "complete" in `vector.hpp`. Can't really verify that because
   `generic_type_traits.hpp` includes definitions for both at the moment
hvdijk and others added 25 commits September 18, 2024 12:48
Previously, OffloadWrapper::ConstructJob would call llc to convert LLVM
IR into machine code. With this change, it calls clang instead.

This should be mostly NFC for x86, there may be some small changes in
exactly which passes run, but they are intended to be roughly the same,
and enables future changes passing along more clang options.

For other targets, this is a bugfix. Specifically, for RISC-V, llc and
clang disagree on which subtarget features to enable by default,
resulting in linker errors due to subtarget feature mismatches between
the llc-compiled wrapper and other clang-compiled object files.
…tel#15307)

Do not internalize kernels when supporting dynamic linking. Kernels must
be visible so that host code can find them.

---------

Signed-off-by: Lu, John <[email protected]>
…ntel#15408)

Improving maintainability:
- Remove dependency of Matrix tests on LIT's params:
`gpu-intel-pvc=True`, `matrix=1`, `matrix-tf32=1`, `matrix-fp16=1`.
- Remove `matrix`, `matrix-xmx8`, `matrix-tf32`, `matrix-fp16` as not
used
- Utilize auto-detection of architecture, aspects and runtime query
instead.
- add handling of `igc-dev` parameter, in case it is passed in the
command line.
- minor clean up from previous "typos" in `REQUIRES` and `XFAIL`
directives
…option to clang-linker-wrapper (intel#15374)

Previously clang-linker-wrapper used -sycl-module-split-mode option to
decide whether to use sycl-post-link tool or not to use. This patch adds
options for a explicit specifiyng usage of tool and library.
…V path (intel#15384)

From SYCL 2020 specification:

> The sycl::atomic_ref class also has a template parameter AddressSpace,
> which allows the application to make an assertion about the address
> space of the object of type T that it references. The default value
> for this parameter is access::address_space::generic_space, which
> indicates that the object could be in either the global or local
> address spaces. If the application knows the address space, it can set
> this template parameter to either access::address_space::global_space
> or access::address_space::local_space as an assertion to the
> implementation. Specifying the address space via this template
> parameter may allow the implementation to perform certain
> optimizations. Specifying an address space that does not match the
> object’s actual address space results in undefined behavior

We use `ext::oneapi::experimental::static_address_cast` to do that. It's
not implemented for CUDA/HIP yet, that path continues using
`sycl::address_space_cast` that performs runtime checks:

> An implementation must return nullptr if the run-time value of pointer
> is not compatible with Space, and must issue a compiletime diagnostic
> if the deduced address space for pointer is not compatible with Space.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet