-
Notifications
You must be signed in to change notification settings - Fork 794
[SYCL][E2E] use any-device-is in requires lines #15816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
| //===----------------------------------------------------------------------===// | ||
|
|
||
| // REQUIRES: opencl-aot, ocloc, cpu, gpu, accelerator | ||
| // REQUIRES: opencl-aot, ocloc, any-device-is-cpu, any-device-is-gpu, any-device-is-accelerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand how it is a correct fix. If , is treated as && in REQUIRES directive, then we previously had: cpu && gpu && acc which is never true. Now we have any-device-is-cpu && any-device-is-gpu && any-device-is-accelerator.
If the expression is computed per single device, then even with the latter there will only be true for just one of any-device-is, meaning the overall expression evaluates to false anyways.
Should we just drop those device requirements from the test and instead use UNSUPPORTED to filter out environments where the test can't be run (like CUDA)?
Considering that there are no device types other than CPU, GPU and ACC, do we ever need to require them all? My understanding is that it is safe to assume that check-sycl-e2e is always launched with an environment with at least one device. And if it is not the case yet, we should maybe introduce a check like this to lit.cfg.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Old features are per-device, new features are per-llvm-lit invocation. They were created to target this scenario exactly.
| @@ -1,4 +1,4 @@ | |||
| // REQUIRES: linux, (gpu && level_zero), cpu | |||
| // REQUIRES: linux, (gpu && level_zero), any-device-is-cpu | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // REQUIRES: linux, (gpu && level_zero), any-device-is-cpu | |
| // REQUIRES: linux | |
| // UNSUPPORTED: (opencl && gpu), acc |
I general I have a very same question about how switching to any-device-is-cpu helps here
| //===---------------------------------------------------------------------===// | ||
|
|
||
| // REQUIRES: cpu, gpu, accelerator | ||
| // REQUIRES: any-device-is-cpu, any-device-is-gpu, any-device-is-accelerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, at this point I've probably realized how it works. any-device-is-* is computed not for a current device that we are running on, but for the whole environment we are in.
Therefore, in an environment where we have all three types of devices, all three any-device -is* will be evaluated to true.
Even though I appreciate the change, I don't think that it should be accepted without update to the documentation. Otherwise, someone could introduce a new incorrect test literally tomorrow.
Moreover, I would maybe think about alternative solutions here. Like, my first though is: why gpu doesn't just work as any-device-is-gpu? And the answer here is that sometimes we want a test to only be run on GPU and not just because GPU is present on a system.
Would it be better to rename any-device-is-cpu to something like cpu-device-is-present? To clarify that we don't require a test to be launched on CPU, but we require a presence of a device on the system. Or is it only me who got confused here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use C expression which is intuitive to everyone?
linux && cpu # linux and cpu env
linux && gpu # linux gpu env
linux && (gpu || cpu) # linux, cpu env or gpu env
linux && (cpu && gpu) # linux, cpu & gpu must exist at the same time
linux && !acc # linux, cpu env or gpu env or cpu&gpu env
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to get rid of "," expression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though I appreciate the change, I don't think that it should be accepted without update to the documentation. Otherwise, someone could introduce a new incorrect test literally tomorrow.
100% agree with the documentation change given how easy it is to make this error. Not to mention that it doesn't just affect REQUIRES, but also any lit line that doesn't have a %{run}. Any statement like %if <device/backend> never evaluates to true unless there is a %{run} in the line. For example:
https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Basic/interop/interop_all_backends.cpp#L2.
Why not just use C expression which is intuitive to everyone?
linux && cpu # linux and cpu env linux && gpu # linux gpu env linux && (gpu || cpu) # linux, cpu env or gpu env linux && (cpu && gpu) # linux, cpu & gpu must exist at the same time linux && !acc # linux, cpu env or gpu env or cpu&gpu env
This would still have the issue that this PR is trying to fix, how we currently handle requires linux && (cpu && gpu) would never be met. However if we got the conditions to work as described, i feel like this would be quite intuitive.
Moreover, I would maybe think about alternative solutions here. Like, my first though is: why
gpudoesn't just work asany-device-is-gpu? And the answer here is that sometimes we want a test to only be run on GPU and not just because GPU is present on a system.
Honestly, I like this solution, we could structure the tests slightly differently in the two cases you mention. If the intention is to run because a GPU is present in the system we would likely also add another device to the requires, and include logic (either through device selectors, or env variables) to make sure we are running on the device we intend. In the case we want to specifically only run on a GPU we would use %{run} which would take care of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would still have the issue that this PR is trying to fix, how we currently handle requires linux && (cpu && gpu) would never be met.
I see. Some of my tests require both cpu and gpu device available, but it's fine to ignore these tests if we don't have this environment currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore, as a relatively cheap trade-off, we could consider renaming
any-device-is-cputocpu-is-present, for example.
Should we also consider renaming of cpu to runs-on-cpu? Then everything will be self-documenting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of cpu-is-present, but I don't like the runs-on-cpu. IMO, it's even more ambiguous than before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the true intent of "REQUIRES: cpu,gpu,accelerator" is tested on any of "cpu, gpu, acc", why not just rewrite it to "REQUIRES: cpu || gpu || accelerator"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AllanZyne, the problem is confusion between "it required to be run on GPU" vs "it requires a GPU to be present on the system, but can be launched on other device".
Do we really need "it requires a GPU to be present on the system, but can be launched on other device" this case?
If so, I think it's more intuitive to add a new label, such as
REQUIRE: cpu
MACHINE: cpu && gpu // or, "cpu, gpu"
It means this test needs to run on a machine which has both cpu and gpu, but this test will only run on cpu (I don't understand this need, it seems "REQUIRE: cpu && gpu" is more reasonable).
When you write REQUIRES: cpu, gpu you probably mean that the test requires both to be present on the system, but can be launched on either of them.
REQUIRE: cpu || gpu
MACHINE: cpu && gpu // or, "cpu, gpu"
I also not quite understand this requirement, though. I think "REQUIRE: cpu || gpu" is more flexible.
And by default,
REQUIRE: cpu || gpu
It means this test can be tested on cpu or gpu, it doesn't require the machine have both cpu and gpu. If the machine only have cpu, it's fine to just test cpu. If the machine have cpu and gpu, it will test on cpu and gpu separately (run twice).
REQUIRE: cpu && gpu
It means, it needs a machine have both cpu and gpu, and this test needs these devices both available (run once).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean "REQUIRE" is used to control "ONEAPI_DEVICE_SELECTOR",
REQUIRE: cpu || gpu
export ONEAPI_DEVICE_SELECTOR=opencl:cpu
export ONEAPI_DEVICE_SELECTOR=level_zero:gpu
REQUIRE: cpu && gpu // or, "cpu, gpu"
export ONEAPI_DEVICE_SELECTOR=level_zero:gpu,opencl:cpu
| // available. | ||
| // | ||
| // REQUIRES: cpu,gpu,accelerator | ||
| // REQUIRES: any-device-is-cpu, any-device-is-gpu, any-device-is-accelerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be an E2E test at all, it should be moved to unit-tests where we can emulate all environments and check that we query the right device, or just directly check that we parse the env variable correctly, similar to what I did in #15774
| @@ -1,4 +1,4 @@ | |||
| // REQUIRES: cpu, gpu, accelerator, opencl | |||
| // REQUIRES: any-device-is-cpu, any-device-is-gpu, any-device-is-accelerator, any-device-is-opencl | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, this should not be an E2E test. This comment can be applied to some other tests as well, but I will repeat it on those which are the first candidates for the rewrite.
|
|
||
| // FIXME: This set is never satisfied all at once in our infrastructure. | ||
| // REQUIRES: opencl, level-zero, cpu, gpu, opencl-aot, ocloc | ||
| // REQUIRES: any-device-is-opencl, any-device-is-level_zero, opencl-aot, ocloc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mostly sure that this can be turned into a unit-test as well
| @@ -1,7 +1,7 @@ | |||
| // REQUIRES: cuda, opencl, gpu, cpu | |||
| // REQUIRES: any-device-is-cuda, any-device-is-opencl, any-device-is-gpu, any-device-is-cpu | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need more unit-tests! :)
Because the aspects for require lines are handled device per device, tests that require multiple devices are never ran.
For example the condition
REQUIRES: cpu, gpuis never met because no device is both a cpu and gpu, instead we want to useREQUIRES: any-device-is-cpu, any-device-is-gpuin this case.