-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL][CUDA] Select only NVPTX64 device binaries #1223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes #1194 in my test case. |
Add the binary target identifier "nvptx64" for NVIDIA PTX devices. Signed-off-by: Andrea Bocci <[email protected]>
Search through the available binary images and select the first one for the PI_DEVICE_BINARY_TARGET_NVPTX64 ("nvptx64") target. Return PI_INVALID_BINARY if no "nvptx64" image is available. Signed-off-by: Andrea Bocci <[email protected]>
We use LIT infrastructure to check end-to-end behavior and tests are located here: https://github.com/intel/llvm/blob/sycl/sycl/test/. Alternative approach is to use Google Test framework to validate specific parts of the SYCL runtime instead of building a full application. Looking at your changes it seems that CUDA agnostic part might be already covered by @sergey-semenov in 9095749, so need to validate only CUDA plugin changes. It seems to me that Codeplay team is using mostly Google test framework for this: https://github.com/intel/llvm/tree/sycl/sycl/unittests/pi/cuda. @romanovvlad, @Ruyk, does it make sense to you? |
I'm trying to run the existing tests, but I must be missing something.
However, most of the tests fail due to
Do I need to set up the After setting up the environment, I still get
Most of the failures look like
and running them by hand I get
Am I missing something in my test setup ? |
No, libsycl.so should be automatically added to LD_LIBRARY_PATH by lit.cfg.py. |
OK, will do. I the meantime I think I've managed to prepare a test for #1194, that shows how it is fixed by these changes:
With these changes there is no mention of this test in the output of |
Ah, I found the issue: sycl/test/CMakeLists.txt uses
instead of
so the tests will fail if a I've opened #1227 for that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor comments, thanks for the patch
2a787290e47 breaks |
…rgets Add a LIT test to check that both backends (PI_OPENCL, PI_CUDA) work irrespective of the order of the -fsycl-targets=... arguments. Signed-off-by: Andrea Bocci <[email protected]>
@fwyzard, thanks for working on this! |
…ctor_tests * origin/sycl: (32 commits) [SYCL] Fix circular reference between events and queues (intel#1226) [CI][Doc] Use SSH to deploy GitHub Pages (intel#1232) [SYCL][CUDA][Test] Testing for use of CUDA primary context (intel#1174) [SYCL] allow underscore symbol in temporary directory name [SYCL] Reject zero length arrays (intel#1153) [SYCL] Fix static code analyzis concerns (intel#1189) [SYCL] Add more details about the -fintelfpga option (intel#1218) [SYCL][CUDA] Select only NVPTX64 device binaries (intel#1223) [SYCL] Reverse max work-group size order (intel#1177) [SYCL][Doc] Add GroupAlgorithms extension (intel#1079) [SYCL] Fix SYCL internal enumerators conflict with user defined macro (intel#1188) [SYCL][CUDA] Fixes context release and unnamed context scope (intel#1207) [SYCL][CUDA] Fix context creation property parsing [CUDA][PI] clang-format pi.h [SYCL][CUDA] Handle the case of not having any CUDA device (intel#1212) [SYCL] Fix check-sycl-deploy target problems (intel#1165) [SYCL] Disable tests which take more than 5 minutes (intel#1220) [SYCL] Make context constructors explicit to avoid unintended conversions (intel#1219) [SYCL][NFC] Add clang-format configuration file for SYCL LIT tests (intel#1224) [SYCL] Fix command cleanup invoked from multiple threads (intel#1214) ...
…_accessor_refactor * origin/sycl: (38 commits) [SYCL] Fix device::get_devices() with a non-host device type (intel#1235) [SYCL][PI][CUDA] Implement kernel and kernel-group information queries (intel#1180) [SYCL] Remove default error code value in exception (intel#1150) [SYCL] Fix devicelib assert LIT test (intel#1245) [SYCL] Set aux-target-cpu for SYCL offload device compilation (intel#1225) [SYCL] Remove fabs and ceil from the list of unsupported math functions (intel#1217) [SYCL] Fix circular reference between events and queues (intel#1226) [CI][Doc] Use SSH to deploy GitHub Pages (intel#1232) [SYCL][CUDA][Test] Testing for use of CUDA primary context (intel#1174) [SYCL] allow underscore symbol in temporary directory name [SYCL] Reject zero length arrays (intel#1153) [SYCL] Fix static code analyzis concerns (intel#1189) [SYCL] Add more details about the -fintelfpga option (intel#1218) [SYCL][CUDA] Select only NVPTX64 device binaries (intel#1223) [SYCL] Reverse max work-group size order (intel#1177) [SYCL][Doc] Add GroupAlgorithms extension (intel#1079) [SYCL] Fix SYCL internal enumerators conflict with user defined macro (intel#1188) [SYCL][CUDA] Fixes context release and unnamed context scope (intel#1207) [SYCL][CUDA] Fix context creation property parsing [CUDA][PI] clang-format pi.h ...
Change the behaviour of the program manager to always ask the native runtime to choose a device image, even if only one is available; this should prevent selecting an image that is not compatible with the current device.
Fix the behaviour of the CUDA plugin to search through the available binary images and select the first one compatible with a PTX target, or return PI_INVALID_BINARY if there are no compatible images.
Define the PI_DEVICE_BINARY_TARGET_NVPTX64 target identifier as "nvptx64" for NVIDIA PTX devices.