-
Notifications
You must be signed in to change notification settings - Fork 768
[SYCL][RTC] Use program manager #16316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Co-authored-by: Lukas Sommer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good apart from one minor nit.
Co-authored-by: Lukas Sommer <[email protected]>
No, I'll drop it! |
Signed-off-by: Julian Oppermann <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool!
Signed-off-by: Julian Oppermann <[email protected]>
A bit of additional context for the last commit which marks the constructed kernel bundle as interop: llvm/sycl/source/detail/kernel_bundle_impl.hpp Lines 387 to 392 in 5a729ba
I'm doing this to get into the second branch here (there are more occurrences of this pattern in the codebase): llvm/sycl/source/detail/scheduler/commands.cpp Lines 2731 to 2765 in 5a729ba
We can't take the first branch because we can't look up the correct kernel_id from the kernel name without adding the RTC-specific prefix. At a first glance, it doesn't seem feasible to also use the prefix in context of kernel objects, handler, commands, etc., because, IIUC, names there are tied to the actual function names, and hence shouldn't be modified. (The overall idea in this PR is to use unique names wrt. the PM's bookkeeping, but don't change the compiled programs.)
Note that the kernel bundle constructed before this PR was also marked as interop (by choice of the parent constructor), hence my commit just restores the situation as before. I'm not a 100% sure I understand all implications of the bundle being interop or not, so please let me know if I'm doing this wrong 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks good to me.
Adds limited support for device globals in runtime-compiled SYCL code. The application interacts with the globals via three new methods on `kernel_bundle`: ```c++ bool ext_oneapi_has_device_global(const std::string &name); void *ext_oneapi_get_device_global_address(const std::string &name, const device &dev); // return a USM pointer suitable for queue::memcpy etc. size_t ext_oneapi_get_device_global_size(const std::string &name); ``` This PR uses the same trick as #16316, i.e. prepending a kernel-bundle-specific prefix to the names of device globals to make them distinguishable for the program manager. Limitations: - Device globals inside a namespace are unsupported due to insufficient name mangling. - Device globals with the `device_image_scope` property cannot be read/written from the host, because the runtime currently cannot expose USM pointers for them. A workaround is using explicit kernels to read/write the global's value into a USM buffer. --------- Signed-off-by: Julian Oppermann <[email protected]>
The idea is to assign a per-compilation prefix (e.g.,
rtc_42$
) to all offload entries in thesycl_device_binaries
datastructure before feeding them intoProgramManager::addImages
, resulting in uniquekernel_id
s as far as the PM is concerned. When querying the PM for the device images to construct the kernel bundle, I look for kernel IDs starting with the current prefix, which should reliably return only the device images corresponding to current compilation request.Note that the actual kernel names don't change, i.e.
__sycl_kernel_foo
keeps that name even though the PM might know it asrtc_42$__sycl_kernel_foo
. The prefix is stored inside the bundle, and prepended to the requested kernel name in theext_onapi_[has|get]_kernel(string)
methods.Kernel objects are also obtained via the program manager. Compared to creating the UR kernel from the selected device image's UR program directly, this approach ensures eliminated arguments are handled correctly. Hence, I was able to drop the previously mandatory
-fno-sycl-dead-args-optimization
from the pipeline.The compilation pipeline and extended kernel bundle now support multiple device images. To test this, I added support for the
-fsycl-device-code-split=
option, and apply it to one of the compilations in the E2E test.The persistent cache is circumvented for now for the
sycl_jit
language (lack of suitable on-disk format), but should be brought back in the future.