Skip to content

[SYCL] Fix multi-device support for persistent cache #16056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 14, 2024

Conversation

againull
Copy link
Contributor

@againull againull commented Nov 12, 2024

@againull againull force-pushed the persistent_multi_device branch 2 times, most recently from 36699f1 to 721af83 Compare November 12, 2024 17:35
@againull againull force-pushed the persistent_multi_device branch from 721af83 to dae4c02 Compare November 12, 2024 17:51
@againull againull force-pushed the persistent_multi_device branch from dae4c02 to a55bf0a Compare November 12, 2024 17:54
@@ -116,7 +116,7 @@ if(SYCL_UR_USE_FETCH_CONTENT)
CACHE PATH "Path to external '${name}' adapter source dir" FORCE)
endfunction()

set(UNIFIED_RUNTIME_REPO "https://github.com/oneapi-src/unified-runtime.git")
set(UNIFIED_RUNTIME_REPO "https://github.com//againull/unified-runtime")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you need to set the UNIFIED_RUNTIME_TAG as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is set in a separate file sycl/cmake/modules/UnifiedRuntimeTag.cmake now: https://github.com/intel/llvm/pull/16056/files#diff-0c7fff539c30bbbd3e5b33eb55120e2035bead3f75119e145b167ac1a9cf8c5f

@againull againull force-pushed the persistent_multi_device branch from 4bd2b2a to fbba36d Compare November 13, 2024 06:31
@againull againull force-pushed the persistent_multi_device branch 2 times, most recently from c2fbec1 to 782a184 Compare November 13, 2024 15:31
# Currently urProgramGetInfo will return UR_INVALID_PROGRAM is program is
# compiled only for a subset of associated devices, i.e. not all devices
# have level zero module and binaries. This PR fixes this behaviour.
set(UNIFIED_RUNTIME_TAG 3e1a0ea44842b10cfdc0e1e98af2eee8fcd8e937)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a newer tag, merged in: #16085 (comment). Please remove this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, removed the change.

@againull againull merged commit 6456fe8 into intel:sycl Nov 14, 2024
13 checks passed
againull added a commit that referenced this pull request Dec 19, 2024
[PR#16056](#16056) fixed multi-device
support for persistent cache but also changed format of the files in
persistent cache - we used to write number of binaries before binary
size and binary data but stopped doing that because we always have a
single binary (for a single device) per file. Because of that new
runtime stopped working with persistent cache created by older runtimes.

This PR restores the old format of files in persistent cache.

For that we need to write number of binaries to the file (before binary
size and binary data) though it is always equal to 1 in current
implementation. Even in the old implementation we could only put a
single binary to the persistent cache in all scenarios, multi-device
case wasn't supported, didn't have an API to create a program from
multiple binaries. So we can assert that number of binaries is always 1
when reading from the cache.
KornevNikita pushed a commit that referenced this pull request Feb 25, 2025
[PR#16056](#16056) fixed multi-device
support for persistent cache but also changed format of the files in
persistent cache - we used to write number of binaries before binary
size and binary data but stopped doing that because we always have a
single binary (for a single device) per file. Because of that new
runtime stopped working with persistent cache created by older runtimes.

This PR restores the old format of files in persistent cache.

For that we need to write number of binaries to the file (before binary
size and binary data) though it is always equal to 1 in current
implementation. Even in the old implementation we could only put a
single binary to the persistent cache in all scenarios, multi-device
case wasn't supported, didn't have an API to create a program from
multiple binaries. So we can assert that number of binaries is always 1
when reading from the cache.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants