Add prefetch for HIP USM allocations #10430

ldrumm · 2023-07-17T17:27:26Z

This change is necessary to workaround a delightful bug in either HIP runtime, or the HIP spec.

It's discussed at length in github.com//issues/7252 but for the purposes of this patch, it suffices to say that a call to hipMemPrefetchAsync is required for correctness in the face of global atomic operations on (at least) shared USM allocations.

The architecture of this change is slightly strange on first sight in that we reduntantly track allocation information in several places. The context now keeps track of all USM mappings. We require a mapping of pointers to the allocated size, but these allocations aren't pinned to any particular queue or HIP stream.
The hipMemPrefetchAsync, however, requires the associated HIP stream object, and the size of the allocation. The stream comes hot-off-the-queue only just before a kernel is launched, so we need to defer the prefetch until we have that information.

Finally, the kernel itself keeps track of pointer arguments in a more accessible way so we can determine which of the kernel's pointer arguments do, in-fact, point to USM allocations.

omarahmed1111

Looks good, just minor stylish stuff

sycl/plugins/unified_runtime/ur/adapters/hip/context.hpp

sycl/plugins/unified_runtime/ur/adapters/hip/enqueue.cpp

sycl/plugins/unified_runtime/ur/adapters/hip/kernel.hpp

sycl/plugins/unified_runtime/ur/adapters/hip/context.hpp

sycl/plugins/unified_runtime/ur/adapters/hip/enqueue.cpp

sycl/plugins/unified_runtime/ur/adapters/hip/context.hpp

sycl/plugins/unified_runtime/ur/adapters/hip/enqueue.cpp

sycl/plugins/unified_runtime/ur/adapters/hip/kernel.hpp

This change is necessary to workaround a delightful bug in either HIP runtime, or the HIP spec. It's discussed at length in github.com/intel/issues/7252 but for the purposes of this patch, it suffices to say that a call to `hipMemPrefetchAsync` is *required* for correctness in the face of global atomic operations on (*at least*) shared USM allocations. The architecture of this change is slightly strange on first sight in that we reduntantly track allocation information in several places. The context now keeps track of all USM mappings. We require a mapping of pointers to the allocated size, but these allocations aren't pinned to any particular queue or HIP stream. The `hipMemPrefetchAsync`, however, requires the associated HIP stream object, and the size of the allocation. The stream comes hot-off-the-queue *only* just before a kernel is launched, so we need to defer the prefetch until we have that information. Finally, the kernel itself keeps track of pointer arguments in a more accessible way so we can determine which of the kernel's pointer arguments do, in-fact, point to USM allocations.

ldrumm · 2023-08-14T08:50:43Z

@intel/llvm-reviewers-runtime ping

aelovikov-intel · 2023-08-14T14:49:59Z

so we can determine which of the kernel's pointer arguments do, in-fact, point to USM allocations.

What if I pass that pointer via memory and not through kernel argument/capture?

ldrumm · 2023-08-14T15:28:22Z

so we can determine which of the kernel's pointer arguments do, in-fact, point to USM allocations.

What if I pass that pointer via memory and not through kernel argument/capture?

Then it won't work.

As I see it, there are basically two options for how to solve this problem:

Do what this patch does, which is best effort. We could potentially track every argument type with compiler support, but I don't think the UR design supports this level of knowledge, and it's a significant amount of work
prefetch all usm allocations before every kernel launch

For now, I think (1) is acceptable as most things already work - the atomics test cases linked in issue #7252 are a notable exception which motivated this patch

npmiller · 2023-08-21T16:23:23Z

friendly ping @jandres742 @aelovikov-intel @intel/llvm-reviewers-runtime

ldrumm · 2023-08-24T12:06:22Z

so we can determine which of the kernel's pointer arguments do, in-fact, point to USM allocations.

What if I pass that pointer via memory and not through kernel argument/capture?

Then it won't work.

As I see it, there are basically two options for how to solve this problem:
1. Do what this patch does, which is best effort. We could potentially track every argument type with compiler support, but I don't think the UR design supports this level of knowledge, and it's a significant amount of work

2. prefetch _all_ usm allocations before _every_ kernel launch
For now, I think (1) is acceptable as most things already work - the atomics test cases linked in issue #7252 are a notable exception which motivated this patch

@aelovikov-intel is this good enough for merge, please?

This change is necessary to workaround a delightful bug in either HIP runtime, or the HIP spec. It's discussed at length in github.com/intel/issues/7252 but for the purposes of this patch, it suffices to say that a call to `hipMemPrefetchAsync` is *required* for correctness in the face of global atomic operations on (*at least*) shared USM allocations. The architecture of this change is slightly strange on first sight in that we reduntantly track allocation information in several places. The context now keeps track of all USM mappings. We require a mapping of pointers to the allocated size, but these allocations aren't pinned to any particular queue or HIP stream. The `hipMemPrefetchAsync`, however, requires the associated HIP stream object, and the size of the allocation. The stream comes hot-off-the-queue *only* just before a kernel is launched, so we need to defer the prefetch until we have that information. Finally, the kernel itself keeps track of pointer arguments in a more accessible way so we can determine which of the kernel's pointer arguments do, in-fact, point to USM allocations.

ldrumm requested a review from a team as a code owner July 17, 2023 17:27

ldrumm temporarily deployed to aws July 17, 2023 17:41 — with GitHub Actions Inactive

ldrumm temporarily deployed to aws July 17, 2023 19:19 — with GitHub Actions Inactive

ldrumm requested a review from omarahmed1111 July 18, 2023 16:38

ldrumm temporarily deployed to aws July 18, 2023 17:12 — with GitHub Actions Inactive

ldrumm temporarily deployed to aws July 18, 2023 17:58 — with GitHub Actions Inactive

omarahmed1111 requested changes Jul 19, 2023

View reviewed changes

ldrumm force-pushed the gh-7252 branch from b3c8240 to 31eb868 Compare July 19, 2023 12:52

omarahmed1111 approved these changes Jul 19, 2023

View reviewed changes

ldrumm temporarily deployed to aws July 19, 2023 13:06 — with GitHub Actions Inactive

ldrumm temporarily deployed to aws July 19, 2023 13:49 — with GitHub Actions Inactive

ldrumm force-pushed the gh-7252 branch from 31eb868 to 89fb386 Compare July 19, 2023 14:15

ldrumm temporarily deployed to aws July 19, 2023 14:53 — with GitHub Actions Inactive

ldrumm temporarily deployed to aws July 19, 2023 16:04 — with GitHub Actions Inactive

jandres742 reviewed Jul 19, 2023

View reviewed changes

sycl/plugins/unified_runtime/ur/adapters/hip/context.hpp Show resolved Hide resolved

jandres742 reviewed Jul 19, 2023

View reviewed changes

sycl/plugins/unified_runtime/ur/adapters/hip/enqueue.cpp Show resolved Hide resolved

ldrumm force-pushed the gh-7252 branch from 89fb386 to dae2de9 Compare August 3, 2023 09:33

ldrumm requested a review from a team as a code owner August 3, 2023 09:33

ldrumm requested review from sergey-semenov and jandres742 August 3, 2023 09:33

ldrumm temporarily deployed to aws August 3, 2023 09:41 — with GitHub Actions Inactive

ldrumm force-pushed the gh-7252 branch from dae2de9 to 6a06d4b Compare August 3, 2023 10:25

ldrumm temporarily deployed to aws August 3, 2023 10:35 — with GitHub Actions Inactive

ldrumm temporarily deployed to aws August 3, 2023 11:25 — with GitHub Actions Inactive

jchlanda reviewed Aug 4, 2023

View reviewed changes

sycl/plugins/unified_runtime/ur/adapters/hip/context.hpp Outdated Show resolved Hide resolved

sycl/plugins/unified_runtime/ur/adapters/hip/enqueue.cpp Show resolved Hide resolved

sycl/plugins/unified_runtime/ur/adapters/hip/kernel.hpp Show resolved Hide resolved

ldrumm force-pushed the gh-7252 branch 2 times, most recently from c0bd637 to 3187744 Compare August 4, 2023 15:37

ldrumm temporarily deployed to aws August 4, 2023 15:56 — with GitHub Actions Inactive

ldrumm temporarily deployed to aws August 4, 2023 16:42 — with GitHub Actions Inactive

ldrumm force-pushed the gh-7252 branch from 3187744 to e9a86c8 Compare August 4, 2023 17:05

ldrumm temporarily deployed to aws August 4, 2023 17:16 — with GitHub Actions Inactive

ldrumm temporarily deployed to aws August 4, 2023 18:04 — with GitHub Actions Inactive

ldrumm force-pushed the gh-7252 branch from e9a86c8 to 96cc59a Compare August 7, 2023 07:40

ldrumm temporarily deployed to aws August 7, 2023 07:50 — with GitHub Actions Inactive

ldrumm temporarily deployed to aws August 7, 2023 08:47 — with GitHub Actions Inactive

jchlanda approved these changes Aug 7, 2023

View reviewed changes

ldrumm temporarily deployed to aws August 7, 2023 13:15 — with GitHub Actions Inactive

ldrumm temporarily deployed to aws August 7, 2023 13:23 — with GitHub Actions Inactive

jandres742 approved these changes Aug 21, 2023

View reviewed changes

aelovikov-intel merged commit a6b8fa6 into intel:sycl Aug 24, 2023

ldrumm mentioned this pull request Aug 24, 2023

[SYCL][HIP] sycl::atomic_ref::fetch_sub does not work with USM #7252

Closed

hdelan mentioned this pull request Sep 19, 2023

[SYCL][HIP] Keep track of only shared USM allocations and prefetch only for those #11218

Closed

This was referenced Oct 9, 2023

[SYCL][HIP] Add AMDGPU reflect pass to choose between safe and unsafe AMDGPU atomics #11467

Merged

[HIP] Revert add prefetch for USM hip allocations a6b8fa66b537753415d24076f… oneapi-src/unified-runtime#936

Merged

This was referenced Oct 16, 2023

[AMDGPU] Add an option to disable unsafe uses of atomic xor pasaulais/llvm-project#1

Draft

[AMDGPU] Add an option to disable unsafe uses of atomic xor llvm/llvm-project#69229

Open

ldrumm deleted the gh-7252 branch November 15, 2023 11:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add prefetch for HIP USM allocations #10430

Add prefetch for HIP USM allocations #10430

Uh oh!

ldrumm commented Jul 17, 2023 •

edited

Loading

Uh oh!

omarahmed1111 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ldrumm commented Aug 14, 2023

Uh oh!

aelovikov-intel commented Aug 14, 2023

Uh oh!

ldrumm commented Aug 14, 2023 •

edited

Loading

Uh oh!

npmiller commented Aug 21, 2023

Uh oh!

ldrumm commented Aug 24, 2023

Uh oh!

Uh oh!

Add prefetch for HIP USM allocations #10430

Add prefetch for HIP USM allocations #10430

Uh oh!

Conversation

ldrumm commented Jul 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

omarahmed1111 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ldrumm commented Aug 14, 2023

Uh oh!

aelovikov-intel commented Aug 14, 2023

Uh oh!

ldrumm commented Aug 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

npmiller commented Aug 21, 2023

Uh oh!

ldrumm commented Aug 24, 2023

Uh oh!

Uh oh!

ldrumm commented Jul 17, 2023 •

edited

Loading

ldrumm commented Aug 14, 2023 •

edited

Loading