Skip to content

[SYCL]: basic support of contexts with multiple devices in Level-Zero #2440

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 16, 2020

Conversation

smaslov-intel
Copy link
Contributor

PI context in Level-Zero can now be created with multiple PI devices in it.
Compile/link/build support is still limited to single-device, though, and will be addressed in subsequent PR.

Signed-off-by: Sergey V Maslov [email protected]

@smaslov-intel smaslov-intel force-pushed the ngpu branch 2 times, most recently from 14620f5 to 96773fe Compare September 8, 2020 17:31
@@ -3100,7 +3129,11 @@ pi_result piSamplerCreate(pi_context Context,
assert(Context);
assert(RetSampler);

ze_device_handle_t ZeDevice = Context->Device->ZeDevice;
// Have the "0" device in context to own the sampler. Rely on Level-Zero
// drivers to perform migration as necessary for sharing it across multiple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smaslov-intel : What do you mean by rely on Level-Zero for migration across devices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jandres742 , I mean that the image/sampler created on one device with zeImageCreate/zeSamplerCreate are made available for access in other devices of this SYCL context without any explicit transfers in the Level-Zero plugin. Note that for buffers I use zeMemAllocShared but image/sampler don't have shared counterparts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L0 does not implicitly migrate between devices, nor for images nor buffers. From here, https://spec.oneapi.com/level-zero/latest/core/PROG.html#memory:

Shared allocations: Migratable to: Another Device = Optional

Additionally. Access from one device to another device's buffers and images are not supported at all moments. Proper way of confirming such access exist is using zeDeviceCanAccessPeer() https://spec.oneapi.com/level-zero/latest/core/api.html?highlight=canaccess#zedevicecanaccesspeer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jandres742 !

L0 does not implicitly migrate between devices, nor for images nor buffers.

Not even between sub-devices of the same root device?

Shared allocations: Migratable to: Another Device = Optional

So what Optional really means? Is this an option of what?

not supported at all moments

At what circumstances can we rely on it?

Proper way of confirming such access exist is using zeDeviceCanAccessPeer()

But we need a way to guarantee such sharing between devices in the same SYCL context.
Is there a way to force the migration that is optional?
Should we just copy explicitly if no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jandres742 , ping

I've added TODO comments for now and would like to proceed with this PR, please approve if you are OK with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not even between sub-devices of the same root device?

There's no need for migrations between sub-devices, as they are part of the same device and each sub-device can access another sub-device's memory. Issue is when we are are referring about migrations between two separate physical devices.

So what Optional really means? Is this an option of what?

That means that it depends on HW and SW (L0 driver, and kernel driver) support. Not always would be present for all platforms.

But we need a way to guarantee such sharing between devices in the same SYCL context.
Is there a way to force the migration that is optional?
Should we just copy explicitly if no?

Exactly. The only way of enforcing migrations between devices is to implement a SW fallback with bounce buffering in the host.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

@againull againull self-requested a review September 10, 2020 06:39
againull
againull previously approved these changes Sep 10, 2020
Copy link
Contributor

@againull againull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smaslov-intel could you please resolve conflicts?

@againull againull merged commit 129ee44 into intel:sycl Sep 16, 2020
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Sep 17, 2020
* upstream/sycl: (405 commits)
  [SYCL] Implement new env var SYCL_DEVICE_FILTER (intel#2239)
  [Driver][SYCL] Make /MD the default for -fsycl (intel#2478)
  [SYCL]: basic support of contexts with multiple devices in Level-Zero (intel#2440)
  [SYCL] Fix LIT regression after 9dd18ca (intel#2481)
  [SYCL][L0] Kernel Destroy in piKernelRelease (intel#2475)
  [SYCL] Emit an aliased function only if it is used (intel#2430)
  [Driver][SYCL] Add defaultlib directive for sycl lib (intel#2464)
  [Driver][SYCL] Improve situations where .exe is added for AOT tools (intel#2467)
  [SYCL][L0]: Check Queue refcnt prior to using members in event wait/release (intel#2471)
  [SYCL] Unroll several loops in __init method accessor class (intel#2449)
  [SYCL][Doc] Add link to use pinned memory spec (intel#2463)
  [SYCL] Link SYCL device libraries by default. (intel#2400)
  Revert "[SYCL] XFAIL test blcoking pulldown"
  Avoid usage of deprecated "VectorType::getNumElements" (intel#737)
  Fix nullptr dereference (intel#741)
  Do not translate arbitrary precision operations without corresponding extensions (intel#714)
  Add Constrained Floating-Point Intrinsics support
  [SYCL] Take into account auxiliary cmake options for Level Zero loader
  [InstCombine] improve fold of pointer differences
  [InstCombine] add ptr difference tests; NFC
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants