[SYCL][Doc] Add initial draft of sycl load_store proposal #5655

jasonsewall-intel · 2022-02-24T14:43:39Z

This adds support for a group of load and store functions that accept property lists. The intention is to allow users to pass behavior-modifying hints; we include a set of nontemporal hints here to be used with these functions.

Signed-off-by: Jason Sewall [email protected]

Signed-off-by: Jason Sewall <[email protected]>

jasonsewall-intel · 2022-03-02T20:34:40Z

What needs to happen next?

bader · 2022-03-03T09:27:51Z

What needs to happen next?

@intel/dpcpp-specification-reviewers is expected to review this proposal.

gmlueck

One global comment, and then some specific comments below. We recently adopted a template for these extension specifications. Please reformat to use that, and also see the instructions in README-process.

gmlueck · 2022-03-04T14:48:01Z

sycl/doc/extensions/proposed/sycl_ext_oneapi_load_store.asciidoc

+This extension is written against the SYCL 2020 specification, Revision 4 and
+the following extensions:
+
+* link:https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_properties.asciidoc[sycl_ext_oneapi_properties]


This link is no longer correct because the extension is now implemented. The spec now lives in the "experimental" directory.

gmlueck · 2022-03-04T14:51:30Z

sycl/doc/extensions/proposed/sycl_ext_oneapi_load_store.asciidoc

+} // namespace sycl
+```
+
+1:: Load and return the object of type `T` at `addr` with the hints in property list `p`. `p` cannot vary across work-items, but `addr` is expected to. Each work-item recieves a copy of the loaded object.


I think it does not make sense to say:

p cannot vary across work-items

because these are not cooperative group functions. Is this just a typo?

Can you give me example of how it could vary across work-items? If it were runtime only, I guess?

The current wording make is seem like the following would not be legal:

if (i == 0) { sycl::ext::oneapi::experimental::properties prop1{sycl::ext::oneapi::experimental::temporality_hint_nontemporal}; sycl::ext::oneapi::experimental::load(ptr1, prop1); } else { sycl::ext::oneapi::experimental::properties prop2{sycl::ext::oneapi::experimental::temporality_hint_temporal}; sycl::ext::oneapi::experimental::load(ptr2, prop2); }

However, since these are not cooperative group functions, it seems like that should be legal.

Maybe I just don't understand SYCL. Are you saying that those two loads are the same call, and that we are effectively calling load with different arguments?

No, I'm saying that the language "cannot vary across work-items" is non-sensical unless you are talking about a function that must be called in convergent code. Since this is not a cooperative group function, that phrase just doesn't make any sense.

gmlueck · 2022-03-04T14:54:50Z

sycl/doc/extensions/proposed/sycl_ext_oneapi_load_store.asciidoc

+
+=== Joint (cooperative) group granularity
+
+The following functions apply to the passed `Group g`; the group cooperates to perform the operation to uniform arguments. These functions follow the restrictions and behaviors described in Sec. 4.17.3: Group functions.


This is just a wording suggestion, but it might be clearer to say that these are "cooperative group functions":

These are cooperative group functions, so they have all the restrictions and behaviors described in Section 4.17.3 "Group functions" of the core SYCL specification.

gmlueck · 2022-03-04T15:10:43Z

sycl/doc/extensions/proposed/sycl_ext_oneapi_load_store.asciidoc

+} // namespace sycl
+```
+
+1:: Load and return the object of type `T` at `addr` with the hints in property list `p`. Each argument must be the same for each work-item in `g`, and a different object is returned for each work-item, unless the `Group` is a `sub_group`, in which case a `sycl::ext::oneapi::experimental::uniform<T>` is returned (see 1b-1c.)


The phrase "a different object is returned for each work-item" seems a little confusing to me. Since all work-items load from the same address, they will all get the same value. I agree that each work-item gets a unique object of type T. However, even in the sub-group case, each work-item gets a unique object of type uniform<T>.

Maybe something like this:

1:: The addr and p arguments must be the same for each work-item in the group. Each work-item loads the object of type T at that address, using the hints in property list p.

1b:: Special case of 1 with sub_group. Each work-item loads the object of type T at that address, and returns that value wrapped in a sycl::ext::oneapi::experimental::uniform<T> object.

gmlueck · 2022-03-04T15:58:15Z

sycl/doc/extensions/proposed/sycl_ext_oneapi_load_store.asciidoc

+} // namespace ext
+} // namespace oneapi
+} // namespace sycl
+```


To me, it seems better to expose the temporal/nontemporal choice as an enum. There should also be predefined variables for each of these properties, which will make their use much less verbose:

namespace sycl::oneapi::ext::experimental { enum class temporality_hint_enum : /*unspecified*/ { nontemporal, temporal }; struct temporality_hint_key { template <temporality_hint_enum Hint> using value_t = property_value<temporality_hint_key, Hint>; }; inline constexpr temporality_hint_key::value_t<temporality_hint_enum::nontemporal> temporality_hint_nontemporal; inline constexpr temporality_hint_key::value_t<temporality_hint_enum::temporal> temporality_hint_temporal; // Etc. for other properties. } // namespace

It would also reduce verbosity if we name the property temporality_key instead of temporality_hint_key.

I had convenience variables in my POC that didn't make it to my proposal; I've fixed that. I don't understand the value of the enum vs. independent classes, to be honest.

Also address review comments

gmlueck · 2022-03-08T19:04:26Z

sycl/doc/extensions/proposed/sycl_ext_oneapi_load_store.asciidoc

+temporality controls at the granularity of memory-transacting instructions. This
+extension provides a groundwork for future extensions that expose pointer- and
+accessor-level semantics. A future extension may provide more
+architecture-specific hints and coarser controls for applying hints.


(adding a global comment here only so it can be a threaded conversation)

After reviewing #5755, I'm wondering if we can use annotated_ptr to convey the properties, and then change this extension to use annotated_ptr. If we did this:

The work-item granularity functions (load() and store()) would be unnecessary because we would just use annotated_ptr:

annotated_ptr aptr(ptr, properties{temporality_hint<nontemporal>}); /*...*/ = *aptr; *aptr = 42;

The joint group functions would use either annotated_ptr or a raw pointer. The annotated_ptr usage would be like this:

// In convergent code /*...*/ = joint_load(g, aptr);

The group block functions would also use either annotated_ptr or a raw pointer. The annotated_ptr usage would be like this:

// In convergent code group_block_store(g, aptr, val);

Oh, man. I guess you don't remember that annotated_ptr work came out of the original proposal for this, and we chose to split this proposal into a 'basic' load/store version and a version that supports annotated_ptr.

There are a few reasons for this:

was that load/store was deemed simpler and easier to propose, and that annotated_ptr could build off of it.

was that having a fine-grained control (load/store) is complementary to a coarse-grained control. load/store can override properties in an annotated_ptr, for example.

Does that make sense? I think it absolutely makes sense to have this work with annotated_ptr, but I don't think we should drop any of the load/store functions.

@gmlueck , did you see this?

Thanks for the ping. At this point we already have a proposed extension for annotated_ptr, so reason (1) is no longer a strong argument. Do we think that point (2) (fine grained control) is enough justification to maintaining two ways to do non-temporal memory accesses?

Regardless, we should decide now how the non-temporary properties interact with annotated_ptr.

If we decided to use annotated_ptr instead of the load/store extension to do non-temporal accesses, I think we would just move the properties to that extension, and add overloads for annotated_ptr as I outline above.

If we keep the non-temporal support in the load/store extension, how would the interaction with annotated_ptr work?

I do think (2) warranted. I see these as complimentary features. annotated_ptr + nontemporal properties allows for a productive, coarse-grained way of marking some semantically important piece of memory, while load/store allow for fine-grained expressions.

In my original proposal, I suggested that load/store would override any annotated properties, and I stand by that. It's a 'most specific' type of behavior that I think users can easily reason about.

I should add that this was renamed "load_store" from "nontemporal" at the suggestion of @Pennycook; we are considering how to add marray/vec behavior as well.

keryell · 2022-03-11T02:55:07Z

Interesting.
Would it be possible to have a code sample showing how to use it?
Otherwise, non_temporal with a _ looks nicer to read.

jasonsewall-intel requested a review from a team as a code owner February 24, 2022 14:43

[SYCL] Add initial draft of sycl load_store proposal

07f2fb6

Signed-off-by: Jason Sewall <[email protected]>

bader changed the title ~~[SYCL] Add initial draft of sycl load_store proposal~~ [SYCL][Doc] Add initial draft of sycl load_store proposal Mar 1, 2022

gmlueck reviewed Mar 4, 2022

View reviewed changes

Jason Sewall added 2 commits March 8, 2022 09:40

[SYCL][Doc] Update load_store proposal to new template

2a5b39e

Also address review comments

[SYCL][Doc] Fix typo in load_store proposal

ed05bf3

gmlueck reviewed Mar 8, 2022

View reviewed changes

github-actions bot added the Stale label Sep 8, 2022

github-actions bot closed this Oct 9, 2022


		=== Joint (cooperative) group granularity

		The following functions apply to the passed `Group g`; the group cooperates to perform the operation to uniform arguments. These functions follow the restrictions and behaviors described in Sec. 4.17.3: Group functions.

[SYCL][Doc] Add initial draft of sycl load_store proposal #5655

[SYCL][Doc] Add initial draft of sycl load_store proposal #5655

Uh oh!

Conversation

jasonsewall-intel commented Feb 24, 2022

Uh oh!

jasonsewall-intel commented Mar 2, 2022

Uh oh!

bader commented Mar 3, 2022

Uh oh!

gmlueck left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keryell commented Mar 11, 2022

Uh oh!

Uh oh!