Skip to content

Commit 7fb56cf

Browse files
[SYCL] Align some extensions with SYCL 2020 (#4432)
This patch 1. aligns these extensions with SYCL 2020 [section #6 in the spec]: - Enqueue barrier [SYCL_EXT_INTEL_ENQUEUE_BARRIER] - Level Zero backend [SYCL_EXT_ONEAPI_BACKEND_LEVEL_ZERO] - Local memory [SYCL_EXT_ONEAPI_LOCAL_MEMORY] - mem_channel property [SYCL_EXT_INTEL_MEM_CHANNEL_PROPERTY] - USM address spaces [SYCL_EXT_INTEL_USM_ADDRESS_SPACES] 2. deprecates these extensions: - sycl::detail::bit_cast [SYCL_INTEL_bitcast] 3. changes the location of these extensions: - sycl::ext::intel::online_compiler moves to sycl::ext::intel::experimental. sycl::ext::intel::online_compiler is deprecated.
1 parent 24ca9bf commit 7fb56cf

28 files changed

+806
-677
lines changed

sycl/doc/CompilerAndRuntimeDesign.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -915,8 +915,8 @@ space attributes in SYCL mode:
915915
| Address space attribute | SYCL address_space enumeration |
916916
|-------------------------|--------------------------------|
917917
| `__attribute__((opencl_global))` | global_space, constant_space |
918-
| `__attribute__((opencl_global_host))` | global_host_space |
919-
| `__attribute__((opencl_global_device))` | global_device_space |
918+
| `__attribute__((opencl_global_host))` | ext_intel_global_host_space |
919+
| `__attribute__((opencl_global_device))` | ext_intel_global_device_space |
920920
| `__attribute__((opencl_local))` | local_space |
921921
| `__attribute__((opencl_private))` | private_space |
922922
| `__attribute__((opencl_constant))` | N/A

sycl/doc/extensions/EnqueueBarrier/enqueue_barrier.asciidoc

Lines changed: 56 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
= SYCL_INTEL_enqueue_barrier
1+
= SYCL_EXT_ONEAPI_ENQUEUE_BARRIER
22
:source-highlighter: coderay
33
:coderay-linenums-mode: table
44

@@ -25,11 +25,6 @@ NOTE: This document is better viewed when rendered as html with asciidoctor. Gi
2525

2626
This document presents a series of changes proposed for a future version of the SYCL Specification. The goal of this proposal is to provide non-blocking APIs that provide synchronization on SYCL command queue for programmers.
2727

28-
29-
== Name Strings
30-
31-
+SYCL_INTEL_enqueue_barrier+
32-
3328
== Notice
3429

3530
Copyright (c) 2019-2020 Intel Corporation. All rights reserved.
@@ -45,19 +40,35 @@ Because the interfaces defined by this specification are not final and are subje
4540
== Version
4641

4742
Built On: {docdate} +
48-
Revision: 1
43+
Revision: 2
4944

5045
== Contact
5146
Please open an issue in the https://github.com/intel/llvm/tree/sycl/sycl/doc/extensions/[extensions repository]
5247

48+
== Feature Test Macro
49+
50+
This extension provides a feature-test macro as described in the core SYCL
51+
specification section 6.3.3 "Feature test macros". Therefore, an
52+
implementation supporting this extension must predefine the macro
53+
`SYCL_EXT_ONEAPI_ENQUEUE_BARRIER` to one of the values defined in the table below.
54+
Applications can test for the existence of this macro to determine if the
55+
implementation supports this feature, or applications can test the macro's
56+
value to determine which of the extension's APIs the implementation supports.
57+
58+
[%header,cols="1,5"]
59+
|===
60+
|Value |Description
61+
|1 |Initial extension version. Base features are supported.
62+
|===
63+
5364
== Dependencies
5465

55-
This extension is written against the SYCL 1.2.1 specification, Revision v1.2.1-6.
66+
This extension is written against the SYCL 2020 specification, revision 3.
5667

5768
== Overview
5869

59-
SYCL 1.2.1 defines a graph-based task execution model, based on kernels or explicit memory operations submitted to out-of-order queues. Dependencies between these kernels are represented by
60-
accessors that form data dependence edges in the execution graph. The USM extension <<usmlink,[1]>> doesn't have accessors, so instead solves
70+
SYCL 2020 defines a graph-based task execution model, based on kernels or explicit memory operations submitted to out-of-order queues. Dependencies between these kernels are represented by
71+
accessors that form data dependence edges in the execution graph. Unified Shared Memory (USM) doesn't have accessors, so instead solves
6172
this by defining `handler::depends_on` methods to specify event-based control dependencies between command groups.
6273

6374
There are situations where defining dependencies based on events is more explicit than desired or required by an application. For instance, the user may know that a given task depends on all previously submitted tasks. Instead of explicitly adding all the required depends_on calls, the user could express this intent via a single call, making the program more concise and explicit.
@@ -75,9 +86,9 @@ two new members to the `queue` class:
7586
[grid="rows"]
7687
[options="header"]
7788
|========================================
78-
|*handler::barrier*|*queue::submit_barrier*
79-
|`void barrier()` | `event submit_barrier()`
80-
|`void barrier( const vector_class<event> &waitList )` | `event submit_barrier( const vector_class<event> &waitList )`
89+
|*handler::ext_oneapi_barrier*|*queue::ext_oneapi_submit_barrier*
90+
|`void ext_oneapi_barrier()` | `event ext_oneapi_submit_barrier()`
91+
|`void ext_oneapi_barrier( const vector_class<event> &waitList )` | `event ext_oneapi_submit_barrier( const vector_class<event> &waitList )`
8192
|========================================
8293

8394
The first variant of the barrier takes no parameters, and waits for all previously submitted commands to the queue to enter the `info::event_command_status::complete` state before any command later submitted to the same queue is allowed to execute. A second variant of the barrier accepts a list of events, with the behavior that no commands submitted to the same queue after barrier submission may execute until all events in the `waitList` have entered the `info::event_command_status::complete` state. Both variants are non-blocking from the host program perspective, in that they do not wait for the barrier conditions to have been met before returning.
@@ -93,7 +104,7 @@ Some forms of the new barrier methods return an `event`, which can be used to pe
93104

94105
CG4 doesn't execute until all previous command groups submitted to the same queue (CG1, CG2, CG3) have entered the completed state.
95106

96-
==== 1. Using `handler::barrier()`:
107+
==== 1. Using `handler::ext_oneapi_barrier()`:
97108

98109
[source,c++,NoName,linenums]
99110
----
@@ -109,7 +120,7 @@ Queue.submit([&](cl::sycl::handler& cgh) {
109120
});
110121
111122
Queue.submit([&](cl::sycl::handler& cgh) {
112-
cgh.barrier();
123+
cgh.ext_oneapi_barrier();
113124
});
114125
115126
Queue.submit([&](cl::sycl::handler& cgh) {
@@ -118,7 +129,7 @@ Queue.submit([&](cl::sycl::handler& cgh) {
118129
...
119130
----
120131

121-
==== 2. Using `queue::submit_barrier()`:
132+
==== 2. Using `queue::ext_oneapi_submit_barrier()`:
122133

123134
[source,c++,NoName,linenums]
124135
----
@@ -133,7 +144,7 @@ Queue.submit([&](cl::sycl::handler& cgh) {
133144
// CG3
134145
});
135146
136-
Queue.submit_barrier();
147+
Queue.ext_oneapi_submit_barrier();
137148
138149
Queue.submit([&](cl::sycl::handler& cgh) {
139150
// CG4
@@ -146,7 +157,7 @@ Queue.submit([&](cl::sycl::handler& cgh) {
146157

147158
CG3 requires CG1 (in Queue1) and CG2 (in Queue2) to have completed before it (CG3) begins execution.
148159

149-
==== 1. Using `handler::barrier()`:
160+
==== 1. Using `handler::ext_oneapi_barrier()`:
150161

151162
[source,c++,NoName,linenums]
152163
----
@@ -160,7 +171,7 @@ auto event_barrier2 = Queue2.submit([&](cl::sycl::handler& cgh) {
160171
});
161172
162173
Queue3.submit([&](cl::sycl::handler& cgh) {
163-
cgh.barrier( vector_class<event>{event_barrier1, event_barrier2} );
174+
cgh.ext_oneapi_barrier( vector_class<event>{event_barrier1, event_barrier2} );
164175
});
165176
166177
Queue3.submit([&](cl::sycl::handler& cgh) {
@@ -169,7 +180,7 @@ Queue3.submit([&](cl::sycl::handler& cgh) {
169180
...
170181
----
171182

172-
==== 2. Using `queue::submit_barrier()`:
183+
==== 2. Using `queue::ext_oneapi_submit_barrier()`:
173184

174185
[source,c++,NoName,linenums]
175186
----
@@ -182,7 +193,7 @@ auto event_barrier2 = Queue2.submit([&](cl::sycl::handler& cgh) {
182193
// CG2
183194
});
184195
185-
Queue3.submit_barrier( vector_class<event>{event_barrier1, event_barrier2} );
196+
Queue3.ext_oneapi_submit_barrier( vector_class<event>{event_barrier1, event_barrier2} );
186197
187198
Queue3.submit([&](cl::sycl::handler& cgh) {
188199
// CG3
@@ -211,44 +222,45 @@ void wait();
211222
template <typename T>
212223
event submit(T cgf, const queue &secondaryQueue);
213224
214-
event submit_barrier();
225+
event ext_oneapi_submit_barrier();
215226
216-
event submit_barrier( const vector_class<event> &waitList );
227+
event ext_oneapi_submit_barrier( const vector_class<event> &waitList );
217228
218229
void wait();
219230
...
220231
----
221-
=== Add rows to Table 4.22
232+
=== Add rows to Table 28
222233

223234
[cols="70,300"]
224235
[grid="rows"]
225236
[options="header"]
226237
|========================================
227238
|*Member functions*|*Description*
228-
|`event submit_barrier()` | Same effect as submitting a `handler::barrier()` within a command group to this `queue`. The returned event enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (implicitly from all previously submitted commands to the same queue) have entered the `info::event_command_status::complete` state.
229-
|`event submit_barrier( const vector_class<event> &waitList )` | Same effect as submitting a `handler:barrier( const vector_class<event> &waitList )` within a command group to this `queue`. The returned event enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (explicitly from `waitList`) have entered the `info::event_command_status::complete` state.
239+
|`event ext_oneapi_submit_barrier()` | Same effect as submitting a `handler::ext_oneapi_barrier()` within a command group to this `queue`. The returned event enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (implicitly from all previously submitted commands to the same queue) have entered the `info::event_command_status::complete` state.
240+
|`event ext_oneapi_submit_barrier( const vector_class<event> &waitList )` | Same effect as submitting a `handler:ext_oneapi_barrier( const vector_class<event> &waitList )` within a command group to this `queue`. The returned event enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (explicitly from `waitList`) have entered the `info::event_command_status::complete` state.
230241
|========================================
231242

232243

233-
=== Modify Section 4.8.2
244+
=== Modify Section 4.9.3
234245

235246
==== Change first sentence from:
236-
A command group scope in SYCL, as it is defined in Section 3.4.1, consists of a single kernel or explicit memory
237-
operation (handler methods such as copy, update_host, fill), together with its requirements.
247+
The member functions and objects defined in this scope will define the requirements for the kernel execution or
248+
explicit memory operation, and will be used by the SYCL runtime to evaluate if the operation is ready for execution.
238249

239250
==== To:
240251

241-
A command group scope in SYCL, as it is defined in Section 3.4.1, consists of a single kernel, explicit memory
242-
operation (handler methods such as copy, update_host, fill) or barrier, together with its requirements.
252+
The member functions and objects defined in this scope will define the requirements for the kernel execution,
253+
explicit memory operation or barrier, and will be used by the SYCL runtime to evaluate if the operation is ready for execution.
254+
243255

244-
=== Modify part of Section 4.8.3
256+
=== Modify part of Section 4.9.4
245257

246258
*Change from:*
247259
[source,c++,NoName,linenums]
248260
----
249261
...
250-
template<typename T, int dim, access::mode mode, access::target tgt>
251-
void fill(accessor<T, dim, mode, tgt> dest, const T& src);
262+
template <typename T>
263+
void fill(void *ptr, const T &pattern, size_t count);
252264
253265
};
254266
...
@@ -258,39 +270,36 @@ void fill(accessor<T, dim, mode, tgt> dest, const T& src);
258270
[source,c++,NoName,linenums]
259271
----
260272
...
261-
template<typename T, int dim, access::mode mode, access::target tgt>
262-
void fill(accessor<T, dim, mode, tgt> dest, const T& src);
273+
template <typename T>
274+
void fill(void *ptr, const T &pattern, size_t count);
263275
264-
void barrier();
276+
void ext_oneapi_barrier();
265277
266-
void barrier( const vector_class<event> &waitList );
278+
void ext_oneapi_barrier( const vector_class<event> &waitList );
267279
268280
};
269281
...
270282
----
271283

272-
=== Add a new section between Section 4.8.6 and 4.8.7
284+
=== Add a new section between Section 4.9.4 and 4.9.5
273285

274-
4.8.X SYCL functions for enqueued synchronization barriers
286+
4.9.X SYCL functions for enqueued synchronization barriers
275287

276288
Barriers may be submitted to a queue, with the effect that they prevent later operations submitted to the same queue from executing until the barrier wait conditions have been satisfied. The wait conditions can be explicitly described by `waitList` or implicitly from all previously submitted commands to the same queue. There are no constraints on the context from which queues may participate in the `waitList`. Enqueued barriers do not block host program execution, but instead form additional dependence edges with the execution task graph.
277289

278290
Barriers can be created by two members of the `handler` class that force synchronization on the SYCL command queue. The first variant of the `handler` barrier (`handler::barrier()`) takes no parameters, and waits for all previously submitted commands to the queue to enter the `info::event_command_status::complete` state before any command later submitted to the same queue is allowed to execute. The second variant of the `handler` barrier (`handler::barrier( const vector_class<event> &waitList )`) accepts a list of events, with the behavior that no commands submitted to the same queue after barrier submission may execute until all events in the waitList have entered the `info::event_command_status::complete` state.
279291

280-
=== Add a new table in the new section between 4.8.6 and 4.8.7: Member functions of the handler class.
292+
=== Add a new table in the new section between 4.9.4 and 4.9.5: Member functions of the handler class.
281293

282294
[cols="70,300"]
283295
[grid="rows"]
284296
[options="header"]
285297
|========================================
286298
|*Member functions*|*Description*
287-
|`void barrier()` | Prevents any commands submitted afterward to this queue from executing until all commands previously submitted to this queue have entered the `info::event_command_status::complete` state.
288-
|`void barrier( const vector_class<event> &waitList` ) | Prevents any commands submitted afterward to this queue from executing until all events in `waitList` have entered the `info::event_command_status::complete` state. If `waitList` is empty, then the barrier has no effect.
299+
|`void ext_oneapi_barrier()` | Prevents any commands submitted afterward to this queue from executing until all commands previously submitted to this queue have entered the `info::event_command_status::complete` state.
300+
|`void ext_oneapi_barrier( const vector_class<event> &waitList` ) | Prevents any commands submitted afterward to this queue from executing until all events in `waitList` have entered the `info::event_command_status::complete` state. If `waitList` is empty, then the barrier has no effect.
289301
|========================================
290302

291-
== References
292-
1. [[usmlink]]https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/USM.adoc
293-
294303
== Issues
295304

296305
None.
@@ -303,6 +312,7 @@ None.
303312
|========================================
304313
|Rev|Date|Author|Changes
305314
|1|2020-02-26|Ye Ting|*Initial public release*
315+
|2|2021-08-30|Dmitry Vodopyanov|*Updated according to SYCL 2020 reqs for extensions*
306316
|========================================
307317

308318
//************************************************************************

sycl/doc/extensions/LevelZeroBackend/LevelZeroBackend.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@ The currently supported targets are all Intel GPUs starting with Gen9.
99

1010
NOTE: This specification is a draft. While describing the currently implemented behaviors it is known to be not complete nor exhaustive.
1111
We shall continue to add more information, e.g. explain general mapping of SYCL programming model to Level-Zero API.
12-
It will also be gradually changing to a SYCL-2020 conforming implementation.
1312

1413
## 2. Prerequisites
1514

@@ -23,7 +22,7 @@ The Level-Zero backend is added to the cl::sycl::backend enumeration:
2322
``` C++
2423
enum class backend {
2524
// ...
26-
level_zero,
25+
ext_oneapi_level_zero,
2726
// ...
2827
};
2928
```
@@ -55,7 +54,7 @@ and they must be included in the order shown:
5554
5655
``` C++
5756
#include "level_zero/ze_api.h"
58-
#include "sycl/backend/level_zero.hpp"
57+
#include "sycl/ext/oneapi/backend/level_zero.hpp"
5958
```
6059
### 4.1 Mapping of SYCL objects to Level-Zero handles
6160

@@ -71,7 +70,7 @@ These SYCL objects encapsulate the corresponding Level-Zero handles:
7170

7271
### 4.2 Obtaining of native Level-Zero handles from SYCL objects
7372
74-
The ```get_native<cl::sycl::backend::level_zero>()``` member function is how a raw native Level-Zero handle can be obtained
73+
The ```get_native<cl::sycl::backend::ext_oneapi_level_zero>()``` member function is how a raw native Level-Zero handle can be obtained
7574
for a specific SYCL object. It is currently supported for SYCL ```platform```, ```device```, ```context```, ```queue```, ```event```
7675
and ```program``` classes. There is also a free-function defined in ```cl::sycl``` namespace that can be used instead of the member function:
7776
``` C++
@@ -81,7 +80,7 @@ auto get_native(const SyclObjectT &Obj) ->
8180
```
8281
### 4.3 Construct a SYCL object from a Level-Zero handle
8382
84-
The following free functions defined in the ```cl::sycl::level_zero``` namespace allow an application to create
83+
The following free functions defined in the ```cl::sycl::ext::oneapi::level_zero``` namespace allow an application to create
8584
a SYCL object that encapsulates a corresponding Level-Zero object:
8685
8786
| Level-Zero interoperability function |Description|
@@ -103,11 +102,15 @@ some interoperability API supports overriding this behavior and keep the ownersh
103102
Use this enumeration for explicit specification of the ownership:
104103
``` C++
105104
namespace sycl {
105+
namespace ext {
106+
namespace oneapi {
106107
namespace level_zero {
107108
108109
enum class ownership { transfer, keep };
109110
110111
} // namespace level_zero
112+
} // namespace oneapi
113+
} // namespace ext
111114
} // namespace sycl
112115
```
113116
@@ -193,3 +196,4 @@ struct free_memory {
193196
|3|2021-04-13|James Brodman|Free Memory Query
194197
|4|2021-07-06|Rehana Begam|Introduced explicit ownership for queue
195198
|5|2021-07-25|Sergey Maslov|Introduced SYCL interop for events
199+
|6|2021-08-30|Dmitry Vodopyanov|Updated according to SYCL 2020 reqs for extensions

sycl/doc/extensions/MemChannel/SYCL_INTEL_mem_channel_property.asciidoc renamed to sycl/doc/extensions/MemChannel/MemChannel.asciidoc

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
= SYCL_INTEL_mem_channel_property
1+
= SYCL_EXT_INTEL_MEM_CHANNEL_PROPERTY
22

33
== Introduction
44
NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos.
@@ -23,14 +23,30 @@ Because the interfaces defined by this specification are not final and are subje
2323
== Version
2424

2525
Built On: {docdate} +
26-
Revision: 1
26+
Revision: 2
2727

2828
== Dependencies
2929

30-
This extension is written against the SYCL 2020 provisional specification, Revision 1.
30+
This extension is written against the SYCL 2020 specification, Revision 3.
3131

3232
The use of this extension requires a target that supports cl_intel_mem_channel_property or equivalent if OpenCL is used as the underlying device runtime.
3333

34+
== Feature Test Macro
35+
36+
This extension provides a feature-test macro as described in the core SYCL
37+
specification section 6.3.3 "Feature test macros". Therefore, an
38+
implementation supporting this extension must predefine the macro
39+
`SYCL_EXT_INTEL_MEM_CHANNEL_PROPERTY` to one of the values defined in the table below.
40+
Applications can test for the existence of this macro to determine if the
41+
implementation supports this feature, or applications can test the macro's
42+
value to determine which of the extension's APIs the implementation supports.
43+
44+
[%header,cols="1,5"]
45+
|===
46+
|Value |Description
47+
|1 |Initial extension version. Base features are supported.
48+
|===
49+
3450
== Overview
3551

3652
On some targets manual assignment of buffers to memory regions can improve memory bandwidth. This extension adds a buffer property to indicate in which memory channel a particular buffer should be allocated. This information is an optimization hint to the runtime and thus it is legal to ignore.
@@ -59,7 +75,7 @@ Add a new constructor to Table 4.34: Constructors of the buffer property classes
5975
|===
6076
--
6177

62-
Add a new member function to Table 4.35: Member functions of the buffer property classes as follows:
78+
Add a new member function to Table 42: Member functions of the buffer property classes as follows:
6379

6480
--
6581
[options="header"]
@@ -87,7 +103,7 @@ enum class aspect {
87103
} // namespace sycl
88104
```
89105

90-
Add an entry for the new aspect to Table 4.20: Device aspects defined by the core SYCL specification:
106+
Add an entry for the new aspect to Table 26: Device aspects defined by the core SYCL specification:
91107

92108
--
93109
[options="header"]
@@ -107,4 +123,5 @@ Add an entry for the new aspect to Table 4.20: Device aspects defined by the cor
107123
|========================================
108124
|Rev|Date|Author|Changes
109125
|1|2020-10-26|Joe Garvey|*Initial public draft*
126+
|2|2021-08-30|Dmitry Vodopyanov|*Updated according to some SYCL 2020 reqs for extensions*
110127
|========================================

0 commit comments

Comments
 (0)