Skip to content

Commit 74b54b4

Browse files
gmlueckPennycook
andauthored
[SYCL][Doc] Add spec to get device image backend content (#14811)
Add a proposed specification for an extension that returns the backend content of a device image. --------- Co-authored-by: John Pennycook <[email protected]>
1 parent d7b2605 commit 74b54b4

File tree

2 files changed

+266
-2
lines changed

2 files changed

+266
-2
lines changed
Lines changed: 255 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,255 @@
1+
= sycl_ext_oneapi_device_image_backend_content
2+
3+
:source-highlighter: coderay
4+
:coderay-linenums-mode: table
5+
6+
// This section needs to be after the document title.
7+
:doctype: book
8+
:toc2:
9+
:toc: left
10+
:encoding: utf-8
11+
:lang: en
12+
:dpcpp: pass:[DPC++]
13+
:endnote: &#8212;{nbsp}end{nbsp}note
14+
15+
// Set the default source code type in this document to C++,
16+
// for syntax highlighting purposes. This is needed because
17+
// docbook uses c++ and html5 uses cpp.
18+
:language: {basebackend@docbook:c++:cpp}
19+
20+
21+
== Notice
22+
23+
[%hardbreaks]
24+
Copyright (C) 2024 Intel Corporation. All rights reserved.
25+
26+
Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
27+
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
28+
permission by Khronos.
29+
30+
31+
== Contact
32+
33+
To report problems with this extension, please open a new issue at:
34+
35+
https://github.com/intel/llvm/issues
36+
37+
38+
== Dependencies
39+
40+
This extension is written against the SYCL 2020 revision 9 specification.
41+
All references below to the "core SYCL specification" or to section numbers in
42+
the SYCL specification refer to that revision.
43+
44+
45+
== Status
46+
47+
This is a proposed extension specification, intended to gather community
48+
feedback.
49+
Interfaces defined in this specification may not be implemented yet or may be
50+
in a preliminary state.
51+
The specification itself may also change in incompatible ways before it is
52+
finalized.
53+
*Shipping software products should not rely on APIs defined in this
54+
specification.*
55+
56+
57+
== Overview
58+
59+
This extension adds a mechanism to obtain the raw backend content of the device
60+
images that are in a kernel bundle.
61+
The format of this content is implementation-defined, so applications that make
62+
use of this extension are not expected to be portable to other implementations
63+
of SYCL.
64+
65+
66+
== Specification
67+
68+
=== Feature test macro
69+
70+
This extension provides a feature-test macro as described in the core SYCL
71+
specification.
72+
An implementation supporting this extension must predefine the macro
73+
`SYCL_EXT_ONEAPI_DEVICE_IMAGE_BACKEND_CONTENT` to one of the values defined in
74+
the table below.
75+
Applications can test for the existence of this macro to determine if the
76+
implementation supports this feature, or applications can test the macro's
77+
value to determine which of the extension's features the implementation
78+
supports.
79+
80+
[%header,cols="1,5"]
81+
|===
82+
|Value
83+
|Description
84+
85+
|1
86+
|The APIs of this experimental extension are not versioned, so the
87+
feature-test macro always has this value.
88+
|===
89+
90+
=== New member functions in the `device_image` class
91+
92+
This extension adds the following member functions to the `device_image` class.
93+
94+
[source,c++]
95+
----
96+
namespace sycl {
97+
98+
template <bundle_state State>
99+
class device_image {
100+
public:
101+
backend ext_oneapi_get_backend() const noexcept;
102+
std::vector<std::byte> ext_oneapi_get_backend_content() const;
103+
104+
#if defined(__cpp_lib_span)
105+
std::span<std::byte> ext_oneapi_get_backend_content_view() const;
106+
#endif
107+
108+
/*...*/
109+
};
110+
111+
} // namespace sycl
112+
----
113+
114+
'''
115+
116+
[frame=all,grid=none,separator="@"]
117+
!====
118+
a@
119+
[source,c++]
120+
----
121+
backend ext_oneapi_get_backend() const noexcept;
122+
----
123+
!====
124+
125+
_Returns:_ The backend that is associated with this device image.
126+
This is always the same as the backend of the kernel bundle that contains this
127+
device image.
128+
129+
'''
130+
131+
[frame=all,grid=none,separator="@"]
132+
!====
133+
a@
134+
[source,c++]
135+
----
136+
std::vector<std::byte> ext_oneapi_get_backend_content() const;
137+
----
138+
!====
139+
140+
_Constraints:_ Available only when `State` is `bundle_state::executable`.
141+
142+
_Returns:_ A copy of the raw backend content for this device image.
143+
The format of this data is implementation-defined.
144+
See below for a description of the formats used by {dpcpp}.
145+
146+
'''
147+
148+
[frame=all,grid=none,separator="@"]
149+
!====
150+
a@
151+
[source,c++]
152+
----
153+
std::span<std::byte> ext_oneapi_get_content_backend_view() const;
154+
----
155+
!====
156+
157+
Available only when the compiler defines the `__cpp_lib_span` feature-test macro
158+
(which is defined in {cpp}20 and higher).
159+
160+
_Constraints:_ Available only when `State` is `bundle_state::executable`.
161+
162+
_Returns:_ A view of the raw backend content for this device image.
163+
The data behind this view has the same lifetime as the `device_image` object.
164+
The format of this data is implementation-defined.
165+
See below for a description of the formats used by {dpcpp}.
166+
167+
'''
168+
169+
170+
== Device image format for {dpcpp}
171+
172+
This section is non-normative and applies only to the {dpcpp} implementation.
173+
The format of the data returned by
174+
`device_image::ext_oneapi_get_backend_content` and
175+
`device_image::ext_oneapi_get_backend_content_view` depends on the backend of the
176+
kernel bundle that contains the device image.
177+
178+
=== Format on Level Zero
179+
180+
The device image's backend content is native ISA for the device, which can be
181+
passed to `zeModuleCreate` as `ZE_MODULE_FORMAT_NATIVE` format.
182+
183+
:ref1: ../proposed/sycl_ext_oneapi_free_function_kernels.asciidoc#level-zero-and-opencl-compatibility
184+
185+
[_Note:_ The interface to kernels in the device image backend content is not
186+
defined in the general case, which means there is no portable way to invoke
187+
kernels from a Level Zero module that is created from the raw device image
188+
content.
189+
However, see link:{ref1}[here] for a limited case where this portability is
190+
guaranteed.
191+
_{endnote}_]
192+
193+
=== Format on OpenCL
194+
195+
The device image's backend content is executable binary device code representing
196+
one or more kernels, which can be passed to `clCreateProgramWithBinary`.
197+
198+
[_Note:_ The interface to kernels in the device image backend content is not
199+
defined in the general case, which means there is no portable way to invoke
200+
kernels from a OpenCL `cl_program` object that is created from the raw device
201+
image content.
202+
However, see link:{ref1}[here] for a limited case where this portability is
203+
guaranteed.
204+
_{endnote}_]
205+
206+
=== Format on CUDA
207+
208+
The device image's backend content is a CUBIN module representing one or more
209+
kernels.
210+
211+
212+
== Example
213+
214+
:ref2: ../proposed/sycl_ext_oneapi_free_function_kernels.asciidoc
215+
216+
A kernel bundle can contain multiple device images with different
217+
representations of the same kernel for different devices.
218+
This example shows how to get the device image's backend content for a
219+
particular kernel for a particular device.
220+
Note that this example also uses the kernel syntax described in link:{ref2}[
221+
sycl_ext_oneapi_free_function_kernels], but it is not necessary to define
222+
kernels in that syntax when using this extension.
223+
224+
[source,c++]
225+
----
226+
#include <sycl/sycl.hpp>
227+
namespace syclext = sycl::ext::oneapi;
228+
namespace syclexp = sycl::ext::oneapi::experimental;
229+
230+
SYCL_EXT_ONEAPI_FUNCTION_PROPERTY((syclexp::nd_range_kernel<1>))
231+
void iota(float start, float *ptr) {
232+
size_t id = syclext::this_work_item::get_nd_item().get_global_linear_id();
233+
ptr[id] = start + static_cast<float>(id);
234+
}
235+
236+
void main() {
237+
sycl::device d;
238+
sycl::queue q{d};
239+
sycl::context ctxt = q.get_context();
240+
241+
// Get a kernel bundle that contains the kernel "iota".
242+
sycl::kernel_id iota = syclexp::get_kernel_id<iota>();
243+
auto exe_bndl =
244+
sycl::get_kernel_bundle<sycl::bundle_state::executable>(ctxt, {iota});
245+
246+
std::vector<std::byte> bytes;
247+
for (auto& img: bundle) {
248+
// Search for the device image that contains "iota" for this device.
249+
if (img.has_kernel(iota, dev)) {
250+
bytes = img.ext_oneapi_get_backend_content();
251+
break;
252+
}
253+
}
254+
}
255+
----

sycl/doc/extensions/proposed/sycl_ext_oneapi_free_function_kernels.asciidoc

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -877,15 +877,14 @@ int main() {
877877
```
878878

879879

880+
[[level-zero-and-opencl-compatibility]]
880881
== {dpcpp} guaranteed compatibility with Level Zero and OpenCL backends
881882

882883
The contents of this section are non-normative and apply only to the {dpcpp}
883884
implementation.
884885
Kernels written using the free function kernel syntax can be submitted to a
885886
device by using the Level Zero or OpenCL backends, without going through the
886887
SYCL host runtime APIs.
887-
This works only when the kernel is AOT compiled to native device code using the
888-
`-fsycl-targets` compiler option.
889888

890889
The interface to the kernel in the native device code module is only guaranteed
891890
when the kernel adheres to the following restrictions:
@@ -899,6 +898,16 @@ when the kernel adheres to the following restrictions:
899898
* The translation unit containing the kernel is compiled with the
900899
`-fno-sycl-dead-args-optimization` option.
901900

901+
In order to invoke a kernel using Level Zero or OpenCL, the application must
902+
first obtain the raw backend content of the device image that contains the
903+
kernel.
904+
One way to do this is by using
905+
link:../proposed/sycl_ext_oneapi_device_image_backend_content.asciidoc[
906+
sycl_ext_oneapi_device_image_backend_content].
907+
It is also possible to compile the application in AOT mode via the
908+
`-fsycl-targets` compiler option and then extract the device image's backend
909+
content from the executable file.
910+
902911
Both Level Zero and OpenCL identify a kernel via a _name_ string.
903912
(See `zeKernelCreate` and `clCreateKernel` in their respective specifications.)
904913
When a kernel is defined according to the restrictions above, the _name_ is

0 commit comments

Comments
 (0)