diff --git a/sycl/doc/extensions/KernelProperties/KernelProperties.asciidoc b/sycl/doc/extensions/KernelProperties/KernelProperties.asciidoc new file mode 100644 index 0000000000000..6c2949a38f865 --- /dev/null +++ b/sycl/doc/extensions/KernelProperties/KernelProperties.asciidoc @@ -0,0 +1,488 @@ += SYCL_EXT_ONEAPI_KERNEL_PROPERTIES +:source-highlighter: coderay +:coderay-linenums-mode: table + +// This section needs to be after the document title. +:doctype: book +:toc2: +:toc: left +:encoding: utf-8 +:lang: en + +:blank: pass:[ +] + +// Set the default source code type in this document to C++, +// for syntax highlighting purposes. This is needed because +// docbook uses c++ and html5 uses cpp. +:language: {basebackend@docbook:c++:cpp} + +== Introduction +IMPORTANT: This specification is a draft. + +NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are +trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. +used by permission by Khronos. + +NOTE: This document is better viewed when rendered as html with asciidoctor. +GitHub does not render image icons. + +This extension introduces a replacement for the kernel attributes defined in +Section 5.8.1 of the SYCL 2020 specification, in the form of a `property_list` +accepting properties with compile-time constant values. + +== Notice + +Copyright (c) 2021 Intel Corporation. All rights reserved. + +== Status + +Working Draft + +This is a preview extension specification, intended to provide early access to +a feature for review and community feedback. When the feature matures, this +specification may be released as a formal extension. + +Because the interfaces defined by this specification are not final and are +subject to change they are not intended to be used by shipping software +products. + +== Version + +Built On: {docdate} + +Revision: 1 + +== Contributors + +Jessica Davies, Intel + +Joe Garvey, Intel + +Greg Lueck, Intel + +John Pennycook, Intel + +Roland Schulz, Intel + +== Dependencies + +This extension is written against the SYCL 2020 specification, Revision 3 and +the following extensions: + +- https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/PropertyList/SYCL_EXT_ONEAPI_property_list.asciidoc[SYCL_EXT_ONEAPI_PROPERTY_LIST] + +== Feature Test Macro + +This extension provides a feature-test macro as described in the core SYCL +specification section 6.3.3 "Feature test macros". Therefore, an +implementation supporting this extension must predefine the macro +`SYCL_EXT_ONEAPI_KERNEL_PROPERTIES` to one of the values defined in the table +below. Applications can test for the existence of this macro to determine if +the implementation supports this feature, or applications can test the macro's +value to determine which of the extension's APIs the implementation supports. + +[%header,cols="1,5"] +|=== +|Value |Description +|1 |Initial extension version. Base features are supported. +|=== + +== Overview + +SYCL 2020 allows for attributes to be applied to the function-type of kernel +function declarations. These attributes are often used to pass information +to a backend device compiler. + +There are several known disadvantages to relying on attributes for such +information, including: + +- Host {cpp} compilers are free to ignore unrecognized attributes, implying + that attributes should not be employed to convey information that cannot be + ignored. Many of the attributes in SYCL 2020 convey information that cannot + be ignored (e.g. a kernel may only execute correctly with a specific + sub-group size, or on devices which have specific aspects). + +- Library-only implementations of SYCL cannot reason about attributes at all. + +- SMCP implementations of SYCL must ensure that the host and device compiler + both understand and agree upon the meaning of each attribute. + +- It is complicated (although not impossible) to generate multiple variants of + a kernel with different combinations of attributes. + +- There is no way to inspect the properties associated with a kernel at + compile-time (e.g. via type traits or similar); any such inspection must be + performed at run-time and *after* compiling the kernel. + +This extension proposes a replacement for these kernel attributes, in the form +of a `property_list` accepting properties with compile-time constant +values, to address several of these issues. + +== Kernel Properties + +The kernel properties below correspond to kernel attributes defined in +Section 5.8.1 of the SYCL 2020 specification. Note that deprecated attributes +(such as `vec_type_hint`) are not included. + +```c++ +namespace sycl { +namespace ext { +namespace oneapi { + +// Corresponds to reqd_work_group_size +struct work_group_size { + template + using value_t = property_value...>; +}; // work_group_size + +// Corresponds to work_group_size_hint +struct work_group_size_hint { + template + using value_t = property_value...>; +}; // work_group_size_hint + +// Corresponds to reqd_sub_group_size +struct sub_group_size { + template + using value_t = property_value>; +}; // sub_group_size + +// Corresponds to device_has +struct device_has { + template + using value_t = property_value...>; +}; // device_has + +template +struct property_value...> { + constexpr size_t operator[](int dim); +}; + +template +struct property_value...> { + constexpr size_t operator[](int dim); +}; + +template +struct property_value...> { + static constexpr std::array value; +}; + +template +inline constexpr work_group_size::value_t work_group_size_v; + +template +inline constexpr work_group_size_hint::value_t work_group_size_hint_v; + +template +inline constexpr sub_group_size::value_t sub_group_size_v; + +template +inline constexpr device_has::value_t device_has_v; + +} // namespace oneapi +} // namespace ext +} // namespace sycl +``` + +|=== +|Property|Description + +|`work_group_size` +|The `work_group_size` property adds the requirement that the kernel must be + launched with the specified work-group size. The number of template arguments + in the `Dims` parameter pack must match the dimensionality of the work-group + used to invoke the kernel. The order of the template arguments matches the + constructor of the `range` class. An implementation may throw an exception + for certain combinations of property values, devices and launch configurations, + as described for the `reqd_work_group_size` attribute in Table 180 of the + SYCL 2020 specification. + +|`work_group_size_hint` +|The `work_group_size_hint` property hints to the compiler that the kernel is + likely to be launched with the specified work-group size. The number of + template arguments in the `Dims` parameter pack must match the dimensionality + of the work-group used to invoke the kernel. The order of the template + arguments matches the constructor of the `range` class. + +|`sub_group_size` +|The `sub_group_size` property adds the requirement that the kernel must be + compiled and executed with the specified sub-group size. An implementation may + throw an exception for certain combinations of property values, devices and + launch configurations, as described for the `reqd_sub_group_size` attribute + in Table 180 of the SYCL 2020 specification. + +|`device_has` +|The `device_has` property adds the requirement that the kernel must be + launched on a device that has all of the aspects listed in the `Aspects` + parameter pack. An implementation may throw an exception or issue a + diagnostic for certain combinations of aspects, devices and kernel functions, + as described for the `device_has` attribute in Table 180 of the SYCL 2020 + specification. + +|=== + +SYCL implementations may introduce additional kernel properties. If any +combinations of kernel attributes are invalid, this must be clearly documented +as part of the new kernel property definition. + +== Adding a `property_list` to a Kernel Launch + +To enable properties to be associated with kernels, this extension adds +new overloads to each of the variants of `single_task`, `parallel_for` and +`parallel_for_work_group` defined in the `queue` and `handler` classes. These +new overloads accept a `sycl::ext::oneapi::property_list` argument. For +variants accepting a parameter pack, the `sycl::ext::oneapi::property_list` +argument is inserted immediately prior to the parameter pack; for variants not +accepting a parameter pack, the `sycl::ext::oneapi::property_list` argument is +inserted immediately prior to the kernel function. + +The overloads introduced by this extension are listed below: +```c++ +namespace sycl { +class queue { + public: + template + event single_task(PropertyList properties, const KernelType &kernelFunc); + + template + event single_task(event depEvent, PropertyList properties, + const KernelType &kernelFunc); + + template + event single_task(const std::vector &depEvents, + PropertyList properties, + const KernelType &kernelFunc); + + template + event parallel_for(range numWorkItems, + Rest&&... rest); + + template + event parallel_for(range numWorkItems, event depEvent, + PropertyList properties, + Rest&&... rest); + + template + event parallel_for(range numWorkItems, + const std::vector &depEvents, + PropertyList properties, + Rest&&... rest); + + template + event parallel_for(nd_range executionRange, + PropertyList properties, + Rest&&... rest); + + template + event parallel_for(nd_range executionRange, + event depEvent, + PropertyList properties, + Rest&&... rest); + + template + event parallel_for(nd_range executionRange, + const std::vector &depEvents, + PropertyList properties, + Rest&&... rest); +} +} + +namespace sycl { +class handler { + public: + template + void single_task(PropertyList properties, const KernelType &kernelFunc); + + template + void parallel_for(range numWorkItems, + PropertyList properties, + Rest&&... rest); + + template + void parallel_for(nd_range executionRange, + PropertyList properties, + Rest&&... rest); + + template + void parallel_for_work_group(range numWorkGroups, + PropertyList properties, + const WorkgroupFunctionType &kernelFunc); + + template + void parallel_for_work_group(range numWorkGroups, + range workGroupSize, + PropertyList properties, + const WorkgroupFunctionType &kernelFunc); +} +} +``` + +Passing properties as an argument in this way allows properties to be +associated with a kernel function without modifying its type. This enables +the same kernel function (e.g. a lambda) to be submitted multiple times with +different properties, or for libraries building on SYCL to add properties +(e.g. for performance reasons) to user-provided kernel functions. + +All the properties defined in this extension have compile-time values. However, +an implementation may support additional properties which could have run-time +values. When this occurs, the `properties` parameter may be a `property_list` +containing a mix of both run-time and compile-time values, and a SYCL +implementation should respect both run-time and compile-time information when +determining the correct way to launch a kernel. However, only compile-time +information can modify the compilation of the kernel function itself. + +A simple example of using this extension to set a required work-group size +and required sub-group size is given below: + +```c++ +sycl::ext::oneapi::property_list properties{sycl::ext::oneapi::work_group_size_v<8, 8>, + sycl::ext::oneapi::sub_group_size_v<8>}; +q.parallel_for(range<2>{16, 16}, properties, [=](id<2> i) { + a[i] = b[i] + c[i]; +}).wait(); +``` + +== Encoding Properties into a Kernel + +In other situations it may be useful to encode a kernel's properties directly +into its type, to ensure that a kernel cannot be launched without a property +that it depends upon for correctness. + +To enable this use-case, this extension adds a mechanism for implementations to +extract a property list from a kernel functor, if a kernel functor declares +a `property_list` member variable named `properties`. Note that this member +variable must be `static constexpr`, and kernel functors can therefore only +encode properties with compile-time values. + +NOTE: The attribute mechanism in SYCL 2020 allows for different kernel +attributes to be applied to different call operators within the same +functor. The `property_list` member variable applies to all call operators in +the functor. + +The example below shows how the kernel from the previous section could be +rewritten to leverage an embedded property list: + +```c++ +struct KernelFunctor { + + KernelFunctor(sycl::accessor a, + sycl::accessor b, + sycl::accessor c) : a(a), b(b), c(c) + {} + + void operator()(id<2> i) const { + a[i] = b[i] + c[i]; + } + + static constexpr auto properties = + sycl::ext::oneapi::property_list{sycl::ext::oneapi::work_group_size_v<8, 8>, + sycl::ext::oneapi::sub_group_size_v<8>}; + + sycl::accessor a; + sycl::accessor b; + sycl::accessor c; + +}; + +... + +q.parallel_for(range<2>{16, 16}, KernelFunctor(a, b, c)).wait(); +``` + +If a kernel functor with a `property_list` member variable is enqueued for +execution using an invocation function with a `property_list` argument, +the kernel is launched as-if the member variable and argument were combined. If +the combined list contains any invalid combinations of properties, then this is +an error: invalid combinations that can be detected at compile-time should be +reported via a diagnostic; invalid combinations that can only be detected at +run-time should result in an implementation throwing an `exception` with the +`errc::invalid` error code. + +== Querying Properties in a Compiled Kernel + +Any properties encoded into a kernel type via a property list are reflected +in the results of a call to `kernel::get_info` with the +`info::kernel::attributes` information descriptor, as if the corresponding +attribute from the SYCL 2020 specification had been applied to the kernel +function. + +== Device Functions + +The SYCL 2020 `sycl::device_has` attribute can be applied to the declaration +of a non-kernel device function, to assert that the device function uses a +specific set of optional features. This extension provides a mechanism exposing +similar behavior, allowing for kernel properties to be associated with +a function via the `SYCL_EXT_ONEAPI_PROPERTY` macro. Each instance of the +`SYCL_EXT_ONEAPI_PROPERTY` macro accepts one argument, corresponding to a +single property value. + +NOTE: Due to limitations of the C preprocessor, property value expressions +containing commas (e.g. due to template arguments) must be enclosed in +parentheses to avoid being interpreted as multiple arguments. + +The example below shows a function that uses two optional features, +corresponding to the `fp16` and `atomic64` aspects. + +```c++ +SYCL_EXT_ONEAPI_PROPERTY((sycl::device_has_v)) +void foo(); +``` + +The table below describes the effects of associating each kernel property +with a non-kernel device function via the `SYCL_EXT_ONEAPI_PROPERTY` macro. + +|=== +|Property|Description + +|`device_has` +|The `device_has` property asserts that the device function uses optional + features corresponding to the aspects listed in the `Aspects` parameter pack. + The effects of this property are identical to those described for the + `device_has` attribute in Table 181 of the SYCL 2020 specification. + +|=== + +The `SYCL_EXT_ONEAPI_PROPERTY` macro can be used alongside the +`SYCL_EXTERNAL` macro, and the macros may be specified in any order. +Whenever `SYCL_EXTERNAL` is used, there are two relevant translation units: the +translation unit that _defines_ the function and the translation unit that +_calls_ the function. If a given `SYCL_EXTERNAL` function _F_ is defined in +one translation unit with a set of properties _P_, then all other translation +units that declare that same function _F_ must list the same set of properties +_P_ via the `SYCL_EXT_ONEAPI_PROPERTY` macro. Programs which fail to do this +are ill-formed, but no diagnostic is required. + +== Issues + +. How should we handle kernels supporting more than one set of device aspects? ++ +-- +*UNRESOLVED*: A compiler can evaluate complex Boolean expressions in an +attribute, but this is non-trivial to emulate using only the {cpp} type system. +A simple alternative may be to introduce an additional level of indirection via +new properties, for example `device_has_all_of` and `device_has_any_of`: +`device_has_all_of, +device_has_any_of>`. +-- + +. How should the `property_list` member variable behave with inheritance? ++ +-- +*UNRESOLVED*: The specification currently allows for a class to inspect the +`property_list` member variable from its base class(es) and construct a new +`property_list` member variable that applies to all call operators. Associating +different properties with different call operators via inheritance has the +potential to be confusing and would increase implementation complexity. +-- + +//. asd +//+ +//-- +//*RESOLUTION*: Not resolved. +//-- + +== Revision History + +[cols="5,15,15,70"] +[grid="rows"] +[options="header"] +|======================================== +|Rev|Date|Author|Changes +|1|2021-08-06|John Pennycook|*Initial public working draft* +|======================================== diff --git a/sycl/doc/extensions/README.md b/sycl/doc/extensions/README.md index 4697aeb8bb224..624de3a98d21a 100755 --- a/sycl/doc/extensions/README.md +++ b/sycl/doc/extensions/README.md @@ -47,6 +47,7 @@ DPC++ extensions status: | [SYCL_EXT_ONEAPI_DEVICE_GLOBAL](DeviceGlobal/SYCL_INTEL_device_global.asciidoc) | Proposal | | | [SYCL_INTEL_bf16_conversion](Bf16Conversion/SYCL_INTEL_bf16_conversion.asciidoc) | Partially supported (Level Zero: GPU) | Currently available only on Xe HP GPU. ext_intel_bf16_conversion aspect is not supported. | | [Property List](PropertyList/SYCL_EXT_ONEAPI_property_list.asciidoc) | Proposal | | +| [KernelProperties](KernelProperties/KernelProperties.asciidoc) | Proposal | | Legend: