diff --git a/README.md b/README.md index 7338c0e..6e57db0 100644 --- a/README.md +++ b/README.md @@ -54,3 +54,4 @@ Each proposal in the table below will be tagged with one of the following states | CP013 | [Supporting Heterogeneous & Distributed Computing Through Affinity](affinity/index.md) | ISO C++ SG1, SG14 | 15 November 2017 | 12 August 2018 | _Work in Progress_ | | CP014 | [Shared Virtual Memory](svm/index.md) | SYCL 2.2 | 22 January 2018 | 22 January 2018 | _Work in Progress_ | | CP015 | [Specialization Constant](spec-constant/index.md) | SYCL 1.2.1 extension / SYCL 2.2 | 24 April 2018 | 24 April 2018 | _Work in Progress_ | +| CP016 | [Sub Groups](spec-constant/index.md) | SYCL 1.2.1 extension | 14 September 2018 | 14 September 2018 | _Work in Progress_ | diff --git a/sub-groups/index.md b/sub-groups/index.md new file mode 100644 index 0000000..22da9da --- /dev/null +++ b/sub-groups/index.md @@ -0,0 +1,36 @@ +# Basic sub-group extension + +| Proposal ID | CP016 | +|-------------|--------| +| Name | Basic sub group extension | +| Date of Creation | 14 September 2018 | +| Target | SYCL 1.2.1 | +| Current Status | _Work In Progress_ | +| Reply-to | Ruyman Reyes | +| Original author | Ruyman Reyes | +| Contributors | Ruyman Reyes , Gordon Brown , Victor Lomuller | + +## Overview + +This vendor extension aims to define an interface to expose sub-group functionality, +as defined in the SYCL 2.2 provisional and the OpenCL 2.2 provisional, +in SYCL 1.2.1. + +The extension is only targeting OpenCL devices that expose +`cl_codeplay_basic_subgroups` vendor extension. + + +## References + +[1] SYCL 1.2.1 specification +https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf + +[2] SYCL 2.2 provisional specification (revision date 2016/02/15) +https://www.khronos.org/registry/SYCL/specs/sycl-2.2.pdf + +[3] OpenCL 2.2 API specification +https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_API.pdf + +[4] OpenCL C++ 1.0 specification +https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_Cxx.pdf + diff --git a/sub-groups/sycl-1.2.1/index.md b/sub-groups/sycl-1.2.1/index.md new file mode 100644 index 0000000..5c89922 --- /dev/null +++ b/sub-groups/sycl-1.2.1/index.md @@ -0,0 +1,157 @@ +# Basic Sub group support + +This proposal aims to define an interface for using OpenCL 2.2 sub groups in +SYCL the provisional SYCL 1.2.1 specification, relying on the underlying +OpenCL implementation supporting the extension `cl_codeplay_basic_subgroups`. + +The extension exposes to programmers the ability to identify sub-groups +on a work-group, count the number of sub-groups available and perform +a broadcast from one work-item on a sub-group to the rest. + +Details of the execution and memory model changes can be found in the +documentation for the Codeplay's OpenCL vendor extension `cl_codeplay_basic_subgroups` +once available. + +## Execution model + +When this vendor extension is available, the execution model of SYCL 1.2.1 +is extended to also include support for sub-groups of threads inside of a +work-group. +Overall, these sub-groups work following the description of the OpenCL 2.2 +sub-groups, with some restrictions: + +* The number of sub-groups available for each work-group is determined +at compile-time and remains the same during the execution of the SYCL application. +* The number of threads per sub-group is known at compile-time, and remains the +same during execution of the SYCL application. +* Only those functions defined in this proposal are available. +In particular, there is no sub-group pipe communication. + +## Memory model + +Sub-groups can access global and local memory, but, given there is no +memory-scope to the atomic or barriers operations in SYCL 1.2.1, there is no +possibility to specify an equivalent of sub-group memory scope. + +## Namespace `basic_sub_group` + +All new functionality is exposed under the `basic_sub_group` namespace +in the `codeplay` vendor extension namespace. +When the vendor extension `basic_sub_group` is available, the macro +`SYCL_CODEPLAY_BASIC_SUB_GROUP` is defined in the header. + +### Class `sub_group` + +The extension adds a new class template `sub_group` that identifies the +sub group range and the current sub group id. +It also for providing sub group barriers. + +```cpp +namespace cl { +namespace sycl { +namespace codeplay { + +template +class sub_group { + public: + + constexpr range get_sub_group_range() const; + + constexpr size_t get_sub_group_range(int dimension) const; + + constexpr size_t get_sub_group_linear_range() const; + + id get_sub_group_id() const; + + size_t get_sub_group_id(int dimension) const; + + size_t get_sub_group_linear_id() const; + + void barrier(access::fence_space accessSpace = access::fence_space::global_and_local) const; + + /* T is permitted to be int, unsigned int, long, unsigned long, + float, half, double */ + template + T broadcast(size_t subGroupId, T value); + + /* Predicate must be a callable type which returns bool */ + template + bool all_of(Predicate predicate) const; + + /* Predicate must be a callable type which returns bool */ + template + bool any_of(Predicate predicate) const; +}; + +} // namespace codeplay +} // namespace sycl +} // namespace cl +``` + +## Free functions + +```cpp +namespace cl { +namespace sycl { +namespace codeplay { + +template +T broadcast(sub_group subGroup, size_t subGroupId, T value); + +template +bool all_of(sub_group subGroup, Predicate predicate); + +template +bool any_of(sub_group subGroup, Predicate predicate); + +template +void barrier(sub_group subGroup, access::fence_space accessSpace + = access::fence_space::global_and_local) const; + +} // namespace codeplay +} // namespace sycl +} // namespace cl +``` + +## Extensions to the nd\_item class + +Extensions to the `nd_item` interface will be exposed via the a derived `nd_item` class template in the `codeplay` vendor extension namespace. + +New member function `get_sub_group` for identifying the current sub group and gaining access to sub group operations. + +```cpp +namespace cl { +namespace sycl { +namespace codeplay { + +template +class nd_item : public ::cl::sycl::nd_item { +public: + + sub_group get_sub_group() const; + +}; + +} // namespace codeplay +} // namespace sycl +} // namespace cl +``` + +## Example + +Below is trivial example showing how you would use `sub_group` to broadcast a value from one work-item within a sub-group to all other work-items in the sub-group. + +```cpp +using namespace cl::sycl; + +template +void my_subgroup_load(sub_group subG, global_ptr myArray) { + + float4 f; + if (subG.get_id() == 0) { + f.load(myArray); + } + barrier(subG, access::fence_space::global_and_local); + float4 res = broadcast(subG, 0, f); +} +```