Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ products.

== Version

Revision: 4
Revision: 5

== Introduction

Expand Down Expand Up @@ -103,7 +103,7 @@ If the device doesn't have the aspect, objects of `bfloat16` class must not be
used in the device code.

**NOTE**: The `ext_oneapi_bfloat16` aspect is not yet supported. The
`bfloat16` class is currently supported only on Xe HP GPU and Nvidia A100 GPU.
`bfloat16` class is currently supported only on Xe HP GPU and Nvidia GPUs with Compute Capability >= SM80.

== New `bfloat16` class

Expand Down Expand Up @@ -316,9 +316,83 @@ int main (int argc, char *argv[]) {
}
----

== New bfloat16 math functions

Many applications will require dedicated functions that take parameters of type `bfloat16`. This extension adds `bfloat16` support to the `fma`, `fmin`, `fmax` and `fabs` SYCL floating point math functions. These functions can be used as element wise operations on matrices, supplementing the `bfloat16` support in the sycl_ext_oneapi_matrix extension.

The descriptions of the `fma`, `fmin`, `fmax` and `fabs` SYCL floating point math functions can be found in the SYCL specification: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_math_functions.

The following functions are only available when `T` is `bfloat16` or `sycl::marray<bfloat16, {N}>`, where `{N}` means any positive value of `size_t` type.

=== fma

```c++
namespace sycl::ext::oneapi::experimental {

template <typename T>
T fma(T a, T b, T c);
} // namespace sycl::ext::oneapi::experimental
```

==== Description

Returns the correctly rounded floating-point representation of the sum of `c` with the infinitely precise product of `a` and `b`.
Rounding of intermediate products shall not occur. The mantissa LSB rounds to the nearest even. Subnormal numbers are supported.

=== fmax

```c++
namespace sycl::ext::oneapi::experimental {
template <typename T>
T fmax(T x, T y);
} // namespace sycl::ext::oneapi::experimental
```

==== Description

Returns `y` if
`x < y`, otherwise it
returns `x`. If one argument is a
NaN, `fmax()` returns the other
argument. If both arguments are
NaNs, `fmax()` returns a NaN.

=== fmin

```c++
namespace sycl::ext::oneapi::experimental {
template <typename T>
T fmin(T x, T y);
} // namespace sycl::ext::oneapi::experimental
```

==== Description

Returns `y` if
`y < x`, otherwise it
returns `x`. If one argument is a
NaN, `fmax()` returns the other
argument. If both arguments are
NaNs, `fmax()` returns a NaN.

=== fabs

```c++
namespace sycl::ext::oneapi::experimental {
template <typename T>
T fabs(T x);
} // namespace sycl::ext::oneapi::experimental
```

==== Description

Compute absolute value of a `bfloat16`.

== Issues

None.
1. The CUDA backend does not have a use case that would necessitate support of the `vec` class in bfloat16 math functions, and `marray` would always be preferred over `vec` if `vec` support were to be added in the CUDA backend. For portability reasons, support for the `vec` class can be easily added if other backends require it.

2. We should decide on a roadmap to extend support of `bfloat16` to other SYCL 2020 math functions.

== Revision History

Expand All @@ -333,4 +407,5 @@ None.
Apply code review suggestions
|3|2021-08-18|Alexey Sotkin |Remove `uint16_t` constructor
|4|2022-03-07|Aidan Belton and Jack Kirk |Switch from Intel vendor specific to oneapi
|5|2022-04-05|Jack Kirk | Added section for bfloat16 math builtins
|========================================