Skip to content

[SYCL] Add has_known_identity/known_identity #2528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Feb 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions sycl/doc/extensions/Reduction/Reduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,31 @@ unspecified reduction(span<T, Extent> var, const T& identity, BinaryOperation co

The exact behavior of a reduction is specific to an implementation; the only interface exposed to the user is the set of functions above, which construct an unspecified `reduction` object encapsulating the reduction variable, an optional operator identity and the reduction operator. For user-defined binary operations, an implementation should issue a compile-time warning if an identity is not specified and this is known to negatively impact performance (e.g. as a result of the implementation choosing a different reduction algorithm). For standard binary operations (e.g. `std::plus`) on arithmetic types, the implementation must determine the correct identity automatically in order to avoid performance penalties.

If an implementation can identify the identity value for a given combination of accumulator type `AccumulatorT` and function object type `BinaryOperation`, the value is defined as a member of the `known_identity` trait class:
```c++
template <typename BinaryOperation, typename AccumulatorT>
struct known_identity {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we should telling that known_identity is a struct on the spec level. It may be implemented as the using to something. I would also try to align the API with integral_constant. I understand that this is not possible for 100% but traits with values usually have some run-time API as well. The comment is applicable to other traits

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the traits in C++ are defined as a struct, aren't they? Is there a good reason that known_identity should be different?

I'm not sure what you're imagining when aligning with integral_constant. In the general case, I don't think we can guarantee that the identity value is something that can be passed as a template parameter (e.g. it may be a floating-point type).

static constexpr AccumulatorT value;
};

// Available if C++17
template <typename BinaryOperation, typename AccumulatorT>
inline constexpr AccumulatorT known_identity_v = known_identity<BinaryOperation, AccumulatorT>::value;
```

Whether `known_identity<BinaryOperation, AccumulatorT>::value` exists can be tested using the `has_known_identity` trait class:

```c++
template <typename BinaryOperation, typename AccumulatorT>
struct has_known_identity {
static constexpr bool value;
};

// Available if C++17
template <typename BinaryOperation, typename AccumulatorT>
inline constexpr bool has_known_identity_v = has_known_identity<BinaryOperation, AccumulatorT>::value;
```

The dimensionality of the `accessor` passed to the `reduction` function specifies the dimensionality of the reduction variable: a 0-dimensional `accessor` represents a scalar reduction, and any other dimensionality represents an array reduction. Specifying an array reduction of size N is functionally equivalent to specifying N independent scalar reductions. The access mode of the accessor determines whether the reduction variable's original value is included in the reduction (i.e. for `access::mode::read_write` it is included, and for `access::mode::discard_write` it is not). Multiple reductions aliasing the same output results in undefined behavior.

`T` must be trivially copyable, permitting an implementation to (optionally) use atomic operations to implement the reduction. This restriction is aligned with `std::atomic<T>` and `std::atomic_ref<T>`.
Expand Down
136 changes: 101 additions & 35 deletions sycl/include/CL/sycl/ONEAPI/reduction.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,85 @@ using IsKnownIdentityOp =
IsMinimumIdentityOp<T, BinaryOperation>::value ||
IsMaximumIdentityOp<T, BinaryOperation>::value>;

template <typename BinaryOperation, typename AccumulatorT>
struct has_known_identity_impl
: std::integral_constant<
bool, IsKnownIdentityOp<AccumulatorT, BinaryOperation>::value> {};

template <typename BinaryOperation, typename AccumulatorT, typename = void>
struct known_identity_impl {};

/// Returns zero as identity for ADD, OR, XOR operations.
template <typename BinaryOperation, typename AccumulatorT>
struct known_identity_impl<BinaryOperation, AccumulatorT,
typename std::enable_if<IsZeroIdentityOp<
AccumulatorT, BinaryOperation>::value>::type> {
static constexpr AccumulatorT value = 0;
};

template <typename BinaryOperation>
struct known_identity_impl<BinaryOperation, half,
typename std::enable_if<IsZeroIdentityOp<
half, BinaryOperation>::value>::type> {
static constexpr half value =
#ifdef __SYCL_DEVICE_ONLY__
0;
#else
cl::sycl::detail::host_half_impl::half(static_cast<uint16_t>(0));
#endif
};

/// Returns one as identify for MULTIPLY operations.
template <typename BinaryOperation, typename AccumulatorT>
struct known_identity_impl<BinaryOperation, AccumulatorT,
typename std::enable_if<IsOneIdentityOp<
AccumulatorT, BinaryOperation>::value>::type> {
static constexpr AccumulatorT value = 1;
};

template <typename BinaryOperation>
struct known_identity_impl<BinaryOperation, half,
typename std::enable_if<IsOneIdentityOp<
half, BinaryOperation>::value>::type> {
static constexpr half value =
#ifdef __SYCL_DEVICE_ONLY__
1;
#else
cl::sycl::detail::host_half_impl::half(static_cast<uint16_t>(0x3C00));
#endif
};

/// Returns bit image consisting of all ones as identity for AND operations.
template <typename BinaryOperation, typename AccumulatorT>
struct known_identity_impl<BinaryOperation, AccumulatorT,
typename std::enable_if<IsOnesIdentityOp<
AccumulatorT, BinaryOperation>::value>::type> {
static constexpr AccumulatorT value = ~static_cast<AccumulatorT>(0);
};

/// Returns maximal possible value as identity for MIN operations.
template <typename BinaryOperation, typename AccumulatorT>
struct known_identity_impl<BinaryOperation, AccumulatorT,
typename std::enable_if<IsMinimumIdentityOp<
AccumulatorT, BinaryOperation>::value>::type> {
Comment on lines +206 to +207
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, definitely not a request to change.
This could be a little bit shorter (but same number of lines though):

Suggested change
typename std::enable_if<IsMinimumIdentityOp<
AccumulatorT, BinaryOperation>::value>::type> {
enable_if_t<IsMinimumIdentityOp<
AccumulatorT, BinaryOperation>::value>> {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this during development and couldn't get it to work. I chalked it up to me not understanding enough about how our implementation of enable_if_t differs from std::enable_if. Can you compile locally if you make this change? Maybe I'm doing something wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked our enable_if_t implementation. It is the same as in C++ standard. So, there should not be any problem. If you need help we may have a call or IM

static constexpr AccumulatorT value =
std::numeric_limits<AccumulatorT>::has_infinity
? std::numeric_limits<AccumulatorT>::infinity()
: (std::numeric_limits<AccumulatorT>::max)();
};

/// Returns minimal possible value as identity for MAX operations.
template <typename BinaryOperation, typename AccumulatorT>
struct known_identity_impl<BinaryOperation, AccumulatorT,
typename std::enable_if<IsMaximumIdentityOp<
AccumulatorT, BinaryOperation>::value>::type> {
static constexpr AccumulatorT value =
std::numeric_limits<AccumulatorT>::has_infinity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like AccumulatorT is the the fundamental C++ type only (because it's used in std::numeric_limits).

Two questions basing on that:

  • What about custom types?
  • If only fundamental types should be supported, can we introduce the integral_constant semantics for known_identity API as I suggested for has_known_identity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a short-term fix. Our reduction implementation currently only detects identity values automatically for fundamental C++ types (and half), so that's all we've decided to cover with the traits for now.

Eventually we want developers to be able to declare the identity value for their own types and function object types. The current thinking is that for combinations of fundamental types and known function object types (e.g. std::plus<>) the traits will continue to work as implemented here, but for anything else they'll check the function object for something like an identity member.

There are still some open questions about exactly what this should look like for transparent functors and what additional checks are needed (e.g. if we can determine an identity for T1 and T1 is convertible to T2, is it safe to assume that the identity for T1 can be used?). These are important questions, but I don't think the first implementation of the traits should be blocked by them.

? static_cast<AccumulatorT>(
-std::numeric_limits<AccumulatorT>::infinity())
: std::numeric_limits<AccumulatorT>::lowest();
};

/// Class that is used to represent objects that are passed to user's lambda
/// functions and representing users' reduction variable.
/// The generic version of the class represents those reductions of those
Expand Down Expand Up @@ -193,43 +272,10 @@ class reducer<T, BinaryOperation,
MValue = BOp(MValue, Partial);
}

/// Returns zero as identity for ADD, OR, XOR operations.
template <typename _T = T, class _BinaryOperation = BinaryOperation>
static enable_if_t<IsZeroIdentityOp<_T, _BinaryOperation>::value, _T>
getIdentity() {
return 0;
}

/// Returns one as identify for MULTIPLY operations.
template <typename _T = T, class _BinaryOperation = BinaryOperation>
static enable_if_t<IsOneIdentityOp<_T, _BinaryOperation>::value, _T>
getIdentity() {
return 1;
}

/// Returns bit image consisting of all ones as identity for AND operations.
template <typename _T = T, class _BinaryOperation = BinaryOperation>
static enable_if_t<IsOnesIdentityOp<_T, _BinaryOperation>::value, _T>
static enable_if_t<has_known_identity_impl<_BinaryOperation, _T>::value, _T>
getIdentity() {
return ~static_cast<_T>(0);
}

/// Returns maximal possible value as identity for MIN operations.
template <typename _T = T, class _BinaryOperation = BinaryOperation>
static enable_if_t<IsMinimumIdentityOp<_T, _BinaryOperation>::value, _T>
getIdentity() {
return std::numeric_limits<_T>::has_infinity
? std::numeric_limits<_T>::infinity()
: (std::numeric_limits<_T>::max)();
}

/// Returns minimal possible value as identity for MAX operations.
template <typename _T = T, class _BinaryOperation = BinaryOperation>
static enable_if_t<IsMaximumIdentityOp<_T, _BinaryOperation>::value, _T>
getIdentity() {
return std::numeric_limits<_T>::has_infinity
? static_cast<_T>(-std::numeric_limits<_T>::infinity())
: std::numeric_limits<_T>::lowest();
return known_identity_impl<_BinaryOperation, _T>::value;
}

template <typename _T = T>
Expand Down Expand Up @@ -1062,6 +1108,26 @@ reduction(T *VarPtr, BinaryOperation) {
access::mode::read_write>(VarPtr);
}

template <typename BinaryOperation, typename AccumulatorT>
struct has_known_identity : detail::has_known_identity_impl<
typename std::decay<BinaryOperation>::type,
typename std::decay<AccumulatorT>::type> {};
#if __cplusplus >= 201703L
template <typename BinaryOperation, typename AccumulatorT>
inline constexpr bool has_known_identity_v =
has_known_identity<BinaryOperation, AccumulatorT>::value;
#endif

template <typename BinaryOperation, typename AccumulatorT>
struct known_identity
: detail::known_identity_impl<typename std::decay<BinaryOperation>::type,
typename std::decay<AccumulatorT>::type> {};
#if __cplusplus >= 201703L
template <typename BinaryOperation, typename AccumulatorT>
inline constexpr AccumulatorT known_identity_v =
known_identity<BinaryOperation, AccumulatorT>::value;
#endif

} // namespace ONEAPI
} // namespace sycl
} // __SYCL_INLINE_NAMESPACE(cl)
43 changes: 35 additions & 8 deletions sycl/include/CL/sycl/half_type.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@
#else
#define __SYCL_CONSTEXPR_ON_DEVICE
#endif
#if __cplusplus >= 201402L
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there the sycl-include-gnu11.cpp test which checks it compiles with C++11? I thought we only support compiling as >=C++14.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the RT library is C++14, but the headers are C++11.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bader , @romanovvlad - Can't we assume C++14 or newer in SYCL header files?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a simple "yes/no" answer, but I'm aware of a couple of factors restricting usage of C++ features:

  • SYCL specification: SYCL-1.2.1 relies on C++11 and SYCL-2020 relies on C++17 features. It seems to me that DPC++ is not going to support "strict SYCL-1.2.1" mode where C++14 (or later) features are not supported.
  • Runtime library ABI compatibility (see the note in our contribution guide). It's hard to say if using C++14 features in SYCL headers breaks runtime ABI, but it possible and should be prevented by regression tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have C++ version documented at:
https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md#c-standard
but it doesn't answer whether C++11 mode will be functional or broken.

I think it would be good to know and catch the case when we really want to break C++11 compatibility and have some discussion on that. Maybe that's what sycl-include-gnu11.cpp test is doing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test only checks that the header can be included. But not that any template in the header can be instantiated in C++11 mode. The current version of this PR passes all tests . But trying to instantiate known_identity for half in C++11 will give a compiler error (not tested).

It shouldn't have any ABI issues to use any C++14 features in the header. The problematic release for ABI issues is C++11 (e.g. string). There weren't any such things in C++14.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++14 is required now. The doc changes have been merged 10 days ago: #3053

#define _CPP14_CONSTEXPR constexpr
#else
#define _CPP14_CONSTEXPR
#endif

__SYCL_INLINE_NAMESPACE(cl) {
namespace sycl {
Expand All @@ -35,8 +40,8 @@ namespace host_half_impl {
class __SYCL_EXPORT half {
public:
half() = default;
half(const half &) = default;
half(half &&) = default;
constexpr half(const half &) = default;
constexpr half(half &&) = default;

half(const float &rhs);

Expand Down Expand Up @@ -74,11 +79,20 @@ class __SYCL_EXPORT half {
return ret;
}

// Operator neg
_CPP14_CONSTEXPR half &operator-() {
Buf ^= 0x8000;
return *this;
}

// Operator float
operator float() const;

template <typename Key> friend struct std::hash;

// Initialize underlying data
constexpr explicit half(uint16_t x) : Buf(x) {}

private:
uint16_t Buf;
};
Expand Down Expand Up @@ -136,8 +150,8 @@ class half;
class half {
public:
half() = default;
half(const half &) = default;
half(half &&) = default;
constexpr half(const half &) = default;
constexpr half(half &&) = default;

__SYCL_CONSTEXPR_ON_DEVICE half(const float &rhs) : Data(rhs) {}

Expand All @@ -146,8 +160,8 @@ class half {
#ifndef __SYCL_DEVICE_ONLY__
// Since StorageT and BIsRepresentationT are different on host, these two
// helpers are required for 'vec' class
half(const detail::host_half_impl::half &rhs) : Data(rhs) {};
operator detail::host_half_impl::half() const { return Data; }
constexpr half(const detail::host_half_impl::half &rhs) : Data(rhs){};
constexpr operator detail::host_half_impl::half() const { return Data; }
#endif // __SYCL_DEVICE_ONLY__

// Operator +=, -=, *=, /=
Expand Down Expand Up @@ -193,7 +207,14 @@ class half {
operator--();
return ret;
}

_CPP14_CONSTEXPR half &operator-() {
Data = -Data;
return *this;
}
_CPP14_CONSTEXPR half operator-() const {
half r = *this;
return -r;
}
// Operator float
operator float() const { return static_cast<float>(Data); }

Expand Down Expand Up @@ -280,8 +301,13 @@ template <> struct numeric_limits<cl::sycl::half> {
return 0.5f;
}

static __SYCL_CONSTEXPR_ON_DEVICE const cl::sycl::half infinity() noexcept {
static constexpr const cl::sycl::half infinity() noexcept {
#ifdef __SYCL_DEVICE_ONLY__
return __builtin_huge_valf();
#else
return cl::sycl::detail::host_half_impl::half(
static_cast<uint16_t>(0x7C00));
#endif
}

static __SYCL_CONSTEXPR_ON_DEVICE const cl::sycl::half quiet_NaN() noexcept {
Expand Down Expand Up @@ -313,3 +339,4 @@ inline std::istream &operator>>(std::istream &I, cl::sycl::half &rhs) {
}

#undef __SYCL_CONSTEXPR_ON_DEVICE
#undef _CPP14_CONSTEXPR
2 changes: 2 additions & 0 deletions sycl/test/regression/constexpr-fp16-numeric-limits.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ int main() {
constexpr cl::sycl::half L5 =
std::numeric_limits<cl::sycl::half>::round_error();
constexpr cl::sycl::half L6 = std::numeric_limits<cl::sycl::half>::infinity();
constexpr cl::sycl::half L6n =
-std::numeric_limits<cl::sycl::half>::infinity();
constexpr cl::sycl::half L7 =
std::numeric_limits<cl::sycl::half>::quiet_NaN();
constexpr cl::sycl::half L8 =
Expand Down