TgMath #588

devshgraphicsprogramming · 2023-10-30T09:40:48Z

Description

Testing

TODO list:

include/nbl/builtin/hlsl/bit.hlsl

include/nbl/builtin/hlsl/cmath.hlsl

devshgraphicsprogramming · 2023-10-30T09:56:49Z

include/nbl/builtin/hlsl/cmath.hlsl

+template<class T>
+T erf(T x)
+{
+    // Bürmann series approximation  
+    // https://www.desmos.com/calculator/myf9ylguh1
+    // https://en.wikipedia.org/wiki/Error_function#Numerical_approximations
+    T E = INTRINSIC_NAMESPACE(exp)(-x*x);
+    T P = T(0.886226925453);
+    T re = INTRINSIC_NAMESPACE(sqrt)(T(1)-E)*(T(1)+T(.155)*E/P*(T(1)-T(.275)*E));
+    return INTRINSIC_NAMESPACE(copysign)(re, x);
+}
+
+template<class T>
+T erfc(T x)
+{
+    return T(1) - erf(x);
+}
+
+
+template<class T>
+T tgamma(T x)
+{
+    // TODO:
+    // Investigate this approximation since margin of error seems to be high
+    // https://www.desmos.com/calculator/ivfbrvxha8
+    // https://en.wikipedia.org/wiki/Lanczos_approximation
+    T pi = T(3.14159265358979323846);
+    T sqrt2pi = T(2.50662827463);
+    T p[] = { 
+        0.99999999999980993,
+        676.5203681218851,
+        -1259.1392167224028,
+        771.32342877765313,
+        -176.61502916214059,
+        12.507343278686905,
+        -0.13857109526572012,
+        9.9843695780195716e-6,
+        1.5056327351493116e-7
+    };
+
+    T c = T(1);
+    if (x < T(0.5))
+    {
+        c = pi / (sin(pi * x));
+        x = T(1)-x;
+    }
+
+    T q = p[0];
+    for(uint32_t i = 1; i < sizeof(p)/sizeof(p[0]); ++i)
+    {
+        q += p[i] / (x + i - 1);
+    }
+
+    T t = x + T(6.5);
+    return c * sqrt2pi * INTRINSIC_NAMESPACE(pow)(t, (x-T(.5)))*INTRINSIC_NAMESPACE(exp)(-t)*q;
+}
+
+template<class T>
+T lgamma(T x)
+{
+    return INTRINSIC_NAMESPACE(log)(INTRINSIC_NAMESPACE(tgamma)(x));
+}


where did these implementations come from btw ?

https://github.com/Devsh-Graphics-Programming/Nabla/pull/588/files#r1554730780

include/nbl/builtin/hlsl/cpp_compat.hlsl

include/nbl/builtin/hlsl/cmath.hlsl

devshgraphicsprogramming · 2023-10-30T10:01:04Z

include/nbl/builtin/hlsl/cmath.hlsl

+#define NBL_ALIAS_FUNCTION_WITH_OUTPUT_PARAM(fn, out_type) \
+template<class T>  \
+T fn(T x, NBL_REF_ARG(out_type) y) { \
+    NBL_LANG_SELECT(out_type, T) out_; \
+    T ret = INTRINSIC_NAMESPACE(fn)(x, NBL_ADDRESS_OF(out_)); \
+    y = out_type(out_); \
+    return ret; \
+}


can you show me and example of what this is for?

IMHO this macro is too ugly for the amount of function aliases it provides.

I'd rather have frexp and modf written out by hand

Also NBL_ADDRESS_OF and LAND_SELECT seems to be super weird, the purpose seems to be to silently use out_type and perform no conversions under HLSL and T under C++

HLSL can do conversions during copy-in and copy-out, so one could just always use T throughout and rely on this mechanic whenever HLSL want to use a different type

Neither macro should exist really.

include/nbl/builtin/hlsl/cmath.hlsl

devshgraphicsprogramming · 2023-10-30T10:13:06Z

include/nbl/builtin/hlsl/cmath.hlsl

+#define NBL_ALIAS_BINARY_FUNCTION2(name,impl) template<class T> T name(T x, T y) { return INTRINSIC_NAMESPACE(impl)(x, y); }
+#define NBL_ALIAS_BINARY_FUNCTION(fn) NBL_ALIAS_BINARY_FUNCTION2(fn,fn)
+#define NBL_ALIAS_UNARY_FUNCTION2(name,impl)  template<class T> T name(T x) { return INTRINSIC_NAMESPACE(impl)(x); }
+#define NBL_ALIAS_UNARY_FUNCTION(fn)  NBL_ALIAS_UNARY_FUNCTION2(fn,fn)


life is never that easy, since we're bridging the gap between HLSL and STL and having a unified namespace for lang/target agnostic code

C++

In HLSL all those functions must be able to work on vector, so C++ needs a per-component "forwarder" to do the function on each component of a vector when is_vector is true.

Also C++ has templated functions in tgmath and I'd prefer you to use that rather than rely on overload resolution to hit the right one.

HLSL

Check if the intrinsics don't have templated versions (cause apparently ByteAddressBuffer does and they're not in the documentation XD).

Something to think about

How will this all with complex

Does tgmath need to include complex ? Should we forward declare the specializations?

Remember that complex will be templated not only on float_t but fvec_t and e.g. complex<float32_t3> is not a builtin type so will need to be forwarded component by component even in HLSL.

After OpenEXR upgrade: Use IlmMath's versions of all these functions if it has them

HLSL doesn't have all these functions for half types (only GLSL on AMD used to).

For now its okay to not support any of this for HLSL and C++.

devshgraphicsprogramming · 2023-11-08T07:26:10Z

Can steal from https://www.boost.org/doc/libs/1_83_0/libs/core/doc/html/core/cmath.html

devshgraphicsprogramming · 2024-04-06T23:33:24Z

include/nbl/builtin/hlsl/cmath.hlsl

 // Copyright (C) 2022 - DevSH Graphics Programming Sp. z O.O.
 // This file is part of the "Nabla Engine".
 // For conditions of distribution and use, see copyright notice in nabla.h
-#ifndef _NBL_BUILTIN_HLSL_CMATH_INCLUDED_
-#define _NBL_BUILTIN_HLSL_CMATH_INCLUDED_
+#ifndef _NBL_BUILTIN_HLSL_TGMATH_INCLUDED_
+#define _NBL_BUILTIN_HLSL_TGMATH_INCLUDED_


need to rename the file as well

include/nbl/builtin/hlsl/cmath.hlsl

devshgraphicsprogramming · 2024-04-06T23:35:25Z

include/nbl/builtin/hlsl/cmath.hlsl

    const int32_t shift = (impl::num_base<T>::float_digits-1);
    const uint_type mask = ~(((uint_type(1) << shift) - 1) | (uint_type(1)<<(sizeof(T)*8-1)));
-    int32_t bits = (bit_cast<uint_type, T>(x) & mask) >> shift;
+    int32_t bits = (std::bit_cast<uint_type, T>(x) & mask) >> shift; //nbl::hlsl::bit_cast sucks as it pulls the ambiguous one from intrinsics.h


explain

Update: I think this is what INTRINSIC_NAMESPACE was for,, to choose between std and not

I want all this to be tested and compile as both C++ and HLSL, you obviously haven't tested

include/nbl/builtin/hlsl/cmath.hlsl

devshgraphicsprogramming · 2024-04-06T23:49:36Z

include/nbl/builtin/hlsl/cmath.hlsl

+    #define NBL_ADDRESS_OF(x) &x
+    #define NBL_LANG_SELECT(x, y) x
+    #define INTRINSIC_NAMESPACE(x) std::x
+#else
+    #define INTRINSIC_NAMESPACE(x) x


INTRINSIC_NAMESPACE(blah) shouldn't be a macro, just a regular define INTRINSIC_NAMESPACE

also the HLSL version should actually fully write out the namespace, i.e. nbl::hlsl::glsl or nbl::hlsl::spirv

devshgraphicsprogramming · 2024-04-06T23:51:43Z

include/nbl/builtin/hlsl/cmath.hlsl

+// Trigonometric functions
+NBL_ALIAS_UNARY_FUNCTION(cos)
+NBL_ALIAS_UNARY_FUNCTION(sin)
+NBL_ALIAS_UNARY_FUNCTION(tan)
+NBL_ALIAS_UNARY_FUNCTION(acos)
+NBL_ALIAS_UNARY_FUNCTION(asin)
+NBL_ALIAS_UNARY_FUNCTION(atan)
+NBL_ALIAS_BINARY_FUNCTION(atan2)
+
+// Hyperbolic functions
+NBL_ALIAS_UNARY_FUNCTION(cosh)
+NBL_ALIAS_UNARY_FUNCTION(sinh)
+NBL_ALIAS_UNARY_FUNCTION(tanh)


comment out the stuff thats not currently available with out spirv & glsl headers

devshgraphicsprogramming · 2024-04-06T23:52:07Z

include/nbl/builtin/hlsl/cmath.hlsl

+// Exponential and logarithmic functions
+NBL_ALIAS_UNARY_FUNCTION(exp)
+NBL_ALIAS_FUNCTION_WITH_OUTPUT_PARAM(frexp, int32_t)
+NBL_ALIAS_BINARY_FUNCTION(ldexp)
+NBL_ALIAS_UNARY_FUNCTION(log)
+NBL_ALIAS_UNARY_FUNCTION(log10)
+NBL_ALIAS_FUNCTION_WITH_OUTPUT_PARAM(modf, T)
+NBL_ALIAS_UNARY_FUNCTION(exp2)
+NBL_ALIAS_UNARY_FUNCTION(log2)
+NBL_ALIAS_UNARY_FUNCTION2(logb,log)


https://github.com/Devsh-Graphics-Programming/Nabla/pull/588/files#r1554730455

include/nbl/builtin/hlsl/cmath.hlsl

devshgraphicsprogramming · 2024-04-06T23:52:50Z

include/nbl/builtin/hlsl/cmath.hlsl

+template<class T>
+T acosh(T x)
+{
+    return INTRINSIC_NAMESPACE(log)(x + INTRINSIC_NAMESPACE(sqrt)(x*x - T(1)));
+}
+
+template<class T>
+T asinh(T x)
+{
+    return INTRINSIC_NAMESPACE(log)(x + INTRINSIC_NAMESPACE(sqrt)(x*x + T(1)));
+}
+
+template<class T>
+T atanh(T x)
+{
+    return T(0.5) * INTRINSIC_NAMESPACE(log)((T(1)+x)/(T(1)-x));
+}


remove this and replace with a TODO comment to forward directly

devshgraphicsprogramming · 2024-04-06T23:53:39Z

include/nbl/builtin/hlsl/cmath.hlsl

+// Power functions
+NBL_ALIAS_BINARY_FUNCTION(pow)
+NBL_ALIAS_UNARY_FUNCTION(sqrt)
+NBL_ALIAS_UNARY_FUNCTION(cbrt)


cbrt surely isn't available in SPIR-V, comment out, TODO

include/nbl/builtin/hlsl/cmath.hlsl

devshgraphicsprogramming · 2024-04-06T23:55:19Z

include/nbl/builtin/hlsl/cmath.hlsl

+NBL_ALIAS_BINARY_FUNCTION(fmod)
+NBL_ALIAS_UNARY_FUNCTION(trunc)
+
+// TODO:
+// Below are rounding mode dependent investigate how we handle it
+NBL_ALIAS_UNARY_FUNCTION(round)
+NBL_ALIAS_UNARY_FUNCTION(rint)
+NBL_ALIAS_UNARY_FUNCTION2(nearbyint,round)
+
+template<class T>
+T remquo(T num, T denom, NBL_REF_ARG(int32_t) quot)
+{
+    quot = int32_t(INTRINSIC_NAMESPACE(round)(num / denom));
+    return num - quot * denom;
+}
+
+template<class T>
+T remainder(T num, T denom)
+{
+    int32_t q;
+    return remquo(num, denom, q);
+}
+
+// Other functions
+NBL_ALIAS_UNARY_FUNCTION(abs)
+NBL_ALIAS_UNARY_FUNCTION2(fabs, abs)
+template<class T>
+T fma(T a, T b, T c)
+{
+    return INTRINSIC_NAMESPACE(fma)(a,b,c);
+}
+
+// Minimum, maximum, difference functions
+
+NBL_ALIAS_BINARY_FUNCTION2(fmax, max)
+NBL_ALIAS_BINARY_FUNCTION2(fmin, min)
+template<class T>
+T fdim(T x, T y)
+{
+    return INTRINSIC_NAMESPACE(max)(T(0),x-y);
+}


comment out whatever doesn't work in HLSL

devshgraphicsprogramming · 2024-12-19T08:28:33Z

include/nbl/builtin/hlsl/tgmath.hlsl

+template<class T>
+T acosh(T x)
+{
+    return INTRINSIC_NAMESPACE(log)(x + INTRINSIC_NAMESPACE(sqrt)(x*x - T(1)));
+}
+
+// TODO numerically better impl of asinh
+//template<class T>
+//T asinh(T x)
+//{
+//    return INTRINSIC_NAMESPACE(log)(x + INTRINSIC_NAMESPACE(sqrt)(x*x + T(1)));
+//}
+
+template<class T>
+T atanh(T x)
+{
+    return T(0.5) * INTRINSIC_NAMESPACE(log)((T(1)+x)/(T(1)-x));
+}


@keptsecret don't grab these implementations, should have been done in terms of log1p

See https://git.musl-libc.org/cgit/musl/tree/src/math/acosh.c

devshgraphicsprogramming · 2024-12-19T08:29:19Z

include/nbl/builtin/hlsl/tgmath.hlsl

+template<class T> 
+T log1p(T x) 
+{ 
+    // https://www.boost.org/doc/libs/1_83_0/boost/math/special_functions/log1p.hpp
+    /*if (x < T(-1))
+         "log1p(x) requires x > -1, but got x = %1%.
+        */
+    if (x == T(-1))
+    {
+        using uint_type = typename nbl::hlsl::numeric_limits<T>::uint_type;
+        return  -nbl::hlsl::numeric_limits<T>::infinity;
+    }
+    T u = T(1) + x;
+    if (u == T(1))
+        return x;
+    else
+        return INTRINSIC_NAMESPACE(log)(u) * (x / (u - T(1)));
+}
+
+template<class T>
+int32_t ilogb(T x) 
+{ 
+    using uint_type = typename nbl::hlsl::numeric_limits<T>::uint_type;
+    const int32_t shift = (impl::num_base<T>::float_digits-1);
+    const uint_type mask = ~(((uint_type(1) << shift) - 1) | (uint_type(1)<<(sizeof(T)*8-1)));
+    int32_t bits = (INTRINSIC_NAMESPACE(bit_cast)<uint_type, T>(x) & mask) >> shift; 
+    return bits + impl::num_base<T>::min_exponent - 2;
+}


these need to get checked against musl and boost again

devshgraphicsprogramming · 2024-12-19T08:29:58Z

include/nbl/builtin/hlsl/tgmath.hlsl

+template<class T>
+T copysign(T x, T sign_)
+{
+    if ((x < 0 && sign_ > 0) || (x > 0 && sign_ < 0))
+        return -x;
+    return x;
+}
+
+// Generate quiet NaN (function)
+template<class T>
+T nan()
+{
+    using uint_type = typename nbl::hlsl::numeric_limits<T>::uint_type;
+    return INTRINSIC_NAMESPACE(bit_cast)<uint_type, T>(nbl::hlsl::numeric_limits<T>::quiet_NaN);
+}


@keptsecret @Przemog1's ieee stuff does it better

here: ieee754.hlsl

@devshgraphicsprogramming maybe @keptsecret should merge this pr?

probably yes

devshgraphicsprogramming · 2024-12-19T08:30:26Z

include/nbl/builtin/hlsl/tgmath.hlsl

+#define NBL_ALIAS_FUNCTION_WITH_OUTPUT_PARAM(fn, out_type) \
+template<class T>  \
+T fn(T x, NBL_REF_ARG(out_type) y) { \
+    NBL_LANG_SELECT(out_type, T) out_; \
+    T ret = INTRINSIC_NAMESPACE(fn)(x, NBL_ADDRESS_OF(out_)); \
+    y = out_type(out_); \
+    return ret; \
+}


this has backfired for us spectacularly, we need to actually write them out by hand

devshgraphicsprogramming · 2024-12-19T08:31:53Z

@keptsecret this isn't really salavageable, mayber log1p and friends but nothing else

atkurtul added 6 commits October 27, 2023 00:49

cmath wip

a564f79

correct error in erf approximation

94760e2

tgamma approximation

cd5b805

add static_assert message

c832440

add -Wno-gnu-static-float-init flag to dxc

b96ee7a

cmath almost done

b6ab46b

devshgraphicsprogramming commented Oct 30, 2023

View reviewed changes

include/nbl/builtin/hlsl/bit.hlsl Show resolved Hide resolved

devshgraphicsprogramming commented Oct 30, 2023

View reviewed changes

include/nbl/builtin/hlsl/cmath.hlsl Outdated Show resolved Hide resolved

devshgraphicsprogramming commented Oct 30, 2023

View reviewed changes

include/nbl/builtin/hlsl/cmath.hlsl Outdated Show resolved Hide resolved

devshgraphicsprogramming commented Oct 30, 2023

View reviewed changes

include/nbl/builtin/hlsl/cmath.hlsl Outdated Show resolved Hide resolved

devshgraphicsprogramming commented Oct 30, 2023

View reviewed changes

include/nbl/builtin/hlsl/cpp_compat.hlsl Outdated Show resolved Hide resolved

devshgraphicsprogramming commented Oct 30, 2023

View reviewed changes

include/nbl/builtin/hlsl/cmath.hlsl Outdated Show resolved Hide resolved

devshgraphicsprogramming commented Oct 30, 2023

View reviewed changes

include/nbl/builtin/hlsl/cmath.hlsl Outdated Show resolved Hide resolved

devshgraphicsprogramming commented Oct 30, 2023

View reviewed changes

AnastaZIuk force-pushed the master branch 3 times, most recently from a07e79f to ffbd843 Compare January 18, 2024 21:24

Hazardu added 6 commits March 23, 2024 17:30

Merge branch 'master' into cmath

802b6b2

Update examples_tests

5bb0123

updated cpp compat and cmath

8ff68e4

resolved some comments

cd5091b

Update examples_tests

e98d0ef

typo fix

90387bd

devshgraphicsprogramming commented Apr 6, 2024

View reviewed changes