Skip to content

PrefixSum and PostfixSum not working #28

@xaphier

Description

@xaphier

Even in a very simple compute shader (DX11), the AmdDxExtShaderIntrinsics_WavePrefixSum and AmdDxExtShaderIntrinsics_WavePostfixSum produce wrong results. I tried on different hardware (RX 480, WX7100 and WX9100) all giving completely wrong results. I also tried different driver versions and always check that the AGS_DX11_EXTENSION_INTRINSIC_WAVE_REDUCE extension is supported.
This is a simple shader using the AmdDxExtShaderIntrinsics_WavePrefixSum (which produces bogus values) or optionally (just commenting out the USE_WAVE_PREFIX_SUM define) using AmdDxExtShaderIntrinsics_SwizzleU & AmdDxExtShaderIntrinsics_ReadlaneU to manually creating the prefix sum (which produces the correct values).

#include "ags_shader_intrinsics_dx11.hlsl"

RWBuffer<uint> dst : register(u0);

#define USE_WAVE_PREFIX_SUM

#define MAKE_MASK(XOR, OR, AND) (((XOR) << 10) | ((OR) << 5) | (AND))

[numthreads(8, 8, 1)] void main(uint3 groupId
                                : SV_GroupID, uint3 dispatchThreadId
                                : SV_DispatchThreadID, uint3 groupThreadId
                                : SV_GroupThreadID, uint groupIndex
                                : SV_GroupIndex) {
    uint groupIdx = groupId.x * 8 * 8;
    uint v0 = groupIndex;
	
#ifndef USE_WAVE_PREFIX_SUM
    uint sum = v0;
    uint value = 0;
    value = AmdDxExtShaderIntrinsics_SwizzleU(sum, AmdDxExtShaderIntrinsicsSwizzle_SwapX1);
    sum += (groupIndex & 1) == 0 ? 0 : value;
    value = AmdDxExtShaderIntrinsics_SwizzleU(sum, MAKE_MASK(0x00, 0x01, 0x1C));
    sum += (groupIndex & 2) == 0 ? 0 : value;
    value = AmdDxExtShaderIntrinsics_SwizzleU(sum, MAKE_MASK(0x00, 0x03, 0x18));
    sum += (groupIndex & 4) == 0 ? 0 : value;
    value = AmdDxExtShaderIntrinsics_SwizzleU(sum, MAKE_MASK(0x00, 0x07, 0x10));
    sum += (groupIndex & 8) == 0 ? 0 : value;
    value = AmdDxExtShaderIntrinsics_SwizzleU(sum, MAKE_MASK(0x00, 0x0F, 0x00));
    sum += (groupIndex & 16) == 0 ? 0 : value;
    value = AmdDxExtShaderIntrinsics_ReadlaneU(sum, 31);
    sum += (groupIndex & 32) == 0 ? 0 : value;
#else
    uint sum = AmdDxExtShaderIntrinsics_WavePrefixSum(v0);
#endif

    dst[groupIdx + groupIndex] = sum;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions