-
Notifications
You must be signed in to change notification settings - Fork 15
Adds tests for the new Morton Code class #187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
|
||
| add_subdirectory(70_FLIPFluids) | ||
| add_subdirectory(71_RayTracingPipeline) | ||
| add_subdirectory(73_Mortons EXCLUDE_FROM_ALL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually can you make it example 14 or 15 because the low numbers is where I keep basic HLSL/C++
| [numthreads(256, 1, 1)] | ||
| [shader("compute")] | ||
| void main(uint3 invocationID : SV_DispatchThreadID) | ||
| { | ||
| if (invocationID.x == 0) | ||
| fillTestValues(inputTestValues[0], outputTestValues[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just make a 1,1,1 workgroup and always call fillTextValues(inputTestValues[gl_GlobalInvocationID.x],outputTestValues[gl_GlobalInvocationID.x])
| // Disabled: current glm implementation is wrong | ||
| //verifyTestValue("subBorrowResult", expectedTestValues.subBorrow.result, testValues.subBorrow.result, testType); | ||
| //verifyTestValue("subBorrowBorrow", expectedTestValues.subBorrow.borrow, testValues.subBorrow.borrow, testType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then use an alternative implementation of subBorrow and don't use GLM's
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assigned @Przemog1 in next PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually glm merged my PR, and I pushed a fix to our branch. So it should be usable now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ye so its just updating glm
|
|
||
| #include <nabla.h> | ||
| #include "app_resources/testCommon.hlsl" | ||
| #include "ITester.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we reusing things from a different example ?
if you want ITester to be general, please move to appropriate directory like common/include/nbl/examples/testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assigned @Przemog1 in next PR
| #ifndef _NBL_EXAMPLES_TESTS_22_CPP_COMPAT_I_TESTER_INCLUDED_ | ||
| #define _NBL_EXAMPLES_TESTS_22_CPP_COMPAT_I_TESTER_INCLUDED_ | ||
|
|
||
| #include <nabla.h> | ||
| #include "app_resources/common.hlsl" | ||
| #include "nbl/application_templates/MonoDeviceApplication.hpp" | ||
|
|
||
| using namespace nbl; | ||
|
|
||
| class ITester | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please unify with ex22, don't want this much duplicate code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assigned @Przemog1 in next PR
| template<typename InputStruct, typename OutputStruct> | ||
| OutputStruct dispatch(const InputStruct& input) | ||
| { | ||
| // Update input buffer | ||
| if (!m_inputBufferAllocation.memory->map({ 0ull,m_inputBufferAllocation.memory->getAllocationSize() }, video::IDeviceMemoryAllocation::EMCAF_READ)) | ||
| logFail("Failed to map the Device Memory!\n"); | ||
|
|
||
| const video::ILogicalDevice::MappedMemoryRange memoryRange(m_inputBufferAllocation.memory.get(), 0ull, m_inputBufferAllocation.memory->getAllocationSize()); | ||
| if (!m_inputBufferAllocation.memory->getMemoryPropertyFlags().hasFlags(video::IDeviceMemoryAllocation::EMPF_HOST_COHERENT_BIT)) | ||
| m_device->invalidateMappedMemoryRanges(1, &memoryRange); | ||
|
|
||
| std::memcpy(static_cast<InputStruct*>(m_inputBufferAllocation.memory->getMappedPointer()), &input, sizeof(InputStruct)); | ||
|
|
||
| m_inputBufferAllocation.memory->unmap(); | ||
|
|
||
| // record command buffer | ||
| m_cmdbuf->reset(video::IGPUCommandBuffer::RESET_FLAGS::NONE); | ||
| m_cmdbuf->begin(video::IGPUCommandBuffer::USAGE::NONE); | ||
| m_cmdbuf->beginDebugMarker("test", core::vector4df_SIMD(0, 1, 0, 1)); | ||
| m_cmdbuf->bindComputePipeline(m_pipeline.get()); | ||
| m_cmdbuf->bindDescriptorSets(nbl::asset::EPBP_COMPUTE, m_pplnLayout.get(), 0, 1, &m_ds.get()); | ||
| m_cmdbuf->dispatch(1, 1, 1); | ||
| m_cmdbuf->endDebugMarker(); | ||
| m_cmdbuf->end(); | ||
|
|
||
| video::IQueue::SSubmitInfo submitInfos[1] = {}; | ||
| const video::IQueue::SSubmitInfo::SCommandBufferInfo cmdbufs[] = { {.cmdbuf = m_cmdbuf.get()} }; | ||
| submitInfos[0].commandBuffers = cmdbufs; | ||
| const video::IQueue::SSubmitInfo::SSemaphoreInfo signals[] = { {.semaphore = m_semaphore.get(), .value = ++m_semaphoreCounter, .stageMask = asset::PIPELINE_STAGE_FLAGS::COMPUTE_SHADER_BIT} }; | ||
| submitInfos[0].signalSemaphores = signals; | ||
|
|
||
| m_api->startCapture(); | ||
| m_queue->submit(submitInfos); | ||
| m_api->endCapture(); | ||
|
|
||
| m_device->waitIdle(); | ||
| OutputStruct output; | ||
| std::memcpy(&output, static_cast<OutputStruct*>(m_outputBufferAllocation.memory->getMappedPointer()), sizeof(OutputStruct)); | ||
| m_device->waitIdle(); | ||
|
|
||
| return output; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we dispatching once per tests, could dispatch all tests in parallel (one invocation one test iteration)!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assigned @Przemog1 in next PR
| /* | ||
| void fillSecondTestValues(NBL_CONST_REF_ARG(InputTestValues) input) | ||
| { | ||
| uint64_t2 Vec2A = { input.coordX, input.coordY }; | ||
| uint64_t2 Vec2B = { input.coordZ, input.coordW }; | ||
| uint64_t3 Vec3A = { input.coordX, input.coordY, input.coordZ }; | ||
| uint64_t3 Vec3B = { input.coordY, input.coordZ, input.coordW }; | ||
| uint64_t4 Vec4A = { input.coordX, input.coordY, input.coordZ, input.coordW }; | ||
| uint64_t4 Vec4B = { input.coordY, input.coordZ, input.coordW, input.coordX }; | ||
| int64_t2 Vec2ASigned = int64_t2(Vec2A); | ||
| int64_t2 Vec2BSigned = int64_t2(Vec2B); | ||
| int64_t3 Vec3ASigned = int64_t3(Vec3A); | ||
| int64_t3 Vec3BSigned = int64_t3(Vec3B); | ||
| int64_t4 Vec4ASigned = int64_t4(Vec4A); | ||
| int64_t4 Vec4BSigned = int64_t4(Vec4B); | ||
| morton::code<false, fullBits_4, 4, emulated_uint64_t> morton_emulated_4A = morton::code<false, fullBits_4, 4, emulated_uint64_t>::create(Vec4A); | ||
| morton::code<true, fullBits_2, 2, emulated_uint64_t> morton_emulated_2_signed = morton::code<true, fullBits_2, 2, emulated_uint64_t>::create(Vec2ASigned); | ||
| morton::code<true, fullBits_3, 3, emulated_uint64_t> morton_emulated_3_signed = morton::code<true, fullBits_3, 3, emulated_uint64_t>::create(Vec3ASigned); | ||
| morton::code<true, fullBits_4, 4, emulated_uint64_t> morton_emulated_4_signed = morton::code<true, fullBits_4, 4, emulated_uint64_t>::create(Vec4ASigned); | ||
| output.mortonEqual_emulated_4 = uint32_t4(morton_emulated_4A.equal<false>(uint16_t4(Vec4B))); | ||
| output.mortonUnsignedLess_emulated_4 = uint32_t4(morton_emulated_4A.lessThan<false>(uint16_t4(Vec4B))); | ||
| mortonSignedLess_emulated_2 = uint32_t2(morton_emulated_2_signed.lessThan<false>(int32_t2(Vec2BSigned))); | ||
| mortonSignedLess_emulated_3 = uint32_t3(morton_emulated_3_signed.lessThan<false>(int32_t3(Vec3BSigned))); | ||
| mortonSignedLess_emulated_4 = uint32_t4(morton_emulated_4_signed.lessThan<false>(int16_t4(Vec4BSigned))); | ||
| uint16_t castedShift = uint16_t(input.shift); | ||
| arithmetic_right_shift_operator<morton::code<true, fullBits_2, 2, emulated_uint64_t> > rightShiftSignedEmulated2; | ||
| mortonSignedRightShift_emulated_2 = rightShiftSignedEmulated2(morton_emulated_2_signed, castedShift); | ||
| arithmetic_right_shift_operator<morton::code<true, fullBits_3, 3, emulated_uint64_t> > rightShiftSignedEmulated3; | ||
| mortonSignedRightShift_emulated_3 = rightShiftSignedEmulated3(morton_emulated_3_signed, castedShift); | ||
| arithmetic_right_shift_operator<morton::code<true, fullBits_4, 4, emulated_uint64_t> > rightShiftSignedEmulated4; | ||
| mortonSignedRightShift_emulated_4 = rightShiftSignedEmulated4(morton_emulated_4_signed, castedShift); | ||
| } | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fletterio whats this commented out block about ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was some fucked up reason that was preventing me from running all tests in a single shader (likely some DXC bug) so I think I was in the middle of moving the commented code to a different shader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok so bug no longer there and we can remove this commented block of code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug is probably still there, I thought I had reported it but it seems not. These are tests that if for some reason you add them to testCommon.hlsl it will fail to compile. I had temporarily commented them out here and the idea was to have many different test shaders so these could be tested apart. Specifically these would be tests for comparison operators for a 4D morton with 16bits per coord stored in an emulated uint64, and tests for the arithmetic right shift operator for a 2D, 3D or 4D morton backed by an emulated uint64.
| #ifndef __HLSL_VERSION | ||
|
|
||
| constexpr uint64_t smallBitsMask_2 = (uint64_t(1) << smallBits_2) - 1; | ||
| constexpr uint64_t mediumBitsMask_2 = (uint64_t(1) << mediumBits_2) - 1; | ||
| constexpr uint64_t fullBitsMask_2 = (uint64_t(1) << fullBits_2) - 1; | ||
|
|
||
| constexpr uint64_t smallBitsMask_3 = (uint64_t(1) << smallBits_3) - 1; | ||
| constexpr uint64_t mediumBitsMask_3 = (uint64_t(1) << mediumBits_3) - 1; | ||
| constexpr uint64_t fullBitsMask_3 = (uint64_t(1) << fullBits_3) - 1; | ||
|
|
||
| constexpr uint64_t smallBitsMask_4 = (uint64_t(1) << smallBits_4) - 1; | ||
| constexpr uint64_t mediumBitsMask_4 = (uint64_t(1) << mediumBits_4) - 1; | ||
| constexpr uint64_t fullBitsMask_4 = (uint64_t(1) << fullBits_4) - 1; | ||
|
|
||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these variables used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No clue what I had in mind here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok @kevyuu delete them
Checks that the arithmetic and comparisons work as expected + checks it compiles for GPU