Skip to content

Conversation

@PotatoPeeler3000
Copy link
Contributor

No description provided.

* This class handles the kernel timings.
* With options to compute the min/max, average, and variance of the dataset
*/
private static class CudaKernelTimings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to CUDAKernelTimings

{
private final CUfunc_st kernelFunction = new CUfunc_st();
private final List<Pointer> parameters = new ArrayList<>();
private final String name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this the first field above kernelFunction?

private final CUevent_st end = new CUevent_st();

private int error;
private boolean enableKernelTimings = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this below retainParameters so that the "flags" of the class are together?

Comment on lines 125 to 133
if (PRINT_TIMING_FOR_KERNELS)
{
updateKernel.enableKernelTimings(true);
registerKernel.enableKernelTimings(true);
croppingKernel.enableKernelTimings(true);
planOffsetKernel.enableKernelTimings(true);
emptyRegisterKernel.enableKernelTimings(true);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pass PRINT_TIMING_FOR_KERNELS into each method
e.g. updateKernel.enableKernelTimings(PRINT_TIMING_FOR_KERNELS);


if (PRINT_TIMING_FOR_KERNELS)
{
snappingKernel.enableKernelTimings(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snappingKernel.enableKernelTimings(PRINT_TIMING_FOR_KERNELS);

@ds58 ds58 self-requested a review February 18, 2025 20:28
@PotatoPeeler3000 PotatoPeeler3000 merged commit 5174bae into develop Feb 18, 2025
64 of 66 checks passed
@PotatoPeeler3000 PotatoPeeler3000 deleted the feature/cuda-timings branch February 18, 2025 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants