Skip to content

Conversation

wangdongxing4
Copy link

Since each block is responsible for a set of groups, the Block-Stride Loop approach should be used to process groups during the receive phase. Therefore, the increment of the for loop should be blockDim.x, rather than gridDim.x * expertsPerBlock.

Since each block is responsible for a set of groups, the Block-Stride Loop approach should be used to process groups during the receive phase.
Therefore, the increment of the for loop should be blockDim.x, rather than gridDim.x * expertsPerBlock.
@abcdabcd987 abcdabcd987 requested a review from nandor July 28, 2025 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant