-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Closed
Description
Motivation.
Currently CI spent a lot of time in kernel unit test, even longer than 3 hours, but you know they are just kernels. So we aim to simplify some unit tests to make CI faster.
Proposed Change.
1. Reduce Test Cases
For example, we can reduce the amount of shapes
MNK_FACTORS = [
(1, 128, 128),
(1, 512, 512),
(1, 128, 7168),
(1, 1024, 7168),
(1, 4608, 128),
(1, 4608, 512),
(1, 4608, 7168),
(83, 128, 128),
(83, 512, 512),
(83, 1024, 7168),
(83, 4608, 512),
(83, 4608, 7168),
(128, 128, 128),
(128, 512, 512),
(128, 1024, 7168),
(128, 4608, 512),
(128, 4608, 7168),
(2048, 128, 128),
(2048, 1024, 7168),
(2048, 4608, 512),
(2048, 4608, 7168),
(8192, 128, 128),
(8192, 512, 512),
(8192, 128, 7168),
(8192, 1024, 7168),
(8192, 4608, 512),
(8192, 4608, 7168),
]
->
MNK_FACTORS = [
(1, 128, 128),
(1, 1024, 7168),
(1, 4608, 128),
(83, 512, 512),
(128, 128, 128),
(128, 1024, 7168),
(2048, 1024, 7168),
(2048, 4608, 512),
(8192, 512, 512),
(8192, 4608, 512),
(8192, 4608, 7168),
]2. Refactor test type
wentao@gpu66:~/vllm-source/tests/kernels/moe$ ls
__init__.py test_modular_kernel_combinations.py
__pycache__ test_moe.py
modular_kernel_tools test_moe_align_block_size.py
parallel_utils.py test_moe_permute_unpermute.py
test_batched_moe.py test_mxfp4_moe.py
test_block_fp8.py test_nvfp4_moe.py
test_block_int8.py test_pplx_cutlass_moe.py
test_count_expert_num_tokens.py test_pplx_moe.py
test_cutlass_grouped_gemm.py test_rocm_aiter_topk.py
test_cutlass_moe.py test_silu_mul_fp8_quant_deep_gemm.py
test_deepep_deepgemm_moe.py test_triton_moe_ptpc_fp8.py
test_deepep_moe.py utils.py
test_deepgemm.pySome communication kernels like deepep & pplx is time consuming and should be moved to other test folder. And only be triggered if necessary
3. Remove Test
Possibly remove some useless test entirely.
Feedback Period.
No response
CC List.
Any Other Things.
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
njhill