Skip to content

Commit e16506d

Browse files
committed
Update on "Improve QAT nvfp4 numerics"
**Summary:** Similar to #2986, this commit improves the prepare vs convert SQNR of NVFP4 QAT from 12 to 36 with `use_per_tensor_scale`, and 12 to inf without. This is achieved by mimicking the PTQ flow more closely, in particular, in descending order of significance: 1. Simulate `f4_unpacked_to_f32` and `f32_to_f4_unpacked`, but in `torch.int32` instead of `torch.uint8` 2. Do not cast intermediate fake quantized values to original dtype, e.g. bf16 which loses some fidelity from fp32 3. Fake round blockwise scales to float8 **Test Plan:** ``` python test/quantization/test_qat.py -k test_qat_nvfp4 python test/quantization/test_qat.py -k test_quantize_api_nvfp4 ``` End-to-end tests TBD. [ghstack-poisoned]
2 parents 8519147 + 6585a8c commit e16506d

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

test/quantization/test_qat.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2096,6 +2096,7 @@ def test_quantize_api_nvfp4(self, use_per_tensor_scale: bool):
20962096
target_convert_sqnr=float("inf"),
20972097
)
20982098

2099+
@unittest.skipIf(not is_sm_at_least_89(), "Need sm89+")
20992100
@unittest.skipIf(not _CUDA_IS_AVAILABLE, "skipping when cuda is not available")
21002101
@parametrize("use_per_tensor_scale", [True, False])
21012102
def test_qat_nvfp4(self, use_per_tensor_scale: bool):

0 commit comments

Comments
 (0)