Commit 52a0ca8
committed
Improve QAT nvfp4 numerics
**Summary:** Similar to #2986,
this commit improves the prepare vs convert SQNR of NVFP4 QAT
from 12 to 36 with `use_per_tensor_scale`, and 12 to inf without.
This is achieved by mimicking the PTQ flow more closely,
in particular, in descending order of significance:
1. Simulate `f4_unpacked_to_f32` and `f32_to_f4_unpacked`,
but in `torch.int32` instead of `torch.uint8`
2. Do not cast intermediate fake quantized values to original
dtype, e.g. bf16 which loses some fidelity from fp32
3. Fake round blockwise scales to float8
**Test Plan:**
```
python test/quantization/test_qat.py -k test_qat_nvfp4
python test/quantization/test_qat.py -k test_quantize_api_nvfp4
```
End-to-end tests TBD.
ghstack-source-id: 47019f4
Pull Request resolved: #30501 parent e1d89e7 commit 52a0ca8
File tree
5 files changed
+49
-18
lines changed- test/quantization
- torchao/prototype
- mx_formats
- qat
5 files changed
+49
-18
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1910 | 1910 | | |
1911 | 1911 | | |
1912 | 1912 | | |
1913 | | - | |
1914 | 1913 | | |
1915 | 1914 | | |
1916 | 1915 | | |
| |||
2086 | 2085 | | |
2087 | 2086 | | |
2088 | 2087 | | |
| 2088 | + | |
| 2089 | + | |
| 2090 | + | |
| 2091 | + | |
| 2092 | + | |
2089 | 2093 | | |
2090 | 2094 | | |
2091 | | - | |
| 2095 | + | |
2092 | 2096 | | |
2093 | 2097 | | |
2094 | 2098 | | |
| |||
2098 | 2102 | | |
2099 | 2103 | | |
2100 | 2104 | | |
| 2105 | + | |
2101 | 2106 | | |
2102 | 2107 | | |
2103 | 2108 | | |
2104 | 2109 | | |
2105 | 2110 | | |
| 2111 | + | |
| 2112 | + | |
| 2113 | + | |
| 2114 | + | |
2106 | 2115 | | |
2107 | 2116 | | |
2108 | 2117 | | |
| |||
2116 | 2125 | | |
2117 | 2126 | | |
2118 | 2127 | | |
2119 | | - | |
| 2128 | + | |
| 2129 | + | |
| 2130 | + | |
| 2131 | + | |
| 2132 | + | |
2120 | 2133 | | |
2121 | 2134 | | |
2122 | 2135 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
28 | 30 | | |
29 | 31 | | |
30 | 32 | | |
| |||
105 | 107 | | |
106 | 108 | | |
107 | 109 | | |
108 | | - | |
| 110 | + | |
| 111 | + | |
109 | 112 | | |
110 | 113 | | |
111 | 114 | | |
| |||
120 | 123 | | |
121 | 124 | | |
122 | 125 | | |
123 | | - | |
| 126 | + | |
| 127 | + | |
124 | 128 | | |
125 | 129 | | |
126 | 130 | | |
127 | 131 | | |
128 | | - | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
129 | 136 | | |
130 | 137 | | |
131 | 138 | | |
132 | 139 | | |
133 | 140 | | |
134 | | - | |
| 141 | + | |
| 142 | + | |
135 | 143 | | |
136 | 144 | | |
137 | 145 | | |
138 | 146 | | |
139 | 147 | | |
140 | 148 | | |
141 | 149 | | |
142 | | - | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
143 | 154 | | |
144 | 155 | | |
145 | 156 | | |
146 | 157 | | |
147 | | - | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
148 | 161 | | |
149 | 162 | | |
150 | 163 | | |
| |||
154 | 167 | | |
155 | 168 | | |
156 | 169 | | |
157 | | - | |
| 170 | + | |
| 171 | + | |
158 | 172 | | |
159 | 173 | | |
160 | 174 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | | - | |
| 68 | + | |
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
74 | | - | |
| 74 | + | |
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
| |||
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
95 | | - | |
| 95 | + | |
96 | 96 | | |
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
101 | | - | |
| 101 | + | |
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
798 | 798 | | |
799 | 799 | | |
800 | 800 | | |
801 | | - | |
802 | 801 | | |
803 | 802 | | |
804 | 803 | | |
| |||
834 | 833 | | |
835 | 834 | | |
836 | 835 | | |
837 | | - | |
| 836 | + | |
838 | 837 | | |
839 | 838 | | |
840 | 839 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
5 | 9 | | |
6 | 10 | | |
7 | 11 | | |
| |||
56 | 60 | | |
57 | 61 | | |
58 | 62 | | |
| 63 | + | |
59 | 64 | | |
60 | 65 | | |
61 | | - | |
62 | 66 | | |
63 | 67 | | |
64 | 68 | | |
65 | 69 | | |
| 70 | + | |
66 | 71 | | |
67 | 72 | | |
68 | 73 | | |
| |||
0 commit comments