Commit 0ec5e20
[1/N] Refactored AutoQuantizeSearcher to _AutoQuantizeBaseSearcher & AutoQuantizeGradientSearcher; seperated quant modules and score modules (NVIDIA#586)
## What does this PR do?
**Type of change:** Refator; Minor new feature
**Overview:** ?
1. Refactored AutoQuantizeSearcher to _AutoQuantizeBaseSearcher &
AutoQuantizeGradientSearcher - Prepares architecture for additional
search methods.
2. seperated quant modules and score modules - separate quantization
modules from scoring modules, enabling auto-quantization to measure
sensitivity at parent layers (e.g., MLP output for MoE experts) rather
than individual ops.
3. Also see NVIDIA#592
and NVIDIA#588
## Testing
See unittests; `tests/unit/torch/quantization/test_autoquant.py` and
`tests/unit/torch/quantization/plugins/test_huggingface.py`
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: Yes
- **Did you add or update any necessary documentation?**: Yes
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Not Required
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added support for score modules in quantization workflows.
* Added optional naming for quantization recipes.
* **Bug Fixes**
* Improved quantization grouping rules documentation with clearer
configuration examples.
* **Refactor**
* Renamed quantization module parameters for improved clarity.
* Enhanced quantization search architecture for better scalability.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: realAsma <[email protected]>
Co-authored-by: Asma Kuriparambil Thekkumpate <[email protected]>1 parent 3e725c3 commit 0ec5e20
File tree
15 files changed
+1170
-341
lines changed- examples
- llm_eval
- llm_ptq
- scripts
- modelopt/torch
- opt
- quantization
- plugins
- tests
- gpu/torch/export
- unit/torch/quantization
- plugins
15 files changed
+1170
-341
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
| 18 | + | |
17 | 19 | | |
18 | 20 | | |
19 | 21 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
201 | 201 | | |
202 | 202 | | |
203 | 203 | | |
204 | | - | |
205 | 204 | | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
206 | 209 | | |
207 | 210 | | |
208 | 211 | | |
| |||
450 | 453 | | |
451 | 454 | | |
452 | 455 | | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
453 | 486 | | |
454 | 487 | | |
455 | 488 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
56 | 59 | | |
57 | 60 | | |
58 | 61 | | |
| |||
81 | 84 | | |
82 | 85 | | |
83 | 86 | | |
| 87 | + | |
| 88 | + | |
84 | 89 | | |
85 | 90 | | |
| 91 | + | |
86 | 92 | | |
87 | 93 | | |
88 | 94 | | |
| |||
101 | 107 | | |
102 | 108 | | |
103 | 109 | | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
104 | 116 | | |
105 | 117 | | |
106 | 118 | | |
| |||
110 | 122 | | |
111 | 123 | | |
112 | 124 | | |
113 | | - | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
114 | 134 | | |
115 | 135 | | |
116 | | - | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
117 | 149 | | |
118 | 150 | | |
119 | 151 | | |
| |||
139 | 171 | | |
140 | 172 | | |
141 | 173 | | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
142 | 177 | | |
143 | 178 | | |
144 | 179 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
227 | 227 | | |
228 | 228 | | |
229 | 229 | | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
230 | 233 | | |
231 | 234 | | |
232 | 235 | | |
| |||
281 | 284 | | |
282 | 285 | | |
283 | 286 | | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
284 | 290 | | |
285 | 291 | | |
286 | 292 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
| 69 | + | |
| 70 | + | |
69 | 71 | | |
70 | 72 | | |
| 73 | + | |
71 | 74 | | |
72 | 75 | | |
73 | 76 | | |
| |||
81 | 84 | | |
82 | 85 | | |
83 | 86 | | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
88 | 108 | | |
89 | 109 | | |
90 | 110 | | |
91 | 111 | | |
92 | 112 | | |
93 | 113 | | |
94 | | - | |
| 114 | + | |
95 | 115 | | |
96 | 116 | | |
97 | | - | |
98 | | - | |
99 | | - | |
| 117 | + | |
| 118 | + | |
100 | 119 | | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
101 | 123 | | |
102 | 124 | | |
103 | 125 | | |
| |||
141 | 163 | | |
142 | 164 | | |
143 | 165 | | |
144 | | - | |
145 | 166 | | |
146 | 167 | | |
147 | 168 | | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
148 | 173 | | |
149 | 174 | | |
150 | 175 | | |
| |||
155 | 180 | | |
156 | 181 | | |
157 | 182 | | |
158 | | - | |
159 | 183 | | |
160 | 184 | | |
161 | 185 | | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
162 | 191 | | |
163 | 192 | | |
164 | 193 | | |
| |||
170 | 199 | | |
171 | 200 | | |
172 | 201 | | |
| 202 | + | |
| 203 | + | |
173 | 204 | | |
174 | | - | |
| 205 | + | |
175 | 206 | | |
176 | 207 | | |
177 | 208 | | |
| |||
186 | 217 | | |
187 | 218 | | |
188 | 219 | | |
189 | | - | |
| 220 | + | |
190 | 221 | | |
191 | 222 | | |
192 | 223 | | |
193 | 224 | | |
194 | 225 | | |
195 | 226 | | |
196 | 227 | | |
197 | | - | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
198 | 237 | | |
199 | 238 | | |
200 | 239 | | |
| |||
0 commit comments