Commit 5537624
committed
support eval of float8_a1x128_w128x128
Summary:
Adds support for the new float8 scaling recipe in the official eval
scripts used to generate accuracy numbers in the README.
For now, I am using this as a smoke test that the scaling is working on
a real model - it is. We can add official benchmark results after we
hook up slayton's cuBLAS binding on H100, which should make the UEX of
running evals a lot better.
Test Plan:
Smoke test on LLama-3.1-8B, accuracy looks good
```
// download checkpoint
with-proxy python scripts/download.py --hf_token {token} --repo_id meta-llama/Meta-Llama-3.1-8B
// prepare checkpoint
python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/meta-llama/Meta-Llama-3.1-8B
// run bf16 eval on a single task
with-proxy time python torchao/_models/llama/eval.py --checkpoint_path checkpoints/meta-llama/Meta-Llama-3.1-8B/model.pth --tasks 'winogrande'
...
winogrande: {'alias': 'winogrande', 'acc,none': 0.7426992896606156, 'acc_stderr,none': 0.012285989618865697}
// run float8 eval on the same task
with-proxy time python torchao/_models/llama/eval.py --checkpoint_path checkpoints/meta-llama/Meta-Llama-3.1-8B/model.pth --tasks 'winogrande' --quantization float8_a1x128_w128x128 --compile
...
winogrande: {'alias': 'winogrande', 'acc,none': 0.7419100236779794, 'acc_stderr,none': 0.012298278833972477}
```
Reviewers:
Subscribers:
Tasks:
Tags:
ghstack-source-id: 11e939c
ghstack-comment-id: 3474380821
Pull-Request: #32691 parent a994c24 commit 5537624
File tree
3 files changed
+25
-4
lines changed- scripts
- torchao
- _models/llama
- quantization
3 files changed
+25
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
| 41 | + | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| |||
44 | 45 | | |
45 | 46 | | |
46 | 47 | | |
| 48 | + | |
47 | 49 | | |
48 | 50 | | |
49 | 51 | | |
| |||
169 | 171 | | |
170 | 172 | | |
171 | 173 | | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
172 | 182 | | |
173 | 183 | | |
174 | 184 | | |
| |||
273 | 283 | | |
274 | 284 | | |
275 | 285 | | |
276 | | - | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
277 | 296 | | |
278 | 297 | | |
279 | 298 | | |
| |||
371 | 390 | | |
372 | 391 | | |
373 | 392 | | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
374 | 396 | | |
375 | 397 | | |
376 | 398 | | |
| |||
387 | 409 | | |
388 | 410 | | |
389 | 411 | | |
| 412 | + | |
390 | 413 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1778 | 1778 | | |
1779 | 1779 | | |
1780 | 1780 | | |
1781 | | - | |
1782 | | - | |
1783 | 1781 | | |
1784 | 1782 | | |
1785 | 1783 | | |
| |||
0 commit comments