Commit 6732127
make --device fast the default (#515)
* make --device fast the default
* Update iOS.md (#517)
* Update iOS.md
* Update iOS.md
* Pip to pip3 (#504)
* remove macos-12 test
* pip to pip3
* break aoti CI jobs separately (#500)
* init
* fixes
* more fixes
* fixes
* fix
* fix
* bug fix
* add objcopy update
* suppress int8
* undefined variable
---------
Co-authored-by: Michael Gschwind <[email protected]>
* Support llama3 in chat in run.cpp (#486)
* refactor chat runner in preparation for llama3
* add sketch for llama3 prompt template and move to returning tokens
* fix tiktoken
* fixes to chat
* add default llama_ver
* Add tests for quantize json, add cuda device specification and precision to cuda.json (#519)
* remove code for no KV Cache path (#527)
* Update ADVANCED-USERS.md (#529)
Update Advanced Users description to reflect changes in the repo since the description was initially created.
* runner-aoti on cuda (#531)
* runner-aoti on cuda
* transfer results back to CPU
* transfer results back to CPU
* runner-aoti on cuda
* Update runner_build.md (#530)
Update description of runner and build process in runner_build.md
* clean up runner code a little (#532)
* clean up runner code a little
* update
* update
* pull out generate loop in chat
* updates
* edit docs
* typo
* move int8 linear class and function into qops.py (#534)
* add dtype tests for runner-aoti + runner-et (#539)
* add dtype tests for runner-aoti + runner-et
* typo
* Quantized embedding (#536)
* move int8 linear class and function into qops.py
* move Quantized Embedding to qops.py
* Move Linear int4 to qops (#537)
* move int8 linear class and function into qops.py
* move Quantized Embedding to qops.py
* move int4 linear to qops
* Revert "add dtype tests for runner-aoti + runner-et (#539)" (#548)
This reverts commit a7a24577a65be67ac9ae4dc05452f35d9c49e5d1.
* fix generate for llama3 (#538)
* fix generate for llama3
* switch more things to C
* remove C++ header
* add delegation visualization instructions (#551)
* Add dtype runner aoti (#552)
* add dtype tests for runner-aoti + runner-et
* typo
* add dtype test runner-aoti
* test sdpa with fp16 (#553)
* test sdpa with fp16
* kv cache fp32
* typo
* update (#560)
* Only support newest versions of lm-eval (#556)
Summary:
remove support for lm-eval 0.3 to reduce the options we have
Test Plan:
CI
Reviewers:
Subscribers:
Tasks:
Tags:
* split cpu eval CI by dtype (#554)
* split cpu eval CI by dtype
* fix
* differentiate names with checks
* keep one name the same as old
* fix
* Removing duplicate HF issue message from README (#559)
Co-authored-by: Michael Gschwind <[email protected]>
* doc updates (#567)
* Add VM-safe MPS check
---------
Co-authored-by: Anthony Shoumikhin <[email protected]>
Co-authored-by: metascroy <[email protected]>
Co-authored-by: Nikita Shulga <[email protected]>
Co-authored-by: lucylq <[email protected]>
Co-authored-by: Jerry Zhang <[email protected]>
Co-authored-by: Jack-Khuu <[email protected]>1 parent 62d4041 commit 6732127
3 files changed
+18
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
156 | 156 | | |
157 | 157 | | |
158 | 158 | | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
159 | 173 | | |
160 | 174 | | |
161 | 175 | | |
162 | 176 | | |
163 | 177 | | |
164 | 178 | | |
165 | | - | |
| 179 | + | |
166 | 180 | | |
167 | 181 | | |
168 | 182 | | |
| |||
173 | 187 | | |
174 | 188 | | |
175 | 189 | | |
176 | | - | |
| 190 | + | |
177 | 191 | | |
178 | 192 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
210 | 210 | | |
211 | 211 | | |
212 | 212 | | |
213 | | - | |
| 213 | + | |
214 | 214 | | |
215 | 215 | | |
216 | 216 | | |
| |||
0 commit comments