Commit ee41748
committed
[Executorch][llama] Enable quantized sdpa
Pull Request resolved: #9945
Enable leveraging quantized sdpa op when quantized kv cache is used. Instead of adding yet another arg, at the moment I have chosen to leverage quantize_kv_cache option.
ghstack-source-id: 276961554
Differential Revision: [D71833064](https://our.internmc.facebook.com/intern/diff/D71833064/)1 parent b34fb9c commit ee41748
File tree
12 files changed
+488
-51
lines changed- examples/models/llama
- source_transformation
- extension/llm/custom_ops
- third-party
12 files changed
+488
-51
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
274 | 274 | | |
275 | 275 | | |
276 | 276 | | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
| 66 | + | |
66 | 67 | | |
67 | 68 | | |
68 | 69 | | |
| |||
77 | 78 | | |
78 | 79 | | |
79 | 80 | | |
| 81 | + | |
80 | 82 | | |
81 | 83 | | |
82 | 84 | | |
| |||
1222 | 1224 | | |
1223 | 1225 | | |
1224 | 1226 | | |
| 1227 | + | |
1225 | 1228 | | |
1226 | 1229 | | |
1227 | 1230 | | |
1228 | 1231 | | |
1229 | 1232 | | |
| 1233 | + | |
| 1234 | + | |
1230 | 1235 | | |
1231 | 1236 | | |
1232 | 1237 | | |
| |||
Lines changed: 44 additions & 19 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| 55 | + | |
| 56 | + | |
55 | 57 | | |
56 | 58 | | |
57 | 59 | | |
| |||
61 | 63 | | |
62 | 64 | | |
63 | 65 | | |
64 | | - | |
| 66 | + | |
65 | 67 | | |
66 | 68 | | |
67 | | - | |
| 69 | + | |
68 | 70 | | |
69 | 71 | | |
70 | 72 | | |
71 | | - | |
| 73 | + | |
72 | 74 | | |
73 | 75 | | |
74 | | - | |
| 76 | + | |
75 | 77 | | |
76 | 78 | | |
77 | 79 | | |
| |||
91 | 93 | | |
92 | 94 | | |
93 | 95 | | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
| 96 | + | |
104 | 97 | | |
105 | | - | |
106 | 98 | | |
107 | 99 | | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
108 | 105 | | |
109 | 106 | | |
110 | 107 | | |
| |||
125 | 122 | | |
126 | 123 | | |
127 | 124 | | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
128 | 128 | | |
129 | 129 | | |
130 | | - | |
131 | | - | |
| 130 | + | |
| 131 | + | |
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
135 | 135 | | |
136 | 136 | | |
137 | 137 | | |
138 | 138 | | |
139 | | - | |
140 | | - | |
| 139 | + | |
| 140 | + | |
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
144 | 144 | | |
145 | 145 | | |
146 | 146 | | |
| 147 | + | |
| 148 | + | |
147 | 149 | | |
148 | 150 | | |
149 | 151 | | |
| |||
152 | 154 | | |
153 | 155 | | |
154 | 156 | | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
155 | 180 | | |
156 | 181 | | |
157 | 182 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
17 | 19 | | |
18 | 20 | | |
19 | 21 | | |
| |||
76 | 78 | | |
77 | 79 | | |
78 | 80 | | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
79 | 198 | | |
80 | 199 | | |
81 | 200 | | |
| |||
0 commit comments