-
Speculative decoding is described at https://arxiv.org/pdf/2302.01318 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
We had this implemented at some point (see #5625), but I think I decided it's not worth the complication (#10362). Greedy sampling is good enough and the logic is much simpler. |
Beta Was this translation helpful? Give feedback.
We had this implemented at some point (see #5625), but I think I decided it's not worth the complication (#10362). Greedy sampling is good enough and the logic is much simpler.