Skip to content

Implementation details of speculative decoding #13928

Answered by ggerganov
exhyy asked this question in Q&A
Discussion options

You must be logged in to vote

We had this implemented at some point (see #5625), but I think I decided it's not worth the complication (#10362). Greedy sampling is good enough and the logic is much simpler.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by exhyy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants