-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
Description
- Use
llama_decode
instead of deprecatedllama_eval
inLlama
class - Implement batched inference support for
generate
andcreate_completion
methods inLlama
class - Add support for streaming / infinite completion
giangluu352001, harry-pham-wise, JackKCWong, bb-worm, ChristianWeyer and 45 moresengiv, ArtyomZemlyak, hamishc, bioshazard, gerred and 16 moreesmeetu, robertritz, zhengzhanpeng, hamishc, ngupta10 and 12 more