Skip to content

Commit fee2cb5

Browse files
authored
Add batched Llama model definition using vLLM paged attention (mlc-ai#1134)
* Add batched Llama model with vllm paged attention * update core.py * doc * minor * add e2e test * mv file * clean * Check if TVM has been built with USE_VLLM * update BuildArgs docstring
1 parent ba67835 commit fee2cb5

File tree

4 files changed

+1347
-165
lines changed

4 files changed

+1347
-165
lines changed

0 commit comments

Comments
 (0)