Skip to content

SDPA re-compile during each SGLang forward #1844

@airMeng

Description

@airMeng

🐛 Describe the bug

python3 -m sglang.bench_one_batch--batch-size BS--input P_LEN--output GEN_LEN--model Llama-3.2-1B-Instruct --trust-remote-code --device xpu –attention-backend torch_native
Image

Versions

https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/14716569221

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions