Your current environment
The output of `python collect_env.py`
how to make one quantized model(w4a FP8). I used llm-compressor make one. But it not work in vllm 0.10.2.
How would you like to use vllm
I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.
Before submitting a new issue...