[Bug]: minicpmv2.6 BNB in-flight quantization error

### Your current environment

### Model Input Dumps

_No response_

### 🐛 Describe the bug

After merging https://github.com/vllm-project/vllm/pull/9891 , I tried the in-flight quantization with minicpmv and encountered the following error:

```shell
[rank0]:   File "/vllm/vllm/model_executor/model_loader/loader.py", line 1105, in _load_weights
[rank0]:     model.load_weights(qweight_iterator)
[rank0]:   File "/vllm/vllm/model_executor/models/minicpmv.py", line 634, in load_weights
[rank0]:     param = params_dict[name]
[rank0]: KeyError: 'vpm.encoder.layers.0.mlp.fc1.weight'
Loading safetensors checkpoint shards:   0% Completed | 0/4 [00:00<?, ?it/s]
```



## Reproduce code

```python 
MODEL_NAME = "openbmb/MiniCPM-V-2_6"
llm = LLM(
    model=MODEL_NAME,
    trust_remote_code=True,
    tensor_parallel_size=1,
    gpu_memory_utilization=0.7,
    quantization="bitsandbytes",
    load_format="bitsandbytes",
)
```

 It seems `mllama` has the same issue.   cc @mgoin  

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: minicpmv2.6 BNB in-flight quantization error #9914

Your current environment

Model Input Dumps

🐛 Describe the bug

Reproduce code

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: minicpmv2.6 BNB in-flight quantization error #9914

Description

Your current environment

Model Input Dumps

🐛 Describe the bug

Reproduce code

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions