[Feature]: Pipeline Parallelism support for the Vision Language Models

### 🚀 The feature, motivation and pitch

If I am not wrong, currently vllm supports only the **Language models** not the **Vision models**.

NotImplementedError: Pipeline parallelism is only supported for the following  architectures: ['AquilaModel', 'AquilaForCausalLM', 'DeepseekV2ForCausalLM', 'InternLMForCausalLM', 'JAISLMHeadModel', 'LlamaForCausalLM', 'LLaMAForCausalLM', 'MistralForCausalLM', 'Phi3ForCausalLM', 'GPT2LMHeadModel', 'MixtralForCausalLM', 'NemotronForCausalLM', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'QWenLMHeadModel'].

This feature would greatly benefit teams and projects working with vision-language models, allowing them to scale out their workloads efficiently and maintain performance as model sizes continue to grow.

Also It would be greatly helpful, if someone can point me out on other possibilities for pipeline parallelism. Thanks in advance

### Alternatives

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Pipeline Parallelism support for the Vision Language Models #7684

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Pipeline Parallelism support for the Vision Language Models #7684

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions