[New Model]: Add Cohere2 Model

### 🚀 The feature, motivation and pitch

Recently cohere released a [CommandR7B](https://github.com/huggingface/transformers/tree/main/src/transformers/models/cohere2) model in huggingface and I would like to contribute the vllm implementation version of it. @simon-mo 

PR: https://github.com/vllm-project/vllm/pull/11358

The model also uses the interleave attention like gemma2 and mistral, so kv cache optimization is needed. I saw it is also on the roadmap. https://github.com/vllm-project/vllm/issues/9464

### Alternatives

_No response_

### Additional context

I have integrated and tested it work with all the benchmark scripts and would like to add a feature branch for review.

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[New Model]: Add Cohere2 Model #11357

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[New Model]: Add Cohere2 Model #11357

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions