[Feature]: Set RoPE scaling parameters dynamically

### 🚀 The feature, motivation and pitch

As it was implemented in #555, specifying RoPE parameters is only available from the model's `config.json`, and I haven't found a way to set it dynamically in my code. Is there currently a way of doing this?

Related to #910.

### Alternatives

Right now, unless providing a modified `config.json` (which is very inconvenient in my setup), I haven't found an alternative. I've tried monkey patching `vllm.transformers_utils.config.get_config` function to no avail (Ray uses it in a way I don't understand).

### Additional context

For context - specific to my setup - I'm using 2 GPUs to run a quantized Llama-3-70B (casperhansen's). Thus vLLM is using Ray.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Set RoPE scaling parameters dynamically #4334

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Set RoPE scaling parameters dynamically #4334

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions