Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion vllm/config/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ class ModelConfig:
specified by the server file system. This is a security risk. Should only
be enabled in trusted environments."""
allowed_media_domains: list[str] | None = None
"""If set, only media URLs that belong to this domain can be used for
"""If set, only media URLs that belong to this domain can be used for
multi-modal inputs. """
revision: str | None = None
"""The specific model version to use. It can be a branch name, a tag name,
Expand Down Expand Up @@ -345,6 +345,7 @@ def compute_hash(self) -> str:
factors.append(self.rope_scaling)
factors.append(self.rope_theta)
factors.append(self.video_pruning_rate)
factors.append(self.enable_prompt_embeds)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Good catch adding enable_prompt_embeds to the compilation hash.

While reviewing this, I noticed that runner_type and convert_type also seem to affect the computation graph but are not currently included in the hash. These fields can determine which model implementation is used (e.g., for generation vs. pooling) or whether a model adapter is applied, both of which are significant changes to the graph.

To prevent potential cache collisions when switching between runners or converters for the same base model, it would be safer to include them in the hash factors. What do you think about adding them here?

Suggested change
factors.append(self.enable_prompt_embeds)
factors.append(self.enable_prompt_embeds)
factors.append(self.runner_type)
factors.append(self.convert_type)


# hf_config can control how the model looks!
try:
Expand Down