-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
[Perf] Cache vllm.env.__getattr__ result to avoid recomputation #26146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a performance optimization by caching the results of environment variable lookups using @functools.cache. While this is a good optimization, I've identified a critical issue where this caching can interfere with functions that modify environment variables at runtime, such as set_vllm_use_v1. This could lead to stale configuration values being used. Please see my detailed comment.
|
Unfortunately we do mutate env vars throughout setup in various situations. Do you think we could change this to cache once we get past startup? Certainly once we are serving we expect nothing to change |
Honestly would be nice to also log those in-place env updates. I've been feeling quite confused about those while dong this. |
@mgoin Sure thing. We could introduce new API to invalidate and re-warmup the cache, and kick it off right before startup finished. And I could think off a few places to invoke this:
Will update the PR soonish, and thanks for pointing out the on-fly environment changes before startup. |
@yeqcharlotte We might need to migrate all os.environ access via a new API, then we could wire up the logging and |
|
CC @mgoin @yeqcharlotte for reviews after introducing cache reloads after each process initialization. |
|
Trying to address the precommit errors in #26742 which doesn't seem to be related to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. I'm a bit worried about unintended changes to behavior that are hard to predict or catch in CI, but I believe this is better to figure out quickly. Thank you.
Let me know when precommit is resolved and I will enable full CI
Yeah, totally! But I think it should be a legit assumption that environment variable SHOULD NOT change after service startup. (And we might need to fix forward if some logic doesn't follow this assumption). On the other hand, I bet there could be more use cases similar to my recent GC debug changes which incorrectly use the ENV_VAR instead of vllm.envs.ENV_VAR which backed by getattr cache behind the scene. I might create an issue to followup to migrate ENV_VAR -> vllm.envs.ENV_VAR.
Will nudge you again and after #26742 landed and I rebased this PR. Thanks in advance. |
|
@mgoin I found quite a lot failing tests and rethink about the actual usage.
So I changed the implementation to wrap envs.getattr with functools.cache after service initialization instead. So we don't need to worry about 1. at all, and we would only need to see if there's any ongoing usage which violates 2. instead. |
Signed-off-by: Jialin Ouyang <[email protected]>
Signed-off-by: Jialin Ouyang <[email protected]>
Signed-off-by: Jialin Ouyang <[email protected]>
Signed-off-by: Jialin Ouyang <[email protected]>
Signed-off-by: Jialin Ouyang <[email protected]>
Signed-off-by: Jialin Ouyang <[email protected]>
Signed-off-by: Jialin Ouyang <[email protected]>
|
I agree, we should only enforce 2 |
@mgoin At least all CI passed now. Please let me know if you have other concerns we should address before merging. |
|
LGTM, let's create the issue to move to migrate ENV_VAR -> vllm.envs.ENV_VAR to enforce/log any deviations in the future |
…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: Jonah Bernard <[email protected]>
…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: bbartels <[email protected]>
…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>
youkaichao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jialin thanks for the great work! i think we should also call enable_envs_cache inside workers? right now it seems only the engine core / executor calls enable_envs_cache. then it won't work e.g. when we use spawn to create processes, or use ray to create remote processes.
…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>
…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: 0xrushi <[email protected]>
…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: 0xrushi <[email protected]>
Thanks @youkaichao for the suggestions. We also ran into similar issue when we used external launcher (e.g. torchrun) to kick off processes (CC @22quinn ) I'm trying to further extend this in #27632. |
…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>
…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>
Purpose
We found quite some os.env stack in the trace dump. But ideally, those environment results are NOT changed after process starts, so we should be caching the results to avoid recomputation.
Environment variables cache will be refreshed after the process initialization to allow environment variable overrides during server startups.
Test Plan & Test Result
_get_num_input_tokens took 11us without the PR and 5us with the caching.
Before

After

Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.