[Perf] Cache vllm.env.getattr result to avoid recomputation #26146

Jialin · 2025-10-02T23:40:58Z

Purpose

We found quite some os.env stack in the trace dump. But ideally, those environment results are NOT changed after process starts, so we should be caching the results to avoid recomputation.

Environment variables cache will be refreshed after the process initialization to allow environment variable overrides during server startups.

Test Plan & Test Result

_get_num_input_tokens took 11us without the PR and 5us with the caching.

Before

After

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request introduces a performance optimization by caching the results of environment variable lookups using @functools.cache. While this is a good optimization, I've identified a critical issue where this caching can interfere with functions that modify environment variables at runtime, such as set_vllm_use_v1. This could lead to stale configuration values being used. Please see my detailed comment.

vllm/envs.py

mgoin · 2025-10-03T01:05:22Z

Unfortunately we do mutate env vars throughout setup in various situations. Do you think we could change this to cache once we get past startup? Certainly once we are serving we expect nothing to change

yeqcharlotte · 2025-10-03T06:26:26Z

Unfortunately we do mutate env vars throughout setup in various situations. Do you think we could change this to cache once we get past startup? Certainly once we are serving we expect nothing to change

Honestly would be nice to also log those in-place env updates. I've been feeling quite confused about those while dong this.

Jialin · 2025-10-03T21:23:30Z

Unfortunately we do mutate env vars throughout setup in various situations. Do you think we could change this to cache once we get past startup? Certainly once we are serving we expect nothing to change

@mgoin Sure thing. We could introduce new API to invalidate and re-warmup the cache, and kick it off right before startup finished. And I could think off a few places to invoke this:

EngineCoreProc.init:

vllm/vllm/v1/engine/core.py

Lines 531 to 537 in 0879736

    
           # Mark the startup heap as static so that it's ignored by GC. 
        
           # Reduces pause times of oldest generation collections. 
        
           gc.collect() 
        
           gc.freeze() 
        
           # If enable, attach GC debugger after static variable freeze. 
        
           maybe_attach_gc_debug_callback()

WorkerProc.worker_main after the worker marked itself as READY:

vllm/vllm/v1/executor/multiproc_executor.py

Lines 573 to 579 in 0879736

    
           # Send READY once we know everything is loaded 
        
           ready_writer.send({ 
        
               "status": 
        
               WorkerProc.READY_STR, 
        
               "handle": 
        
               worker.worker_response_mq.export_handle(), 
        
           })

Will update the PR soonish, and thanks for pointing out the on-fly environment changes before startup.

Jialin · 2025-10-03T21:25:01Z

Honestly would be nice to also log those in-place env updates. I've been feeling quite confused about those while dong this.

@yeqcharlotte We might need to migrate all os.environ access via a new API, then we could wire up the logging and cache invalidation properly. However, there might not be a way to enforce everyone to use the new API :/

Jialin · 2025-10-13T21:01:41Z

CC @mgoin @yeqcharlotte for reviews after introducing cache reloads after each process initialization.

Jialin · 2025-10-13T22:55:24Z

Trying to address the precommit errors in #26742 which doesn't seem to be related to this PR.

mgoin

Nice work. I'm a bit worried about unintended changes to behavior that are hard to predict or catch in CI, but I believe this is better to figure out quickly. Thank you.

Let me know when precommit is resolved and I will enable full CI

Jialin · 2025-10-14T00:53:35Z

I'm a bit worried about unintended changes to behavior that are hard to predict or catch in CI, but I believe this is better to figure out quickly.

Yeah, totally! But I think it should be a legit assumption that environment variable SHOULD NOT change after service startup. (And we might need to fix forward if some logic doesn't follow this assumption).

On the other hand, I bet there could be more use cases similar to my recent GC debug changes which incorrectly use the ENV_VAR instead of vllm.envs.ENV_VAR which backed by getattr cache behind the scene. I might create an issue to followup to migrate ENV_VAR -> vllm.envs.ENV_VAR.

Let me know when precommit is resolved and I will enable full CI

Will nudge you again and after #26742 landed and I rebased this PR. Thanks in advance.

Jialin · 2025-10-14T06:25:42Z

@mgoin I found quite a lot failing tests and rethink about the actual usage.

Before service initialization, we should expect environment variables could change at any moment
After service initialization, all the variables should be locked.

So I changed the implementation to wrap envs.getattr with functools.cache after service initialization instead. So we don't need to worry about 1. at all, and we would only need to see if there's any ongoing usage which violates 2. instead.

Signed-off-by: Jialin Ouyang <[email protected]>

mgoin · 2025-10-14T17:55:17Z

I agree, we should only enforce 2

Jialin · 2025-10-14T20:15:54Z

Let me know when precommit is resolved and I will enable full CI

@mgoin At least all CI passed now. Please let me know if you have other concerns we should address before merging.

mgoin · 2025-10-14T21:03:18Z

LGTM, let's create the issue to move to migrate ENV_VAR -> vllm.envs.ENV_VAR to enforce/log any deviations in the future

Jialin · 2025-10-14T23:13:40Z

LGTM, let's create the issue to move to migrate ENV_VAR -> vllm.envs.ENV_VAR to enforce/log any deviations in the future

Created issue #26854 which should be mostly addressed by #26810

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: Jonah Bernard <[email protected]>

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: bbartels <[email protected]>

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>

youkaichao

@Jialin thanks for the great work! i think we should also call enable_envs_cache inside workers? right now it seems only the engine core / executor calls enable_envs_cache. then it won't work e.g. when we use spawn to create processes, or use ray to create remote processes.

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: 0xrushi <[email protected]>

Jialin · 2025-10-28T06:04:55Z

@Jialin thanks for the great work! i think we should also call enable_envs_cache inside workers? right now it seems only the engine core / executor calls enable_envs_cache. then it won't work e.g. when we use spawn to create processes, or use ray to create remote processes.

Thanks @youkaichao for the suggestions. We also ran into similar issue when we used external launcher (e.g. torchrun) to kick off processes (CC @22quinn )

I'm trying to further extend this in #27632.

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>

gemini-code-assist bot reviewed Oct 2, 2025

View reviewed changes

vllm/envs.py Outdated Show resolved Hide resolved

Jialin force-pushed the env branch from 1426fb7 to fb9ba55 Compare October 13, 2025 20:57

Jialin requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners October 13, 2025 20:57

mergify bot added the v1 label Oct 13, 2025

Jialin mentioned this pull request Oct 13, 2025

[Easy] Fix env type check errors from VLLM_DEBUG_LOG_API_SERVER_RESPONSE #26742

Merged

5 tasks

mgoin approved these changes Oct 14, 2025

View reviewed changes

Jialin force-pushed the env branch from a0d0420 to 02757d7 Compare October 14, 2025 02:58

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 14, 2025

Jialin changed the title ~~[Perf][Easy] Cache vllm.env.__getattr__ result to avoid recomputation~~ [Perf] Cache vllm.env.__getattr__ result to avoid recomputation Oct 14, 2025

Jialin mentioned this pull request Oct 14, 2025

[Core][Easy] Use envs.__getattr__ for all Unify to environment variable access #26810

Merged

5 tasks

Jialin added 7 commits October 14, 2025 10:09

Cache vllm.env.__getattr__ result to avoid recomputation

c721f7d

Signed-off-by: Jialin Ouyang <[email protected]>

Refresh environment variable cache after process initialization

831dd8d

Signed-off-by: Jialin Ouyang <[email protected]>

code comments

e9fe650

Signed-off-by: Jialin Ouyang <[email protected]>

Enable functools.cache on the fly instead

f686142

Signed-off-by: Jialin Ouyang <[email protected]>

Fix VLLM_GC_DEBUG in separate PR

2b4f339

Signed-off-by: Jialin Ouyang <[email protected]>

Code comments

5264699

Signed-off-by: Jialin Ouyang <[email protected]>

Code comment polish

8e583c4

Signed-off-by: Jialin Ouyang <[email protected]>

Jialin force-pushed the env branch from 2a3d1a8 to 8e583c4 Compare October 14, 2025 17:09

Jialin mentioned this pull request Oct 14, 2025

[flashinfer] [kernel] support for fp8 kv cache for trtllm prefill attention #24197

Merged

mgoin merged commit 380f175 into vllm-project:main Oct 14, 2025
46 checks passed

Jialin mentioned this pull request Oct 14, 2025

[Bug]: Use vllm.envs.ENV_VARIABLE instead of ENV_VARIABLE #26854

Open

1 task

Jialin deleted the env branch October 14, 2025 23:13

Jonahcb pushed a commit to Jonahcb/vllm that referenced this pull request Oct 15, 2025

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation (vllm…

d5facec

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: Jonah Bernard <[email protected]>

bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation (vllm…

113788e

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: bbartels <[email protected]>

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation (vllm…

fd7fc87

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>

youkaichao reviewed Oct 23, 2025

View reviewed changes

alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation (vllm…

f7f7e58

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation (vllm…

34eec2a

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation (vllm…

ba488a6

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation (vllm…

21fee26

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: 0xrushi <[email protected]>

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation (vllm…

8387166

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: 0xrushi <[email protected]>

Jialin mentioned this pull request Oct 28, 2025

[Core][Env Cache] Enable environment variable cache for Worker as well #27632

Closed

5 tasks

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation (vllm…

edfa599

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>

Zhathw pushed a commit to Zhathw/vllm that referenced this pull request Nov 12, 2025

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation (vllm…

5497e57

…-project#26146) Signed-off-by: Jialin Ouyang <[email protected]>

Uh oh!

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation #26146

[Perf] Cache vllm.env.__getattr__ result to avoid recomputation #26146

Uh oh!

Conversation

Jialin commented Oct 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan & Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mgoin commented Oct 3, 2025

Uh oh!

yeqcharlotte commented Oct 3, 2025

Uh oh!

Jialin commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jialin commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jialin commented Oct 13, 2025

Uh oh!

Jialin commented Oct 13, 2025

Uh oh!

mgoin left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jialin commented Oct 14, 2025

Uh oh!

Jialin commented Oct 14, 2025

Uh oh!

mgoin commented Oct 14, 2025

Uh oh!

Jialin commented Oct 14, 2025

Uh oh!

mgoin commented Oct 14, 2025

Uh oh!

Uh oh!

Jialin commented Oct 14, 2025

Uh oh!

youkaichao left a comment

Choose a reason for hiding this comment

Uh oh!

Jialin commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Perf] Cache vllm.env.getattr result to avoid recomputation #26146

[Perf] Cache vllm.env.getattr result to avoid recomputation #26146

Jialin commented Oct 2, 2025 •

edited by github-actions bot

Loading

Jialin commented Oct 3, 2025 •

edited

Loading

Jialin commented Oct 3, 2025 •

edited

Loading

mgoin left a comment •

edited

Loading

Jialin commented Oct 28, 2025 •

edited

Loading