[LoRA][2/2]Remove LoRA extra vocab #28545

jeejeelee · 2025-11-12T10:39:26Z

Purpose

In this PR:

Clean up all code related to lora_extra_vocab_size and lora_vocab_padding_size FIX [RFC]: Disallow extra vocab for LoRA #23474
Train the LoRA for meta-llama/Llama-3.2-3B-Instruct to replace the previous llama2-7B LoRA model, to test adding LoRA to the lm_head and embedding layers and reduce the CI test pressure
Using the Qwen/Qwen3-0.6B model for related LoRA testing can also reduce the testing pressure on CI

Will continue to clean up the related code subsequently

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Jee Jee Li <[email protected]>

gemini-code-assist

Code Review

This pull request is part of an effort to remove the LoRA extra vocabulary feature. The changes to the model implementations in granite.py and teleflm.py are consistent with this goal, and tests related to the extra vocabulary feature have been correctly removed. However, I've found a critical issue in tests/lora/test_lora_manager.py where a change breaks a test, which will need to be addressed.

gemini-code-assist · 2025-11-12T10:41:38Z

tests/lora/test_lora_manager.py

-    new_embeddings = load_file(
-        os.path.join(sql_lora_files, "new_embeddings.safetensors")
-    )
+    new_embeddings = load_file(os.path.join(sql_lora_files, ""))


load_file(os.path.join(sql_lora_files, "")) will attempt to load from a directory path, which will raise an IsADirectoryError and cause the test to fail.

Since new embeddings are being removed, new_embeddings should likely be an empty dictionary.

Please note that changing this line to new_embeddings = {} will reveal another issue in this test: a KeyError will be raised on line 83. The test logic from line 77 onwards needs to be updated to reflect that lora.embeddings_tensor is now always None. The if/else block can be simplified to assert lora.embeddings_tensor is None.

Suggested change

new_embeddings = load_file(os.path.join(sql_lora_files, ""))

new_embeddings = {}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-11-12T10:42:33Z

tests/lora/test_lora_manager.py

 @pytest.mark.parametrize("device", DEVICES)
 def test_from_lora_tensors(sql_lora_files, device):
    tensors = load_file(os.path.join(sql_lora_files, "adapter_model.safetensors"))
-    new_embeddings = load_file(
-        os.path.join(sql_lora_files, "new_embeddings.safetensors")
-    )
+    new_embeddings = load_file(os.path.join(sql_lora_files, ""))



Avoid loading deleted new_embeddings file

The updated test still calls load_file(os.path.join(sql_lora_files, "")), which resolves to the LoRA directory itself. safetensors.torch.load_file only accepts paths to .safetensors files and will raise IsADirectoryError, so test_from_lora_tensors now crashes before exercising any behaviour. If extra vocab embeddings are no longer used, this load should be dropped or replaced with a stub so the test can run.

Useful? React with 👍 / 👎.

Signed-off-by: Jee Jee Li <[email protected]>

chatgpt-codex-connector

💡 Codex Review

vllm/vllm/v1/worker/tpu_model_runner.py

Lines 219 to 223 in 808b6e0

    
           self.hidden_size = model_config.get_hidden_size() 
        
           self.vocab_size = model_config.get_vocab_size() 
        
           if self.lora_config is not None: 
        
               self.vocab_size += self.lora_config.lora_extra_vocab_size

TPU runner still references removed LoRA extra vocab attr

The LoRAConfig dataclass no longer exposes lora_extra_vocab_size, but the TPU model runner continues to access self.lora_config.lora_extra_vocab_size when computing the model vocabulary. Any LoRA-enabled run on TPU will now raise an AttributeError during initialization, preventing TPU LoRA serving altogether. Consider removing this addition or guarding it behind a compatibility shim so TPU paths remain functional.

vllm/vllm/lora/punica_wrapper/punica_tpu.py

Lines 289 to 296 in 808b6e0

    
           def _update_base_metadata( 
        
               self, 
        
               mapping: "LoRAMapping", 
        
               lora_index_to_id: list[int | None], 
        
               max_loras: int, 
        
               vocab_size: int, 
        
               extra_vocab_size: int, 
        
           ):

Punica TPU metadata hook requires extra arg no longer passed

PunicaWrapperBase.update_metadata now calls _update_base_metadata with four positional parameters (mapping, lora_index_to_id, max_loras, vocab_size) and unconditionally sets extra_vocab_size to 0 internally. The TPU implementation still overrides _update_base_metadata with a five‑argument signature, so the call from the base class will raise TypeError: _update_base_metadata() missing 1 required positional argument whenever TPU LoRA metadata is updated. The override should drop the extra_vocab_size parameter or accept a default to keep the method compatible.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Signed-off-by: Jee Jee Li <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: LuminolT <[email protected]>

Init

46c9c24

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee marked this pull request as draft November 12, 2025 10:39

gemini-code-assist bot reviewed Nov 12, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Nov 12, 2025

View reviewed changes

jeejeelee added 3 commits November 12, 2025 18:44

Merge branch 'main' into remove-lora-extra-vocab-2nd

1118896

Move forward

4c78406

Signed-off-by: Jee Jee Li <[email protected]>

Merge branch 'main' into remove-lora-extra-vocab-2nd

e6d5df0

mergify bot added llama Related to Llama models tpu Related to Google TPUs labels Nov 13, 2025

jeejeelee added 2 commits November 14, 2025 08:53

Merge branch 'main' into remove-lora-extra-vocab-2nd

ecd1684

Move forward

b0de954

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee mentioned this pull request Nov 14, 2025

[Bugfix][Model] Support (zero-padded) LoRA on Qwen3 output embedding #26115

Open

5 tasks

jeejeelee added 4 commits November 15, 2025 23:29

Merge branch 'main' into remove-lora-extra-vocab-2nd

08ab85c

Merge branch 'vllm-project:main' into remove-lora-extra-vocab-2nd

1f4cc7a

Revert

1cecd75

Signed-off-by: Jee Jee Li <[email protected]>

Ready

de9be22

Signed-off-by: Jee Jee Li <[email protected]>

mergify bot removed the tpu Related to Google TPUs label Nov 16, 2025

Merge branch 'main' into remove-lora-extra-vocab-2nd

808b6e0

jeejeelee marked this pull request as ready for review November 16, 2025 07:37

jeejeelee requested review from ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, patrickvonplaten, robertgshaw2-redhat, tlrmchlsmth, yewentao256 and youkaichao as code owners November 16, 2025 07:37

jeejeelee added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 16, 2025

chatgpt-codex-connector bot reviewed Nov 16, 2025

View reviewed changes

jeejeelee removed the ready ONLY add when PR is ready to merge/full CI is needed label Nov 16, 2025

jeejeelee added 3 commits November 16, 2025 10:59

Fix test

1f3d9e8

Signed-off-by: Jee Jee Li <[email protected]>

Fix

a34ca3c

Signed-off-by: Jee Jee Li <[email protected]>

Fix

8d19e6d

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee requested a review from NickLucche as a code owner November 16, 2025 11:13

mergify bot added v1 tpu Related to Google TPUs labels Nov 16, 2025

jeejeelee added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 16, 2025

Merge branch 'main' into remove-lora-extra-vocab-2nd

a946a7f

jeejeelee removed the ready ONLY add when PR is ready to merge/full CI is needed label Nov 17, 2025

jeejeelee added 4 commits November 17, 2025 02:28

Fix test

a03c10a

Signed-off-by: Jee Jee Li <[email protected]>

Fix test

47c4ac6

Signed-off-by: Jee Jee Li <[email protected]>

Fix test

f49e743

Signed-off-by: Jee Jee Li <[email protected]>

Fix test

5913a48

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee force-pushed the remove-lora-extra-vocab-2nd branch from 7b97113 to 5913a48 Compare November 17, 2025 05:11

Merge branch 'main' into remove-lora-extra-vocab-2nd

68c76e6

jeejeelee added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 17, 2025

jeejeelee added 4 commits November 17, 2025 07:18

Fix test

6e083b7

Signed-off-by: Jee Jee Li <[email protected]>

Revert test

7e9af3a

Signed-off-by: Jee Jee Li <[email protected]>

Revert test

7223ee2

Signed-off-by: Jee Jee Li <[email protected]>

Revert test

d0bb883

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee changed the title ~~[LoRA][2/N]Remove LoRA extra vocab~~ [LoRA][2/2]Remove LoRA extra vocab Nov 17, 2025

WoosukKwon approved these changes Nov 21, 2025

View reviewed changes

jeejeelee merged commit 9875be6 into vllm-project:main Nov 21, 2025
60 checks passed

jeejeelee deleted the remove-lora-extra-vocab-2nd branch November 21, 2025 01:46

LuminolT pushed a commit to LuminolT/vllm that referenced this pull request Nov 21, 2025

[LoRA][2/2]Remove LoRA extra vocab (vllm-project#28545)

ece722f

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: LuminolT <[email protected]>

vanbasten23 mentioned this pull request Nov 21, 2025

Fix lora test by removing LoRA extra vocab vllm-project/tpu-inference#1156

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[LoRA][2/2]Remove LoRA extra vocab #28545

[LoRA][2/2]Remove LoRA extra vocab #28545

Uh oh!

jeejeelee commented Nov 12, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Nov 12, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	new_embeddings = load_file(os.path.join(sql_lora_files, ""))
	new_embeddings = {}

	self.hidden_size = model_config.get_hidden_size()
	self.vocab_size = model_config.get_vocab_size()

	if self.lora_config is not None:
	self.vocab_size += self.lora_config.lora_extra_vocab_size

	def _update_base_metadata(
	self,
	mapping: "LoRAMapping",
	lora_index_to_id: list[int \| None],
	max_loras: int,
	vocab_size: int,
	extra_vocab_size: int,
	):

Uh oh!

[LoRA][2/2]Remove LoRA extra vocab #28545

[LoRA][2/2]Remove LoRA extra vocab #28545

Uh oh!

Conversation

jeejeelee commented Nov 12, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jeejeelee commented Nov 12, 2025 •

edited by github-actions bot

Loading