New weight loader without np copy #52

zhuohan123 · 2023-04-30T10:32:04Z

Fix #48.

This PR makes the numpy copy in the previous weight loading optional. Specifically, we implement a new hf_model_weights_iterator, which iterates all the weights of a huggingface checkpoint. We then load each weight in the checkpoint to the model's state_dict.

WoosukKwon

Awesome. Left a comment about minor refactoring.

cacheflow/models/gpt_neox.py

* fix __init__ files * make yapf happy

### What this PR does / why we need it? fix communicator patch so parallel could work. see vllm-project#52 Signed-off-by: MengqingCao <[email protected]>

[Bug] Fix Test for Blackwell

Co-authored-by: dengyunyang <[email protected]>

zhuohan123 added 5 commits April 30, 2023 10:31

New weight loader for OPT models

8e5f1f7

support llama

f0cb731

fix loading for gpt_neox

73a04ae

Disable tqdm for downlading.

f1f5d3a

Fix arguments & remove duplicated codes

dde2e87

zhuohan123 requested a review from WoosukKwon May 2, 2023 09:57

WoosukKwon approved these changes May 3, 2023

View reviewed changes

cacheflow/models/gpt_neox.py Outdated Show resolved Hide resolved

Fix review comments

00353c5

zhuohan123 merged commit 27f1410 into main May 3, 2023

zhuohan123 deleted the new-weight-loader branch May 24, 2023 04:40

shanshanpt mentioned this pull request Nov 17, 2023

Run long conetxt error : CUDA error: an illegal memory access was encountered #1700

Closed

junior-zsy mentioned this pull request Nov 20, 2023

Error with 32k Long Text in chatglm2-6b-32k Model #1725

Closed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

New weight loader without np copy (vllm-project#52)

1f6d8d9

yuhuixu1993 mentioned this pull request Jun 2, 2024

[Bug]: loading squeezellm model #5190

Closed

ZHJ19970917 mentioned this pull request Jul 14, 2024

[Bug]: When using qwen-32b-chat-awq with multi-threaded access, errors occur after approximately several hundred visits.”vllm.engine.async_llm_engine.AsyncEngineDeadError: Background loop has errored already.“ #6421

Closed

dllehr-amd pushed a commit to dllehr-amd/vllm that referenced this pull request Jul 22, 2024

fix __init__ files (vllm-project#52)

0370719

* fix __init__ files * make yapf happy

JHLEE17 pushed a commit to JHLEE17/vllm that referenced this pull request Aug 1, 2024

Update requirements-hpu.txt (vllm-project#52)

d3e64dc

alixiaodi mentioned this pull request Aug 2, 2024

[Bug]: #7072

Closed

heheda12345 pushed a commit to heheda12345/vllm that referenced this pull request Sep 29, 2025

Merge pull request vllm-project#52 from vllm-model-0920/wentao-fix-test

7829646

[Bug] Fix Test for Blackwell

Bounty-hunter added a commit to Bounty-hunter/vllm that referenced this pull request Sep 30, 2025

bug fix for text request (vllm-project#52)

f5a7386

Co-authored-by: dengyunyang <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

New weight loader without np copy #52

New weight loader without np copy #52

Uh oh!

zhuohan123 commented Apr 30, 2023 •

edited

Loading

Uh oh!

WoosukKwon left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

New weight loader without np copy #52

New weight loader without np copy #52

Uh oh!

Conversation

zhuohan123 commented Apr 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WoosukKwon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhuohan123 commented Apr 30, 2023 •

edited

Loading