[WIP] feat: support loading model weights and forward overlap. #441

Clement-Wang26 · 2025-11-26T08:56:29Z

details:

model loading uniformly merges tensors in DRAM, followed by manual memory allocation (malloc) on the device and data copying (memcpy) to the allocated memory.
support loading model weights and forward overlap.

liujinguang0125 · 2025-11-26T14:41:53Z

xllm/models/lazy_layer_loader.h

+
+  c10_npu::NPUStream load_stream_;
+  std::unique_ptr<ThreadPool> threadpool_;
+  std::vector<aclrtEvent> events_;


aclrtEvent depends on Ascend platform, it needs to abstract event here. you can implement abstract EventInterface class, specific NpuEvent, and EventFactory in core/platform.

feat: support loading model weights and forward overlap.

8c0ada7

liujinguang0125 reviewed Nov 26, 2025

View reviewed changes

liujinguang0125 self-requested a review November 26, 2025 14:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] feat: support loading model weights and forward overlap. #441

[WIP] feat: support loading model weights and forward overlap. #441

Uh oh!

Clement-Wang26 commented Nov 26, 2025

Uh oh!

liujinguang0125 Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[WIP] feat: support loading model weights and forward overlap. #441

Are you sure you want to change the base?

[WIP] feat: support loading model weights and forward overlap. #441

Uh oh!

Conversation

Clement-Wang26 commented Nov 26, 2025

Uh oh!

liujinguang0125 Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants