[feature] support no master weights option for low level zero plugin #4816

KKZ20 · 2023-09-27T06:24:10Z

📌 Checklist before creating the PR

I have created an issue for this PR for traceability
The title follows the standard format: [doc/gemini/tensor/...]: A concise description
I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

📝 What does this PR do?

Summarize your work here.
if you have any plots/diagrams/screenshots/tables, please attach them here.

Add an argument to support whether to store a master copy of weights in fp32. Test results on the Bert model are as follows:

💥 Checklist before requesting a review

I have linked my PR to an issue (instruction)
My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
I have performed a self-review of my code
I have added thorough tests.
I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

🌝 Yes, I do.
🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

colossalai/zero/low_level/low_level_optim.py

… data copy when no master weights

RelxOff

👍

colossalai/zero/low_level/low_level_optim.py

github-actions · 2023-10-13T07:39:08Z

The code coverage for the changed files is %.

Click me to view the complete report

Name                                                  Stmts   Miss  Cover
-------------------------------------------------------------------------
colossalai/booster/plugin/hybrid_parallel_plugin.py     366     41    89%
colossalai/booster/plugin/low_level_zero_plugin.py      144     11    92%
colossalai/zero/low_level/low_level_optim.py            365     32    91%
-------------------------------------------------------------------------
TOTAL                                                   875     84    90%

…pcaitech#4816) * [feature] support no master weights for low level zero plugin * [feature] support no master weights for low level zero plugin, remove data copy when no master weights * remove data copy and typecasting when no master weights * not load weights to cpu when using no master weights * fix grad: use fp16 grad when no master weights * only do not update working param when no master weights * fix: only do not update working param when no master weights * fix: passing params in dict format in hybrid plugin * fix: remove extra params (tp_process_group) in hybrid_parallel_plugin

[feature] support no master weights for low level zero plugin

758d911

ver217 reviewed Sep 27, 2023

View reviewed changes

colossalai/zero/low_level/low_level_optim.py Outdated Show resolved Hide resolved

Zhongkai Zhao added 2 commits September 27, 2023 15:21

[feature] support no master weights for low level zero plugin, remove…

7d0ecfb

… data copy when no master weights

remove data copy and typecasting when no master weights

c4cde46

RelxOff approved these changes Sep 27, 2023

View reviewed changes

Zhongkai Zhao and others added 4 commits September 27, 2023 17:56

not load weights to cpu when using no master weights

3c414ec

fix grad: use fp16 grad when no master weights

ecaadd7

fix code complexity

14b4be0

retry fix code complexity

ec341f2

ver217 reviewed Oct 11, 2023

View reviewed changes

colossalai/zero/low_level/low_level_optim.py Show resolved Hide resolved

do not update working param

53a5609

ver217 reviewed Oct 12, 2023

View reviewed changes

colossalai/zero/low_level/low_level_optim.py Outdated Show resolved Hide resolved

Zhongkai Zhao added 2 commits October 12, 2023 13:27

only do not update working param when no master weights

90b2426

fix: only do not update working param when no master weights

023c13e

ver217 approved these changes Oct 12, 2023

View reviewed changes

Fridge003 approved these changes Oct 12, 2023

View reviewed changes

Zhongkai Zhao and others added 4 commits October 12, 2023 17:16

fix: passing params in dict format in hybrid plugin

3c82abd

Merge branch 'main' into feature/no_master_weights_llz

35cacfc

fix: remove extra params (tp_process_group) in hybrid_parallel_plugin

0172e73

add a comment

7a26ae1

KKZ20 merged commit a0684e7 into hpcaitech:main Oct 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feature] support no master weights option for low level zero plugin #4816

[feature] support no master weights option for low level zero plugin #4816

Uh oh!

KKZ20 commented Sep 27, 2023

Uh oh!

Uh oh!

RelxOff left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[feature] support no master weights option for low level zero plugin #4816

[feature] support no master weights option for low level zero plugin #4816

Uh oh!

Conversation

KKZ20 commented Sep 27, 2023

📌 Checklist before creating the PR

🚨 Issue number

📝 What does this PR do?

💥 Checklist before requesting a review

⭐️ Do you enjoy contributing to Colossal-AI?

Uh oh!

Uh oh!

RelxOff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants