Skip to content

Conversation

emlin
Copy link
Contributor

@emlin emlin commented Sep 2, 2025

Summary:
since zch v.Next is using a very large virtual table size, 2^50, the default uniform init value becomes very small, and when the weight dtype is half, those value essentially becomes 0.
We have observed the weight init value is all 0 from the debug log: https://fburl.com/mlhub/aea9mbzf
{F1981621246}

Differential Revision: D81296621

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 2, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81296621

emlin added a commit to emlin/torchrec that referenced this pull request Sep 4, 2025
Summary:

since zch v.Next is using a very large virtual table size, 2^50, the default uniform init value becomes very small, and when the weight dtype is half, those value essentially becomes 0.
We have observed the weight init value is all 0 from the debug log: https://fburl.com/mlhub/aea9mbzf
 {F1981621246}

Differential Revision: D81296621
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81296621

Summary:

since zch v.Next is using a very large virtual table size, 2^50, the default uniform init value becomes very small, and when the weight dtype is half, those value essentially becomes 0.
We have observed the weight init value is all 0 from the debug log: https://fburl.com/mlhub/aea9mbzf
 {F1981621246}

Differential Revision: D81296621
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81296621

emlin added a commit to emlin/torchrec that referenced this pull request Sep 4, 2025
Summary:

since zch v.Next is using a very large virtual table size, 2^50, the default uniform init value becomes very small, and when the weight dtype is half, those value essentially becomes 0.
We have observed the weight init value is all 0 from the debug log: https://fburl.com/mlhub/aea9mbzf
 {F1981621246}

Reviewed By: kathyxuyy

Differential Revision: D81296621
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants