-
Notifications
You must be signed in to change notification settings - Fork 258
Open
Description
Hi, thank you so much for releasing this wonderful code!
I notice in your examples/pretrain_llama_7b.sh, the dtype is set to fp32, which seems to make activations fp32. However, I think it's more common to make activations bf16? Also, I notice that it seems like the param_dtype is always set to fp32.
Could you please elaborate a bit on this choice? Thank you very much!
Metadata
Metadata
Assignees
Labels
No labels