Skip to content

Conversation

@vince62s
Copy link
Member

For benchmark and better documentation on performance I am adding back Apex.amp.

As a reminder, we have:
For adam:
if fp32, then it will use the native fp32 adam training.
if fp16, then it will use the "amp" library included in pytorch.

For fusedadam:
if fp16 or fp32 and "opt.apex_opt_level" is NOT set, then it will use the old legacy FusedAdam algorythm from the original apex implementation. https://github.com/NVIDIA/apex/blob/master/apex/contrib/optimizers/fp16_optimizer.py

if "opt.apex_opt_level" is set to:
"O0": training will be done in fp32 whatever the opt.model_dtype flag.
"O1" or "O2" will train according to those parameters: https://github.com/NVIDIA/apex/blob/master/docs/source/amp.rst
"O3": does not work with fusedadam, so do not use.

@vince62s vince62s merged commit 4cb2e0c into OpenNMT:v3.0 Nov 23, 2022
vince62s added a commit that referenced this pull request Nov 23, 2022
@vince62s vince62s deleted the apexamp branch November 23, 2022 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant