-
Notifications
You must be signed in to change notification settings - Fork 37
Transformers v4.12.0 compatible #107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
feihugis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JulianneKnott Thanks for updating HF models! I did a pass and left some minor comments. Did you check if all the changed code have been used during the inference?
tests/optimizer/transformers/data/expected_prophetnet_output_no_cache.hypo
Outdated
Show resolved
Hide resolved
| #https://github.com/huggingface/transformers.git \ | ||
| export BASELINE_REPO=$CACHE_DIR/transformers_v4.12.0 | ||
| git_clone_if_not_in_cache \ | ||
| https://github.com/JiushengChen/transformers.git \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some context about this forked repo, it is to add param "no_repeat_ngram_size", see
JiushengChen/Transformers@db97043
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, this is no longer needed since fastseq uses it's own run_eval_hf.py for the baseline?
| is_greedy_gen_mode = (num_beams == 1) and (num_beam_groups == 1) and do_sample is False | ||
| is_sample_gen_mode = (num_beams == 1) and (num_beam_groups == 1) and do_sample is True | ||
| is_beam_gen_mode = (num_beams > 1) and (num_beam_groups == 1) and do_sample is False | ||
| is_beam_sample_gen_mode = (num_beams > 1) and (num_beam_groups == 1) and do_sample is True | ||
| is_group_beam_gen_mode = (num_beams > 1) and (num_beam_groups > 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are our changes compatible with all these generation modes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The beam search updates are only applied for is_beam_gen_mode. The model-specific updates (ie attention) are applied in all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it also work with is_group_beam_gen_mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no is_group_beam_gen_mode does not use the updates. Should that be added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the optimizations can also work for group_beam_gen_mode, we can add it. We can support it in the another PR later if it will take some time to support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will take some time to make it work for both, so I think it's best if group_beam_gen_mode goes in another pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
feihugis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @JulianneKnott for the revisions! The PR looks good to me!
Updating Fastseq for compatibility with huggingface transformers v4.12.0.
Benchmarks (samples/sec):