Generate: consistently handle special tokens as tensors #29788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

gante wants to merge 16 commits into huggingface:main from gante:special_tokens_as_tensors

Member

gante commented Mar 21, 2024 •

edited

Loading

What does this PR do?

To enable torch.compile with generate, some special token-related operations have to be rewritten into torch operations. That requires special tokens to be tensors instead of integers or a list of integers. (See #29374 for a working prototype)

This PR reworks special token usage in generate to consistently treat them as a tensor, as opposed to e.g. keeping track of eos_token_id in integer and in tensor form.

👉 Review suggestion: start by reading _prepare_special_tokens and how it fits in generate.

Requirements before merging this PR:

merge Generate: fix logits processors doctests #29718 [Fixes doctests]

Tests ran locally:

logits processors doctests (pytest --doctest-modules src/transformers/generation/logits_process.py -vv), needs requirement to be merged first
generate doctests (pytest --doctest-modules src/transformers/generation/utils.py -vv)
generate integration tests (RUN_SLOW=1 py.test tests/generation/test_utils.py -vv)
cache integration tests (RUN_SLOW=1 py.test tests/test_cache_utils.py -vv) -- same failures as in main
llama slow tests (RUN_SLOW=1 py.test tests/models/llama/test_modeling_llama.py -vv)
whisper slow tests (RUN_SLOW=1 py.test tests/models/whisper/test_modeling_whisper.py -vv) -- same failures as in main

HuggingFaceDocBuilderDev commented Mar 21, 2024

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante changed the title ~~Generate: special tokens as tensors~~ Generate: consistently handle special tokens as tensors

gante force-pushed the special_tokens_as_tensors branch from a746766 to 7863c5d Compare

March 25, 2024 12:03

gante mentioned this pull request

Pass device in Logits Processor's init #29804

Merged

gante marked this pull request as ready for review

March 25, 2024 16:30

gante requested review from ArthurZucker and zucchini-nlp

March 25, 2024 16:30

gante commented

View reviewed changes

src/transformers/generation/utils.py Outdated

Member Author

gante Mar 25, 2024

The logic of this function is now within _prepare_special_tokens, which preprocesses all special tokens

gante commented

View reviewed changes

src/transformers/generation/utils.py Outdated

Member Author

gante Mar 25, 2024

ALL preprocessing logic for the special tokens now resides in this function 🧹

gante commented

View reviewed changes

src/transformers/generation/utils.py Outdated

Member Author

gante Mar 25, 2024

kwargs_has_attention_mask is an optional argument so we can use this function in tests, to prepare special tokens.

gante commented

View reviewed changes

src/transformers/generation/utils.py Outdated

Member Author

gante Mar 25, 2024 •

edited

Loading

The decoding functions are backward compatible (for now), and we can still pass int/list(int) as special tokens.

The doctests in generate test this.

gante commented

View reviewed changes

src/transformers/models/musicgen/modeling_musicgen.py Outdated

Member Author

gante Mar 25, 2024

Musicgen (and its melody variant) have their own custom generate, relying on this method.

I've intentionally not updated this custom generate, to pressure us into moving towards a single generate function.

gante commented

View reviewed changes

src/transformers/generation/utils.py Outdated

Member Author

gante Mar 25, 2024

The logic rewritten in functions like this is torch.compile(..., fullgraph=True) compatible 😉

zucchini-nlp approved these changes

View reviewed changes

Member

zucchini-nlp left a comment

Thanks for working on this 😄

src/transformers/generation/utils.py Outdated Show resolved Hide resolved

src/transformers/generation/utils.py Outdated Show resolved Hide resolved

src/transformers/generation/logits_process.py Outdated Show resolved Hide resolved

gante and others added 15 commits

March 29, 2024 12:18


          tmp commit

cd110d1


          variable name

fe2c907


          fix _prepare_decoder_input_ids_for_generation

83c4730


          eos is always 1D tensor

74461b6


          default attn mask with input embeds

835a19d


          clone tensor

1c3aaea


          last test fixed?

1f73173


          logits processors with eos token as tensor input

f347ad3


          special tokens as tensors in the decoding functions

c8bff34


          eos_token_id is optional in NoBadWordsLogitsProcessor

1a45607


          fix last test?

013e02d


          plan B: update the attributes in generation_config

a01c798


          forgot the most important part in the previous commit :D

29dd408


          PR suggestions

2ff9f39


          Update src/transformers/generation/utils.py

dd5bf8c

Co-authored-by: Raushan Turganbay <[email protected]>

gante force-pushed the special_tokens_as_tensors branch from 7537207 to dd5bf8c Compare

March 29, 2024 13:06


          stopping criteria with tensor input

48c2c45

Member Author

gante commented Mar 29, 2024 •

edited

Loading

let's merge #29956 first, so the diff here becomes much smaller (the EOS-as-stopping-criteria made the diff more elaborate)

(Arthur -- don't review this one until that is merged, I'll ping you again)

gante removed the request for review from ArthurZucker

March 29, 2024 17:47

Contributor

github-actions bot commented Apr 23, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions bot closed this

gante mentioned this pull request

Generate: consistently handle special tokens as tensors #30624

Merged

6 tasks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet