Use past_key_values to speed up multi-token target LLM Attribution #1224

vivekmig · 2023-12-17T22:29:15Z

Default generation in transformers utilizes past_key_values to cache previous key values to speed up forward passes for subsequent tokens. This adds a flag and use of corresponding helpers from transformers generation utils to follow the same approach for using caching.

…te_llm

facebook-github-bot · 2023-12-17T22:30:12Z

@vivekmig has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-12-19T16:47:03Z

@vivekmig merged this pull request in fefcb5b.

vivekmig added 2 commits December 16, 2023 23:45

Alt forward option

2341c29

Fixes

4e23784

facebook-github-bot added the cla signed label Dec 17, 2023

Merge branch 'master' of github.com:pytorch/captum into switch_genera…

799b716

…te_llm

facebook-github-bot closed this in fefcb5b Dec 19, 2023

facebook-github-bot added the Merged label Dec 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use past_key_values to speed up multi-token target LLM Attribution #1224

Use past_key_values to speed up multi-token target LLM Attribution #1224

Uh oh!

vivekmig commented Dec 17, 2023

Uh oh!

facebook-github-bot commented Dec 17, 2023

Uh oh!

facebook-github-bot commented Dec 19, 2023

Uh oh!

Uh oh!

Use past_key_values to speed up multi-token target LLM Attribution #1224

Use past_key_values to speed up multi-token target LLM Attribution #1224

Uh oh!

Conversation

vivekmig commented Dec 17, 2023

Uh oh!

facebook-github-bot commented Dec 17, 2023

Uh oh!

facebook-github-bot commented Dec 19, 2023

Uh oh!

Uh oh!