-
Notifications
You must be signed in to change notification settings - Fork 30.9k
Closed
Labels
Description
Feature request
Would be great to have static cache support for Whisper to make it faster with torch.compile. Currently, the generate() function doesn't support cache_implementation="static" for Whisper.
Motivation
Static cache with torch.compile can make generation much faster.
Your contribution
Static cache is already supported for LLMs and we see great speed-up.
zucchini-nlp, appoose and kadirnar