[Cache] rename `max_batch_size` -> `batch_size` in compilable caches #37389

gante · 2025-04-09T10:04:31Z

What does this PR do?

Uses deprecate_kwarg to rename max_batch_size to batch_size in all compilable caches. max_batch_size is a bad arg name: it implies that batch sizes smaller than max_batch_size can use the cache too, which is not the case.

Note that this deprecation was started before, but we messed it up along the way:

Deprecation process started: Cache: use batch_size instead of max_batch_size #32657
Deprecation message got changed to the opposite of the goal: Offloaded cache: fix generate #34921
User-contributed PR that respected the (modified) deprecation message: Remove deprecated batch_size parameter #37007

github-actions · 2025-04-09T10:04:43Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

zucchini-nlp

Oh wow, I even forgot we were deprecating the other way. Thanks for digging into it!

Since we talk about batch sizes, I remember this issue (#35444) where user wanted to contribute an actual max_batch_size (especially for enc-dec model cases). Similar to seq length, unused batches are all zeros. I think the feature is nice to have, but I also see we can mess up with users who manipulate directly cache._key_cache. WDYT about it, is it worth supporting?

HuggingFaceDocBuilderDev · 2025-04-09T10:30:37Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante · 2025-04-09T10:40:34Z

@zucchini-nlp I see, the export use cache with a batch size > input batch size makes sense (export once, reuse with any batch size).

It's feasible, the questions are a) code complexity; b) throughput. I'm going to give it a go 🤞

zucchini-nlp · 2025-04-09T11:02:16Z

Yeah, I am also concerned if it will add too much complexity. Thanks, will be cool if it's doable with minimal maintenance cost :)

gante · 2025-04-09T11:16:22Z

It's actually quite clean and doesn't seem to have throughput disadvantages 👀 closing this PR in favor of expanding capabilities

cache batch size arg

95edf6c

github-actions bot marked this pull request as draft April 9, 2025 10:04

gante marked this pull request as ready for review April 9, 2025 10:04

github-actions bot requested review from ArthurZucker and Rocketknight1 April 9, 2025 10:05

gante requested review from zucchini-nlp and removed request for Rocketknight1 April 9, 2025 10:06

zucchini-nlp reviewed Apr 9, 2025

View reviewed changes

gante removed the request for review from ArthurZucker April 9, 2025 10:40

gante closed this Apr 9, 2025

gante mentioned this pull request Apr 9, 2025

[Cache] Support compilable cache reuse with smaller batch sizes #37394

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Cache] rename `max_batch_size` -> `batch_size` in compilable caches #37389

[Cache] rename `max_batch_size` -> `batch_size` in compilable caches #37389

Uh oh!

gante commented Apr 9, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 9, 2025

Uh oh!

zucchini-nlp left a comment •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 9, 2025

Uh oh!

gante commented Apr 9, 2025

Uh oh!

zucchini-nlp commented Apr 9, 2025

Uh oh!

gante commented Apr 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Cache] rename max_batch_size -> batch_size in compilable caches #37389

[Cache] rename max_batch_size -> batch_size in compilable caches #37389

Uh oh!

Conversation

gante commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

github-actions bot commented Apr 9, 2025

Uh oh!

zucchini-nlp left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 9, 2025

Uh oh!

gante commented Apr 9, 2025

Uh oh!

zucchini-nlp commented Apr 9, 2025

Uh oh!

gante commented Apr 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Cache] rename `max_batch_size` -> `batch_size` in compilable caches #37389

[Cache] rename `max_batch_size` -> `batch_size` in compilable caches #37389

gante commented Apr 9, 2025 •

edited

Loading

zucchini-nlp left a comment •

edited

Loading