Skip to content

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Nov 25, 2025

Bumps transformers from 4.55.2 to 4.57.3.

Release notes

Sourced from transformers's releases.

Patch release v4.57.3

There was a hidden bug when loading models with local_files_only=True and a typo related to the recent patch.

The main fix is: huggingface/transformers@b605555.

We are really sorry that this slipped through, our CIs just did not catch it.

As it affects a lot of users we are gonna yank the previous release

Patch Release v4.57.2

This patch most notably fixes an issue on some Mistral tokenizers. It contains the following commits:

  • Add AutoTokenizer mapping for mistral3 and ministral (#42198)
  • Auto convert tekken.json (#42299)
  • fix tekken pattern matching (#42363)
  • Check model inputs - hidden states (#40994)
  • Remove invalid @staticmethod from module-level get_device_and_memory_breakdown (#41747)

Patch release v4.57.1

This patch most notably fixes an issue with an optional dependency (optax), which resulted in parsing errors with poetry. It contains the following fixes:

v4.57.0: Qwen3-Next, Vault Gemma, Qwen3 VL, LongCat Flash, Flex OLMO, LFM2 VL, BLT, Qwen3 OMNI MoE, Parakeet, EdgeTAM, OLMO3

New model additions

Qwen3 Next

The Qwen3-Next series represents the Qwen team's next-generation foundation models, optimized for extreme context length and large-scale parameter efficiency. The series introduces a suite of architectural innovations designed to maximize performance while minimizing computational cost:

  • Hybrid Attention: Replaces standard attention with the combination of Gated DeltaNet and Gated Attention, enabling efficient context modeling.
  • High-Sparsity MoE: Achieves an extreme low activation ratio as 1:50 in MoE layers — drastically reducing FLOPs per token while preserving model capacity.
  • Multi-Token Prediction(MTP): Boosts pretraining model performance, and accelerates inference.
  • Other Optimizations: Includes techniques such as zero-centered and weight-decayed layernorm, Gated Attention, and other stabilizing enhancements for robust training.

Built on this architecture, they trained and open-sourced Qwen3-Next-80B-A3B — 80B total parameters, only 3B active — achieving extreme sparsity and efficiency.

Despite its ultra-efficiency, it outperforms Qwen3-32B on downstream tasks — while requiring less than 1/10 of the training cost. Moreover, it delivers over 10x higher inference throughput than Qwen3-32B when handling contexts longer than 32K tokens.

For more details, please visit their blog Qwen3-Next (blog post).

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [transformers](https://github.com/huggingface/transformers) from 4.55.2 to 4.57.3.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.55.2...v4.57.3)

---
updated-dependencies:
- dependency-name: transformers
  dependency-version: 4.57.3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Nov 25, 2025
@meta-cla meta-cla bot added the cla signed label Nov 25, 2025
@meta-codesync
Copy link

meta-codesync bot commented Dec 1, 2025

@huydhn has imported this pull request. If you are a Meta employee, you can view this in D88104417.

@meta-codesync meta-codesync bot closed this in 49758ea Dec 2, 2025
@dependabot @github
Copy link
Contributor Author

dependabot bot commented on behalf of github Dec 2, 2025

OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting @dependabot ignore this major version or @dependabot ignore this minor version. You can also ignore all major, minor, or patch releases for a dependency by adding an ignore condition with the desired update_types to your config file.

If you change your mind, just re-open this PR and I'll resolve any conflicts on it.

@meta-codesync
Copy link

meta-codesync bot commented Dec 2, 2025

@huydhn merged this pull request in 49758ea.

@dependabot dependabot bot deleted the dependabot/pip/main/transformers-4.57.3 branch December 2, 2025 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed dependencies Pull requests that update a dependency file Merged python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants