Commit 6316a9e
Fix MoE for V5 (#42456)
* remove zero_like + scatter
* fix mixtral moe
* fix other moe models as well
* fix ci
* fix modular mixtral
* fix qwen2_moe + qwen3_next
* fix device mismatch for qwen3_vl_moe to pass tests
* fix modular mixtral
* fix other models
* rm slow tokenizers (#40936)
* fixes missed
* gemma test fix
* refactor
* rm legacy from llama
* added renaming
* add _model
* update legacy
* update legacy
* fix docstring
* always load blank, then set _tokenizer if we have it
* new toks
* update all berttokenizer based models
* apply feedback - delete bert duplicates
* more models --> fast only
* more convert_slow models
* fix common test refs
* updating fast only tokenizers
* openai and pegasus
* enable sentencepiecebackend
* more models
* code gen
* t5
* code gen tests
* speecht5
* mbart
* mbart50
* more models
* more models
* layouglmv2
* update tests
* update tests
* update tests
* pretrainedtokenizer
* whisper
* whisper
* layoutxlm and storing backends
* refactor sentencepiecebackend and additional_special_tokens
* renaming tokenization_utils --> tokenization_python
* udpate tests
* bert test
* blenderbot
* clip
* codegen
* code_llama
* cohere
* deberata, deberat v2, funnel
* gpt2
* batch update tests
* pegasus qwen2 roberta
* more models
* layout tests
* some renaming
* fix references to utils_fast
* fix refs
* fix refs
* fix refs
* fix refs
* fix refs
* fix refs
* fix refs
* fix some tests
* regression
* fix refs
* fix refs
* missed the most crucial file in my last commit
* fix refs
* fix refs
* fix refs
* batch encode fix
* fix some tests
* BC for batch_decode bc too many refs
* more tests
* fix more tests
* fix for processors
* fixing more models
* deleted mbart50 by accident
* seamless m4t
* albert fix
* whisper
* layout3
* attempt to fix cached tokenizers on CI
* trying another fix on CI
* again try to work around CI
* bertweet
* tapas
* mbart50
* luke
* mluke
* markuplm
* markuplm
* fix some more auto tests
* some random model failures
* mistralcommontestser
* more fixes
* ref fix
* siglip
* marian
* plbart
* update utils toks
* seamless m4t
* roc bert
* udpate byt5 test
* xlm
* esm
* roformer
* code llama
* biogpt
* m2m100
* dpr and flaubert
* xlm and speech to text
* tok backend pass object
* tokenizer object pass
* wav2vec2
* wav2vec2
* cpmant
* update utils tokenizers
* cpmant
* bartpho
* test apply chat template assistant mask
* apply chat template video
* apply chat template assistant mask
* test torch
* update from slow in base and fix donut processor errors
* auto to point to tokenizers backend, fix kosmos2
* some non model fixes for old slow models that no longer have their own tokenizer file as they are the same as bert
* missed file from last commit
* idefics2
* fixup
* fixup
* pretrained tokenizer fast test update
* stash
* bad merged
* cherry pick more stuff that did not merge well
* fix gptsw3
* nit warn for now
* update error raising
* just ran fixup
* bring back bert legacy
* fix
* nit
* fix 56 errors on blenderbotsmall?
* 18 for blenderbotsmall
* tok auto
* missed clip
* fix tests
* something missed
* token healing
* tok common tests update - nonmodel
* try to fix non-model test in test_tokenization_utils
* fix hub tests
* try to fix hub tests
* custom vocab related fixed
* bert jap
* BERT JAP
* rename bert legacy to bert legacy
* Wav2vec2
* fix in tok python to update total vocab size - fixes speech t5
* blender bot small
* forgot test file
* test failures
* marian
* gpt2 tiktoken
* big bird / marian
* udop
* forgot couple changes
* test_serve fix
* missing import
* a couple processors fixes
* style partly
* fix to fetch tests ci
* Revert branch back to commit f5bc69e state
* revert branch to styling
* update mistral after merge
* fixes for non model tests
* some processor test fixes
* more processor test fixes
* more processor fixes
* hub tests
* python tok utils
* fix hub test
* make style for now
* remove problemattic fic copies
* python utils/check_copies.py --fix_and_overwrite
* more styling
* fixup
* silence docstirng
* fix import?
* fix imports
* add the local test as well
* throw spm error
* llamas
* fix a couple tests
* broke ci
* broke ci
* broke ci
* broke ci
* add logs to debug gemma on ci
* gemma and llama
* gemma
* revert las commit
* gemma debug
* gemma debug
* gemma
* safely import spiece backend
* tok tests
* check none
* setup and qual
* ruff
* del dev files
* tok auto
* fill docstrings
* update auto
* blenderbot small nit
* add migration guide
* move mixtral patch to `TokenizersBackend`, move `TokenizerExtractor`
* rename MistralCommonTokenizer to MistralCommonB ackend
* nit
* fix failures
* fixup
* remoove one old test
* mark the slow one as slow
* very small fixes
* update auto mapping for missing ones
* fixup lorsd
* fixup doc and stuff
* should be the final fixe
* processing update
* update
* FIX or brute AI fix the llava test
* style
* slow?
* fix is offline mode?
* fix mt5
* One tok utils (#42462)
* consolidate python and utils tokenization files, they are copies
* ruff and ref
* Format
* fix cohere
* ?
* up
* am I dumbb?
* grumble
---------
Co-authored-by: Arthur <[email protected]>
* [loading/saving] Reverse all loading operations when saving (#42396)
* first shot
* default to reversing
* oupso
* oupsi 2
* oupsi 3
* fix renamed kwargs
* fix timm_wrapper
* remove fix_state_dict methods
* can do it all the time, with __init__ as well
* doc
* oupsi
* fix
* create helper
* fix annotation annoying isue
* small fix
* small fixes
* alright commit all that already
* oupsi
* the fix
* update quantizers
* this works
* the hardcoded regex got me hard....
* style
* the final one
* cleanup a bit
* better
* style
* oupsi readded it
* do it inside the ops instead - no need for full names anymore
* reverse quantizers and simplify signatures
* small thingy
* add no_grad decorator
* utils to rename keys
* oupssii again
* add test
* simplify nicely
* Fix T5 tests: use generation_config for generation parameters (#42419)
* pass the generation parameters to generate()
* fix use_task_specific_params to separate model.config and model.generation_config params
* fix style
* some fixes
* remove redundant check
* update expectation for llama_7b_bf16 on rocm
* Update tests/models/llama/test_modeling_llama.py
Co-authored-by: Rémi Ouazan <[email protected]>
---------
Co-authored-by: Rémi Ouazan <[email protected]>
* linting
* more fix to pass the CI tests
* fix lfm2 moe
* fix docstring
* fix docstring
* fix qwen like model
* fix flex olmo
* revert lfm2 moe config
* make fixup
* fix docstring
* fix conversion mapping
* fix inference of gpt-oss
* add some fixes to gpt-oss (but still not good)
* fix modular
* we need errors I think
* fix config issue
* this was fixed
---------
Co-authored-by: Ita Zaporozhets <[email protected]>
Co-authored-by: Arthur <[email protected]>
Co-authored-by: Cyril Vallez <[email protected]>
Co-authored-by: BADAOUI Abdennacer <[email protected]>
Co-authored-by: Rémi Ouazan <[email protected]>1 parent eb399a9 commit 6316a9e
File tree
30 files changed
+177
-168
lines changed- src/transformers
- models
- deepseek_v2
- deepseek_v3
- dots1
- flex_olmo
- glm4_moe
- glm4v_moe
- gpt_oss
- hunyuan_v1_moe
- jamba
- lfm2_moe
- minimax
- mixtral
- olmoe
- phimoe
- qwen2_moe
- qwen3_moe
- qwen3_next
- qwen3_omni_moe
- qwen3_vl_moe
- qwen3_vl
- tests
- models
- gpt_oss
- olmoe
30 files changed
+177
-168
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
175 | 175 | | |
176 | 176 | | |
177 | 177 | | |
| 178 | + | |
| 179 | + | |
178 | 180 | | |
179 | 181 | | |
180 | 182 | | |
| |||
Lines changed: 4 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
64 | | - | |
65 | 64 | | |
66 | | - | |
| 65 | + | |
67 | 66 | | |
68 | 67 | | |
69 | 68 | | |
70 | 69 | | |
71 | 70 | | |
72 | | - | |
| 71 | + | |
73 | 72 | | |
74 | | - | |
| 73 | + | |
75 | 74 | | |
76 | 75 | | |
77 | 76 | | |
78 | 77 | | |
79 | | - | |
| 78 | + | |
80 | 79 | | |
81 | 80 | | |
82 | 81 | | |
| |||
Lines changed: 4 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
169 | 169 | | |
170 | 170 | | |
171 | 171 | | |
172 | | - | |
173 | 172 | | |
174 | | - | |
| 173 | + | |
175 | 174 | | |
176 | 175 | | |
177 | 176 | | |
178 | 177 | | |
179 | 178 | | |
180 | | - | |
| 179 | + | |
181 | 180 | | |
182 | | - | |
| 181 | + | |
183 | 182 | | |
184 | 183 | | |
185 | 184 | | |
186 | 185 | | |
187 | | - | |
| 186 | + | |
188 | 187 | | |
189 | 188 | | |
190 | 189 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
327 | 327 | | |
328 | 328 | | |
329 | 329 | | |
330 | | - | |
331 | 330 | | |
332 | | - | |
| 331 | + | |
333 | 332 | | |
334 | 333 | | |
335 | 334 | | |
336 | 335 | | |
337 | 336 | | |
338 | | - | |
| 337 | + | |
339 | 338 | | |
340 | | - | |
| 339 | + | |
341 | 340 | | |
342 | 341 | | |
343 | 342 | | |
344 | 343 | | |
345 | | - | |
| 344 | + | |
346 | 345 | | |
347 | 346 | | |
348 | 347 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
313 | 313 | | |
314 | 314 | | |
315 | 315 | | |
316 | | - | |
317 | 316 | | |
318 | | - | |
| 317 | + | |
319 | 318 | | |
320 | 319 | | |
321 | 320 | | |
322 | 321 | | |
323 | 322 | | |
324 | | - | |
| 323 | + | |
325 | 324 | | |
326 | | - | |
| 325 | + | |
327 | 326 | | |
328 | 327 | | |
329 | 328 | | |
330 | 329 | | |
331 | | - | |
| 330 | + | |
332 | 331 | | |
333 | 332 | | |
334 | 333 | | |
| |||
351 | 350 | | |
352 | 351 | | |
353 | 352 | | |
354 | | - | |
355 | | - | |
| 353 | + | |
| 354 | + | |
356 | 355 | | |
357 | 356 | | |
358 | 357 | | |
| |||
364 | 363 | | |
365 | 364 | | |
366 | 365 | | |
367 | | - | |
| 366 | + | |
368 | 367 | | |
369 | 368 | | |
370 | 369 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
350 | 350 | | |
351 | 351 | | |
352 | 352 | | |
353 | | - | |
354 | 353 | | |
355 | | - | |
| 354 | + | |
356 | 355 | | |
357 | 356 | | |
358 | 357 | | |
359 | 358 | | |
360 | 359 | | |
361 | | - | |
| 360 | + | |
362 | 361 | | |
363 | | - | |
| 362 | + | |
364 | 363 | | |
365 | 364 | | |
366 | 365 | | |
367 | 366 | | |
368 | | - | |
| 367 | + | |
369 | 368 | | |
370 | 369 | | |
371 | 370 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
414 | 414 | | |
415 | 415 | | |
416 | 416 | | |
417 | | - | |
418 | 417 | | |
419 | | - | |
| 418 | + | |
420 | 419 | | |
421 | 420 | | |
422 | 421 | | |
423 | 422 | | |
424 | 423 | | |
425 | | - | |
| 424 | + | |
426 | 425 | | |
427 | | - | |
| 426 | + | |
428 | 427 | | |
429 | 428 | | |
430 | 429 | | |
431 | 430 | | |
432 | | - | |
| 431 | + | |
433 | 432 | | |
434 | 433 | | |
435 | 434 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
95 | 95 | | |
96 | 96 | | |
97 | 97 | | |
98 | | - | |
99 | 98 | | |
100 | 99 | | |
101 | 100 | | |
102 | 101 | | |
103 | | - | |
| 102 | + | |
104 | 103 | | |
105 | 104 | | |
106 | 105 | | |
| |||
110 | 109 | | |
111 | 110 | | |
112 | 111 | | |
113 | | - | |
| 112 | + | |
114 | 113 | | |
115 | 114 | | |
116 | | - | |
| 115 | + | |
117 | 116 | | |
118 | 117 | | |
119 | 118 | | |
| |||
122 | 121 | | |
123 | 122 | | |
124 | 123 | | |
125 | | - | |
| 124 | + | |
126 | 125 | | |
127 | 126 | | |
128 | 127 | | |
129 | | - | |
130 | | - | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
131 | 131 | | |
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
135 | 135 | | |
136 | 136 | | |
137 | 137 | | |
138 | | - | |
139 | | - | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
140 | 147 | | |
141 | 148 | | |
142 | 149 | | |
| |||
155 | 162 | | |
156 | 163 | | |
157 | 164 | | |
158 | | - | |
159 | | - | |
| 165 | + | |
| 166 | + | |
160 | 167 | | |
161 | 168 | | |
162 | 169 | | |
| |||
167 | 174 | | |
168 | 175 | | |
169 | 176 | | |
170 | | - | |
| 177 | + | |
171 | 178 | | |
172 | 179 | | |
173 | 180 | | |
| |||
0 commit comments