Commit 2d0e384
merge main (#23866)
* Debug example code for MegaForCausalLM (#23382)
* Debug example code for MegaForCausalLM
set ignore_mismatched_sizes=True in model loading code
* Fix up
* Remove erroneous `img` closing tag (#23646)
See #23625
* Fix tensor device while attention_mask is not None (#23538)
* Fix tensor device while attention_mask is not None
* Fix tensor device while attention_mask is not None
* Fix accelerate logger bug (#23650)
* fix logger bug
* Update tests/mixed_int8/test_mixed_int8.py
Co-authored-by: Zachary Mueller <[email protected]>
* import `PartialState`
---------
Co-authored-by: Zachary Mueller <[email protected]>
* Muellerzr fix deepspeed (#23657)
* Fix deepspeed recursion
* Better fix
* Bugfix: LLaMA layer norm incorrectly changes input type and consumers lots of memory (#23535)
* Fixed bug where LLaMA layer norm would change input type.
* make fix-copies
---------
Co-authored-by: younesbelkada <[email protected]>
* Fix wav2vec2 is_batched check to include 2-D numpy arrays (#23223)
* Fix wav2vec2 is_batched check to include 2-D numpy arrays
* address comment
* Add tests
* oops
* oops
* Switch to np array
Co-authored-by: Sanchit Gandhi <[email protected]>
* Switch to np array
* condition merge
* Specify mono channel only in comment
* oops, add other comment too
* make style
* Switch list check from falsiness to empty
---------
Co-authored-by: Sanchit Gandhi <[email protected]>
* changing the requirements to a cpu torch version that works (#23483)
* Fix SAM tests and use smaller checkpoints (#23656)
* Fix SAM tests and use smaller checkpoints
* Override test_model_from_pretrained to use sam-vit-base as well
* make fixup
* Update all no_trainer with skip_first_batches (#23664)
* Update workflow files (#23658)
* fix
* fix
---------
Co-authored-by: ydshieh <[email protected]>
* [image-to-text pipeline] Add conditional text support + GIT (#23362)
* First draft
* Remove print statements
* Add conditional generation
* Add more tests
* Remove scripts
* Remove BLIP specific linkes
* Add support for pix2struct
* Add fast test
* Address comment
* Fix style
* small fix to remove unused eos in processor when it's not used. (#23408)
* Bump requests from 2.27.1 to 2.31.0 in /examples/research_projects/decision_transformer (#23673)
Bump requests in /examples/research_projects/decision_transformer
Bumps [requests](https://github.com/psf/requests) from 2.27.1 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.27.1...v2.31.0)
---
updated-dependencies:
- dependency-name: requests
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Bump requests from 2.22.0 to 2.31.0 in /examples/research_projects/visual_bert (#23670)
Bump requests in /examples/research_projects/visual_bert
Bumps [requests](https://github.com/psf/requests) from 2.22.0 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.22.0...v2.31.0)
---
updated-dependencies:
- dependency-name: requests
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Bump requests from 2.22.0 to 2.31.0 in /examples/research_projects/lxmert (#23668)
Bump requests in /examples/research_projects/lxmert
Bumps [requests](https://github.com/psf/requests) from 2.22.0 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.22.0...v2.31.0)
---
updated-dependencies:
- dependency-name: requests
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add PerSAM [bis] (#23659)
* Add PerSAM args
* Make attn_sim optional
* Rename to attention_similarity
* Add docstrigns
* Improve docstrings
* Fix typo in a parameter name for open llama model (#23637)
* Update modeling_open_llama.py
Fix typo in `use_memorry_efficient_attention` parameter name
* Update configuration_open_llama.py
Fix typo in `use_memorry_efficient_attention` parameter name
* Update configuration_open_llama.py
Take care of backwards compatibility ensuring that the previous parameter name is taken into account if used
* Update configuration_open_llama.py
format to adjust the line length
* Update configuration_open_llama.py
proper code formatting using `make fixup`
* Update configuration_open_llama.py
pop the argument not to let it be set later down the line
* Fix PyTorch SAM tests (#23682)
fix
Co-authored-by: ydshieh <[email protected]>
* Making `safetensors` a core dependency. (#23254)
* Making `safetensors` a core dependency.
To be merged later, I'm creating the PR so we can try it out.
* Update setup.py
* Remove duplicates.
* Even more redundant.
* 🌐 [i18n-KO] Translated `tasks/monocular_depth_estimation.mdx` to Korean (#23621)
docs: ko: `tasks/monocular_depth_estimation`
Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Gabriel Yang <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>
Co-authored-by: Jungnerd <[email protected]>
* Fix a `BridgeTower` test (#23694)
fix
Co-authored-by: ydshieh <[email protected]>
* [`SAM`] Fixes pipeline and adds a dummy pipeline test (#23684)
* add a dummy pipeline test
* change test name
* TF version compatibility fixes (#23663)
* New TF version compatibility fixes
* Remove dummy print statement, move expand_1d
* Make a proper framework inference function
* Make a proper framework inference function
* ValueError -> TypeError
* [`Blip`] Fix blip doctest (#23698)
fix blip doctest
* is_batched fix for remaining 2-D numpy arrays (#23309)
* Fix is_batched code to allow 2-D numpy arrays for audio
* Tests
* Fix typo
* Incorporate comments from PR #23223
* Skip `TFCvtModelTest::test_keras_fit_mixed_precision` for now (#23699)
fix
Co-authored-by: ydshieh <[email protected]>
* fix: load_best_model_at_end error when load_in_8bit is True (#23443)
Ref: huggingface/peft#394
Loading a quantized checkpoint into non-quantized Linear8bitLt is not supported.
call module.cuda() before module.load_state_dict()
* Fix some docs what layerdrop does (#23691)
* Fix some docs what layerdrop does
* Update src/transformers/models/data2vec/configuration_data2vec_audio.py
Co-authored-by: Sylvain Gugger <[email protected]>
* Fix more docs
---------
Co-authored-by: Sylvain Gugger <[email protected]>
* add GPTJ/bloom/llama/opt into model list and enhance the jit support (#23291)
Signed-off-by: Wang, Yi A <[email protected]>
* 4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479)
* Added lion and paged optimizers and made original tests pass.
* Added tests for paged and lion optimizers.
* Added and fixed optimizer tests.
* Style and quality checks.
* Initial draft. Some tests fail.
* Fixed dtype bug.
* Fixed bug caused by torch_dtype='auto'.
* All test green for 8-bit and 4-bit layers.
* Added fix for fp32 layer norms and bf16 compute in LLaMA.
* Initial draft. Some tests fail.
* Fixed dtype bug.
* Fixed bug caused by torch_dtype='auto'.
* All test green for 8-bit and 4-bit layers.
* Added lion and paged optimizers and made original tests pass.
* Added tests for paged and lion optimizers.
* Added and fixed optimizer tests.
* Style and quality checks.
* Fixing issues for PR #23479.
* Added fix for fp32 layer norms and bf16 compute in LLaMA.
* Reverted variable name change.
* Initial draft. Some tests fail.
* Fixed dtype bug.
* Fixed bug caused by torch_dtype='auto'.
* All test green for 8-bit and 4-bit layers.
* Added lion and paged optimizers and made original tests pass.
* Added tests for paged and lion optimizers.
* Added and fixed optimizer tests.
* Style and quality checks.
* Added missing tests.
* Fixup changes.
* Added fixup changes.
* Missed some variables to rename.
* revert trainer tests
* revert test trainer
* another revert
* fix tests and safety checkers
* protect import
* simplify a bit
* Update src/transformers/trainer.py
* few fixes
* add warning
* replace with `load_in_kbit = load_in_4bit or load_in_8bit`
* fix test
* fix tests
* this time fix tests
* safety checker
* add docs
* revert torch_dtype
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <[email protected]>
* multiple fixes
* update docs
* version checks and multiple fixes
* replace `is_loaded_in_kbit`
* replace `load_in_kbit`
* change methods names
* better checks
* oops
* oops
* address final comments
---------
Co-authored-by: younesbelkada <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
* Paged Optimizer + Lion Optimizer for Trainer (#23217)
* Added lion and paged optimizers and made original tests pass.
* Added tests for paged and lion optimizers.
* Added and fixed optimizer tests.
* Style and quality checks.
---------
Co-authored-by: younesbelkada <[email protected]>
* Export to ONNX doc refocused on using optimum, added tflite (#23434)
* doc refocused on using optimum, tflite
* minor updates to fix checks
* Apply suggestions from code review
Co-authored-by: regisss <[email protected]>
* TFLite to separate page, added links
* Removed the onnx list builder
* make style
* Update docs/source/en/serialization.mdx
Co-authored-by: regisss <[email protected]>
---------
Co-authored-by: regisss <[email protected]>
* fix: use bool instead of uint8/byte in Deberta/DebertaV2/SEW-D to make it compatible with TensorRT (#23683)
* Use bool instead of uint8/byte in DebertaV2 to make it compatible with TensorRT
TensorRT cannot accept onnx graph with uint8/byte intermediate tensors. This PR uses bool tensors instead of unit8/byte tensors to make the exported onnx file can work with TensorRT.
* fix: use bool instead of uint8/byte in Deberta and SEW-D
---------
Co-authored-by: Yuxian Qiu <[email protected]>
* fix gptj could not jit.trace in GPU (#23317)
Signed-off-by: Wang, Yi A <[email protected]>
* Better TF docstring types (#23477)
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Don't forget the imports
* Add the imports to tests too
* make fixup
* Refactor tests that depended on get_type_hints
* Better test refactor
* Fix an old hidden bug in the test_keras_fit input creation code
* Fix for the Deit tests
* Minor awesome-transformers.md fixes (#23453)
Minor docs fixes
* TF SAM memory reduction (#23732)
* Extremely small change to TF SAM dummies to reduce memory usage on build
* remove debug breakpoint
* Debug print statement to track array sizes
* More debug shape printing
* More debug shape printing
* Now remove the debug shape printing
* make fixup
* make fixup
* fix: delete duplicate sentences in `document_question_answering.mdx` (#23735)
fix: delete duplicate sentence
* fix: Whisper generate, move text_prompt_ids trim up for max_new_tokens calculation (#23724)
move text_prompt_ids trimming to top
* Overhaul TF serving signatures + dummy inputs (#23234)
* Let's try autodetecting serving sigs
* Don't clobber existing sigs
* Change shapes for multiplechoice models
* Make default dummy inputs smarter too
* Fix missing f-string
* Let's YOLO a serving output too
* Read __class__.__name__ properly
* Don't just pass naked lists in there and expect it to be okay
* Code cleanup
* Update default serving sig
* Clearer error messages
* Further updates to the default serving output
* make fixup
* Update the serving output a bit more
* Cleanups and renames, raise errors appropriately when we can't infer inputs
* More renames
* we're building in a functional context again, yolo
* import DUMMY_INPUTS from the right place
* import DUMMY_INPUTS from the right place
* Support cross-attention in the dummies
* Support cross-attention in the dummies
* Complete removal of dummy/serving overrides in BERT
* Complete removal of dummy/serving overrides in RoBERTa
* Obliterate lots and lots of serving sig and dummy overrides
* merge type hint changes
* Fix for token_type_ids with vocab_size 1
* Add missing property decorator
* Fix T5 and hopefully some models that take conv inputs
* More signature pruning
* Fix T5's signature
* Fix Wav2Vec2 signature
* Fix LongformerForMultipleChoice input signature
* Fix BLIP and LED
* Better default serving output error handling
* Fix BART dummies
* Fix dummies for cross-attention, esp encoder-decoder models
* Fix visionencoderdecoder signature
* Fix BLIP serving output
* Small tweak to BART dummies
* Cleanup the ugly parameter inspection line that I used in a few places
* committed a breakpoint again
* Move the text_dims check
* Remove blip_text serving_output
* Add decoder_input_ids to the default input sig
* Remove all the manual overrides for encoder-decoder model signatures
* Tweak longformer/led input sigs
* Tweak default serving output
* output.keys() -> output
* make fixup
* [Whisper] Reduce batch size in tests (#23736)
* Fix the regex in `get_imports` to support multiline try blocks and excepts with specific exception types (#23725)
* fix and test get_imports for multiline try blocks, and excepts with specific errors
* fixup
* add some more tests
* add license
* Fix sagemaker DP/MP (#23681)
* Check for use_sagemaker_dp
* Add a check for is_sagemaker_mp when setting _n_gpu again. Should be last broken thing
* Try explicit check?
* Quality
* Enable prompts on the Hub (#23662)
* Enable prompts on the Hub
* Update src/transformers/tools/prompts.py
Co-authored-by: amyeroberts <[email protected]>
* Address review comments
---------
Co-authored-by: amyeroberts <[email protected]>
* Remove the last few TF serving sigs (#23738)
Remove some more serving methods that (I think?) turned up while this PR was open
* Fix `pip install --upgrade accelerate` command in modeling_utils.py (#23747)
Fix command in modeling_utils.py
* Add LlamaIndex to awesome-transformers.md (#23484)
* Fix psuh_to_hub in Trainer when nothing needs pushing (#23751)
* Revamp test selection for the example tests (#23737)
* Revamp test selection for the example tests
* Rename old XLA test and fake modif in run_glue
* Fixes
* Fake Trainer modif
* Remove fake modifs
* [LongFormer] code nits, removed unused parameters (#23749)
* remove unused parameters
* remove unused parameters in config
* Fix is_ninja_available() (#23752)
* Fix is_ninja_available()
search ninja using subprocess instead of importlib.
* Fix style
* Fix doc
* Fix style
* Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/lxmert (#23766)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.0.4 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](tornadoweb/tornado@v6.0.4...v6.3.2)
---
updated-dependencies:
- dependency-name: tornado
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/visual_bert (#23767)
Bump tornado in /examples/research_projects/visual_bert
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.0.4 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](tornadoweb/tornado@v6.0.4...v6.3.2)
---
updated-dependencies:
- dependency-name: tornado
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* [`Nllb-Moe`] Fix nllb moe accelerate issue (#23758)
fix nllb moe accelerate issue
* [OPT] Doc nit, using fast is fine (#23789)
small doc nit
* Fix RWKV backward on GPU (#23774)
* Update trainer.mdx class_weights example (#23787)
class_weights tensor should follow model's device
* no_cuda does not take effect in non distributed environment (#23795)
Signed-off-by: Wang, Yi <[email protected]>
* Fix no such file or directory error (#23783)
* Fix no such file or directory error
* Address comment
* Fix formatting issue
* Log the right train_batch_size if using auto_find_batch_size and also log the adjusted value seperately. (#23800)
* Log right bs
* Log
* Diff message
* Enable code-specific revision for code on the Hub (#23799)
* Enable code-specific revision for code on the Hub
* invalidate old revision
* [Time-Series] Autoformer model (#21891)
* ran `transformers-cli add-new-model-like`
* added `AutoformerLayernorm` and `AutoformerSeriesDecomposition`
* added `decomposition_layer` in `init` and `moving_avg` to config
* added `AutoformerAutoCorrelation` to encoder & decoder
* removed caninical self attention `AutoformerAttention`
* added arguments in config and model tester. Init works! 😁
* WIP autoformer attention with autocorrlation
* fixed `attn_weights` size
* wip time_delay_agg_training
* fixing sizes and debug time_delay_agg_training
* aggregation in training works! 😁
* `top_k_delays` -> `top_k_delays_index` and added `contiguous()`
* wip time_delay_agg_inference
* finish time_delay_agg_inference 😎
* added resize to autocorrelation
* bug fix: added the length of the output signal to `irfft`
* `attention_mask = None` in the decoder
* fixed test: changed attention expected size, `test_attention_outputs` works!
* removed unnecessary code
* apply AutoformerLayernorm in final norm in enc & dec
* added series decomposition to the encoder
* added series decomp to decoder, with inputs
* added trend todos
* added autoformer to README
* added to index
* added autoformer.mdx
* remove scaling and init attention_mask in the decoder
* make style
* fix copies
* make fix-copies
* inital fix-copies
* fix from #22076
* make style
* fix class names
* added trend
* added d_model and projection layers
* added `trend_projection` source, and decomp layer init
* added trend & seasonal init for decoder input
* AutoformerModel cannot be copied as it has the decomp layer too
* encoder can be copied from time series transformer
* fixed generation and made distrb. out more robust
* use context window to calculate decomposition
* use the context_window for decomposition
* use output_params helper
* clean up AutoformerAttention
* subsequences_length off by 1
* make fix copies
* fix test
* added init for nn.Conv1d
* fix IGNORE_NON_TESTED
* added model_doc
* fix ruff
* ignore tests
* remove dup
* fix SPECIAL_CASES_TO_ALLOW
* do not copy due to conv1d weight init
* remove unused imports
* added short summary
* added label_length and made the model non-autoregressive
* added params docs
* better doc for `factor`
* fix tests
* renamed `moving_avg` to `moving_average`
* renamed `factor` to `autocorrelation_factor`
* make style
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <[email protected]>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <[email protected]>
* fix configurations
* fix integration tests
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* fixing `lags_sequence` doc
* Revert "fixing `lags_sequence` doc"
This reverts commit 21e3491.
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Apply suggestions from code review
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* model layers now take the config
* added `layer_norm_eps` to the config
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* added `config.layer_norm_eps` to AutoformerLayernorm
* added `config.layer_norm_eps` to all layernorm layers
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* fix variable names
* added inital pretrained model
* added use_cache docstring
* doc strings for trend and use_cache
* fix order of args
* imports on one line
* fixed get_lagged_subsequences docs
* add docstring for create_network_inputs
* get rid of layer_norm_eps config
* add back layernorm
* update fixture location
* fix signature
* use AutoformerModelOutput dataclass
* fix pretrain config
* no need as default exists
* subclass ModelOutput
* remove layer_norm_eps config
* fix test_model_outputs_equivalence test
* test hidden_states_output
* make fix-copies
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* removed unused attr
* Update tests/models/autoformer/test_modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <[email protected]>
* use AutoFormerDecoderOutput
* fix formatting
* fix formatting
---------
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
* add type hint in pipeline model argument (#23740)
* add type hint in pipeline model argument
* add pretrainedmodel and tfpretainedmodel type hint
* make type hints string
* TF SAM shape flexibility fixes (#23842)
SAM shape flexibility fixes for compilation
* fix Whisper tests on GPU (#23753)
* move input features to GPU
* skip these tests because undefined behavior
* unskip tests
* 🌐 [i18n-KO] Translated `fast_tokenizers.mdx` to Korean (#22956)
* docs: ko: fast_tokenizer.mdx
content - translated
Co-Authored-By: Gabriel Yang <[email protected]>
Co-Authored-By: Nayeon Han <[email protected]>
Co-Authored-By: Hyeonseo Yun <[email protected]>
Co-Authored-By: Sohyun Sim <[email protected]>
Co-Authored-By: Jungnerd <[email protected]>
Co-Authored-By: Wonhyeong Seo <[email protected]>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <[email protected]>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <[email protected]>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <[email protected]>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <[email protected]>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <[email protected]>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <[email protected]>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Hyeonseo Yun <[email protected]>
* Update fast_tokenizers.mdx
* Update fast_tokenizers.mdx
* Update fast_tokenizers.mdx
* Update fast_tokenizers.mdx
* Update _toctree.yml
---------
Co-authored-by: Gabriel Yang <[email protected]>
Co-authored-by: Nayeon Han <[email protected]>
Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Jungnerd <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>
Co-authored-by: Hyeonseo Yun <[email protected]>
* [i18n-KO] Translated video_classification.mdx to Korean (#23026)
* task/video_classification translated
Co-Authored-By: Hyeonseo Yun <[email protected]>
Co-Authored-By: Gabriel Yang <[email protected]>
Co-Authored-By: Sohyun Sim <[email protected]>
Co-Authored-By: Nayeon Han <[email protected]>
Co-Authored-By: Wonhyeong Seo <[email protected]>
Co-Authored-By: Jungnerd <[email protected]>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <[email protected]>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <[email protected]>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <[email protected]>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <[email protected]>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <[email protected]>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <[email protected]>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <[email protected]>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <[email protected]>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Sohyun Sim <[email protected]>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Sohyun Sim <[email protected]>
* Apply suggestions from code review
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Jungnerd <[email protected]>
Co-authored-by: Gabriel Yang <[email protected]>
* Update video_classification.mdx
* Update _toctree.yml
* Update _toctree.yml
* Update _toctree.yml
* Update _toctree.yml
---------
Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Gabriel Yang <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Nayeon Han <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>
Co-authored-by: Jungnerd <[email protected]>
Co-authored-by: Hyeonseo Yun <[email protected]>
* 🌐 [i18n-KO] Translated `troubleshooting.mdx` to Korean (#23166)
* docs: ko: troubleshooting.mdx
* revised: fix _toctree.yml #23112
* feat: nmt draft `troubleshooting.mdx`
* fix: manual edits `troubleshooting.mdx`
* revised: resolve suggestions troubleshooting.mdx
Co-authored-by: Sohyun Sim <[email protected]>
---------
Co-authored-by: Sohyun Sim <[email protected]>
* Adds a FlyteCallback (#23759)
* initial flyte callback
* lint
* logs should still be saved to Flyte even if pandas isn't install (unlikely)
* cr - flyte team
* add docs for Flytecallback
* fix doc string - cr sgugger
* Apply suggestions from code review
cr - sgugger fix doc strings
Co-authored-by: Sylvain Gugger <[email protected]>
---------
Co-authored-by: Sylvain Gugger <[email protected]>
* Update collating_graphormer.py (#23862)
* [LlamaTokenizerFast] nit update `post_processor` on the fly (#23855)
* Update the processor when changing add_eos and add_bos
* fixup
* update
* add a test
* fix failing tests
* fixup
* #23388 Issue: Update RoBERTa configuration (#23863)
* [from_pretrained] imporve the error message when `_no_split_modules` is not defined (#23861)
* Better warning
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <[email protected]>
* format line
---------
Co-authored-by: Sylvain Gugger <[email protected]>
---------
Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Wang, Yi A <[email protected]>
Signed-off-by: Wang, Yi <[email protected]>
Co-authored-by: Tyler <[email protected]>
Co-authored-by: Joshua Lochner <[email protected]>
Co-authored-by: zspo <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Zachary Mueller <[email protected]>
Co-authored-by: Tim Dettmers <[email protected]>
Co-authored-by: younesbelkada <[email protected]>
Co-authored-by: LWprogramming <[email protected]>
Co-authored-by: Sanchit Gandhi <[email protected]>
Co-authored-by: sshahrokhi <[email protected]>
Co-authored-by: Matt <[email protected]>
Co-authored-by: Yih-Dar <[email protected]>
Co-authored-by: ydshieh <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: Nicolas Patry <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alex <[email protected]>
Co-authored-by: Nayeon Han <[email protected]>
Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Gabriel Yang <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>
Co-authored-by: Jungnerd <[email protected]>
Co-authored-by: 小桐桐 <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: Wang, Yi <[email protected]>
Co-authored-by: Maria Khalusova <[email protected]>
Co-authored-by: regisss <[email protected]>
Co-authored-by: uchuhimo <[email protected]>
Co-authored-by: Yuxian Qiu <[email protected]>
Co-authored-by: pagarsky <[email protected]>
Co-authored-by: Connor Henderson <[email protected]>
Co-authored-by: Daniel King <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: Eric J. Wang <[email protected]>
Co-authored-by: Ravi Theja <[email protected]>
Co-authored-by: Arthur <[email protected]>
Co-authored-by: 玩火 <[email protected]>
Co-authored-by: amitportnoy <[email protected]>
Co-authored-by: Ran Ran <[email protected]>
Co-authored-by: Eli Simhayev <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Samin Yasar <[email protected]>
Co-authored-by: Matthijs Hollemans <[email protected]>
Co-authored-by: Kihoon Son <[email protected]>
Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: peridotml <[email protected]>
Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Vijeth Moudgalya <[email protected]>1 parent c9f3cff commit 2d0e384
File tree
320 files changed
+9944
-8114
lines changed- .circleci
- .github/workflows
- docs/source
- en
- main_classes
- model_doc
- tasks
- ko
- tasks
- pt
- examples
- flax/vision
- pytorch
- image-classification
- image-pretraining
- language-modeling
- multiple-choice
- question-answering
- semantic-segmentation
- summarization
- text-classification
- text-generation
- token-classification
- translation
- research_projects
- bert-loses-patience/pabee
- decision_transformer
- lxmert
- visual_bert
- tensorflow/image-classification
- src/transformers
- generation
- models
- albert
- align
- audio_spectrogram_transformer
- autoformer
- auto
- bart
- bert
- blenderbot_small
- blenderbot
- blip_2
- blip
- bloom
- camembert
- clap
- clip
- convbert
- convnext
- ctrl
- cvt
- data2vec
- deberta_v2
- deberta
- deformable_detr
- deit
- deta
- distilbert
- dpr
- efficientnet
- electra
- encoder_decoder
- esm
- flaubert
- funnel
- gpt2
- gptj
- graphormer
- groupvit
- hubert
- informer
- layoutlmv3
- layoutlm
- led
- llama
- longformer
- lxmert
- marian
- mask2former
- mbart
- mctct
- mega
- mobilebert
- mobilevit
- mpnet
- nllb_moe
- open_llama
- openai
- opt
- pegasus_x
- pegasus
- rag
- realm
- regnet
- rembert
- resnet
- roberta_prelayernorm
- roberta
- roformer
- rwkv
- sam
- segformer
- sew_d
- sew
- speech_to_text
- speecht5
- swin
- t5
- tapas
- time_series_transformer
- transfo_xl
- tvlt
- unispeech_sat
- unispeech
- vision_encoder_decoder
- vision_text_dual_encoder
- vit_mae
- vit
- wav2vec2_conformer
- wav2vec2
- wavlm
- whisper
- xglm
- xlm_roberta
- xlm
- xlnet
- pipelines
- tools
- utils
- templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}
- tests
- bitsandbytes
- generation
- models
- albert
- audio_spectrogram_transformer
- autoformer
- auto
- bart
- bert
- blenderbot_small
- blenderbot
- blip
- bort
- bridgetower
- camembert
- clap
- clip
- convbert
- convnext
- ctrl
- cvt
- data2vec
- deberta_v2
- deberta
- deit
- distilbert
- dpr
- electra
- encoder_decoder
- esm
- flaubert
- funnel
- gpt2
- gptj
- groupvit
- hubert
- informer
- layoutlmv3
- layoutlm
- led
- llama
- longformer
- lxmert
- marian
- mbart
- mctct
- mobilebert
- mobilevit
- mpnet
- mt5
- openai
- opt
- pegasus
- rag
- regnet
- rembert
- resnet
- roberta_prelayernorm
- roberta
- roformer
- sam
- segformer
- speech_to_text
- speecht5
- swin
- t5
- tapas
- time_series_transformer
- transfo_xl
- tvlt
- vision_encoder_decoder
- vision_text_dual_encoder
- vit_mae
- vit
- wav2vec2
- whisper
- xglm
- xlm_roberta
- xlm
- xlnet
- pipelines
- repo_utils
- trainer
- utils
- utils
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
320 files changed
+9944
-8114
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
46 | 52 | | |
47 | 53 | | |
48 | 54 | | |
| |||
62 | 68 | | |
63 | 69 | | |
64 | 70 | | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
78 | 71 | | |
79 | 72 | | |
80 | 73 | | |
| |||
111 | 104 | | |
112 | 105 | | |
113 | 106 | | |
114 | | - | |
| 107 | + | |
115 | 108 | | |
116 | 109 | | |
117 | 110 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
342 | 342 | | |
343 | 343 | | |
344 | 344 | | |
345 | | - | |
346 | 345 | | |
347 | 346 | | |
348 | 347 | | |
| |||
355 | 354 | | |
356 | 355 | | |
357 | 356 | | |
358 | | - | |
359 | 357 | | |
360 | 358 | | |
361 | 359 | | |
| |||
367 | 365 | | |
368 | 366 | | |
369 | 367 | | |
370 | | - | |
371 | 368 | | |
372 | 369 | | |
373 | 370 | | |
| |||
551 | 548 | | |
552 | 549 | | |
553 | 550 | | |
554 | | - | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
555 | 562 | | |
556 | 563 | | |
557 | 564 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
195 | 195 | | |
196 | 196 | | |
197 | 197 | | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
198 | 202 | | |
199 | 203 | | |
200 | 204 | | |
| |||
284 | 288 | | |
285 | 289 | | |
286 | 290 | | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
287 | 295 | | |
288 | 296 | | |
289 | 297 | | |
| |||
373 | 381 | | |
374 | 382 | | |
375 | 383 | | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
376 | 388 | | |
377 | 389 | | |
378 | 390 | | |
| |||
459 | 471 | | |
460 | 472 | | |
461 | 473 | | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
462 | 478 | | |
463 | 479 | | |
464 | 480 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
122 | 126 | | |
123 | 127 | | |
124 | 128 | | |
| |||
176 | 180 | | |
177 | 181 | | |
178 | 182 | | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
179 | 187 | | |
180 | 188 | | |
181 | 189 | | |
| |||
221 | 229 | | |
222 | 230 | | |
223 | 231 | | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
224 | 236 | | |
225 | 237 | | |
226 | 238 | | |
| |||
268 | 280 | | |
269 | 281 | | |
270 | 282 | | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
271 | 287 | | |
272 | 288 | | |
273 | 289 | | |
| |||
315 | 331 | | |
316 | 332 | | |
317 | 333 | | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
318 | 338 | | |
319 | 339 | | |
320 | 340 | | |
| |||
361 | 381 | | |
362 | 382 | | |
363 | 383 | | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
364 | 388 | | |
365 | 389 | | |
366 | 390 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
292 | 292 | | |
293 | 293 | | |
294 | 294 | | |
| 295 | + | |
295 | 296 | | |
296 | 297 | | |
297 | 298 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
267 | 267 | | |
268 | 268 | | |
269 | 269 | | |
| 270 | + | |
270 | 271 | | |
271 | 272 | | |
272 | 273 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
239 | 239 | | |
240 | 240 | | |
241 | 241 | | |
| 242 | + | |
242 | 243 | | |
243 | 244 | | |
244 | 245 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
301 | 301 | | |
302 | 302 | | |
303 | 303 | | |
| 304 | + | |
304 | 305 | | |
305 | 306 | | |
306 | 307 | | |
| |||
0 commit comments