Commit 63f767e
Add SVD (#5895)
* begin model
* finish blocks
* add_embedding
* addition_time_embed_dim
* use TimestepEmbedding
* fix temporal res block
* fix time_pos_embed
* fix add_embedding
* add conversion script
* fix model
* up
* add new resnet blocks
* make forward work
* return sample in original shape
* fix temb shape in TemporalResnetBlock
* add spatio temporal transformers
* add vae blocks
* fix blocks
* update
* update
* fix shapes in Alphablender and add time activation in res blcok
* use new blocks
* style
* fix temb shape
* fix SpatioTemporalResBlock
* reuse TemporalBasicTransformerBlock
* fix TemporalBasicTransformerBlock
* use TransformerSpatioTemporalModel
* fix TransformerSpatioTemporalModel
* fix time_context dim
* clean up
* make temb optional
* add blocks
* rename model
* update conversion script
* remove UNetMidBlockSpatioTemporal
* add in init
* remove unused arg
* remove unused arg
* remove more unsed args
* up
* up
* check for None
* update vae
* update up/mid blocks for decoder
* begin pipeline
* adapt scheduler
* add guidance scalings
* fix norm eps in temporal transformers
* add temporal autoencoder
* make pipeline run
* fix frame decodig
* decode in float32
* decode n frames at a time
* pass decoding_t to decode_latents
* fix decode_latents
* vae encode/decode in fp32
* fix dtype in TransformerSpatioTemporalModel
* type image_latents same as image_embeddings
* allow using differnt eps in temporal block for video decoder
* fix default values in vae
* pass num frames in decode
* switch spatial to temporal for mixing in VAE
* fix num frames during split decoding
* cast alpha to sample dtype
* fix attention in MidBlockTemporalDecoder
* fix typo
* fix guidance_scales dtype
* fix missing activation in TemporalDecoder
* skip_post_quant_conv
* add vae conversion
* style
* take guidance scale as input
* up
* allow passing PIL to export_video
* accept fps as arg
* add pipeline and vae in init
* remove hack
* use AutoencoderKLTemporalDecoder
* don't scale image latents
* add unet tests
* clean up unet
* clean TransformerSpatioTemporalModel
* add slow svd test
* clean up
* make temb optional in Decoder mid block
* fix norm eps in TransformerSpatioTemporalModel
* clean up temp decoder
* clean up
* clean up
* use c_noise values for timesteps
* use math for log
* update
* fix copies
* doc
* upcast vae
* update forward pass for gradient checkpointing
* make added_time_ids is tensor
* up
* fix upcasting
* remove post quant conv
* add _resize_with_antialiasing
* fix _compute_padding
* cleanup model
* more cleanup
* more cleanup
* more cleanup
* remove freeu
* remove attn slice
* small clean
* up
* up
* remove extra step kwargs
* remove eta
* remove dropout
* remove callback
* remove merge factor args
* clean
* clean up
* move to dedicated folder
* remove attention_head_dim
* docstr and small fix
* update unet doc strings
* rename decoding_t
* correct linting
* store c_skip and c_out
* cleanup
* clean TemporalResnetBlock
* more cleanup
* clean up vae
* clean up
* begin doc
* more cleanup
* up
* up
* doc
* Improve
* better naming
* better naming
* better naming
* better naming
* better naming
* better naming
* better naming
* better naming
* Apply suggestions from code review
* Default chunk size to None
* add example
* Better
* Apply suggestions from code review
* update doc
* Update src/diffusers/pipelines/stable_diffusion_video/pipeline_stable_diffusion_video.py
Co-authored-by: Patrick von Platen <[email protected]>
* style
* Get torch compile working
* up
* rename
* fix doc
* add chunking
* torch compile
* torch compile
* add modelling outputs
* torch compile
* Improve chunking
* Apply suggestions from code review
* Update docs/source/en/using-diffusers/svd.md
* Close diff tag
* remove slicing
* resnet docstr
* add docstr in resnet
* rename
* Apply suggestions from code review
* update tests
* Fix output type latents
* fix more
* fix more
* Update docs/source/en/using-diffusers/svd.md
* fix more
* add pipeline tests
* remove unused arg
* clean up
* make sure get_scaling receives tensors
* fix euler scheduler
* fix get_scalings
* simply euler for now
* remove old test file
* use randn_tensor to create noise
* fix device for rand tensor
* increase expected_max_difference
* fix test_inference_batch_single_identical
* actually fix test_inference_batch_single_identical
* disable test_save_load_float16
* skip test_float16_inference
* skip test_inference_batch_single_identical
* fix test_xformers_attention_forwardGenerator_pass
* Apply suggestions from code review
* update StableVideoDiffusionPipelineSlowTests
* update image
* add diffusers example
* fix more
---------
Co-authored-by: Dhruv Nair <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: apolinário <[email protected]>1 parent d1b2a1a commit 63f767e
File tree
38 files changed
+5287
-149
lines changed- docs/source/en
- using-diffusers
- scripts
- src/diffusers
- models
- pipelines
- stable_diffusion
- stable_video_diffusion
- schedulers
- utils
- tests
- models
- pipelines/stable_video_diffusion
- schedulers
38 files changed
+5287
-149
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
85 | | - | |
| 85 | + | |
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
| 97 | + | |
| 98 | + | |
97 | 99 | | |
98 | 100 | | |
99 | 101 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
0 commit comments