[Core] Introduce class variants for `Transformer2DModel` #7647

sayakpaul · 2024-04-12T05:44:42Z

What does this PR do?

Introduces two variants of Transformer2DModel:

DiTTransformer2DModel
PixArtTransformer2DModel

For the other instances where Transformer2DModel is used, they should later be turned to blocks as they shouldn't be inheriting from ModelMixin (has been discussed internally).

TODO:

(Will be tackled after I get an initial review)

Tests for each individual variant
Documentation

Some comments are in-line.

LMK.

src/diffusers/models/transformers/transformer_2d.py

src/diffusers/models/transformers/transformer_2d_continuous.py

HuggingFaceDocBuilderDev · 2024-04-12T05:56:59Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul · 2024-04-29T02:42:09Z

@yiyixuxu @DN6 a gentle ping here.

DN6 · 2024-04-29T06:44:38Z

Is the plan here to eventually map the Transformer2DModel to the variant? e.g A pipeline that uses Transformer2DModel with patched inference will now try to create PatchedTransformer2DModel under the hood?

Also how feasible is it to break it up into model specific variants rather than input specific variants? e.g PixArtTransformer2DModel?

sayakpaul · 2024-04-29T06:57:43Z

Is the plan here to eventually map the Transformer2DModel to the variant? e.g A pipeline that uses Transformer2DModel with patched inference will now try to create PatchedTransformer2DModel under the hood?

Yeah, that's the plan.

Also how feasible is it to break it up into model specific variants rather than input specific variants? e.g PixArtTransformer2DModel?

Feasible, but I am not sure if we have enough such transformer-based pipelines yet. Most of them vary across very few things (such as the norm type and a cross-attention layer).

I think there is a fair trade-off to be had when deciding which variant to use. If there are too many arguments that are changing, better to use a dedicated class (like we did for the private model). If not, rely on an existing variant that is dependent on the input type.

src/diffusers/models/transformers/dit_transformer2d.py

src/diffusers/models/modeling_utils.py

sayakpaul · 2024-05-22T07:26:04Z

@DN6 done. I think I have addressed all your comments. LMK.

src/diffusers/utils/hub_utils.py

sayakpaul · 2024-05-28T14:45:32Z

@DN6 resolved your comment on the location of _CLASS_REMAPPING_DICT. I have also moved _fetch_remapped_cls_from_config to model_loading_utils. I think this is better as _fetch_remapped_cls_from_config has nothing to do with the Hub.

src/diffusers/models/model_loading_utils.py

DN6

Nice work 👍🏽

DN6 · 2024-05-30T14:15:33Z

LGTM. cc: @yiyixuxu in case you want to take a look too.

yiyixuxu

nice!!
I left a comment - let me know if it is a concern, and feel free to merge if it's not or have addressed it

yiyixuxu · 2024-05-30T19:05:17Z

src/diffusers/models/transformers/pixart_transformer_2d.py

+        shift, scale = (
+            self.scale_shift_table[None] + embedded_timestep[:, None].to(self.scale_shift_table.device)
+        ).chunk(2, dim=1)
+        hidden_states = self.norm_out(hidden_states)
+        # Modulation
+        hidden_states = hidden_states * (1 + scale.to(hidden_states.device)) + shift.to(hidden_states.device)


ohh ok do these tests should also fail on the current implementation - I don't think this refactor introduced any change that would cause them to fail, no?

src/diffusers/models/model_loading_utils.py

yiyixuxu · 2024-05-30T19:40:26Z

src/diffusers/models/modeling_utils.py

            del module.proj_attn
+
+
+class LegacyModelMixin(ModelMixin):


yiyixuxu · 2024-05-30T20:11:03Z

tests/models/transformers/test_models_dit_transformer2d.py

+        return {"hidden_states": hidden_states, "timestep": timesteps, "class_labels": class_label_ids}
+
+    @property
+    def input_shape(self):


are these properties used at all? maybe we can leverage them so we don't have to specify in in test_output?
can be in a separate PR if it makes sense.

Yeah good idea. Can look into a little "input_shape" refactor in a future PR :)

@DN6

* init for patches * finish patched model. * continuous transformer * vectorized transformer2d. * style. * inits. * fix-copies. * introduce DiTTransformer2DModel. * fixes * use REMAPPING as suggested by @DN6 * better logging. * add pixart transformer model. * inits. * caption_channels. * attention masking. * fix use_additional_conditions. * remove print. * debug * flatten * fix: assertion for sigma * handle remapping for modeling_utils * add tests for dit transformer2d * quality * placeholder for pixart tests * pixart tests * add _no_split_modules * add docs. * check * check * check * check * fix tests * fix tests * move Transformer output to modeling_output * move errors better and bring back use_additional_conditions attribute. * add unnecessary things from DiT. * clean up pixart * fix remapping * fix device_map things in pixart2d. * replace Transformer2DModel with appropriate classes in dit, pixart tests * empty * legacy mixin classes./ * use a remapping dict for fetching class names. * change to specifc model types in the pipeline implementations. * move _fetch_remapped_cls_from_config to modeling_loading_utils.py * fix dependency problems. * add deprecation note.

yiyixuxu · 2024-06-08T06:55:46Z

tests/pipelines/pixart_sigma/test_pixart.py

    def test_pixart_512_without_resolution_binning(self):
        generator = torch.manual_seed(0)

-        transformer = Transformer2DModel.from_pretrained(


we should have kept this test, can we add it back, and name it test_pixart_512_without_resolution_binning_legacy_class or something like this?
ane make sure to have a similar slow test for dit

in the future, I think we should always kept the test with the legacy class name, no? so that we can make sure that everything still work fine from the old API
cc @DN6

@DN6

* init for patches * finish patched model. * continuous transformer * vectorized transformer2d. * style. * inits. * fix-copies. * introduce DiTTransformer2DModel. * fixes * use REMAPPING as suggested by @DN6 * better logging. * add pixart transformer model. * inits. * caption_channels. * attention masking. * fix use_additional_conditions. * remove print. * debug * flatten * fix: assertion for sigma * handle remapping for modeling_utils * add tests for dit transformer2d * quality * placeholder for pixart tests * pixart tests * add _no_split_modules * add docs. * check * check * check * check * fix tests * fix tests * move Transformer output to modeling_output * move errors better and bring back use_additional_conditions attribute. * add unnecessary things from DiT. * clean up pixart * fix remapping * fix device_map things in pixart2d. * replace Transformer2DModel with appropriate classes in dit, pixart tests * empty * legacy mixin classes./ * use a remapping dict for fetching class names. * change to specifc model types in the pipeline implementations. * move _fetch_remapped_cls_from_config to modeling_loading_utils.py * fix dependency problems. * add deprecation note.

lengmo1996 · 2025-07-14T13:22:12Z

Hello, I need to use a deprecated model (vq-diffusion) now. Due to version issues, Transformer2DModel has been mapped to two variants, but these two variants are slightly different from the original vq-diffusion (specifically, different types of norms are used). Directly loading the pre-trained model will cause the from_pretrained of the LegacyModelMixin class to fall into a loop call until the buffer overflows. If DiTTransformer2DModel uses ada_norm, an error [NotImplementedError: Forward pass is not implemented when patch_size is not None and norm_type is 'ada_norm'] will be reported. So how can I adjust the code to make it run?

sayakpaul added 6 commits April 11, 2024 15:43

init for patches

fa5a0bd

finish patched model.

05c0382

continuous transformer

e041313

vectorized transformer2d.

ad0d71f

style.

302bd02

inits.

ed2ddd1

sayakpaul added the refactor label Apr 12, 2024

sayakpaul requested review from DN6 and yiyixuxu April 12, 2024 05:44

sayakpaul commented Apr 12, 2024

View reviewed changes

src/diffusers/models/transformers/transformer_2d.py Show resolved Hide resolved

sayakpaul commented Apr 12, 2024

View reviewed changes

src/diffusers/models/transformers/transformer_2d_continuous.py Outdated Show resolved Hide resolved

sayakpaul commented Apr 12, 2024

View reviewed changes

src/diffusers/models/transformers/transformer_2d_continuous.py Outdated Show resolved Hide resolved

sayakpaul changed the title ~~[Core]~~ [Core] Introduce class variants for Transformer2Model Apr 12, 2024

sayakpaul changed the title ~~[Core] Introduce class variants for Transformer2Model~~ [Core] Introduce class variants for Transformer2DModel Apr 12, 2024

fix-copies.

9c83d68

sayakpaul added 3 commits April 12, 2024 11:27

Merge branch 'main' into transormer2d-variants

a22ff0f

merge main

c2da87b

Merge branch 'main' into transormer2d-variants

89a0955

DN6 marked this pull request as ready for review April 29, 2024 05:35

sayakpaul added 2 commits May 1, 2024 04:47

Merge branch 'main' into transormer2d-variants

20b06ee

introduce DiTTransformer2DModel.

47cc474

sayakpaul changed the title ~~[Core] Introduce class variants for Transformer2DModel~~ [WIP][Core] Introduce class variants for Transformer2DModel May 1, 2024

sayakpaul marked this pull request as draft May 1, 2024 04:11

sayakpaul added 2 commits May 1, 2024 09:44

fixes

fbade3c

Merge branch 'main' into transormer2d-variants

a48453e

sayakpaul commented May 1, 2024

View reviewed changes

src/diffusers/models/transformers/dit_transformer2d.py Outdated Show resolved Hide resolved

DN6 reviewed May 21, 2024

View reviewed changes

src/diffusers/models/modeling_utils.py Outdated Show resolved Hide resolved

DN6 reviewed May 21, 2024

View reviewed changes

src/diffusers/models/modeling_utils.py Outdated Show resolved Hide resolved

sayakpaul added 2 commits May 22, 2024 12:55

use a remapping dict for fetching class names.

1dbdb80

Merge branch 'main' into transormer2d-variants

3e8869d

sayakpaul added 2 commits May 27, 2024 17:56

Merge branch 'main' into transormer2d-variants

58983db

change to specifc model types in the pipeline implementations.

0b198ac

DN6 reviewed May 28, 2024

View reviewed changes

src/diffusers/utils/hub_utils.py Outdated Show resolved Hide resolved

DN6 reviewed May 28, 2024

View reviewed changes

src/diffusers/utils/hub_utils.py Outdated Show resolved Hide resolved

sayakpaul added 2 commits May 28, 2024 15:42

Merge branch 'main' into transormer2d-variants

a43a584

move _fetch_remapped_cls_from_config to modeling_loading_utils.py

fa692c6

sayakpaul requested a review from DN6 May 28, 2024 14:44

fix dependency problems.

b111afc

DN6 reviewed May 30, 2024

View reviewed changes

src/diffusers/models/model_loading_utils.py Show resolved Hide resolved

DN6 approved these changes May 30, 2024

View reviewed changes

sayakpaul added 2 commits May 30, 2024 20:13

add deprecation note.

8edcf0b

Merge branch 'main' into transormer2d-variants

cb0f994

yiyixuxu approved these changes May 30, 2024

View reviewed changes

Merge branch 'main' into transormer2d-variants

3406d34

sayakpaul merged commit 983dec3 into main May 31, 2024

sayakpaul deleted the transormer2d-variants branch May 31, 2024 08:10

This was referenced May 31, 2024

[Transformer2DModel] Handle norm_type safely while remapping #8370

Merged

[Tests] use the output_shape property in the tests. #8386

Closed

yiyixuxu reviewed Jun 8, 2024

View reviewed changes

lengmo1996 mentioned this pull request Jul 14, 2025

Fixed bug: Uncontrolled recursive calls that caused an infinite loop when loading certain pipelines containing Transformer2DModel #11923

Merged

4 tasks

[Core] Introduce class variants for Transformer2DModel #7647

[Core] Introduce class variants for Transformer2DModel #7647

Uh oh!

Conversation

sayakpaul commented Apr 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 12, 2024

Uh oh!

sayakpaul commented Apr 29, 2024

Uh oh!

DN6 commented Apr 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Apr 29, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented May 22, 2024

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented May 28, 2024

Uh oh!

Uh oh!

DN6 left a comment

Choose a reason for hiding this comment

Uh oh!

DN6 commented May 30, 2024

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu May 30, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yiyixuxu May 30, 2024

Choose a reason for hiding this comment

Uh oh!

yiyixuxu May 30, 2024

Choose a reason for hiding this comment

Uh oh!

sayakpaul May 31, 2024

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Jun 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lengmo1996 commented Jul 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[Core] Introduce class variants for `Transformer2DModel` #7647

[Core] Introduce class variants for `Transformer2DModel` #7647

sayakpaul commented Apr 12, 2024 •

edited

Loading

DN6 commented Apr 29, 2024 •

edited

Loading

yiyixuxu Jun 8, 2024 •

edited

Loading