Skip to content

Conversation

williamberman
Copy link
Contributor

@williamberman williamberman commented Feb 14, 2023

The type is really a Union not an Enum

I think manually maintaining the separate karras_diffusion_scheduler_compatibles is the cleanest way to use the _compatibles list within the schedulers module. We can't import the KarrasDiffusionSchedulers type w/in the schedulers modules because that's a circular dependency (as creating the type requires importing all schedulers). I separately tried constructing the type using forward declarations but the forward declarations still require the type to be in scope it seems.

@williamberman williamberman force-pushed the karras_diffusion_schedulers_enum_fix branch 2 times, most recently from 133e015 to 0beb7bd Compare February 14, 2023 06:32
@williamberman williamberman force-pushed the karras_diffusion_schedulers_enum_fix branch from 0beb7bd to 5f7023a Compare February 14, 2023 06:44
Comment on lines -83 to -86

### KarrasDiffusionSchedulers

[[autodoc]] schedulers.scheduling_utils.KarrasDiffusionSchedulers
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc builder doesn't know how to autodoc Enums. I think it's ok to remove

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add it manually?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's what I'm thinking : #2346 (comment)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@williamberman williamberman mentioned this pull request Feb 14, 2023
"KDPM2DiscreteScheduler",
"KDPM2AncestralDiscreteScheduler",
"DEISMultistepScheduler",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the Union semantics, but not a big fan of duplicating these in both KarrasDiffusionSchedulers and here (despite the warning comments). We can maybe work around the cyclic references by using forward type declarations in the Union, but I don't think that would help here anyway. Not sure how to deal with this, honestly.

Copy link
Contributor Author

@williamberman williamberman Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can work around the cyclic references with forward declarations because the forward declaration must still evaluate to correct code once the module with the forward declaration is fully loaded. I think they only really work when the type being forward declared is already in the same module as the usage of the type.

The string literal should contain a valid Python expression (i.e., compile(lit, '', 'eval') should be a valid code object) and it should evaluate without errors once the module has been fully loaded. The local and global namespace in which it is evaluated should be the same namespaces in which default arguments to the same function would be evaluated.

https://peps.python.org/pep-0484/#forward-references:~:text=The%20string%20literal,would%20be%20evaluated.

tl;dr we'll basically have to have regardless

# in <utils module>
from <scheduler module> import <a scheduler>, <b scheduler>

# can be forward reference or not, but if is forward reference, must have <a scheduler> and <b scheduler> in its module by the time its module is fully loaded
TheSchedulersUnion = Union[<a scheduler>, <b scheduler>] 
# in scheduler module
from <utils module> import TheSchedulersUnion # <- circular reference

<do something with TheSchedulersUnion>

The only way the Unions + forward references could work (afaik) would be

from typing import Union

TheSchedulersUnion = Union['SchedulerA', 'SchedulerB']

class SchedulerA:
    prop: TheSchedulersUnion

class SchedulerB:
    prop: TheSchedulersUnion

But once they're in different modules, there's going to be circular imports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think duplicating might be the only way to use the correct union type. Also, I don't know if there's a good way to extract the values out of union. Union.__args__ returns the types it's constructed with but I had to find that by reading the source code of Union and I wouldn't want to use undocumented features of typing.py

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm not really in favor of this PR.

I agree that the docs currently don't nicely display all schedulers, but that's something we can add, change with Enum (we cannot really do this with the new type hint no?
Don't fully understand the advantage of this PR.

The goal with the previous enum was:

  • Have nice docs about the schedulers in the future. E.g. people can look up all the different kinds of schedulers that are compatible with each other.
  • Give people nice type hints for the schedulers.
  • Have the "source" information in a single place, adding a new scheduler just forces us to update it in one place

Do we improve on those aspects with this PR?

@patrickvonplaten
Copy link
Contributor

The main thing really here are the docs, so using Enum was a first step to centralize information to which we can guide the user if necessary. Currently the docs of the karras schedluers are very poor: https://huggingface.co/docs/diffusers/v0.12.0/en/api/schedulers/overview#diffusers.schedulers.KarrasDiffusionSchedulers but we could overwrite them by giving the enum nicer docstring.

I want to be able to give the user a link where they can see all the schedulers that are compatible with each other.

@williamberman
Copy link
Contributor Author

The main thing really here are the docs, so using Enum was a first step to centralize information to which we can guide the user if necessary. Currently the docs of the karras schedluers are very poor: https://huggingface.co/docs/diffusers/v0.12.0/en/api/schedulers/overview#diffusers.schedulers.KarrasDiffusionSchedulers but we could overwrite them by giving the enum nicer docstring.

I want to be able to give the user a link where they can see all the schedulers that are compatible with each other.

I would be in favor of manually maintaining the list of compatible schedulers in the docs and opening an issue on the doc-builder to support union types.

This would require manually maintaining the list of compatible schedulers in three places

  1. The union type
  2. The list of strings used in _compatibles
  3. The docs

I'm in favor of this because

  1. We'll have to move to union eventually anyway when we turn on type checking
$ mypy src/diffusers/pipelines/versatile_diffusion/pipeline_versatile_diffusion_text_to_image.py | grep KarrasDiffusionSchedulers
src/diffusers/pipelines/versatile_diffusion/pipeline_versatile_diffusion_text_to_image.py:451: error: "KarrasDiffusionSchedulers" has no attribute "set_timesteps"  [attr-defined]
src/diffusers/pipelines/versatile_diffusion/pipeline_versatile_diffusion_text_to_image.py:452: error: "KarrasDiffusionSchedulers" has no attribute "timesteps"  [attr-defined]
src/diffusers/pipelines/versatile_diffusion/pipeline_versatile_diffusion_text_to_image.py:474: error: "KarrasDiffusionSchedulers" has no attribute "scale_model_input"  [attr-defined]
src/diffusers/pipelines/versatile_diffusion/pipeline_versatile_diffusion_text_to_image.py:485: error: "KarrasDiffusionSchedulers" has no attribute "step"  [attr-defined]
  1. The list of strings solves the circular import issue and I don't want to manually futz with values on python's type objects (this comment has more details on this karras diffusion schedulers Enum->Union #2346 (comment))
  2. We don't add new schedulers that often and so manually updating in two additional places isn't a huge lift. We can also add comments in the locations reminding to keep them in sync

Thoughts?

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Feb 15, 2023

A couple of thoughts here:

    1. I agree that using Union[...] is "more" correct, but why exactly is Enum so bad? We can nicely document an Enum class to display all its members
    1. I'm not sure we'll ever turn type checking on - we haven't done it for transformers ever. IMO mypy is too extreme and would slow down development too much
    1. Never of fan of forcing us to "remember" adding things in multiple places. This usually doesn't end up happening in my experience. We could however use some automated tooling
    1. I'm more concerned with documentation here

@williamberman
Copy link
Contributor Author

Ah ok I thought we had discussed doing a community sprint where we turn on type checking but I probably misunderstood. If we're not turning on typechecking, I agree that leaving it as an enum works. I'll just add a comment on a separate PR noting that it's really a union when used in the pipelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants