Skip to content

Conversation

@patrickvonplaten
Copy link
Contributor

@patrickvonplaten patrickvonplaten commented Nov 10, 2022

This PR introduces a new set_scheduler API to make it easier to switch schedulers in existing pipelines.

You can use it as follows:

from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe.set_scheduler("euler_discrete")

@HuggingFaceDocBuilder
Copy link

HuggingFaceDocBuilder commented Nov 10, 2022

The documentation is not available anymore as the PR was closed or merged.

@patrickvonplaten patrickvonplaten requested review from anton-l, patil-suraj and pcuenca and removed request for anton-l and pcuenca November 12, 2022 18:05
)

scheduler_config = current_scheduler.config
scheduler_init_dict, _ = scheduler_class.extract_init_dict(scheduler_config)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extracts the required config of the new scheduler from the old config @keturn

@keturn
Copy link
Contributor

keturn commented Nov 12, 2022

Thanks for following up on how to set a scheduler! Your earlier comment advising against setting the scheduler after pipeline initialization made got me worried, because that is a very common thing to want to do.

but I must say, there's a lot going on here, and I'm not sure what to make of most of it.

def set_scheduler(self, scheduler_type=Union[str, SchedulerType, Dict[str, str], Dict[str, SchedulerType]]):

I'm easily overwhelmed by these types of function signatures with the Union of many different types with different dimensionalities. Even when they have parameter documentation, as this one does, it's difficult to explain all those different modes at once under a single parameter.

I recently learned that Python 3 has single-dispatch generic functions now, as of Python 3.4 (or maybe 3.7 to get automatic use of type annotations). That might be worth considering if you do want to support such very different inputs, as then the documentation for the dict-based input is very clearly separated from, say, the enum input.

But after reading through this implementation and its tests a couple times, I think I'm starting to get the idea. You're addressing two use cases:

  1. A pipeline that uses a single Scheduler, as has been the common case for the most popular uses of StableDiffusionPipeline. There may be a variety of schedulers that are compatible with it, but it only uses one at a time.
  2. A pipeline that makes use of multiple schedulers (potentially of different types) in the same pipeline run. What would an example of this be? Different schedulers for training and inference? Or maybe a compound pipeline that uses multiple diffusion models, like text-to-image synthesis followed by a diffusion-based upscaler? Your tests could use a case to demonstrate this.

Alternate proposal:

  1. common case, a DiffusionPipeline with one scheduler: pipeline.set_scheduler(EulerDiscreteScheduler)
  2. multi-scheduler-pipeline case: pipeline.set_scheduler(upscaling=DDIMScheduler)

Key features:

  • class names are easy to statically check with standard Python tools, unlike strings.
  • classes already have names, we don't need to maintain an extra enum with them.
  • consistent type signature; we always pass the same kind of thing to it, no foo | dict[_, foo] union.
  • easily extensible. When I have a new Scheduler class, I can use it here, without having to add it to any str-to-constructor registry.

Other notes

scheduler compatibility

Changing scheduler from type {current_scheduler} to an incompatible scheduler type {scheduler_type}.

I admit it made a mess when I tried to use IPNDMScheduler as a drop-in replacement for PNDMScheduler -- I guess they are not compatible -- but overall I don't think the DiffusionPipeline base class is in a good position to give reliable advice about this. How does it know what schedulers my pipeline is compatible with? How does it make that decision when I hand it a new class unknown to this version of diffusers? Having every scheduler keep a hardcoded list of strings referring to every other compatible scheduler does not seem maintainable either.

assignment after pipeline initialization

We started talking about this based on the comment that we shouldn't reassign the pipeline's scheduler property after the pipeline's been initialized, but after all this, it comes down to:

scheduler = scheduler_class(**scheduler_init_dict)
setattr(self, component_name, scheduler)

--which seems to be doing just that?

Maybe we don't need a special setter method for this at all, and we could do with something more like:

pipeline.scheduler = EulerDiscreteScheduler.from_config(current_scheduler.config)

@patrickvonplaten
Copy link
Contributor Author

patrickvonplaten commented Nov 12, 2022

Thanks for the comments @keturn, they are really nice.

Regarding the "Alternate Proposal":

  • I like the idea of just using class names and removing the scheduler name to class name mapping!
    This should indeed help as this way the user doesn't have to learn anything new. The reason, I've put all the types (including Enums) is that enums are easily proposed by most IDEs. But overall, very much agree that it's better to just use class names.
    Your comment here makes a lot of sense -> will adapt to just using class names

  • Now regarding the API, I think it's also a very valid question whether:
    a)

pipeline.set_scheduler(EulerDiscreteScheduler)

or b)

pipeline.scheduler = EulerDiscreteScheduler.from_config(pipeline.scheduler.config)

is better for UX.

The reason why I advocate against setting the scheduler as shown in b) is because it'll currently lead to errors. The problem here is that not every scheduler needs all configurations from other schedulers. E.g.:

When I do:

pipeline.scheduler = EulerDiscreteScheduler.from_config("runwayml/stable-diffusion-v1-5", subfolder="scheduler")

then some parameters in the original config will be "destroyed", e.g. the parameter steps_offset is not needed by "EulerDiscreteScheduler" and therefore after having done the above:

pipeline.scheduler.config

does not include the steps_offset parameter meaning that:

pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)

would led to errors. We could mitigate this problem by saving additional "hidden" parameters in the config.

So either way we'll need some more functionality. The question is just whether the API:

a)

pipeline.set_scheduler(EulerDiscreteScheduler)

or just:

b)

pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)

is nicer.

Maybe b) is actually the nicer API in the end as it doesn't require users to learn a new method.

@patil-suraj @pcuenca @anton-l @keturn what do you think?

@patrickvonplaten
Copy link
Contributor Author

patrickvonplaten commented Nov 12, 2022

BTW, the second use case is not urgent now, but cascaded diffusion models like Google's Imagen have multiple upscaling diffusion pipelines which each need a different scheduler.

But this case could also be nicely covered by the API 1b) - just set multiple schedulers:

pipe.scheduler1 = ...
pipe.scheduler2 = ...

@patrickvonplaten
Copy link
Contributor Author

The more I think about this PR, the more I want to more or less start from scratch and instead use it to make the following changes: #1268 . Curious to hear any feedback.

@dblunk88
Copy link
Contributor

dblunk88 commented Nov 13, 2022

The more I think about this PR, the more I want to more or less start from scratch and instead use it to make the following changes: #1268 . Curious to hear any feedback.

As someone who uses diffusers heavily: I love this! Although if you do:
pipe.set_scheduler("euler_discrete")
I would also have a list somewhere that would report back supported schedulers.
maybe something like
print(pipe.scheduler.supported)

pipe.scheduler.set("euler_discrete")

@patrickvonplaten
Copy link
Contributor Author

Closing this one and going for the better "from_config" refactor !

@patrickvonplaten
Copy link
Contributor Author

Closing in favor of #1286

@sayakpaul sayakpaul deleted the nicer_scheduler_switch_in_out_function branch December 3, 2024 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants