Skip to content

Attend-and-excite pipeline doesn't work with a size different from the default #2471

@jorgemcgomes

Description

@jorgemcgomes

Describe the bug

The title says all. Any image size different from the default 512x512 (smaller or larger) results in an error.

I don't have time to track down the issue or debug any further, but I thought I'd let you guys know about it.

Reproduction

Just take the example at https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/attend_and_excite and use any different width/height

https://colab.research.google.com/drive/1veuIMdf6Oi-9HteR7nPHWpRnDKLF1nls?usp=sharing

Logs

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-5-d8854f991d1e> in <module>
----> 1 images = pipe(
      2     prompt=prompt,
      3     token_indices=token_indices,
      4     guidance_scale=7.5,
      5     generator=generator,

3 frames
/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     25         def decorate_context(*args, **kwargs):
     26             with self.clone():
---> 27                 return func(*args, **kwargs)
     28         return cast(F, decorate_context)
     29 

/usr/local/lib/python3.8/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py in __call__(self, prompt, token_indices, height, width, num_inference_steps, guidance_scale, negative_prompt, num_images_per_prompt, eta, generator, latents, prompt_embeds, negative_prompt_embeds, output_type, return_dict, callback, callback_steps, cross_attention_kwargs, max_iter_to_alter, thresholds, scale_factor)
    874 
    875                         # Get max activation value for each subject token
--> 876                         max_attention_per_index = self._aggregate_and_get_max_attention_per_token(
    877                             indices=index,
    878                         )

/usr/local/lib/python3.8/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py in _aggregate_and_get_max_attention_per_token(self, indices)
    563     ):
    564         """Aggregates the attention for each token and computes the max activation value for each token to alter."""
--> 565         attention_maps = self.attention_store.aggregate_attention(
    566             from_where=("up", "down", "mid"),
    567         )

/usr/local/lib/python3.8/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py in aggregate_attention(self, from_where)
    100                 cross_maps = item.reshape(-1, self.attn_res, self.attn_res, item.shape[-1])
    101                 out.append(cross_maps)
--> 102         out = torch.cat(out, dim=0)
    103         out = out.sum(0) / out.shape[0]
    104         return out

RuntimeError: torch.cat(): expected a non-empty list of Tensors


### System Info

0.13.1

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions