Skip to content

Conversation

pcuenca
Copy link
Member

@pcuenca pcuenca commented Feb 6, 2023

Found while testing #1791.

The pipeline doesn't crash, but it still doesn't work.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Feb 6, 2023

The documentation is not available anymore as the PR was closed or merged.

@@ -234,7 +234,7 @@ def prepare_attention_mask(self, attention_mask, target_length):
# HACK: MPS: Does not support padding by greater than dimension of input tensor.
# Instead, we can manually construct the padding tensor.
padding_shape = (attention_mask.shape[0], attention_mask.shape[1], target_length)
padding = torch.zeros(padding_shape, device=attention_mask.device)
padding = torch.zeros(padding_shape).to(attention_mask)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
padding = torch.zeros(padding_shape).to(attention_mask)
padding = torch.zeros(padding_shape).to(attention_mask.device)

no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recently learned that doing it that way changes both the device and the dtype at once. But it may not be clear, so I'll change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah interesting! Good to keep in mind :-)

In general, it's better IMO to stick to our existing API usage, e.g. we always use f-strings, ... so even if it requires more code IMO we should stick to what we have currently and change it only if we change the whole codebase so that our code design stays consistent.

Also for such cases we should also be loosely aware of whether it's a very recent feature of PyTorch or already there since ~2 years. If it's there since ~2 years then happy to ad

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been there for several versions, I verified :) But I do agree that it's nowhere in our codebase, so it's better to err on the explicit side.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@patrickvonplaten patrickvonplaten merged commit e619db2 into main Feb 7, 2023
@williamberman
Copy link
Contributor

Ah, thanks @pcuenca !

@pcuenca pcuenca deleted the mps-hack-fix branch February 7, 2023 20:34
yiyixuxu pushed a commit to evinpinar/diffusers-attend-and-excite-pipeline that referenced this pull request Feb 16, 2023
* mps cross-attention hack: don't crash on fp16

* Make conversion explicit.
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* mps cross-attention hack: don't crash on fp16

* Make conversion explicit.
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* mps cross-attention hack: don't crash on fp16

* Make conversion explicit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants