Bad Image Quality generated by Qwen Image

### Describe the bug

I installed the latest version of the diffusers library from the main branch, and used the inference code example in the application for inference, but the effect was very poor, as shown in the following figure:

<img width="1024" height="1024" alt="Image" src="https://github.com/user-attachments/assets/f6f5b70d-3632-4b84-8bfa-ce1c6bb851f0" />

However, the effects in the official application are as shown in the figure.

<img width="1226" height="1227" alt="Image" src="https://github.com/user-attachments/assets/f682c202-f0f5-479b-943e-c7511960de15" />

### Reproduction

Firstly install the newest version of diffusers:
`pip install git+https://github.com/huggingface/diffusers`

Codes are as follows: 
```python
import torch.nn.functional as F

_orig_sdpa = F.scaled_dot_product_attention

def _sdpa_drop_enable_gqa(*args, **kwargs):
    kwargs.pop("enable_gqa", None)
    return _orig_sdpa(*args, **kwargs)

F.scaled_dot_product_attention = _sdpa_drop_enable_gqa


from diffusers import DiffusionPipeline
import torch

model_name = "Qwen/Qwen-Image"
torch_dtype = torch.bfloat16
device = "cuda"
pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
pipe = pipe.to(device)
prompt = "一位身着淡雅水粉色交领襦裙的年轻女子背对镜头而坐，俯身专注地手持毛笔在素白宣纸上书写“通義千問”四个遒劲汉字。古色古香的室内陈设典雅考究，案头错落摆放着青瓷茶盏与鎏金香炉，一缕熏香轻盈升腾；柔和光线洒落肩头，勾勒出她衣裙的柔美质感与专注神情，仿佛凝固了一段宁静温润的旧时光。"

image = pipe(
    prompt=prompt,
    width=1024,
    height=1024,
    num_inference_steps=30,
    true_cfg_scale=4.0,
    guidance_scale=1.0
    generator=torch.Generator(device="cuda").manual_seed(42)
).images[0]
image.save("res.jpg")
```

### Logs

```shell

```

### System Info

```
- 🤗 Diffusers version: 0.35.0.dev0
- Platform: Linux-5.15.0-58-generic-x86_64-with-glibc2.35
- Running on Google Colab?: No
- Python version: 3.10.10
- PyTorch version (GPU?): 2.6.0 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.34.4
- Transformers version: 4.55.2
- Accelerate version: 1.10.0
- PEFT version: not installed
- Bitsandbytes version: not installed
- Safetensors version: 0.5.3
- xFormers version: 0.0.22
- Accelerator: NA
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
```

### Who can help?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bad Image Quality generated by Qwen Image #12175

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bad Image Quality generated by Qwen Image #12175

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions