Can't get good quality wan 2.1 videos #1000

KintCark · 2025-10-15T03:29:07Z

KintCark
Oct 15, 2025

What is the best setting for wan 2.1 I can't get animations to good visual it's just mush and artifacts I even tried causvid but that didn't work either idk how to use wan 2.1 it's the only one that works on my limited ram

MrSnichovitch · 2025-10-16T22:12:29Z

MrSnichovitch
Oct 16, 2025

What are your system specs? CPU/System RAM/GPU +VRAM?

0 replies

KintCark · 2025-10-21T01:50:23Z

KintCark
Oct 21, 2025
Author

What are your system specs? CPU/System RAM/GPU +VRAM?

Cpu snapdragon 865

Ram 10602MB

Gpu adreno 650

Vram N/A

0 replies

MrSnichovitch · 2025-10-21T02:32:37Z

MrSnichovitch
Oct 21, 2025

Can you supply a sample prompt you're attempting to use? If you're not using the --diffusion-fa option, it might be necessary as it seems to be with Vulkan and ROCm, but I can't specifically say for sure since I don't have an ARM-based/Adreno GPU system to test with.

Here's an example prompt I've run on Vulkan using the standard WAN2.1 T2V model & VAE, but with the Q5_K_M text encoder to save RAM. Process used ~500 MiB of system RAM when running, but I'll warn you that the VAE process used up ~ 5.4 GiB of VRAM at peak, so limited memory could still pose a problem.

./sd -M vid_gen 
--diffusion-model models/checkpoints/wan2.1_t2v_1.3B_fp16.safetensors 
--vae models/vae/wan_2.1_vae.safetensors 
--t5xxl models/text_encoders/umt5-xxl-encoder-Q5_K_M.gguf 
--lora-model-dir models/loras --embd-dir models/embeddings 
-s -1 --cfg-scale 5 --steps 20 
--sampling-method euler --scheduler simple 
-W 384 -H 216 
--fps 12 --video-frames 85 
-v 
-p "medium shot, a lovely cat in a carpeted room, turning to face the camera and walking toward it" 
-n "anime, cartoon, drawing, 3d render, cgi render, ai generated, ugly face" 
-o ../VidGenOutputs/wan2.1_t2v_1.3B_fp16_2025-10-20_test02.avi --diffusion-fa

Resulting vid:

wan2.1_t2v_1.3B_fp16_2025-10-20_test02.mp4

0 replies

stduhpf · 2025-10-21T09:10:30Z

stduhpf
Oct 21, 2025

I mean I don't think you should expect amazing quality out of the 1.3B model

0 replies

MrSnichovitch · 2025-10-21T18:42:16Z

MrSnichovitch
Oct 21, 2025

@stduhpf You're very right... you shouldn't expect miracles. But you should at least be able to get viable clips from it, which is what my particular example clip did. It followed the prompt and produced what was intended.

0 replies

KintCark · 2025-10-22T22:06:45Z

KintCark
Oct 22, 2025
Author

@stduhpf You're very right... you shouldn't expect miracles. But you should at least be able to get viable clips from it, which is what my particular example clip did. It followed the prompt and produced what was intended.

My problem is I don't want to wait for so long just to get a unfinished video my frame rate is too low u did 85 frames and I forgot how many steps but yeah we need fastwan support it only requires 3 steps so we wouldn't need to wait so long

0 replies

KintCark · 2025-11-08T03:39:23Z

KintCark
Nov 8, 2025
Author

Can you supply a sample prompt you're attempting to use? If you're not using the --diffusion-fa option, it might be necessary as it seems to be with Vulkan and ROCm, but I can't specifically say for sure since I don't have an ARM-based/Adreno GPU system to test with.

Here's an example prompt I've run on Vulkan using the standard WAN2.1 T2V model & VAE, but with the Q5_K_M text encoder to save RAM. Process used ~500 MiB of system RAM when running, but I'll warn you that the VAE process used up ~ 5.4 GiB of VRAM at peak, so limited memory could still pose a problem.
./sd -M vid_gen 
--diffusion-model models/checkpoints/wan2.1_t2v_1.3B_fp16.safetensors 
--vae models/vae/wan_2.1_vae.safetensors 
--t5xxl models/text_encoders/umt5-xxl-encoder-Q5_K_M.gguf 
--lora-model-dir models/loras --embd-dir models/embeddings 
-s -1 --cfg-scale 5 --steps 20 
--sampling-method euler --scheduler simple 
-W 384 -H 216 
--fps 12 --video-frames 85 
-v 
-p "medium shot, a lovely cat in a carpeted room, turning to face the camera and walking toward it" 
-n "anime, cartoon, drawing, 3d render, cgi render, ai generated, ugly face" 
-o ../VidGenOutputs/wan2.1_t2v_1.3B_fp16_2025-10-20_test02.avi --diffusion-fa
Resulting vid:

wan2.1_t2v_1.3B_fp16_2025-10-20_test02.mp4

How you able to use the wan 2.1 1.3B fp16 model with utm5xxl q5_0 I have 10gb ram with 8gb free u tried wab 2.1 1 3B Q8 and umt5 Q5 but it got oom and termux crashed

0 replies

MrSnichovitch · 2025-11-08T20:51:01Z

MrSnichovitch
Nov 8, 2025

How you able to use the wan 2.1 1.3B fp16 model with utm5xxl q5_0 I have 10gb ram with 8gb free u tried wab 2.1 1 3B Q8 and umt5 Q5 but it got oom and termux crashed

I mentioned previously that I don't have an ARM-based/Adreno GPU system to test with, so I'm using desktop hardware with more RAM and a GPU with dedicated VRAM to work with. There's no real way for me to give you an apples-to-apples comparison using the Linux system I have vs. the Android system you're using.

When stable-diffusion.cpp runs, it loads/buffers the model, VAE, and text encoder tensors into RAM before processing begins in earnest, meaning that even if you were to use the smallest quants available, such as these:

wan2.1_t2v_1.3b-q2_k.gguf		552.4 MiB
wan_2.1_vae.safetensors			242.1 MiB
umt5-xxl-encoder-Q3_K_M.gguf	2.8 GiB
					            3.62 GiB total

...you'd still be eating up 3.62 GiB of your available 8 GiB RAM before the compute stages start, which consumes RAM/VRAM on top of that.

I've tested those small quants on my system using both the Vulkan and ROCm backends, and I can't get them to produce anything besides blurry shapes. It should be noted that WAN was coded to use CUDA on NVidia hardware, and with it's poor performance in Vulkan and ROCm, it definitely shows. I have no idea if the small quants function properly on NVidia since I have no hardware to test with.

I don't want to be discouraging, but ultimately, you may be tilting at windmills trying to get WAN to generate anything usable with the limited resources you have. Even the full fp16 version of WAN2.1 1.3B doesn't run under Vulkan all that well on my system (ROCm is consistently better), and I still need the fp16 version to get anything usable. I'm afraid there's no information I can give you that might help.

0 replies

KintCark · 2025-11-09T23:17:19Z

KintCark
Nov 9, 2025
Author

How you able to use the wan 2.1 1.3B fp16 model with utm5xxl q5_0 I have 10gb ram with 8gb free u tried wab 2.1 1 3B Q8 and umt5 Q5 but it got oom and termux crashed

I mentioned previously that I don't have an ARM-based/Adreno GPU system to test with, so I'm using desktop hardware with more RAM and a GPU with dedicated VRAM to work with. There's no real way for me to give you an apples-to-apples comparison using the Linux system I have vs. the Android system you're using.

When stable-diffusion.cpp runs, it loads/buffers the model, VAE, and text encoder tensors into RAM before processing begins in earnest, meaning that even if you were to use the smallest quants available, such as these:
wan2.1_t2v_1.3b-q2_k.gguf		552.4 MiB
wan_2.1_vae.safetensors			242.1 MiB
umt5-xxl-encoder-Q3_K_M.gguf	2.8 GiB
					            3.62 GiB total
...you'd still be eating up 3.62 GiB of your available 8 GiB RAM before the compute stages start, which consumes RAM/VRAM on top of that.

I've tested those small quants on my system using both the Vulkan and ROCm backends, and I can't get them to produce anything besides blurry shapes. It should be noted that WAN was coded to use CUDA on NVidia hardware, and with it's poor performance in Vulkan and ROCm, it definitely shows. I have no idea if the small quants function properly on NVidia since I have no hardware to test with.

I don't want to be discouraging, but ultimately, you may be tilting at windmills trying to get WAN to generate anything usable with the limited resources you have. Even the full fp16 version of WAN2.1 1.3B doesn't run under Vulkan all that well on my system (ROCm is consistently better), and I still need the fp16 version to get anything usable. I'm afraid there's no information I can give you that might help.

Well I managed to get wan Q5KS umt5-xxl-encoder-Q4_K_S.gguf wan fp8 but i can't manage to get results it only worked once. I try using self forcing causvid and cfg distilled loras but still can't get good results.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can't get good quality wan 2.1 videos #1000

Uh oh!

{{title}}

Uh oh!

Replies: 9 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Can't get good quality wan 2.1 videos #1000

Uh oh!

KintCark Oct 15, 2025

Replies: 9 comments

Uh oh!

MrSnichovitch Oct 16, 2025

Uh oh!

KintCark Oct 21, 2025 Author

Uh oh!

MrSnichovitch Oct 21, 2025

Uh oh!

stduhpf Oct 21, 2025

Uh oh!

MrSnichovitch Oct 21, 2025

Uh oh!

KintCark Oct 22, 2025 Author

Uh oh!

KintCark Nov 8, 2025 Author

Uh oh!

MrSnichovitch Nov 8, 2025

Uh oh!

KintCark Nov 9, 2025 Author

KintCark
Oct 15, 2025

MrSnichovitch
Oct 16, 2025

KintCark
Oct 21, 2025
Author

MrSnichovitch
Oct 21, 2025

stduhpf
Oct 21, 2025

MrSnichovitch
Oct 21, 2025

KintCark
Oct 22, 2025
Author

KintCark
Nov 8, 2025
Author

MrSnichovitch
Nov 8, 2025

KintCark
Nov 9, 2025
Author