Name and Version
llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
version: 1849 (fb22dd0)
built with cc (Debian 12.2.0-14+deb12u1) 12.2.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Command line
/usr/local/bin/llama-server --jinja -fa -c 16384 -ngl 999 -v --log-timestamps --host 192.168.1.68 -m /mnt/external2TB01/LLM/quantized/Qwen3-30B-A3B-Q3_K_L.gguf
Problem description & steps to reproduce
First Bad Commit
The download feature gives only the prompt, result is never downloaded.
Relevant log output