Misc. bug: Server crash with use of lora on CPU

### Name and Version

version: 4960 (fd7855f8)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu

### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell
llama-server -m qwen2-0_5b-instruct-q4_k_m.gguf -c 8192 -b 512 -np 1 --lora Qwen2-0.5B-Instruct-ru-lora.gguf
```

### Problem description & steps to reproduce

The server crashes when using a lora in the latest master with segmentation fault when running on CPU (AVX2).

Model: [qwen2-0_5b-instruct-q4_k_m.gguf](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct-GGUF/resolve/main/qwen2-0_5b-instruct-q4_k_m.gguf)
Lora: [Qwen2-0.5B-Instruct-ru-lora.gguf](https://huggingface.co/undreamer/Qwen2-0.5B-Instruct-ru-lora/resolve/main/Qwen2-0.5B-Instruct-ru-lora.gguf)

I tracked down the commit that causes the issue: 3d82dbcbce2c677fc35fbf99574ccd107d95a1f8.
If the commit is reverted the problem is fixed.

### First Bad Commit

3d82dbcbce2c677fc35fbf99574ccd107d95a1f8

### Relevant log output

```shell
Stacktrace:

llama_adapter_lora_init_impl: loading lora adapter from '/home/benuix/.config/LLMUnity/models/Qwen2-0.5B-Instruct-ru-lora.gguf' ...
llama_adapter_lora_init_impl: CPU_Mapped LoRA buffer size =    14.67 MiB
llama_adapter_lora_init_impl: CPU_AARCH64 LoRA buffer size =     2.11 MiB

Thread 1 "llama-server" received signal SIGSEGV, Segmentation fault.
0x00007ffff7ecc32a in ggml_backend_cpu_aarch64_buffer_set_tensor (buffer=0x5555581abd90, tensor=0x5555564b2700, data=0x5555564d0150, offset=0, size=155648) at /home/benuix/codes/llama.cpp/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:5632
5632	    auto OK            = tensor_traits->repack(tensor, data, size);
(gdb) where
#0  0x00007ffff7ecc32a in ggml_backend_cpu_aarch64_buffer_set_tensor (buffer=0x5555581abd90, 
    tensor=0x5555564b2700, data=0x5555564d0150, offset=0, size=155648)
    at /home/benuix/codes/llama.cpp/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:5632
#1  0x00007ffff772ac03 in ggml_backend_tensor_set (tensor=0x5555564b2700, data=0x5555564d0150, 
    offset=0, size=155648) at /home/benuix/codes/llama.cpp/ggml/src/ggml-backend.cpp:268
#2  0x00007ffff7b6dcc7 in operator() (__closure=0x7fffffff94b0, orig=0x555555b53a80, dev=0x5555564b2700)
    at /home/benuix/codes/llama.cpp/src/llama-adapter.cpp:316
#3  0x00007ffff7b6efc0 in llama_adapter_lora_init_impl (model=..., 
    path_lora=0x555555adf860 "/home/benuix/.config/LLMUnity/models/Qwen2-0.5B-Instruct-ru-lora.gguf", 
    adapter=...) at /home/benuix/codes/llama.cpp/src/llama-adapter.cpp:321
#4  0x00007ffff7b6f619 in llama_adapter_lora_init (model=0x555555b13930, 
    path_lora=0x555555adf860 "/home/benuix/.config/LLMUnity/models/Qwen2-0.5B-Instruct-ru-lora.gguf")
    at /home/benuix/codes/llama.cpp/src/llama-adapter.cpp:333
#5  0x000055555582ab0a in common_init_from_params (params=...)
    at /home/benuix/codes/llama.cpp/common/common.cpp:993
#6  0x0000555555645333 in server_context::load_model (this=0x7fffffffc370, params=...)
    at /home/benuix/codes/llama.cpp/examples/server/server.cpp:1849
#7  0x000055555560127d in main (argc=11, argv=0x7fffffffdb28)
    at /home/benuix/codes/llama.cpp/examples/server/server.cpp:4488
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: Server crash with use of lora on CPU #12587

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: Server crash with use of lora on CPU #12587

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions