Skip to content

Misc. bug: llama-embedding asserts: GGML_ASSERT(params.n_batch >= params.n_ctx); #12860

@yurivict

Description

@yurivict

Name and Version

$ llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA GeForce RTX 2060 (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 49152 | int dot: 0 | matrix cores: KHR_coopmat
register_backend: registered backend Vulkan (1 devices)
register_device: registered device Vulkan0 (NVIDIA GeForce RTX 2060)
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (CPU)
version: 0 (unknown)
built with FreeBSD clang version 19.1.7 (https://github.com/llvm/llvm-project.git llvmorg-19.1.7-0-gcd708029e0b2) for x86_64-unknown-freebsd14.2

Operating systems

BSD

Which llama.cpp modules do you know to be affected?

No response

Command line

llama-embedding -m llama-2-7b-chat.Q4_K_M.gguf -ngl 22 -c 4096

Problem description & steps to reproduce

Location of assertion:

135│     // max batch size
136│     const uint64_t n_batch = params.n_batch;
137├───> GGML_ASSERT(params.n_batch >= params.n_ctx);

Values:

(gdb) p params.n_ctx
$2 = 4096
(gdb) p n_batch
$3 = 2048

First Bad Commit

n/a

Relevant log output

...
system_info: n_threads = 4 (n_threads_batch = 4) / 8 | CPU : SSE3 = 1 | SSSE3 = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 | 
/usr/ports/misc/llama-cpp/work/llama.cpp-b5097/examples/embedding/embedding.cpp:137: GGML_ASSERT(params.n_batch >= params.n_ctx) failed
[New LWP 769694 of process 69927]
[New LWP 769695 of process 69927]
[New LWP 769696 of process 69927]
[New LWP 769697 of process 69927]
[New LWP 769698 of process 69927]
[New LWP 769699 of process 69927]
[New LWP 769705 of process 69927]
[New LWP 769706 of process 69927]
[New LWP 769707 of process 69927]
_wait4 () at _wait4.S:4
4       PSEUDO(wait4)
#0  _wait4 () at _wait4.S:4
4       PSEUDO(wait4)
#1  0x000000004dec3a9c in __thr_wait4 (pid=81876, status=0x4cc8a88c, options=0, rusage=0x0) at /disk-samsung/freebsd-src/lib/libthr/thread/thr_syscalls.c:578
578             ret = __sys_wait4(pid, status, options, rusage);
#2  0x000000005060bdc1 in ggml_print_backtrace () at /usr/ports/misc/llama-cpp/work/llama.cpp-b5097/ggml/src/ggml.c:156
156             waitpid(pid, &wstatus, 0);
#3  0x000000005060bc52 in ggml_abort (file=0x23102f "/usr/ports/misc/llama-cpp/work/llama.cpp-b5097/examples/embedding/embedding.cpp", line=137, fmt=0x22c62a "GGML_ASSERT(%s) failed") at /usr/ports/misc/llama-cpp/work/llama.cpp-b5097/ggml/src/ggml.c:183
183         ggml_print_backtrace();
#4  0x000000000033a528 in main (argc=7, argv=0x4cc8c128) at /usr/ports/misc/llama-cpp/work/llama.cpp-b5097/examples/embedding/embedding.cpp:137
137         GGML_ASSERT(params.n_batch >= params.n_ctx);
[Inferior 1 (process 69927) detached]
Abort trap

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions