Skip to content

GGML GPT-2 consistently dies at around 825 tokens with: ggml_new_object: not enough space in the context's memory pool #480

@rmc135

Description

@rmc135

Firstly, thanks to GG and contributors for a great library/utility.

When generating using gpt-2, ggml bombs out at around 824 or 825 tokens, reporting an error then dumping core.

I would expect there to be a problem (hopefully not involving fatal errors and core dumps) when the total tokens equal the context size, but 824 or 825 total seems an odd number?

The same error is referenced in the llama.cpp repo, but possibly for a different reason: ggml-org/llama.cpp#2404

REPRODUCE:

Clean build, CPU only, Ubuntu 22: git pull && rm -Rf build && mkdir build && cd build && cmake .. && make

with ggml-model-f16.bin (gpt2-xl), eg bin/gpt-2 -m ~/gpt-2/models/1558M/ggml-model-f32.bin -n ...
-n 823: ok (run completes without error)
-n 824: ggml_new_object: not enough space in the context's memory pool (needed 268457104, available 268435456)
-n 825: ggml_new_object: not enough space in the context's memory pool (needed 268457104, available 268435456)

with ggml-model-f32.bin (gpt2-xl):
-n 823: ok
-n 824: ok
-n 825: ggml_new_object: not enough space in the context's memory pool (needed 268457104, available 268435456)

Note: I had to repeat some runs several times as ggml will stop prematurely if an <|endoftext|> token is generated. Getting to 823+ tokens can take a few tries.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions