I am currently working on memory improvements for training and testing with bigger models than before.
When training big models the maximum node and parameter count needs to be increased.
At some point stack overflows happened due to big graph struct size...

To solve this I allocate heap memory for the graphs by using the data buffer of a new tensor and then use only build_expand functions instead of the regular build functions.
This required a new ggml_build_backward_expand function which takes the backward graph as pointer parameter, like the forward graph. ggml_build_backward just calls this function.

GGML_API void ggml_build_backward_expand(struct ggml_context * ctx, struct ggml_cgraph * gf, struct ggml_cgraph * gb, bool keep);

Example for allocating new graph:

struct ggml_tensor * gfbuf = ggml_new_tensor_1d(ctx0, GGML_TYPE_I32, sizeof(struct ggml_cgraph) / ggml_type_size(GGML_TYPE_I32) + (sizeof(struct ggml_cgraph) % ggml_type_size(GGML_TYPE_I32) ? 1 : 0));
memset(gfbuf->data, 0, ggml_nbytes(gfbuf));
struct ggml_cgraph * gf = (struct ggml_cgraph *) gfbuf->data;

This seems to be enough for solving the stackoverflows in training.

In llama.cpp inference code there are only a few locations where a cgraph is allocated on stack.
Replacing this occurences with heap allocated graphs and then just using ggml_build_forward_expand should be straight forward.

A function to directly allocate an ggml object with enough bytes from the context would be nice!
Making the new tensor and recasting of the tensor->data looks a bit hackish..

Maybe something like this:

GGML_API void * ggml_alloc(struct ggml_context * ctx, size_t nbytes);
...
struct ggml_cgraph * gf = ggml_alloc(ctx0, sizeof(struct ggml_cgraph));

I prefer allocating the memory from the context over just 'malloc', so that the machinery for freeing the context and all its related memory can be reused.

ggml : improve API to allow allocating compute graphs on the heap #299

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions