-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
One of the biggest problems with ggml currently is that the user needs to manually pre-calculate the necessary sizes for all the ggml_context objects that they create. This is a result of the goal to have as little memory allocations as possible during runtime. However, it resulted in an unpleasant experience and needs to be improved.
Additionally, the "scratch buffer" mechanism is also very difficult to use and needs to be overhauled as well.
This will be quite a big change to the core library and there are many different ways to approach it, so for now I will keep the description of the issue short. Tagging @slaren as he had shared some nice ideas regarding this topic, which we can discuss further here and decide on a good strategy to implement them