Total Size Of The App #3512
Replies: 3 comments
-
I'm assuming you mean how much memory it uses while you're actually running the model? That depends on stuff like the type of quantization and the context size you set. You can assume it will require memory equal to the This also depends on the type of model. LLaMAv1 models, for example, use a lot more memory for the context than LLaMAv2 models. Other models like Starcoders, Baichuan, etc may vary also. You can expect a Q4_K quantized 7B LLaMA model to require around 4GB RAM just to load and then maybe another 1-2GB based on the context size. This is just a very inexact ballpark figure to give you an idea of the general range. |
Beta Was this translation helpful? Give feedback.
-
I was able to run starcoder 1b on my iPhone 13, details here: #3284 7b models need to be heavily quantized to load onto 4gb of RAM (Q2_K), and need to have GPU offloading in order to run at decent speeds. Expect to use 1-3gb of storage space for these models. |
Beta Was this translation helpful? Give feedback.
-
In my experience, iPad Air (M1, 8GB RAM) is able to run 13B models (q2_k ~ q4_0), but with smaller context size. I guess it should also work for iPhone 15 Pro. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I am thinking about building an app in ios. That will use lamma model so what will be the size of the model when it runs on the phone ?
Beta Was this translation helpful? Give feedback.
All reactions