-
Notifications
You must be signed in to change notification settings - Fork 433
January 2025 Update #1036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
January 2025 Update #1036
Conversation
…eeds more changes for other backends
Thanks for your hard work. Any chance this can be updated with the latest binaries? Mainstream got QRWKV Hybrid, Phi-3.5-MoE, Deepseek V3, C4AI Command-R 7B models, INT-8 and BF16 implementations. |
The binaries here are from just one week ago, so hopefully most of those things should be covered. Currently we need to fix the issues loading the binaries on Linux and MacOS, once that's figured out the next update should be easier 🤞 |
Backend library split compatibility for January 2025 binary update
Wip december update fixes v2
…er defined AVX level
Wip december update fixes v3
Unit tests passed on Windows CUDA & Linux CUDA. Test application is running fine on:
|
Works on my machine! (osx-arm64)
Tested all of |
Updated llama.cpp binaries to 0827b2c1da299805288abbd556d869318f2b121e.
This introduces new binaries. Previously there were 2 for each platform, e.g.
ggml.dll, llama.dll
for CUDA. Now there are more:ggml.dll, ggml-base.dll, ggml-cpu.dll, ggml-cuda.dll, llama.dll
.Currently these are handled in the same way as the old system - each platform has it's own set of completely independent binaries. In the future this should be modified to dynamically load backends. See more details here and here.
Testing: