Temp #18

apicalshark · 2024-11-11T05:27:22Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

* llama.swift : exclude ggml-metal-embed.metal * swift : exclude build/

* ggml : add ggml_flash_attn_ext_get_prec * metal : use F16 precision in FA kernels ggml-ci * metal : minor clean-up * metal : compile-guard bf16 FA kernels ggml-ci * build : remove obsolete compile flag [no ci] * metal : prevent int overflows [no ci] * cuda : disable BF16 FA ggml-ci * metal : fix BF16 requirement for FA kernels ggml-ci * make : clean-up [no ci]

* metal : opt-in compile flag for BF16 ggml-ci * ci : use BF16 ggml-ci * swift : switch back to v12 * metal : has_float -> use_float ggml-ci * metal : fix BF16 check in MSL ggml-ci

…-org#10156) This change upstreams llamafile's cpu matrix multiplication kernels for ppc64le using MMA builtins for FP32 datatype. This change results in a consistent 90% improvement in input processing time, and 20% to 80% improvement in output processing time, across various batch sizes. The patch is tested with Meta-Lllama-3-8B, Mistral-7B, Llama-2-7B-chat-hf models on a IBM POWER10 machine. Signed-off-by: Amrita H S <[email protected]>

…ator when ‘ne’ is small (ggml-org#10213)

* metal : reorder write loop * metal : int -> short, style ggml-ci

…l-org#10226)

* Add back samplers to server * Added tooltips with basic information * Fixed stretching of input fields. * use component for settings input, move help msg to tooltips --------- Co-authored-by: Xuan Son Nguyen <[email protected]>

Flake lock file updates: • Updated input 'nixpkgs': 'github:NixOS/nixpkgs/807e9154dcb16384b1b765ebe9cd2bba2ac287fd?narHash=sha256-l253w0XMT8nWHGXuXqyiIC/bMvh1VRszGXgdpQlfhvU%3D' (2024-10-29) → 'github:NixOS/nixpkgs/4aa36568d413aca0ea84a1684d2d46f55dbabad7?narHash=sha256-Zwl8YgTVJTEum%2BL%2B0zVAWvXAGbWAuXHax3KzuejaDyo%3D' (2024-11-05) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

jhen0409 and others added 15 commits November 8, 2024 11:34

swift : exclude ggml-metal-embed.metal (ggml-org#10211)

d05b312

* llama.swift : exclude ggml-metal-embed.metal * swift : exclude build/

metal : improve clarity (minor) (ggml-org#10171)

695ad75

metal : opt-in compile flag for BF16 (ggml-org#10218)

ec450d3

* metal : opt-in compile flag for BF16 ggml-ci * ci : use BF16 ggml-ci * swift : switch back to v12 * metal : has_float -> use_float ggml-ci * metal : fix BF16 check in MSL ggml-ci

scripts : fix pattern and get n_tokens in one go (ggml-org#10221)

8fc393f

ggml: fix zero division in ‘dne’ calculation in CUDA COUNT_EQUAL oper…

5b359bb

…ator when ‘ne’ is small (ggml-org#10213)

metal : hide debug messages from normal log

46323fa

llama : fix Qwen model type strings

f018acb

metal : fix F32 accumulation in FA vec kernel (ggml-org#10232)

bb38cdd

metal : fix build and some more comments (ggml-org#10229)

39a334a

metal : reorder write loop in mul mat kernel + style (ggml-org#10231)

6423c65

* metal : reorder write loop * metal : int -> short, style ggml-ci

vulkan: Fix newly added tests for permuted mul_mat and 1D im2col (ggm…

160687b

…l-org#10226)

github-actions bot added examples server devops testing ggml Nvidia GPU Vulkan labels Nov 11, 2024

Merge branch 'master' into temp

89329f7

apicalshark merged commit 35e9ed4 into master Nov 11, 2024
6 of 8 checks passed

apicalshark deleted the temp branch November 11, 2024 05:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Temp #18

Temp #18

Uh oh!

apicalshark commented Nov 11, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Temp #18

Temp #18

Uh oh!

Conversation

apicalshark commented Nov 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

apicalshark commented Nov 11, 2024 •

edited

Loading