-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Implement SparseK Attention mechanism — new GGML operator with CPU backend (GPU planned next) #16817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
yael-works
wants to merge
47
commits into
ggml-org:master
Choose a base branch
from
yael-works:feature/sparsek-attn-sycl
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+495
−12
Open
Implement SparseK Attention mechanism — new GGML operator with CPU backend (GPU planned next) #16817
Changes from 7 commits
Commits
Show all changes
47 commits
Select commit
Hold shift + click to select a range
efd9ad4
chore: ignore local backup files
8db1307
feat(SparseK): integrate dynamic mask build into llama-graph
68ab48c
remove accidental .gitignore
ce761f8
Without unnecessary spaces
GittyBurstein 9d07172
restore .gitignore from upstream/master
af711f8
SparseK: apply review feedback (use ggml_scale_bias, single flash_att…
3933069
SparseK: apply review feedback (use ggml_scale_bias, single flash_att…
0c2dd04
fix(SparseK): use ggml_scale_bias directly on scores
GittyBurstein c6a5db4
restore SparseK kv-cache implementation (recovered from local file)
yael-works a6784f0
SparseK: update graph build — replace src/llama-graph.{h,cpp}
f9bd873
sparsek: finalize mask reshape and validation fixes
yael-works de64151
sparsek: replace ggml_scale_bias with standard ops for portability
yael-works 08e359d
sparsek: align base mask 4D shape and add topk==0 guard for robustness
yael-works 49a8a81
SparseK: clean dynamic mask path, remove legacy reshapes, avoid kv-ca…
yael-works ea21d8f
SparseK: finalize graph pipeline cleanup, remove deprecated path and …
161e7cd
SparseK: integrate dynamic attention mask, GGUF metadata, and model l…
yael-works b9a960f
SparseK: less nodes in the graph
b7315fc
Restore head_count block and remove incorrect SparseK metadata (per C…
yael-works 35180a1
SparseK: fix duplicate get_key<bool> instantiations
2fd25a8
SparseK: don't alter KQ mask when disabled
5c3c65c
SparseK: do not alter KV mask when disabled
5798c33
Add SparseK KQ mask unit test
yael-works 48ccccd
Clean SparseK KQ mask test and fix warnings
yael-works a365437
Align SparseK KV mask env gating with unit test
yael-works db3e875
Sparse-K: integrate graph changes and HF->GGUF metadata fixes
194f6a3
Merge branch 'feature/sparsek-attn-sycl' of https://github.com/yael-w…
60c75e7
SparseK: fix meta-buffer expansion and resolve CI failure
88ac1d9
SparseK: silence unused parameters in unit tests for CI
e6b0b10
SparseK: update reference test for kq_mask
46e192f
SparseK: silence release warnings in unit test helpers
a9d2015
SparseK: fix release warnings in unit test (assert helpers + finite_c…
060ee50
tests: integrate SparseK KQ mask test
yael-works 729973b
Merge branch 'master' into feature/sparsek-attn-sycl
yael-works 205fded
Fix duplicate get_key<bool> instantiation
6e36508
Remove tests/test-sparsek_kq_mask.cpp to match remote branch (resolve…
ed9ed7e
SparseK: Fix KQ mask test shapes to match ggml_get_rows 3D semantics
212d47f
SparseK: cleanup meta context and rely on graph_max_nodes headroom
087ecf3
SparseK: fix test-backend-ops overrides + update mask graph implement…
5c2849d
Remove test-backend-ops.cpp from PR
3687665
SparseK: fix graph node budget and stable mask construction
4045566
Fix flake8 E302 in convert_hf_to_gguf
f7b79ce
fix errors
d3b6c26
Fix unused variable 'picked' in SparseK mask builder
18adb6f
without spaces
57b907e
try to chek the SPARSE
3f1005b
mark SparseK tests as NOT_SUPPORTED on Vulkan
04d6c83
Merge branch 'master' into feature/sparsek-attn-sycl
yael-works File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.