-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[Kernel] Adding basic Triton JitCache for triton_attn #16606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
bringlein
wants to merge
49
commits into
vllm-project:main
Choose a base branch
from
bringlein:ngl_jit_cache_pr
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 40 commits
Commits
Show all changes
49 commits
Select commit
Hold shift + click to select a range
b3a01dc
copying jit_cache, adapting jit cache for 2d kernel
bringlein 3490bfc
some cleanup
bringlein 4cc5407
formatting, typos...
bringlein 59755e2
ruff....
bringlein f114090
adding assume const to jit cache
bringlein c43006e
experimenting with static launch grid again
bringlein 9da4df6
recovering good performance
bringlein d7fc0af
going back to static launch grid
bringlein bf64b6d
formatting...
bringlein f3fb7e9
make type checking of key arguments more helpful
bringlein dc3b28c
applying jit cache for prefix prefill
bringlein e717040
fmt & ruff
bringlein fe2f6a5
ci
bringlein 14cca7e
remove changed requirements by mistake/pre-hook?
bringlein d37ef48
fmt...
bringlein 5e4bb2f
removing jit cache from prefix prefill again
bringlein c711433
cleanup
bringlein f8c6610
address review comments
bringlein f8b5001
fix type hints
bringlein ef3d6a3
add transparency as fallback mode
bringlein edf8633
CI whacamole
bringlein 10df1df
CI whacamole...
bringlein cf1cea9
Merge branch 'main' into ngl_jit_cache_pr
bringlein f6852ed
adding triton 3.3 support
bringlein b93de23
Merge branch 'main' into ngl_jit_cache_pr
bringlein 72d9858
fixing triton 3.3 support (1/x); add support for unified kernel
bringlein eeaab8d
fixing triton 3.3 support (2/2)
bringlein 9ffc6e4
cleanup and add env var
bringlein 1c65d75
adding assume_const
bringlein 43b500b
make argument passing (slightly) faster
bringlein 43aed8c
Merge branch 'main' into ngl_jit_cache_pr (moving envs content)
bringlein e50534a
fixing env var merge conflict
bringlein 450770c
adding attention metadata specific for triton_backend
bringlein f7705c0
fixing env file again
bringlein 3a5c63e
Revert "adding attention metadata specific for triton_backend"
bringlein e2ef23e
more elegant fix on dependency of flash attention
bringlein 8f5735b
thrid way to un-break triton backend
bringlein ccd22c9
CI...
bringlein a94e99b
making jitcache safe to use with autotuner
cyang49 af094a3
CI whacamole...
bringlein c1b21d5
fixup spelling in a few spots
tlrmchlsmth be9d7d4
Merge branch 'main' into ngl_jit_cache_pr
tdoublep 791b8b2
Added support for specialization.
tdoublep f4a436a
Merge branch 'main' into ngl_jit_cache_pr
bringlein f72a768
minor cleanup; remove copy of launch grid
bringlein 02a6ea4
improve docstring
bringlein cd987c2
Merge branch 'main' into ngl_jit_cache_pr
bringlein d52af9b
ruff....
bringlein e1cf444
fixing merge error
bringlein File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.