Skip to content

Conversation

@maleksan85
Copy link

@maleksan85 maleksan85 commented Aug 7, 2024

add memory clean up after every shape tryout and a parameter to reduce number of cache invalidation buffers

command like: CACHE_INVALIDATE_BUFFERS=11 python3 vllm/gradlib/gradlib/gemm_tuner.py --input_file /root/workspace/gradlib/gemms_1_2048_512.csv --tuned_file /root/workspace/gradlib/gemms_tuned_1_2048_512.csv --indtype f16 --outdtype f16
works on Navi32 well for llama2 7b

@maleksan85 maleksan85 self-assigned this Aug 7, 2024
@maleksan85 maleksan85 requested a review from gshtras August 7, 2024 23:29
@maleksan85 maleksan85 merged commit 30f12f0 into main Aug 8, 2024
AdrianAbeyta pushed a commit that referenced this pull request Aug 12, 2024
* add memory clean up after every shape and parameter to reduce cache invalidation buffers

* small typo

* syntax change

---------

Co-authored-by: maleksan85 <[email protected]>
@maleksan85 maleksan85 deleted the gradlib_oom_improvement branch August 16, 2024 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants