Skip to content

bert : various improvements #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 3, 2024

Conversation

ggerganov
Copy link

I ran the instructions following the README on a MacOS device and found some minor issues

The V tensor does not need to be transposed explicitly - it can be done earlier during the ggml_permute(). This should improve the performance a little bit since it will save an extra copy

Regarding the ggml_soft_max_ext() change that I mentioned in the llama.cpp issue - it's not going to work because it assumes the mask can be broadcasted across the batches, but it is not the case here. So the way it is implemented is good

@ggerganov ggerganov force-pushed the gg/mac-improvements branch from 8b9845a to 8fbd461 Compare February 3, 2024 09:42
@iamlemec
Copy link
Owner

iamlemec commented Feb 3, 2024

Awesome, thanks for the fixes!

@iamlemec iamlemec merged commit bad2726 into iamlemec:master Feb 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants