Skip to content

Fix conversion of some BERT embedding models #6937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 29, 2024

Conversation

christianazinn
Copy link
Contributor

@christianazinn christianazinn commented Apr 26, 2024

BERT-based embedding models require the use of convert-hf-to-gguf.py to be converted from safetensors/PyTorch format to GGML. The BertModel class was missing logic that resolves unsupported datatypes, resulting in some models like acge_text_embedding failing with TypeError: Got unsupported ScalarType BFloat16 when running the line data = data_torch.squeeze().numpy().

This is rectified by converting unsupported datatypes to f32 - this is done in every other model class, so it was probably just missed in BERT models. This fix allows for satisfactory conversion and subsequent quantization, as seen in this GGUF quantization of acge_text_embedding created with this fix.

It's also literally just two lines of code, so unless converting tensors in BERT models specifically is undesirable, I hope this is an easy fix.

@christianazinn
Copy link
Contributor Author

Fixed whitespace issues, would like a review.

@iamlemec
Copy link
Collaborator

Makes sense to me. Can confirm that conversion of acge_text_embedding works and model outputs accurate results. Also checked that bge-base-en-v1.5 and nomic-embed-text-v1.5 give identical coverted GGUFs between this PR and master.

@ggerganov ggerganov merged commit 3055a41 into ggml-org:master Apr 29, 2024
22 checks passed
nopperl pushed a commit to nopperl/llama.cpp that referenced this pull request May 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants