Add support for GritLM #5959

dranger003 · 2024-03-09T12:51:21Z

This PR adds support for GritLM (thanks to @iamlemec). I also added sample code to reproduce their example from here. The sample demonstrates how to use both embeddings and text generation using the same model.

I uploaded some model files here.

llama.cpp

llama.h

ggerganov · 2024-03-09T20:35:48Z

llama.h

+    // Set whether to use causal attention or not
+    // If set to true, the model will only attend to the past tokens
+    LLAMA_API void llama_set_embeddings(struct llama_context * ctx, bool embeddings);


This should be improved - the comment is about causal attention, but the function is called llama_set_embeddings. It's not clear enough

I think we need to introduce bool causal_attn in llama_cparams. It will be initialized by default with llama_hparams.causal_attn, but it would be possible to override it via llama_set_causal_attn(struct llama_context * ctx, bool causal_attn);

The if in llama_set_inputs() should be checking cparams.causal_attn. It would make more sense, because the if is about whether we build a causal mask or not. But the proposed change if (!cparams.embeddings) { is confusing

Agreed, that is confusing. Pushing proposed changes now.

* add gritlm example * gritlm results match * tabs to spaces * comment out debug printing * rebase to new embed * gritlm embeddings are back babeee * add to gitignore * allow to toggle embedding mode * Clean-up GritLM sample code. * Fix types. * Flush stdout and output ending newline if streaming. * mostly style fixes; correct KQ_mask comment * add causal_attn flag to llama_cparams * gritml : minor * llama : minor --------- Co-authored-by: Douglas Hanley <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>

iamlemec and others added 9 commits March 5, 2024 13:42

add gritlm example

4be8fb1

gritlm results match

e79195f

tabs to spaces

a71842d

comment out debug printing

805ae52

rebase to new embed

9793607

gritlm embeddings are back babeee

1ab6aee

add to gitignore

f618e50

allow to toggle embedding mode

bd3d9fb

Clean-up GritLM sample code.

03acc82

ggerganov reviewed Mar 9, 2024

View reviewed changes

llama.cpp Show resolved Hide resolved

ggerganov reviewed Mar 9, 2024

View reviewed changes

llama.h Outdated Show resolved Hide resolved

dranger003 and others added 3 commits March 9, 2024 12:48

Fix types.

a86c844

Flush stdout and output ending newline if streaming.

b1d9c26

mostly style fixes; correct KQ_mask comment

2df2834

iamlemec approved these changes Mar 9, 2024

View reviewed changes

ggerganov reviewed Mar 9, 2024

View reviewed changes

iamlemec and others added 4 commits March 9, 2024 22:59

add causal_attn flag to llama_cparams

d3085de

Merge branch 'master' into HEAD

ce05fff

gritml : minor

8ee5892

llama : minor

ecad2af

ggerganov approved these changes Mar 10, 2024

View reviewed changes

ggerganov merged commit bcebd7d into ggml-org:master Mar 10, 2024

dranger003 deleted the gritlm-pr branch March 21, 2024 15:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for GritLM #5959

Add support for GritLM #5959

Uh oh!

dranger003 commented Mar 9, 2024

Uh oh!

Uh oh!

Uh oh!

ggerganov Mar 9, 2024

Uh oh!

iamlemec Mar 10, 2024

Uh oh!

Uh oh!

Add support for GritLM #5959

Add support for GritLM #5959

Uh oh!

Conversation

dranger003 commented Mar 9, 2024

Uh oh!

Uh oh!

Uh oh!

ggerganov Mar 9, 2024

Choose a reason for hiding this comment

Uh oh!

iamlemec Mar 10, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!