Skip to content

Commit 67c5f4f

Browse files
committed
fix perplexity after c-api refactor by proving a large enough token buffer
1 parent d5850c5 commit 67c5f4f

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

main.cpp

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,12 @@ void perplexity(llama_context * ctx, const gpt_params & params) {
8585
// Download: https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip?ref=salesforce-research
8686
// Run `./main --perplexity -m models/7B/ggml-model-q4_0.bin -f wiki.test.raw`
8787
// Output: `perplexity: 13.5106 [114/114]`
88-
auto tokens = ::llama_tokenize(ctx, params.prompt.c_str(), true);
88+
std::vector<llama_token> tokens(params.prompt.size()); // initialize to prompt numer of chars, since n_tokens <= n_prompt_chars
89+
{
90+
const auto res = llama_tokenize(ctx, params.prompt.c_str(), tokens.data(), tokens.size(), true);
91+
assert(res >= 0);
92+
tokens.resize(res);
93+
}
8994

9095
int count = 0;
9196
double nll = 0.0;

0 commit comments

Comments
 (0)