Skip to content

Commit ca268b2

Browse files
drasticactionsakawrykow
authored andcommitted
llama : use Unicode Escape Sequence to replace encoded characters (ggml-org#2814)
The use of special characters within source files can break compiling on some computers with different region and language settings. Using Unicode escape sequences should allow for the code to be compiled on all setups without needing to change your computers settings or switch regions.
1 parent af0d589 commit ca268b2

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

llama.cpp

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -955,10 +955,10 @@ struct llama_vocab {
955955
id linefeed_id = 13;
956956

957957
int find_bpe_rank(std::string token_left, std::string token_right) const {
958-
replace_all(token_left, " ", "Ġ");
959-
replace_all(token_left, "\n", "Ċ");
960-
replace_all(token_right, " ", "Ġ");
961-
replace_all(token_right, "\n", "Ċ");
958+
replace_all(token_left, " ", "\u0120");
959+
replace_all(token_left, "\n", "\u010A");
960+
replace_all(token_right, " ", "\u0120");
961+
replace_all(token_right, "\n", "\u010A");
962962

963963
auto it = bpe_ranks.find(std::make_pair(token_left, token_right));
964964
if (it == bpe_ranks.end()) {

0 commit comments

Comments
 (0)