Skip to content

Commit f5fe98d

Browse files
authored
docs : add grammar docs (#2701)
* docs : add grammar docs * tweaks to grammar guide * rework GBNF example to be a commented grammar
1 parent 777f42b commit f5fe98d

File tree

3 files changed

+107
-0
lines changed

3 files changed

+107
-0
lines changed

README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ Last revision compatible with the old format: [dadbed9](https://github.com/ggerg
3939
<li><a href="#memorydisk-requirements">Memory/Disk Requirements</a></li>
4040
<li><a href="#quantization">Quantization</a></li>
4141
<li><a href="#interactive-mode">Interactive mode</a></li>
42+
<li><a href="#constrained-output-with-grammars">Constrained output with grammars</a></li>
4243
<li><a href="#instruction-mode-with-alpaca">Instruction mode with Alpaca</a></li>
4344
<li><a href="#using-openllama">Using OpenLLaMA</a></li>
4445
<li><a href="#using-gpt4all">Using GPT4All</a></li>
@@ -604,6 +605,16 @@ PROMPT_TEMPLATE=./prompts/chat-with-bob.txt PROMPT_CACHE_FILE=bob.prompt.bin \
604605
CHAT_SAVE_DIR=./chat/bob ./examples/chat-persistent.sh
605606
```
606607

608+
### Constrained output with grammars
609+
610+
`llama.cpp` supports grammars to constrain model output. For example, you can force the model to output JSON only:
611+
612+
```bash
613+
./main -m ./models/13B/ggml-model-q4_0.gguf -n 256 --grammar-file grammars/json.gbnf -p 'Request: schedule a call at 8pm; Command:'
614+
```
615+
616+
The `grammars/` folder contains a handful of sample grammars. To write your own, check out the [GBNF Guide](./grammars/README.md).
617+
607618
### Instruction mode with Alpaca
608619

609620
1. First, download the `ggml` Alpaca model into the `./models` folder
@@ -885,3 +896,4 @@ docker run --gpus all -v /path/to/models:/models local/llama.cpp:light-cuda -m /
885896
- [BLIS](./docs/BLIS.md)
886897
- [Performance troubleshooting](./docs/token_generation_performance_tips.md)
887898
- [GGML tips & tricks](https://github.com/ggerganov/llama.cpp/wiki/GGML-Tips-&-Tricks)
899+
- [GBNF grammars](./grammars/README.md)

examples/main/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,10 @@ These options help improve the performance and memory usage of the LLaMA models.
288288

289289
- `--prompt-cache FNAME`: Specify a file to cache the model state after the initial prompt. This can significantly speed up the startup time when you're using longer prompts. The file is created during the first run and is reused and updated in subsequent runs. **Note**: Restoring a cached prompt does not imply restoring the exact state of the session at the point it was saved. So even when specifying a specific seed, you are not guaranteed to get the same sequence of tokens as the original generation.
290290

291+
### Grammars
292+
293+
- `--grammar GRAMMAR`, `--grammar-file FILE`: Specify a grammar (defined inline or in a file) to constrain model output to a specific format. For example, you could force the model to output JSON or to speak only in emojis. See the [GBNF guide](../../grammars/README.md) for details on the syntax.
294+
291295
### Quantization
292296

293297
For information about 4-bit quantization, which can significantly improve performance and reduce memory usage, please refer to llama.cpp's primary [README](../../README.md#prepare-data--run).

grammars/README.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# GBNF Guide
2+
3+
GBNF (GGML BNF) is a format for defining [formal grammars](https://en.wikipedia.org/wiki/Formal_grammar) to constrain model outputs in `llama.cpp`. For example, you can use it to force the model to generate valid JSON, or speak only in emojis. GBNF grammars are supported in various ways in `examples/main` and `examples/server`.
4+
5+
## Background
6+
7+
[Bakus-Naur Form (BNF)](https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form) is a notation for describing the syntax of formal languages like programming languages, file formats, and protocols. GBNF is an extension of BNF that primarily adds a few modern regex-like features.
8+
9+
## Basics
10+
11+
In GBNF, we define *production rules* that specify how a *non-terminal* (rule name) can be replaced with sequences of *terminals* (characters, specifically Unicode [code points](https://en.wikipedia.org/wiki/Code_point)) and other non-terminals. The basic format of a production rule is `nonterminal ::= sequence...`.
12+
13+
## Example
14+
15+
Before going deeper, let's look at some of the features demonstrated in `grammars/chess.gbnf`, a small chess notation grammar:
16+
```
17+
# `root` specifies the pattern for the overall output
18+
root ::= (
19+
# it must start with the characters "1. " followed by a sequence
20+
# of characters that match the `move` rule, followed by a space, followed
21+
# by another move, and then a newline
22+
"1. " move " " move "\n"
23+
24+
# it's followed by one or more subsequent moves, numbered with one or two digits
25+
([1-9] [0-9]? ". " move " " move "\n")+
26+
)
27+
28+
# `move` is an abstract representation, which can be a pawn, nonpawn, or castle.
29+
# The `[+#]?` denotes the possibility of checking or mate signs after moves
30+
move ::= (pawn | nonpawn | castle) [+#]?
31+
32+
pawn ::= ...
33+
nonpawn ::= ...
34+
castle ::= ...
35+
```
36+
37+
## Non-Terminals and Terminals
38+
39+
Non-terminal symbols (rule names) stand for a pattern of terminals and other non-terminals. They are required to be a dashed lowercase word, like `move`, `castle`, or `check-mate`.
40+
41+
Terminals are actual characters ([code points](https://en.wikipedia.org/wiki/Code_point)). They can be specified as a sequence like `"1"` or `"O-O"` or as ranges like `[1-9]` or `[NBKQR]`.
42+
43+
## Characters and character ranges
44+
45+
Terminals support the full range of Unicode. Unicode characters can be specified directly in the grammar, for example `hiragana ::= [ぁ-ゟ]`, or with escapes: 8-bit (`\xXX`), 16-bit (`\uXXXX`) or 32-bit (`\UXXXXXXXX`).
46+
47+
Character ranges can be negated with `^`:
48+
```
49+
single-line ::= [^\n]+ "\n"`
50+
```
51+
52+
## Sequences and Alternatives
53+
54+
The order of symbols in a sequence matter. For example, in `"1. " move " " move "\n"`, the `"1. "` must come before the first `move`, etc.
55+
56+
Alternatives, denoted by `|`, give different sequences that are acceptable. For example, in `move ::= pawn | nonpawn | castle`, `move` can be a `pawn` move, a `nonpawn` move, or a `castle`.
57+
58+
Parentheses `()` can be used to group sequences, which allows for embedding alternatives in a larger rule or applying repetition and optptional symbols (below) to a sequence.
59+
60+
## Repetition and Optional Symbols
61+
62+
- `*` after a symbol or sequence means that it can be repeated zero or more times.
63+
- `+` denotes that the symbol or sequence should appear one or more times.
64+
- `?` makes the preceding symbol or sequence optional.
65+
66+
## Comments and newlines
67+
68+
Comments can be specified with `#`:
69+
```
70+
# defines optional whitspace
71+
ws ::= [ \t\n]+
72+
```
73+
74+
Newlines are allowed between rules and between symbols or sequences nested inside parentheses. Additionally, a newline after an alternate marker `|` will continue the current rule, even outside of parentheses.
75+
76+
## The root rule
77+
78+
In a full grammar, the `root` rule always defines the starting point of the grammar. In other words, it specifies what the entire output must match.
79+
80+
```
81+
# a grammar for lists
82+
root ::= ("- " item)+
83+
item ::= [^\n]+ "\n"
84+
```
85+
86+
## Next steps
87+
88+
This guide provides a brief overview. Check out the GBNF files in this directory (`grammars/`) for examples of full grammars. You can try them out with:
89+
```
90+
./main -m <model> --grammar-file grammars/some-grammar.gbnf -p 'Some prompt'
91+
```

0 commit comments

Comments
 (0)