LL1Checker is a tool for verifying if a grammar is LL(1) and for validating whether an input string belongs to the language generated by the grammar. The grammar is read from a file.
- Project Status
- What's New in v5.0
- Run the Program
- Dependencies
- Considerations
- Structure of grammar.txt
- Want to Contribute?
- Compilation
- Documentation
This project is complete, with all essential functionalities implemented and tested with various grammars. It is now in maintenance mode for potential enhancements or bug fixes. If you encounter any issues or unexpected interpretations, please open an issue and include relevant details. Contributions for further improvement are welcome!
- The lexer, parser, and symbol table now work with numeric token identifiers, improving table lookups and reducing memory usage for large grammars.
- Boost.Spirit powers the dynamic lexer and the parser now mirrors its diagnostics, so both lexing and parsing errors display multi-line windows with a caret pointing at the failing token.
- Parse sessions can export a full parse tree to Graphviz-compatible
.dotfiles via--export-tree, making it easy to visualize derivations. - Grammar validation is stricter: duplicate terminal or non-terminal identifiers abort loading with a descriptive error, and parse tree exports now fail fast if the output file cannot be created.
- Command-line handling exits immediately on invalid options, preventing partial runs when arguments are malformed.
- The build system gained an opt-in static linking toggle (
LL1CHECKER_FORCE_STATIC_RUNTIME) and amake statichelper target; CMake will also fetchcxxoptsautomatically when it is not installed.
You can run the program as follows:
./ll1 <GRAMMAR_FILENAME> [TEXT_FILENAME] [OPTIONS]
<GRAMMAR_FILENAME>: Path to the file containing the grammar.[TEXT_FILENAME](optional): If provided, the program will validate whether the input string belongs to the language defined by the grammar.
-h, --help: Show help message.-v, --verbose: Enable verbose mode, displaying the LL(1) table and input content.--format <FORMAT>: Specify the table format (oldornew).- If set,
verbosemode is enabled automatically. - The default format is
"new".
- If set,
--text <STRING>: Directly parse the provided text instead of reading a file.--export-tree <FILE>: Write the parse tree to a Graphviz.dotfile when parsing succeeds.
./ll1 grammar.txt
- Verifies if the provided grammar is LL(1).
If the grammar provided is not LL1, an error will be displayed alongside its table:
./ll1 grammar.txt input.txt
- Verifies if the grammar is LL(1).
- Parses the
input.txtfile according to the grammar.
./ll1 grammar.txt --text "aa"
- Parses
"aa"according to the grammar without creating a file.
./ll1 grammar.txt input.txt -v
- Displays the entire LL(1) table.
- Prints the contents of
input.txtbefore parsing.
./ll1 grammar.txt -v --format old
- Use the old table format when the new format cannot be displayed correctly due to screen size.
- When a lexer error occurs, the program pinpoints the location:
- When a parse error occurs, the program highlights the offending token:
- Export the parse tree to a Graphviz
.dotfile with--export-tree. - Run
./ll1 grammar.txt input.txt --export-tree tree.dotto generatetree.dot. - Convert the output with
dot -Tpng tree.dot -o output.pngto obtain an image.
./ll1 examples/grammar_3.txt examples/input_3.txt --export-tree tree.dot
dot -Tpng tree.dot -o output.png
Error Handling:
If <GRAMMAR_FILENAME> or <TEXT_FILENAME> do not exist or cannot be opened, the program prints an error and exits.
- C++20 capable compiler and CMake β₯ 3.16.
- Boost headers (tested with Boost 1.78+); required for the lexer built on Boost.Spirit.
cxxoptsfor command-line parsing. CMake will reuse an existing installation or automatically fetch v3.1.1 during configuration.
- Avoid using the reserved symbols
<<EPSILON>>and<<EOF>>. - For terminal symbols, note that order matters. If two regexes have common elements, place the more specific one first, as in the example:
terminal WH "while";
terminal WORD [a-zA-Z][a-zA-Z]*;
The grammar file has two sections separated by ;: symbol definition and grammar definition.
The terminal symbols follow the structure terminal <IDENTIFIER> <REGEX>; (just like declaring a variable). The <IDENTIFIER> should adhere to the regex pattern [a-zA-Z_\'][a-zA-Z_\'0-9]*.
An example of the first section would be:
terminal a a;
start with S;
;
You should write the last line to designate S as the axiom. The grammar is augmented internally with a new non-terminal whose production is S' -> S <<EOF>>.
The grammar follows this structure:
S -> A$;
A -> aaA;
A ->;
;
The line A->; represents an empty production.
So, our grammar.txt would be:
terminal a a;
start with S;
;
S -> A$;
A -> aaA;
A ->;
;
This grammar generates the following language: L(G) = {aa, aaaa, aaaaaa, ...}, that is, a language with an even number of 'a'.
Place the string you want to validate in input.txt.
The repository follows a simple layout keeping sources and headers separated:
LL1Checker/
βββ CMakeLists.txt
βββ Makefile # Helper Makefile to speed up compilation or formatting
βββ src/ # Source files and CMake logic
βββ include/ # Header files
βββ app/ # Main file
βββ examples/ # Example grammars and inputs
βββ docs/ # Generated documentation
To get started, you'll need the following:
- A working C++20 toolchain plus the dependencies listed above.
- Familiarity with CMake; configuration will fetch
cxxoptsautomatically if it is not already available on your system.
Feel free to reach out if you have any questions or suggestions! π
To build the project using CMake:
cmake -S . -B build
cmake --build buildAlternatively, you can simply run:
makeThis will configure and compile the project using the default Debug build type. To compile in Release mode:
make BUILD_TYPE=ReleaseTo produce a static Release build (when your toolchain supports it):
make static
# or make BUILD_TYPE=Release STATIC=1 buildThis passes -DLL1CHECKER_FORCE_STATIC_RUNTIME=ON to CMake and links the C++ runtime statically where available.
The ll1 executable will be located at build/app/ll1.
The complete API documentation is available here: LL1Checker Documentation





