Skip to content

agile-learning-institute/stage0_runbook_grader

Repository files navigation

stage0_runbook_grader

Grade the Grader runbook

Grade Runbook

This runbook reads a configuration.yaml file, loads the referenced data files from the input folder, and then grades a model's responses to prompts that produce a grade when provided given and expected values. Grades are written to the output folder as yyyy.mm.ddThh:mm:ss-grades.json

Expected Directory Structure

When you run the utility you will specify folders to mount for /config, /input and /output

/📁 config
├── 📝 grade_config.yaml                # Grade Configuration
/📁 input
│── 📁 grader                           # Simple LLM message list with grading prompts and keys
│   ├── ✏️ grader1.csv                  # grade prompts
│   ├── ✏️ grader_key.csv               # grading keys (given, expected, min, max)
/📁 output
│   ├── 📀 yyyy-mm-ddThh:mm:ss-grades.json  # Grades from running the evaluation

Using this in your project.

Adjust the command below to use appropriate values for your Echo Bot project, and add it to your pipenv scripts. See the [test_data] folder for sample files. Grades will be written to a file called {datetime}-grades.json in the output folder when you run the tool.

docker run --rm /
    -v ./my_model/input:/input
    -v ./my_model/config:/config
    -v ./my_model/output:/output
    ghcr.io/agile-learning-institute/stage0-echo-grade:latest

Contributing

Prerequisites

Ensure the following tools are installed:

Testing

All testing uses config/input/output folders in ./test_data.

Install Dependencies

pipenv install

Run Evaluate Runbook locally.

pipenv run grade

Debug Evaluate Runbook locally

pipenv run debug

Runs locally with logging level set to DEBUG

Build the Gary the Grader model

See Gary.modelfile - from llama3.2:latest, turns the temperature all the way down to 0

pipenv run model

Build the Evaluate Runbook container

pipenv run build

Build, and run the Evaluate Runbook container

pipenv run container

About

Grade the Grader runbook

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages