Open
Description
Description
We've talked about adding in lm-eval to become a native pathway in GuideLLM to enable developers to run both benchmarks and accuracy evaluations in just one tool. This ticket will outline the necessary work to integrate lm-eval into GuideLLM.
Acceptance Criteria
- Package up the lm-eval-harness library into GuideLLM
- Create new command: guidellm-eval that can then take the same input parameters as lm-eval:
-
- model hf \
-
- model_args pretrained=EleutherAI/gpt-j-6B \
-
- tasks hellaswag \
-
- device cuda:0 \
-
- batch_size 8
Metadata
Metadata
Assignees
Type
Projects
Status
Backlog