Unify the torchao evaluation framework

TorchAO has multiple evaluation scripts which needs to be unified in the torchao benchmarking framework:

torchao/_models/llama/eval.py
benchmarks/_models/eval_hf_models.py
.github/scripts/torchao_model_releases