Skip to content

compute_metrics with causal LM training #26474

@steremma

Description

@steremma

Feature request

Besides loss, users often need to report additional metrics throughout the training in order to drive decision making and communicate results, which in the case of Seq2Seq models is elegantly done with the compute_metrics argument of the Trainer. Generative metrics easily fit this framework by setting predict_with_generate=True. The same is much less straightforward with a Causal underlying LM. The only "working" approach I found is this:

def compute_metrics(eval_preds):

But I think this is an erroneous calculation: the logits.argmax(dim=-1) call does not really generate in inference mode, it "cheats" because of teacher forcing and therefore any metric computed that way is probably inflated. Ideally it would be possible to make the argument passed to compute_metrics include a proper predictions attribute that has been properly generated using the trainers generation config.

Motivation

I am always frustrated when I can't observe the learning trajectory of my generative metric (say BLEU/ROUGE) when using a CML even though it is trivial to do when I am using a S2S

Your contribution

If you confirm that this is an issue and important enough to justify a fix I may be able to make a PR but can't promise it

Metadata

Metadata

Assignees

Labels

WIPLabel your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions