compute_metrics with causal LM training

### Feature request

Besides loss, users often need to report additional metrics throughout the training in order to drive decision making and communicate results, which in the case of Seq2Seq models is elegantly done with the `compute_metrics` argument of the `Trainer`. Generative metrics easily fit this framework by setting `predict_with_generate=True`. The same is much less straightforward with a Causal underlying LM. The only "working" approach I found is this: https://github.com/huggingface/transformers/blob/5e11d72d4d0939138fbabfebe9a69d2061519547/examples/pytorch/language-modeling/run_clm.py#L578

But I think this is an erroneous calculation: the `logits.argmax(dim=-1)` call does not really generate in inference mode, it "cheats" because of teacher forcing and therefore any metric computed that way is probably inflated. Ideally it would be possible to make the argument passed to `compute_metrics`  include a proper `predictions` attribute that has been properly generated using the trainers generation config.

### Motivation

I am always frustrated when I can't observe the learning trajectory of my generative metric (say BLEU/ROUGE) when using a CML even though it is trivial to do when I am using a S2S

### Your contribution

If you confirm that this is an issue and important enough to justify a fix I may be able to make a PR but can't promise it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

compute_metrics with causal LM training #26474

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

compute_metrics with causal LM training #26474

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions