What's going on with T5 x torch.compile ?

### System Info

Hi Team,
First of all huge thanks for all the great work you are doing.

Recently, I was benchmarking inference for T5 model on ‪AWS EC2 ( G6E machine with L40 GPU) for batch sizes of 1, 2, 4.

I have heard tons about torch. compile and wanted to try it out and see if it reduces the inference time. Surprisingly, it did the other way around. On average, I saw an increase of ~1 sec in inference time for a sample size of 50 with a length of each sample ranging from [2200, 3000] characters, with an average of around 2550 chars.

I had a chat with a friend about this who told me that T5 is not a very suitable architecture for compilation yet and there are lots of graphbreaks. With his advice, I decided to open an issue here. 

From my experience, T5 is still a very good model and I would want to see it work seamlessly with torch compile. If chance comes, I am ready to put my own time into this and contribute to the cause. Let me know what you think.

### Who can help?

_No response_

### Information

- [X] The official example scripts
- [ ] My own modified scripts

### Tasks

- [X] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

 ‪AWS EC2 ( G6E machine with L40 GPU) for batch sizes of 1, 2, and 4.

### Expected behavior

The inference time should reduce post compilation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What's going on with T5 x torch.compile ? #33221

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What's going on with T5 x torch.compile ? #33221

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions