-
Notifications
You must be signed in to change notification settings - Fork 7
Description
🐛 Describe the bug
FusionExecutor caches output sizes. e.g. for the same input sizes, we'll reuse the cached output sizes instead of running evaluation on outputs again.
This doesn't work any more with our recent expansion on codegen (TensorFactory methods & RNG ops). Unfortunately the issue was caught earlier, since the caching re-use gets accidentally disabled in my refactor code. https://github.com/csarofeen/pytorch/pull/1914/files#diff-3e62c8296c8362cd8c14a3d3300e5b2758d09b163ade856eefbf7361d75d7acaR373
So I'm going to add a check in executor and walk through fusion ops and disable cache re-use when I see those ops. I think there should be a more robust/aggressive way to handle this. -> we only needed to disable the cached output allocation when the factory methods depends on runtime scalar input.
Versions
ToT devel