Closed
Description
Bugzilla Link | 36890 |
Resolution | FIXED |
Resolved on | Oct 20, 2018 09:06 |
Version | trunk |
OS | All |
Blocks | #31672 |
CC | @RKSimon |
Fixed by commit(s) | 329415 |
Extended Description
This comment which appears in all the Intel scheduler models is incorrect
// A folded store needs a cycle on port 4 for the store data, but it does not
// need an extra port 2/3 cycle to recompute the address.
def : WriteRes<WriteRMW, [SKLPort4]>;
The load uop and the store address uop are separate micro ops. There address computation isn't shared between them. You can see this in Agner's tables where RMW add shows 4 unfused uops. Load, add, store address, and store data.