-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AMDGPU] Baseline gfx1250 speed model. #145217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Baseline gfx1250 speed model. #145217
Conversation
@llvm/pr-subscribers-backend-amdgpu Author: Stanislav Mekhanoshin (rampitec) ChangesFull diff: https://github.com/llvm/llvm-project/pull/145217.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/GCNProcessors.td b/llvm/lib/Target/AMDGPU/GCNProcessors.td
index 0b331bd3f3fb6..b5ffa64c3a4b4 100644
--- a/llvm/lib/Target/AMDGPU/GCNProcessors.td
+++ b/llvm/lib/Target/AMDGPU/GCNProcessors.td
@@ -326,6 +326,6 @@ def : ProcessorModel<"gfx12-generic", GFX12SpeedModel,
FeatureISAVersion12_Generic.Features
>;
-def : ProcessorModel<"gfx1250", GFX12SpeedModel,
+def : ProcessorModel<"gfx1250", GFX1250SpeedModel,
FeatureISAVersion12_50.Features
>;
diff --git a/llvm/lib/Target/AMDGPU/SISchedule.td b/llvm/lib/Target/AMDGPU/SISchedule.td
index 2a374b360b04a..1679cee320067 100644
--- a/llvm/lib/Target/AMDGPU/SISchedule.td
+++ b/llvm/lib/Target/AMDGPU/SISchedule.td
@@ -99,6 +99,7 @@ def SIDPGFX950FullSpeedModel : SISchedMachineModel;
def GFX10SpeedModel : SISchedMachineModel;
def GFX11SpeedModel : SISchedMachineModel;
def GFX12SpeedModel : SISchedMachineModel;
+def GFX1250SpeedModel : SISchedMachineModel;
// XXX: Are the resource counts correct?
def HWBranch : ProcResource<1> {
@@ -455,3 +456,35 @@ def : HWWriteRes<WriteBarrier, [HWBranch], 2000>;
def : InstRW<[WriteCopy], (instrs COPY)>;
} // End SchedModel = GFX12SpeedModel
+
+multiclass GFX125xCommonWriteRes {
+
+def : HWWriteRes<Write32Bit, [HWVALU, HWRC], 5>;
+def : HWWriteRes<WriteFloatCvt, [HWVALU, HWRC], 5>;
+def : HWWriteRes<WriteTrans32, [HWTransVALU, HWRC], 7>;
+def : HWWriteRes<WriteQuarterRate32, [HWVALU, HWRC], 6>;
+def : HWWriteRes<WriteFloatFMA, [HWVALU, HWRC], 5>;
+def : HWWriteRes<WritePseudoScalarTrans, [HWVALU, HWRC], 8>;
+
+def : HWWriteRes<WriteBranch, [HWBranch], 32>;
+def : HWWriteRes<WriteExport, [HWExport, HWRC], 16>;
+def : HWWriteRes<WriteLDS, [HWLGKM, HWRC], 20>;
+def : HWWriteRes<WriteSALU, [HWSALU, HWRC], 2>;
+def : HWWriteRes<WriteSFPU, [HWSALU, HWRC], 4>;
+def : HWWriteRes<WriteSMEM, [HWLGKM, HWRC], 20>;
+def : HWWriteRes<WriteVMEM, [HWVMEM, HWRC], 320>;
+def : HWWriteRes<WriteBarrier, [HWBranch], 2000>;
+
+def : InstRW<[WriteCopy], (instrs COPY)>;
+} // End GFX125xCommonWriteRes
+
+let SchedModel = GFX1250SpeedModel in {
+defm : GFX125xCommonWriteRes;
+
+def : HWWriteRes<Write64Bit, [HWVALU, HWRC], 7>;
+def : HWWriteRes<WriteIntMul, [HWVALU, HWRC], 11>;
+def : HWWriteRes<WriteDouble, [HWVALU, HWRC], 32>;
+def : HWWriteRes<WriteDoubleAdd, [HWVALU, HWRC], 32>;
+def : HWWriteRes<WriteDoubleCvt, [HWVALU, HWRC], 32>;
+def : HWWriteRes<WriteTrans64, [HWVALU, HWTransVALU, HWRC], 38>;
+} // SchedModel = GFX1250SpeedModel
|
def : HWWriteRes<WriteTrans32, [HWTransVALU, HWRC], 7>; | ||
def : HWWriteRes<WriteQuarterRate32, [HWVALU, HWRC], 6>; | ||
def : HWWriteRes<WriteFloatFMA, [HWVALU, HWRC], 5>; | ||
def : HWWriteRes<WritePseudoScalarTrans, [HWVALU, HWRC], 8>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do WriteTrans32 and WritePseudoScalarTrans use different resources? And it seems unintuitive that the scalar trans cost is higher than trans32, is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the spec it uses VALU pipeline. I suspect it uses both, but that is really what is written. Then it is of course inherited from the gfx12 baseline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yes, it is correct it is higher, because you also need to move data to the pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm-mca tests would be good
No description provided.