@@ -1107,51 +1107,51 @@ is shown in a single column in the table below.
11071107==== Intel XMX Supported Combinations
11081108This is currently available in devices with the architecture
11091109`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1110- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_dg2_g10 `,
1111- `architecture::intel_gpu_dg2_g11 `, `architecture::intel_gpu_dg2_g12 `,
1112- `architecture::intel_gpu_arl_h `, `architecture::intel_gpu_ptl_h`, and
1113- `architecture::intel_gpu_ptl_u`.
1110+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1111+ `architecture::intel_gpu_dg2_g10 `, `architecture::intel_gpu_dg2_g11 `,
1112+ `architecture::intel_gpu_dg2_g12 `, `architecture::intel_gpu_arl_h`,
1113+ `architecture::intel_gpu_ptl_h`, and `architecture:: intel_gpu_ptl_u`.
11141114
11151115[frame="none",options="header"]
11161116|======================
11171117| A type | B type | C type | D type | M | N | K | device
11181118.2+| `matrix_type::uint8` .2+| `matrix_type::uint8` .2+|
11191119`matrix_type::sint32` .2+| `matrix_type::sint32` .2+| +<=+ 8 | 16 .2+| 32
11201120|`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1121- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1122- `architecture::intel_gpu_ptl_u`
1121+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1122+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
11231123|8|`architecture::intel_gpu_dg2_g10,
11241124architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`,
11251125`architecture::intel_gpu_arl_h`
11261126.2+| `matrix_type::uint8` .2+| `matrix_type::sint8` .2+|
11271127`matrix_type::sint32` .2+|`matrix_type::sint32` .2+| +<=+ 8 | 16 .2+| 32 |
11281128`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1129- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1130- `architecture::intel_gpu_ptl_u`
1129+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1130+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
11311131|8|`architecture::intel_gpu_dg2_g10,
11321132architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`,
11331133`architecture::intel_gpu_arl_h`
11341134.2+| `matrix_type::sint8` .2+| `matrix_type::uint8` .2+|
11351135`matrix_type::sint32` .2+|`matrix_type::sint32` .2+| +<=+ 8 | 16 .2+| 32 |
11361136`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1137- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1138- `architecture::intel_gpu_ptl_u`
1137+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1138+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
11391139|8|`architecture::intel_gpu_dg2_g10,
11401140architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`,
11411141`architecture::intel_gpu_arl_h`
11421142.2+| `matrix_type::sint8` .2+| `matrix_type::sint8` .2+|
11431143`matrix_type::sint32` .2+| `matrix_type::sint32` .2+| +<=+ 8 | 16 .2+| 32 |
11441144`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1145- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1146- `architecture::intel_gpu_ptl_u`
1145+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1146+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
11471147|8|`architecture::intel_gpu_dg2_g10,
11481148architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`,
11491149`architecture::intel_gpu_arl_h`
11501150.8+|`matrix_type::fp16` .8+| `matrix_type::fp16` .8+|
11511151`matrix_type::fp32` .8+|`matrix_type::fp32` .1+| 16 .1+| 16 | 16
11521152.6+|`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1153- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1154- `architecture::intel_gpu_ptl_u`
1153+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1154+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
11551155.2+| 1 .2+| 64 | 16 |32
11561156.2+| 32 .2+| 64 | 16 |32
11571157.2+| +<=+ 8 | 16 .2+| 16
@@ -1162,28 +1162,28 @@ architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`,
11621162.6+|`matrix_type::fp16` .6+| `matrix_type::fp16` .6+|
11631163`matrix_type::fp16` .6+|`matrix_type::fp32` .1+| +<=+ 8 | 16 .1+| 16
11641164.6+| `architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1165- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1166- `architecture::intel_gpu_ptl_u`
1165+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1166+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
11671167| 16 | 16 | 16 .2+| 1 .2+| 64 | 16 | 32
11681168.2+| 32 .2+| 64 | 16 | 32
11691169.6+|`matrix_type::fp16` .6+| `matrix_type::fp16` .6+|
11701170`matrix_type::fp32` .6+|`matrix_type::fp16` .1+| +<=+ 8 | 16 .1+| 16
11711171.6+|`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1172- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1173- `architecture::intel_gpu_ptl_u`
1172+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1173+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
11741174| 16 | 16 | 16 .2+| 1 .2+| 64 | 16 | 32
11751175.2+| 32 .2+| 64 |16 | 32
11761176.6+|`matrix_type::fp16` .6+| `matrix_type::fp16` .6+|
11771177`matrix_type::fp16` .6+|`matrix_type::fp16` .1+| +<=+ 8 | 16 .1+| 16
11781178.6+|`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1179- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1180- `architecture::intel_gpu_ptl_u`
1179+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1180+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
11811181| 16 | 16 | 16 .2+| 1 .2+| 64 | 16 |32 .2+| 32 .2+| 64 | 16 | 32
11821182.8+| `matrix_type::bf16` .8+| `matrix_type::bf16` .8+|
11831183`matrix_type::fp32` .8+| `matrix_type::fp32` | 16 | 16 | 16
11841184.6+|`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1185- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1186- `architecture::intel_gpu_ptl_u`
1185+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1186+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
11871187.2+| 1 .2+| 64 | 16 | 32
11881188.2+| 32 .2+| 64 | 16 |32
11891189.2+| +<=+ 8 | 16 .2+| 16
@@ -1194,34 +1194,35 @@ architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`,
11941194.6+|`matrix_type::bf16` .6+| `matrix_type::bf16` .6+|
11951195`matrix_type::bf16` .6+|`matrix_type::fp32` .1+| +<=+ 8 | 16 .1+| 16 .6+|
11961196`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1197- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1198- `architecture::intel_gpu_ptl_u`
1197+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1198+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
11991199| 16 | 16 | 16 .2+| 1 .2+| 64 | 16 | 32
12001200.2+| 32 .2+| 64 |16 | 32
12011201.6+|`matrix_type::bf16` .6+| `matrix_type::bf16` .6+|
12021202`matrix_type::fp32` .6+|`matrix_type::bf16` .1+| +<=+ 8 | 16 .1+| 16 .6+|
12031203`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1204- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1205- `architecture::intel_gpu_ptl_u`
1204+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1205+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
12061206| 16 | 16 | 16 .2+| 1 .2+| 64 | 16 | 32
12071207.2+| 32 .2+| 64 |16 | 32
12081208.6+|`matrix_type::bf16` .6+| `matrix_type::bf16` .6+|
12091209`matrix_type::bf16` .6+|`matrix_type::bf16` .1+| +<=+ 8 | 16 .1+| 16 .6+|
12101210`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1211- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1212- `architecture::intel_gpu_ptl_u`
1211+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1212+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
12131213| 16 | 16 | 16 .2+| 1 .2+| 64 | 16 | 32
12141214.2+| 32 .2+| 64 |16 | 32
12151215| `matrix_type::tf32` | `matrix_type::tf32` |
12161216`matrix_type::fp32` .2+| `matrix_type::fp32` | +<=+ 8 | 16 | 8 |
12171217`architecture::intel_gpu_pvc`, `architecture::intel_gpu_bmg_g21`,
1218- `architecture::intel_gpu_lnl_m `, `architecture::intel_gpu_ptl_h `,
1219- `architecture::intel_gpu_ptl_u`
1218+ `architecture::intel_gpu_bmg_g31 `, `architecture::intel_gpu_lnl_m `,
1219+ `architecture::intel_gpu_ptl_h`, `architecture:: intel_gpu_ptl_u`
12201220|======================
12211221
12221222===== Restrictions on `architecture::intel_gpu_pvc`,
1223- `architecture::intel_gpu_bmg_g21`, `architecture::intel_gpu_lnl_m`,
1224- `architecture::intel_gpu_ptl_h`, and `architecture::intel_gpu_ptl_u`
1223+ `architecture::intel_gpu_bmg_g21`, `architecture::intel_gpu_bmg_g31`,
1224+ `architecture::intel_gpu_lnl_m`, `architecture::intel_gpu_ptl_h`,
1225+ and `architecture::intel_gpu_ptl_u`
12251226
12261227- The `stride` parameter to `joint_matrix_load` and
12271228 `joint_matrix_store` has the following restrictions:
@@ -1363,4 +1364,4 @@ load/store overloads
13631364|11 |2024-04-29 |Yury Plyakhin | Add 1x64x16 supported combination for
13641365Intel XMX (intel_gpu_pvc)
13651366|12 |2024-06-14 |Jack Kirk | Add note on sm version device matching issue.
1366- |======================
1367+ |======================
0 commit comments