Closed
Description
This can be implemented using xegpu.load_nd %x {transpose = {1, 0}}
for B tiles.
We should support both patterns:
- Explicit transpose in op
linalg.matmul_transpose_b
. - Transpose before
matmul
(this is how MLIR from OV will look like):%b_tr = linalg.transpose %b ... %res = linalg.matmul %a, %b, ...
This functionality is required for OV integration (#207).