bf16 matmul's corresponding `tensor.pack` not properly optimized

Currently, the following 2 single-layer MLP have worst performance compared with GC v1.
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/yifeizh2/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/yifeizh2/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
<style>

</style>
</head>

<body link="#467886" vlink="#96607D">


dtype | batch size | hidden list | GC V1 | 8c55a0544  remove brgemm read lock
-- | -- | -- | -- | --
bf16 | 128 | 1024x1024 | 0.0286 | 0.0828 | 34.52%
bf16 | 128 | 1024x512 | 0.0204 | 0.0670 | 30.45%



</body>

</html>


We performed detailed breakdown as follows:

<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/yifeizh2/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/yifeizh2/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
<style>

</style>
</head>

<body link="#467886" vlink="#96607D">


128x1024x1024 | GC v1 | 8c55a0544
-- | -- | --
matmul only | 0.01766 | 0.01989
tiled pack (or reorder) | 0.02634 | 0.04632
total | 0.04418 | 0.077969



</body>

</html>

and 

<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/yifeizh2/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/yifeizh2/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
<style>

</style>
</head>

<body link="#467886" vlink="#96607D">


128x1024x512 | GC v1 | 8c55a0544
-- | -- | --
matmul only | 0.01587 | 0.01591
tiled pack (or reorder) | 0.01278 | 0.0398
total | 0.02881 | 0.06917



</body>

</html>

Are there any further optimization opportunity for vnni pack?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bf16 matmul's corresponding `tensor.pack` not properly optimized #320

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

128x1024x1024	GC v1	`8c55a05`
matmul only	0.01766	0.01989
tiled pack (or reorder)	0.02634	0.04632
total	0.04418	0.077969

128x1024x512	GC v1	`8c55a05`
matmul only	0.01587	0.01591
tiled pack (or reorder)	0.01278	0.0398
total	0.02881	0.06917

bf16 matmul's corresponding tensor.pack not properly optimized #320

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

bf16 matmul's corresponding `tensor.pack` not properly optimized #320