Skip to content

Conversation

hanno-becker
Copy link
Contributor

Fixes #1144

@hanno-becker hanno-becker added the benchmark this PR should be benchmarked in CI label Aug 4, 2025
Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 12306 cycles 12305 cycles 1.00
ML-KEM-512 encaps 14703 cycles 14702 cycles 1.00
ML-KEM-512 decaps 19246 cycles 19247 cycles 1.00
ML-KEM-768 keypair 21358 cycles 21360 cycles 1.00
ML-KEM-768 encaps 23564 cycles 23565 cycles 1.00
ML-KEM-768 decaps 30140 cycles 30139 cycles 1.00
ML-KEM-1024 keypair 30353 cycles 30354 cycles 1.00
ML-KEM-1024 encaps 34539 cycles 34539 cycles 1
ML-KEM-1024 decaps 44340 cycles 44338 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 9679 cycles 9706 cycles 1.00
ML-KEM-512 encaps 10962 cycles 10981 cycles 1.00
ML-KEM-512 decaps 15179 cycles 15218 cycles 1.00
ML-KEM-768 keypair 16448 cycles 16394 cycles 1.00
ML-KEM-768 encaps 17779 cycles 17696 cycles 1.00
ML-KEM-768 decaps 23325 cycles 23236 cycles 1.00
ML-KEM-1024 keypair 22177 cycles 22183 cycles 1.00
ML-KEM-1024 encaps 24391 cycles 24386 cycles 1.00
ML-KEM-1024 decaps 32030 cycles 32026 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 29016 cycles 28987 cycles 1.00
ML-KEM-512 encaps 34449 cycles 34407 cycles 1.00
ML-KEM-512 decaps 44382 cycles 44361 cycles 1.00
ML-KEM-768 keypair 48142 cycles 48040 cycles 1.00
ML-KEM-768 encaps 56117 cycles 56196 cycles 1.00
ML-KEM-768 decaps 68166 cycles 68077 cycles 1.00
ML-KEM-1024 keypair 72270 cycles 72392 cycles 1.00
ML-KEM-1024 encaps 84492 cycles 84387 cycles 1.00
ML-KEM-1024 decaps 100217 cycles 100306 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 52350 cycles 51706 cycles 1.01
ML-KEM-512 encaps 60283 cycles 59476 cycles 1.01
ML-KEM-512 decaps 76917 cycles 75991 cycles 1.01
ML-KEM-768 keypair 88355 cycles 89126 cycles 0.99
ML-KEM-768 encaps 97242 cycles 96280 cycles 1.01
ML-KEM-768 decaps 120620 cycles 120042 cycles 1.00
ML-KEM-1024 keypair 132536 cycles 132417 cycles 1.00
ML-KEM-1024 encaps 144922 cycles 145358 cycles 1.00
ML-KEM-1024 decaps 178350 cycles 177828 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 11926 cycles 11916 cycles 1.00
ML-KEM-512 encaps 13477 cycles 13482 cycles 1.00
ML-KEM-512 decaps 18358 cycles 18356 cycles 1.00
ML-KEM-768 keypair 20744 cycles 20766 cycles 1.00
ML-KEM-768 encaps 21705 cycles 21727 cycles 1.00
ML-KEM-768 decaps 28721 cycles 28758 cycles 1.00
ML-KEM-1024 keypair 27468 cycles 27493 cycles 1.00
ML-KEM-1024 encaps 29620 cycles 29637 cycles 1.00
ML-KEM-1024 decaps 39065 cycles 39024 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 16829 cycles 16830 cycles 1.00
ML-KEM-512 encaps 18589 cycles 18606 cycles 1.00
ML-KEM-512 decaps 24004 cycles 23986 cycles 1.00
ML-KEM-768 keypair 28563 cycles 28624 cycles 1.00
ML-KEM-768 encaps 29685 cycles 29688 cycles 1.00
ML-KEM-768 decaps 37819 cycles 37549 cycles 1.01
ML-KEM-1024 keypair 41637 cycles 41648 cycles 1.00
ML-KEM-1024 encaps 43986 cycles 44025 cycles 1.00
ML-KEM-1024 decaps 54527 cycles 54485 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 17852 cycles 17941 cycles 1.00
ML-KEM-512 encaps 20789 cycles 21034 cycles 0.99
ML-KEM-512 decaps 27381 cycles 27651 cycles 0.99
ML-KEM-768 keypair 30569 cycles 30895 cycles 0.99
ML-KEM-768 encaps 33272 cycles 33562 cycles 0.99
ML-KEM-768 decaps 42701 cycles 43137 cycles 0.99
ML-KEM-1024 keypair 44337 cycles 44601 cycles 0.99
ML-KEM-1024 encaps 49308 cycles 49619 cycles 0.99
ML-KEM-1024 decaps 62179 cycles 62596 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 36397 cycles 36397 cycles 1
ML-KEM-512 encaps 42991 cycles 42981 cycles 1.00
ML-KEM-512 decaps 56039 cycles 56056 cycles 1.00
ML-KEM-768 keypair 59961 cycles 59868 cycles 1.00
ML-KEM-768 encaps 68225 cycles 68221 cycles 1.00
ML-KEM-768 decaps 85775 cycles 85753 cycles 1.00
ML-KEM-1024 keypair 87716 cycles 87481 cycles 1.00
ML-KEM-1024 encaps 99878 cycles 99816 cycles 1.00
ML-KEM-1024 decaps 121706 cycles 121276 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 28855 cycles 28890 cycles 1.00
ML-KEM-512 encaps 33996 cycles 34154 cycles 1.00
ML-KEM-512 decaps 44572 cycles 44603 cycles 1.00
ML-KEM-768 keypair 49309 cycles 49218 cycles 1.00
ML-KEM-768 encaps 54355 cycles 54281 cycles 1.00
ML-KEM-768 decaps 69341 cycles 69133 cycles 1.00
ML-KEM-1024 keypair 71548 cycles 71596 cycles 1.00
ML-KEM-1024 encaps 79841 cycles 79911 cycles 1.00
ML-KEM-1024 decaps 100054 cycles 100057 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 38367 cycles 38345 cycles 1.00
ML-KEM-512 encaps 46667 cycles 46649 cycles 1.00
ML-KEM-512 decaps 60033 cycles 59999 cycles 1.00
ML-KEM-768 keypair 63816 cycles 63831 cycles 1.00
ML-KEM-768 encaps 74735 cycles 74334 cycles 1.01
ML-KEM-768 decaps 92283 cycles 92617 cycles 1.00
ML-KEM-1024 keypair 94253 cycles 94240 cycles 1.00
ML-KEM-1024 encaps 107970 cycles 108069 cycles 1.00
ML-KEM-1024 decaps 130548 cycles 130476 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 16289 cycles 16255 cycles 1.00
ML-KEM-512 encaps 18375 cycles 18404 cycles 1.00
ML-KEM-512 decaps 24944 cycles 24884 cycles 1.00
ML-KEM-768 keypair 29593 cycles 28961 cycles 1.02
ML-KEM-768 encaps 29957 cycles 29855 cycles 1.00
ML-KEM-768 decaps 39287 cycles 39351 cycles 1.00
ML-KEM-1024 keypair 37355 cycles 37374 cycles 1.00
ML-KEM-1024 encaps 40469 cycles 40479 cycles 1.00
ML-KEM-1024 decaps 56130 cycles 55959 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 28871 cycles 28836 cycles 1.00
ML-KEM-512 encaps 34104 cycles 34036 cycles 1.00
ML-KEM-512 decaps 44547 cycles 44605 cycles 1.00
ML-KEM-768 keypair 49215 cycles 49308 cycles 1.00
ML-KEM-768 encaps 54273 cycles 54331 cycles 1.00
ML-KEM-768 decaps 69155 cycles 69163 cycles 1.00
ML-KEM-1024 keypair 71594 cycles 71594 cycles 1
ML-KEM-1024 encaps 79888 cycles 79884 cycles 1.00
ML-KEM-1024 decaps 100043 cycles 100099 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 35320 cycles 35795 cycles 0.99
ML-KEM-512 encaps 40189 cycles 40699 cycles 0.99
ML-KEM-512 decaps 50819 cycles 52103 cycles 0.98
ML-KEM-768 keypair 59592 cycles 59563 cycles 1.00
ML-KEM-768 encaps 64726 cycles 66487 cycles 0.97
ML-KEM-768 decaps 79393 cycles 81100 cycles 0.98
ML-KEM-1024 keypair 87765 cycles 88530 cycles 0.99
ML-KEM-1024 encaps 97011 cycles 98622 cycles 0.98
ML-KEM-1024 decaps 116309 cycles 117280 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 19064 cycles 19076 cycles 1.00
ML-KEM-512 encaps 22310 cycles 22310 cycles 1
ML-KEM-512 decaps 29553 cycles 29546 cycles 1.00
ML-KEM-768 keypair 32617 cycles 32607 cycles 1.00
ML-KEM-768 encaps 35702 cycles 35640 cycles 1.00
ML-KEM-768 decaps 46030 cycles 45983 cycles 1.00
ML-KEM-1024 keypair 46889 cycles 46891 cycles 1.00
ML-KEM-1024 encaps 52091 cycles 52142 cycles 1.00
ML-KEM-1024 decaps 65966 cycles 66085 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 46402 cycles 46376 cycles 1.00
ML-KEM-512 encaps 54855 cycles 54856 cycles 1.00
ML-KEM-512 decaps 70269 cycles 70270 cycles 1.00
ML-KEM-768 keypair 75888 cycles 75898 cycles 1.00
ML-KEM-768 encaps 86880 cycles 86802 cycles 1.00
ML-KEM-768 decaps 106784 cycles 106765 cycles 1.00
ML-KEM-1024 keypair 111281 cycles 111282 cycles 1.00
ML-KEM-1024 encaps 125323 cycles 124979 cycles 1.00
ML-KEM-1024 decaps 150395 cycles 150669 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 59311 cycles 59292 cycles 1.00
ML-KEM-512 encaps 67871 cycles 67957 cycles 1.00
ML-KEM-512 decaps 86566 cycles 86637 cycles 1.00
ML-KEM-768 keypair 99186 cycles 99025 cycles 1.00
ML-KEM-768 encaps 109816 cycles 110173 cycles 1.00
ML-KEM-768 decaps 134786 cycles 135290 cycles 1.00
ML-KEM-1024 keypair 149346 cycles 149426 cycles 1.00
ML-KEM-1024 encaps 164716 cycles 164748 cycles 1.00
ML-KEM-1024 decaps 195994 cycles 196045 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 38865 cycles 38851 cycles 1.00
ML-KEM-512 encaps 44664 cycles 44621 cycles 1.00
ML-KEM-512 decaps 56482 cycles 56506 cycles 1.00
ML-KEM-768 keypair 64396 cycles 64336 cycles 1.00
ML-KEM-768 encaps 71586 cycles 71543 cycles 1.00
ML-KEM-768 decaps 87529 cycles 87747 cycles 1.00
ML-KEM-1024 keypair 95914 cycles 96029 cycles 1.00
ML-KEM-1024 encaps 106339 cycles 106281 cycles 1.00
ML-KEM-1024 decaps 126846 cycles 126794 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 59451 cycles 59447 cycles 1.00
ML-KEM-512 encaps 66751 cycles 66713 cycles 1.00
ML-KEM-512 decaps 85312 cycles 85291 cycles 1.00
ML-KEM-768 keypair 101331 cycles 101335 cycles 1.00
ML-KEM-768 encaps 112312 cycles 112126 cycles 1.00
ML-KEM-768 decaps 139136 cycles 138986 cycles 1.00
ML-KEM-1024 keypair 154058 cycles 153913 cycles 1.00
ML-KEM-1024 encaps 172096 cycles 171681 cycles 1.00
ML-KEM-1024 decaps 210998 cycles 206561 cycles 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks

Benchmark suite Current: 7f40838 Previous: 0632f8e Ratio
ML-KEM-512 keypair 226585 cycles 226726 cycles 1.00
ML-KEM-512 encaps 270582 cycles 270767 cycles 1.00
ML-KEM-512 decaps 344885 cycles 345103 cycles 1.00
ML-KEM-768 keypair 375802 cycles 375969 cycles 1.00
ML-KEM-768 encaps 432799 cycles 433029 cycles 1.00
ML-KEM-768 decaps 529254 cycles 529473 cycles 1.00
ML-KEM-1024 keypair 554679 cycles 554952 cycles 1.00
ML-KEM-1024 encaps 631012 cycles 631498 cycles 1.00
ML-KEM-1024 decaps 752423 cycles 752734 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@hanno-becker hanno-becker force-pushed the arch_specs branch 2 times, most recently from 8860eac to fed25f8 Compare August 5, 2025 09:50
@hanno-becker hanno-becker marked this pull request as ready for review August 5, 2025 09:50
@hanno-becker hanno-becker requested a review from a team as a code owner August 5, 2025 09:50
@hanno-becker hanno-becker marked this pull request as draft August 5, 2025 11:05
@hanno-becker hanno-becker force-pushed the arch_specs branch 3 times, most recently from eec3423 to c228927 Compare August 6, 2025 13:24
@hanno-becker hanno-becker marked this pull request as ready for review August 6, 2025 13:31
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Aug 6, 2025
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Aug 6, 2025
Fixes #1144

This commit extends the Makefiles used for tests to check host compiler
and host CPU capabilities when setting archflags with AUTO=1.

This leads to SHA3 being enabled for valgrind CT tests in CI, which
breaks the test since valgrind does not support those instructions.
This is worked around by allowing the user to force values of the
compiler feature detection on the command line.

Signed-off-by: Hanno Becker <[email protected]>
Copy link
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hanno-becker. Mostly looks good.
I ran into some problems when CROSS_PREFIX is passed as an environment variable (like we do in CI).

@mkannwischer mkannwischer merged commit 51d641a into main Aug 8, 2025
360 checks passed
@mkannwischer mkannwischer deleted the arch_specs branch August 8, 2025 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark this PR should be benchmarked in CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Makefile: Add compile-time detection for compiler and host platform feature support
3 participants