sorting networks

I haven't done much work on sorting lately, but figured to share some findings.

I looked into unstable sorting networks this week and haven't been able to reproduce the suggested performance gain. I suspect there's some cache pollution due to the large instruction size when utilizing sorting networks in a quicksort.

So far my best results have been using piposort on a threshold of 96, with unrolled 4, 8, 16 element parity merges and twice-unguarded insertion to fill the gaps.

As for the high performance reported by rust sorts, I suspect it's primarily due to rust compiling `? :` ternary operations as branchless. This makes the benchmarks quite misleading, since there's no such thing in gcc.

When comparing crumsort compiled with clang to pdqsort compiled with g++, pdqsort is nearly two times slower than crumsort for 10000 elements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sorting networks #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

sorting networks #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions