Skip to content

sorting networks #1

@scandum

Description

@scandum

I haven't done much work on sorting lately, but figured to share some findings.

I looked into unstable sorting networks this week and haven't been able to reproduce the suggested performance gain. I suspect there's some cache pollution due to the large instruction size when utilizing sorting networks in a quicksort.

So far my best results have been using piposort on a threshold of 96, with unrolled 4, 8, 16 element parity merges and twice-unguarded insertion to fill the gaps.

As for the high performance reported by rust sorts, I suspect it's primarily due to rust compiling ? : ternary operations as branchless. This makes the benchmarks quite misleading, since there's no such thing in gcc.

When comparing crumsort compiled with clang to pdqsort compiled with g++, pdqsort is nearly two times slower than crumsort for 10000 elements.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions