Skip to content
This repository was archived by the owner on Dec 18, 2018. It is now read-only.

Loop unroll; avoid bounds checks #399

Closed
wants to merge 1 commit into from

Conversation

benaadams
Copy link
Contributor

No description provided.

@davidfowl
Copy link
Member

@benaadams Before after numbers?

@benaadams
Copy link
Contributor Author

Will benchmark

@@ -13,7 +13,7 @@ public class AsciiDecoderTests
[Fact]
private void FullByteRangeSupported()
{
var byteRange = Enumerable.Range(0, 255).Select(x => (byte)x).ToArray();
var byteRange = Enumerable.Range(0, 256).Select(x => (byte)x).ToArray();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it wrong, is count not upper bound. Though current code still pass with this change.

@benaadams
Copy link
Contributor Author

@benaadams
Copy link
Contributor Author

Current is SequentialArray proposed is UnrolledParallelPointer

BenchmarkDotNet=v0.7.8.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz, ProcessorCount=8
HostCLR=MS.NET 4.0.30319.42000, Arch=64-bit  [RyuJIT]
Type=AsciiToString  Mode=Throughput  Platform=X64  Jit=RyuJit  .NET=HostFramework
Method Param AvrTime StdDev op/s
SequentialArray 1 1.8446 ns 0.0359 ns 542,130,112.58
SequentialPointer 1 1.4709 ns 0.0290 ns 679,872,918.30
UnrolledPointer 1 1.5475 ns 0.0068 ns 646,184,034.29
UnrolledParallelPointer 1 1.4456 ns 0.0280 ns 691,731,058.85
SequentialArray 2 1.8347 ns 0.0301 ns 545,041,308.79
SequentialPointer 2 1.7573 ns 0.0055 ns 569,058,870.74
UnrolledPointer 2 1.7582 ns 0.0302 ns 568,748,690.53
UnrolledParallelPointer 2 1.6387 ns 0.0255 ns 610,257,762.49
SequentialArray 8 2.6034 ns 0.0473 ns 384,119,102.12
SequentialPointer 8 2.7821 ns 0.0495 ns 359,435,697.96
UnrolledPointer 8 2.3780 ns 0.0419 ns 420,515,824.62
UnrolledParallelPointer 8 1.9083 ns 0.0923 ns 524,037,000.21
SequentialArray 64 15.5845 ns 1.2281 ns 64,166,181.67
SequentialPointer 64 12.2456 ns 0.2320 ns 81,662,055.86
UnrolledPointer 64 9.5331 ns 0.1572 ns 104,897,218.65
UnrolledParallelPointer 64 5.8002 ns 0.1617 ns 172,408,326.49
SequentialArray 512 109.1622 ns 3.9246 ns 9,160,708.35
SequentialPointer 512 77.9059 ns 1.3984 ns 12,836,002.73
UnrolledPointer 512 68.6465 ns 1.2514 ns 14,567,379.93
UnrolledParallelPointer 512 39.9602 ns 1.6196 ns 25,024,919.12
SequentialArray 1024 216.8628 ns 4.3980 ns 4,611,210.56
SequentialPointer 1024 147.6454 ns 2.6145 ns 6,772,985.46
UnrolledPointer 1024 134.3055 ns 2.3314 ns 7,445,713.61
UnrolledParallelPointer 1024 77.5874 ns 1.8344 ns 12,888,690.04
SequentialArray 4096 838.0113 ns 8.0319 ns 1,193,301.35
SequentialPointer 4096 603.3346 ns 10.7317 ns 1,657,455.03
UnrolledPointer 4096 528.5438 ns 8.8499 ns 1,891,990.90
UnrolledParallelPointer 4096 306.2025 ns 4.9520 ns 3,265,812.70
SequentialArray 8192 1,665.9055 ns 17.9379 ns 600,274.20
SequentialPointer 8192 1,203.2110 ns 20.8846 ns 831,109.43
UnrolledPointer 8192 1,054.4902 ns 0.7537 ns 948,325.53
UnrolledParallelPointer 8192 606.6170 ns 10.3170 ns 1,648,486.66
SequentialArray 16384 3,331.0892 ns 31.2668 ns 300,202.12
SequentialPointer 16384 2,404.2652 ns 1.2482 ns 415,927.49
UnrolledPointer 16384 2,105.9399 ns 2.7216 ns 474,847.36
UnrolledParallelPointer 16384 1,447.2493 ns 25.6027 ns 690,965.98

Approx 1 hour to benchmark

@benaadams
Copy link
Contributor Author

@davidfowl new implementation is

x 1.36 for 8 bytes
x 2 for 64 bytes
x 2.7 for 512 bytes

(not including the string constructor which is the same in all cases; and the GC cost would dwarf the ASCII decode)

@benaadams benaadams force-pushed the faster-GetAsciiString branch from 161ea88 to c3042ba Compare November 18, 2015 19:34
@benaadams benaadams force-pushed the faster-GetAsciiString branch from 66f8171 to 0f9abfb Compare November 18, 2015 20:04
This was referenced Nov 20, 2015
@benaadams
Copy link
Contributor Author

Will come back to this

@benaadams benaadams closed this Dec 9, 2015
@benaadams benaadams deleted the faster-GetAsciiString branch May 10, 2016 02:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants