-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Closed
Labels
multithreadingBase.Threads and related functionalityBase.Threads and related functionalityperformanceMust go fasterMust go faster
Description
I guess core devs already know this issue, but I couldn't find a dedicated issue. So, let me file an issue.
I discovered that the new per-field atomics API couldn't match the previous API in terms of performance.
mutable struct Atomic{T}
@atomic data::T
end
increment!(x::Atomic) = @atomic x.data += 1
increment!(x::Threads.Atomic) = Threads.atomic_add!(x, 1)As can be seen, the new API is almost 30 times slower:
julia> x = Atomic{Int}(0);
julia> @btime for _ in 1:1_000_000
increment!($x)
end
109.877 ms (3000000 allocations: 61.04 MiB)
julia> x = Threads.Atomic{Int}(0);
julia> @btime for _ in 1:1_000_000
increment!($x)
end
3.895 ms (0 allocations: 0 bytes)
Of course, this is because the @atomic x.data += 1 call is failed to be optimized down to a sequence of lock and xadd instructions of AMD64. If we deprecate the old API, I think the new API should provide an alternative way that is comparable in terms of performance.
Metadata
Metadata
Assignees
Labels
multithreadingBase.Threads and related functionalityBase.Threads and related functionalityperformanceMust go fasterMust go faster