-
Notifications
You must be signed in to change notification settings - Fork 72
Open
Description
I have a reimplementation of Linearlagebra.qr
in my codebase which uses @turbo
. In it there is
using LinearAlgebra, LoopVectorization, BenchmarkTools
function reflectorApply!(x::AbstractVector{<: Real}, τ::Real, A::StridedMatrix{<: Real})
m, n = size(A)
@inbounds for j = 1:n
# dot
vAj = A[1, j]
@turbo for i = 2:m
vAj += conj(x[i]) * A[i, j]
end
vAj = conj(τ)*vAj
# ger
A[1, j] -= vAj
@turbo for i = 2:m
A[i, j] -= x[i]*vAj
end
end
return A
end
which is called with two views into the same array:
input = rand(64, 64)
n = 64
j = 17
τ = 1.7
x = LinearAlgebra.view(input, j:n, j)
y = LinearAlgebra.view(input, j:n, j+1:n)
@benchmark reflectorApply!($x, $τ, $y)
I've noticed that after triggering a package update things have been running more slowly and Identified that the problem is here. Trying some different version of LoopVectorization pointed showed that v0.12.119 had made this much slower:
# median times via @benchmark, after julia restart
# no turbo: 2.9µs
# 0.12.124: 48µs
# 0.12.120: 46µs
# 0.12.119: 46µs
# 0.12.118: 0.72µs
# 0.12.115: 0.74µs
# 0.12.110: 0.72µs
# 0.12.100: 0.71µs
Rewriting this to use no views restores the performance from before 0.12.119.
Metadata
Metadata
Assignees
Labels
No labels