-
-
Notifications
You must be signed in to change notification settings - Fork 216
Open
Description
As of version 0.6.15, the following:
julia> Zygote.gradient(^, 0.0, 0.9)
(0.0, 0.0)
julia> Zygote.gradient((x, y) -> sum(x .^ y), zeros(3), fill(0.9, 3))
([0.0, 0.0, 0.0], [0.0, 0.0, 0.0])
julia> f(x, y) = x ^ y;
julia> Zygote.gradient((x, y) -> sum(f.(x, y)), zeros(3), fill(0.9, 3))
([Inf, Inf, Inf], [NaN, NaN, NaN])The same code in version 0.6.14:
julia> Zygote.gradient(^, 0.0, 0.9)
(0.0, 0.0)
julia> Zygote.gradient((x, y) -> sum(x .^ y), zeros(3), fill(0.9, 3))
([0.0, 0.0, 0.0], [0.0, 0.0, 0.0])
julia> f(x, y) = x ^ y;
julia> Zygote.gradient((x, y) -> sum(f.(x, y)), zeros(3), fill(0.9, 3))
([0.0, 0.0, 0.0], [0.0, 0.0, 0.0])the crucial difference being in the last case for each.
If you do this with ForwardDiff, you get
julia> ForwardDiff.gradient(x -> x[1]^x[2], [0.0, 0.9])
2-element Vector{Float64}:
Inf
NaNso this smells to me like ForwardDiff is being invoked for f, whereas individual chain rules are used for sum(x .^ y) because Zygote un-fuses broadcasting where it can (it can't un-fuse f).
@oxinabox @mcabbott @DhairyaLGandhi I'm assuming that this is related to our recent efficiency upgrades?
This is breaking the build in KernelFunctions.jl. We can of course allow the tests to fail for the time being, but it essentially means that we can't use Zygote with one of our kernels.
Metadata
Metadata
Assignees
Labels
No labels