Gradient dimension mismatch error when training rnns

```
using Flux

xs = [[1f0 2f0 3f0], [2f0 3f0 4f0]]
ys = [[2f0 3f0 4f0], [3f0 4f0 5f0]]
m = GRU(1, 1)

function loss(xs, ys)
    Flux.reset!(m)
    sum(Flux.mse.([m(x) for x in xs], ys))
end

opt = ADAM()
ps = params(m)
grads = gradient(ps) do
    loss(xs, ys)
end

julia> Flux.update!(opt, ps, grads)
ERROR: DimensionMismatch("new dimensions (1, 1) must be consistent with array size 3")
Stacktrace:
 [1] (::Base.var"#throw_dmrsa#272")(dims::Tuple{Int64, Int64}, len::Int64)
   @ Base ./reshapedarray.jl:41
 [2] reshape
   @ ./reshapedarray.jl:45 [inlined]
 [3] reshape
   @ ./reshapedarray.jl:116 [inlined]
 [4] restructure
   @ ~/.julia/packages/ArrayInterface/mJodK/src/ArrayInterface.jl:400 [inlined]
 [5] update!(opt::ADAM, x::Matrix{Float32}, x̄::Matrix{Float32})
   @ Flux.Optimise ~/.julia/packages/Flux/qAdFM/src/optimise/train.jl:24
 [6] update!(opt::ADAM, xs::Zygote.Params, gs::Zygote.Grads)
   @ Flux.Optimise ~/.julia/packages/Flux/qAdFM/src/optimise/train.jl:32
 [7] top-level scope
   @ REPL[9]:1
 [8] top-level scope
   @ ~/.julia/packages/CUDA/Axzxe/src/initialization.jl:52

julia> [size(p) for p in ps]
4-element Vector{Tuple{Int64, Vararg{Int64}}}:
 (3, 1)
 (3, 1)
 (3,)
 (1, 1)

julia> [size(grads[p]) for p in ps]
4-element Vector{Tuple{Int64, Vararg{Int64}}}:
 (3, 1)
 (3, 1)
 (3,)
 (1, 3)
```
This happens when using `RNN` or `GRU` but doesn't when using `LSTM`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Gradient dimension mismatch error when training rnns #1891

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Gradient dimension mismatch error when training rnns #1891

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions