In the CKY implementation in #586 when I try to take a gradient I get the error ->
Can't differentiate wrt type (Ref h' ((tmp77<..) => Float32))
CallStack (from HasCallStack):
error, called at src/lib/Autodiff.hs:402:23 in dex-0.1.0.0-1c1KJ1laoEjKvvidxV1QZ9:Autodiff
I think this is an okay thing to do. I've been able to use State and grad together before. But I think the slice indexing breaks it.