Description
Hey there!
I'm getting an error when trying to do a custom implementation of a Neural Network (for fun). My code runs fine with the CPU
backend (slowly, but fine), but in the CUDA
and OPENCL
backends, it crashes with the following: (actually, on OPENCL
shows this, CUDA
doesn't print anything but I assume its the same issue)
In function void opencl::evalNodes(std::vector<opencl::Param>&, std::vector<opencl::JIT::Node*>)
In file src/backend/opencl/jit.cpp:267
OpenCL Error (-48): Invalid Kernel when calling clEnqueueNDRangeKernel
In function af_err af_transpose(void**, af_array, bool)
In file src/api/c/transpose.cpp:51
thread 'main' panicked at 'Error message: Eror either in ArrayFire or in a project upstream', /home/jared/.cargo/registry/src/git.colasdn.top-1ecc6299db9ec823/arra
yfire-3.4.0/src/error.rs:72
Happy to provide a Rust stacktrace and the code if it's helpful. This is on Ubuntu 16.04, Cuda 8.0, ArrayFire 3.4.2, arrayfire-rust 3.4.0 running on an Nvdia GeForce 980 Ti. Let me know what other information might be useful, and thanks in advance =D
Edit: In case its helpful, here's the Rust code:
// iter of type (usize, (Array, Array)) => (index, (activation, zeta))
let iter = activations.iter().rev().zip(zetas.iter().rev()).rev().enumerate().rev();
// inject / fold / reduce with starting blank delta, receiving
// Array, (usize, (Array, Array)) => delta, (index, (activation, zeta))
let _ = iter.fold(constant::<f64>(0f64, Dim4::new(&[1, 1, 1, 1])), |delta, (index, (activation, zeta))| {
if index == self.num_layers - 1 {
self.cost_derivative(activation, target) * tanh_prime(zeta)
} else if index == 0 {
nabla_b.push(delta.clone());
let activation_t = transpose(activation, false);
let mul = matmul(&delta, &activation_t, MatProp::NONE, MatProp::NONE);
nabla_w.push(mul.clone());
delta
} else {
let activation_t = transpose(activation, false);
let mul = matmul(&delta, &activation_t, MatProp::NONE, MatProp::NONE);
nabla_w.push(mul.clone());
nabla_b.push(delta.clone());
let activated = tanh_prime(zeta);
let weight = transpose(&self.weights[index], false);
let new_delta = matmul(&weight, &delta, MatProp::NONE, MatProp::NONE) * activated;
new_delta
}
});
It's blowing up at the let activation_t = transpose(activation, false);
line in the else
clause (presumably because that clause is hit before the else if
clause?). Specifically, it will blow up with any access to activation
s data, even a af_print!("", activation)
call.
In the iter
construction up there, both activations
and zetas
are of type Vec<Array>
. This only happens when, for some reason, I create a neural network who's last layer has more than 3 nodes in it.
Edit 2: I'm using array fire 3.4.2, not 3.4.0:
ArrayFire v3.4.2 (OpenCL, 64-bit Linux, build 2da9967)
[0] NVIDIA : GeForce GTX 980 Ti, 6076 MB