Skip to content

Getting Invalid Kernel when calling clEnqueueNDRangeKernel #95

Closed
@jaredonline

Description

@jaredonline

Hey there!

I'm getting an error when trying to do a custom implementation of a Neural Network (for fun). My code runs fine with the CPU backend (slowly, but fine), but in the CUDA and OPENCL backends, it crashes with the following: (actually, on OPENCL shows this, CUDA doesn't print anything but I assume its the same issue)

In function void opencl::evalNodes(std::vector<opencl::Param>&, std::vector<opencl::JIT::Node*>)
In file src/backend/opencl/jit.cpp:267
OpenCL Error (-48): Invalid Kernel when calling clEnqueueNDRangeKernel

In function af_err af_transpose(void**, af_array, bool)
In file src/api/c/transpose.cpp:51

thread 'main' panicked at 'Error message: Eror either in ArrayFire or in a project upstream', /home/jared/.cargo/registry/src/git.colasdn.top-1ecc6299db9ec823/arra
yfire-3.4.0/src/error.rs:72

Happy to provide a Rust stacktrace and the code if it's helpful. This is on Ubuntu 16.04, Cuda 8.0, ArrayFire 3.4.2, arrayfire-rust 3.4.0 running on an Nvdia GeForce 980 Ti. Let me know what other information might be useful, and thanks in advance =D

Edit: In case its helpful, here's the Rust code:

// iter of type (usize, (Array, Array)) => (index, (activation, zeta))
let iter = activations.iter().rev().zip(zetas.iter().rev()).rev().enumerate().rev();
// inject / fold / reduce with starting blank delta, receiving
// Array, (usize, (Array, Array)) => delta, (index, (activation, zeta))
let _ = iter.fold(constant::<f64>(0f64, Dim4::new(&[1, 1, 1, 1])), |delta, (index, (activation, zeta))| {
    if index == self.num_layers - 1 {
        self.cost_derivative(activation, target) * tanh_prime(zeta)
    } else if index == 0 {
        nabla_b.push(delta.clone());
        let activation_t = transpose(activation, false);
        let mul = matmul(&delta, &activation_t, MatProp::NONE, MatProp::NONE);
        nabla_w.push(mul.clone());
        delta
    } else {
        let activation_t = transpose(activation, false);
        let mul = matmul(&delta, &activation_t, MatProp::NONE, MatProp::NONE);
        nabla_w.push(mul.clone());
        nabla_b.push(delta.clone());

        let activated = tanh_prime(zeta);
        let weight = transpose(&self.weights[index], false);
        let new_delta = matmul(&weight, &delta, MatProp::NONE, MatProp::NONE) * activated;
        new_delta
    }
});

It's blowing up at the let activation_t = transpose(activation, false); line in the else clause (presumably because that clause is hit before the else if clause?). Specifically, it will blow up with any access to activations data, even a af_print!("", activation) call.

In the iter construction up there, both activations and zetas are of type Vec<Array>. This only happens when, for some reason, I create a neural network who's last layer has more than 3 nodes in it.

Edit 2: I'm using array fire 3.4.2, not 3.4.0:

ArrayFire v3.4.2 (OpenCL, 64-bit Linux, build 2da9967)
[0] NVIDIA  : GeForce GTX 980 Ti, 6076 MB

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions