-
Notifications
You must be signed in to change notification settings - Fork 41
Description
Introduction: How would you simulate gradients
If you want to simulate the gradient of a random function
And similarly
For stationary covariance functions
So if we consider the multivariate random function
where we use
Performance Considerations
In principle you could just directly apply Autodiff (AD) to any kernel
Unfortunately the way that KernelFunctions implement length scales results in general kernel functions, so I am not completely sure how to tell the compiler, that "these derivatives are much simpler than they look".
One possibility might be, to add the Abstract types IsotropicKernel and StationaryKernel and carry these types over when transformations do not violate them. Scaling would not, more general affine transformations would violate isotropy but not stationarity, etc. This could probably be done with type parameters.
But even once that is implemented, how do you tell autodiff what to differentiate? I have seen the file chainrules.jl in this repository, so I thought I would ask if someone already knows how to implement this.
Kernels for multiple outputs considerations
Since you implemented kernels for multiple outputs as an extension of the input space, the reuse of the derivative
Maybe this is all premature optimization, as the evaluation of the kernel is complexity wise in the shadow of the cholesky decomposition.
Extension: Simulate $n$ -order derivatives
In principle you could similarly simulate
What do you think? I handcrafted something for first order derivatives in a personal project, but for KernelFunctions.jl a more general approach is probably needed.