Wishlist / redesign proposal for predict_spatial() #99
CBonannella
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Context
I’m using
mlr3spatial::predict_spatial()
for large-area species-distribution mapping (binary + probabilistic output). While porting a stacked model pipeline I ran into several pain points that make the current helper too narrow for real-world classification workflows.Below is a short review of the current behaviour, why it breaks, and extensions that would solve the issues without changing defaults.
Current limitations (classification)
pred$response
(hard labels)predict_type = "prob"
. I saw there's a pull request up for this but it's quite old (~ 2 years ago).terra::categories()
FLT8S
), great for regression but not always ideal for classification + no GDAL optsThe PR I mentioned above also doesn't address GDAL options but focuses only on implementing probabilities, which would be already a good step forward, but you know, one can dream, no?
What a more general helper should cover
It's great that the
terra::writeRaster()
uses data chunking, which helps a lot in real world / production pipelines, but here is what I think could be improved:predict_type
response
.pred$prob[,"1"]
), if theprob
object hasncol > 2
allow the user to select how many raster layers need to be written.Pseudocode / sketch
So basically:
learner$predict_type == "prob"
prob_class
defaults to the second factor level, i.e. the "positive" class)prob_class
terra::categories()
entirely; name output layer(s) asp_<class>
where<class>
is the corresponding class label (coming fromprob_class
).predict_type == "response"
) keep current categorical workflow / current workflowdatatype
andgdal_opts
arguments, pass them through toterra::writeStart()
I'm currently prototyping / doing this ad hoc myself but I'd be happy contribute / help with these implementations, doing some testing and eventually update docs/examples when needed.
Let me know if this direction is acceptable, or if there are design constraints I missed.
Thanks for the great package!
Beta Was this translation helpful? Give feedback.
All reactions