[ENH] Add `predict` to v2 models #1984

phoeenniixx · 2025-10-18T18:01:44Z

This PRs adds the predict function and related functionality to v2
Fixes #1883

Add PredictCallback
Add Base_pkg class and predict wrapper
Add tests for predict
Add functionality to return the predictions in a readable format (csv or pandas)
Add tests for checkpointing
Add support of yml file reading for cfgs (similar to pytorch-tabular)

Summary of changes:

Added Base_pkg acts as a base class for model_pkg classes which wrap the fit and predict methods of the model layer and also acts as "fixture containers" for test coverage.
- It internally creates trainer, model object, datamodules, and dataloaders (when called fit or predict)
- Gets TimeSeries classes from data_scenarios which are used by test fixtures to create datamodules for different data_scenarios (in get_test_dataset_from method)
Added PredictCallback which is used by the predict of BaseModel to perform predictions based on different modes (raw, predict, quantile).
Updated BaseModel to have predict function which perform predictions using PredictCallback.
Updated the pkg (or model_pkg) classes to used Base_pkg.
- removed _get_test_datamodule_from method from the classes as the data modules are now created using the Base_pkg's predict and fit functions.
- Added get_datamodule_cls which returns the datamodule a model is compatible with - tslib or EncoderDecoder.
Updated _integration function to reduce the workload of creating trainer object as that also happens inside the Base_pkg class and to use predict and fit functions of pkg class.

codecov · 2025-10-18T19:16:37Z

Codecov Report

❌ Patch coverage is 79.33579% with 56 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@c78b273). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
pytorch_forecasting/base/_base_pkg.py	76.23%	24 Missing ⚠️
pytorch_forecasting/callbacks/predict.py	64.28%	20 Missing ⚠️
pytorch_forecasting/models/base/_base_model_v2.py	60.71%	11 Missing ⚠️
...ytorch_forecasting/tests/test_all_estimators_v2.py	96.55%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1984   +/-   ##
=======================================
  Coverage        ?   86.31%           
=======================================
  Files           ?      162           
  Lines           ?     9594           
  Branches        ?        0           
=======================================
  Hits            ?     8281           
  Misses          ?     1313           
  Partials        ?        0

Flag	Coverage Δ
cpu	`86.31% <79.33%> (?)`
pytest	`86.31% <79.33%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fkiraly · 2025-10-19T19:00:39Z

pytorch_forecasting/base/_base_pkg.py

+        Configs to initialise ``lightning.Trainer``. Defaults to {}.
+    datamodule_cfg : Union[dict, str, Path], optional
+        Configs to initialise a ``LightningDataModule``.
+        - If dict, the keys and values are used as configuration parameters.


minor formatting issue: please have newlines around bullet point lists

Thanks for poiniting it out. I will make the changes to the PR soon.

fkiraly

Extremely neat!

I like the design, and I think this is brilliantly done!

Some questions and requests:

could you write a summary for the PR?
could you add a vignette somewhere in a notebook or an "examples" section? I think this is crucial to have a full review.
can you explain how or why the test configs can be replaced by a single test vignette?
design questions: why are fit and predict kwargs there and not in __init__? Have you thought about this?

agobbifbk · 2025-10-28T12:32:51Z

Hello there! I like the prediction callback, it is easy to read and fits in the whole architecture! I have 2 comments:
1- not sure if we want to put the scaling part at model level (like here) probably this logic should happen at base level , taking into account the quantile loss (rescaling all the quantiles). But I see it is a WIP so it does not bother me so much
2- I see the return of the predict callback is a tensor, is there any additional interface for have the prediction in a more readable format? Or is the next piece of the puzzle we need to clarify?
3- Given 2, probably we need to insert the temporal information as output, not as a tensor but probably as numpy array. I think we can have access to the time array using batch_idx and dataloader_idx in the on_predict_batch_end function, does it seems reasonable?

THX!

phoeenniixx · 2025-10-28T17:32:34Z

can you explain how or why the test configs can be replaced by a single test vignette?

Sorry, I did not understand the question, can you please rephrase it?
Right now, the test configs are taken from get_test_train_params and are tested using the "loop over" method, similar to v1.

design questions: why are fit and predict kwargs there and not in __init__? Have you thought about this?

Well kwargs in fit are mainly for the trainer to add anymore customization to the trainer's initialization and fit. For predict, right now, they are mainly sent to PredictCallback which pass them to to_prediction and to_quantiles - something I took from v1 (see here).

But the usage of kwargs is still not mature enough, mainly these are added to provide a little more flexibility to the usage of predict and fit . As right now, I am not compeletly sure what more params these funcs may require. So, I was thinking when more people use this we might get a concrete design. And in order to do so, I thought of adding these kwargs to both the funcs.

phoeenniixx · 2025-10-28T17:44:32Z

not sure if we want to put the scaling part at model level (like here) probably this logic should happen at base level , taking into account the quantile loss (rescaling all the quantiles). But I see it is a WIP so it does not bother me so much

I totally agree! I think this was added because the scaling was not present at D2 level when this model was implemented and we needed scaling here for some reason? (@PranavBhatP please correct me if I am wrong here).

Anyways, I think #1983 will solve this issue to some extent.

I see the return of the predict callback is a tensor, is there any additional interface for have the prediction in a more readable format? Or is the next piece of the puzzle we need to clarify?

Well a lot of work is done by to_prediction of the BaseModel which creates the tensors of final predictions (according to my understanding). I was thinking of having a method at pkg layer to create the final human readable format, but the design is still not concrete in my mind about it. I need some time to figure it out. Maybe we might need some discussion over it in near future to guide me in the right direction.

Given 2, probably we need to insert the temporal information as output, not as a tensor but probably as numpy array. I think we can have access to the time array using batch_idx and dataloader_idx in the on_predict_batch_end function, does it seems reasonable?

I am not quite clear about this right now, can you please elaborate your idea here? I agree we need temporal info in the output, but as numpy... I am not sure about it (how and why should we have numpy here?). I need some time to bring all the thoughts together!

phoeenniixx · 2025-10-28T17:45:42Z

I will add the summary of the complete PR and add the example vignette to the notebooks as well. I just need some time, sorry, I got busy with some other things :)

fkiraly · 2025-10-28T23:35:01Z

can you explain how or why the test configs can be replaced by a single test vignette?

Sorry, I did not understand the question, can you please rephrase it? Right now, the test configs are taken from get_test_train_params and are tested using the "loop over" method, similar to v1.

The _integation method seems to change substantially, and _get_test_datamodule_from gets deleted everywhere. Why is that ok? I would like to understand the details, as I worry about loss of test coverage.

design questions: why are fit and predict kwargs there and not in __init__? Have you thought about this?

Well kwargs in fit are mainly for the trainer to add anymore customization to the trainer's initialization and fit. For predict, right now, they are mainly sent to PredictCallback which pass them to to_prediction and to_quantiles - something I took from v1 (see here).

But the usage of kwargs is still not mature enough, mainly these are added to provide a little more flexibility to the usage of predict and fit . As right now, I am not compeletly sure what more params these funcs may require. So, I was thinking when more people use this we might get a concrete design. And in order to do so, I thought of adding these kwargs to both the funcs.

This is consequential though - also think about composite models where fit etc gets called not by the user but internally by the pipeline. Does it also make sense to have them there in such a scenario?

agobbifbk · 2025-10-29T08:33:33Z

not sure if we want to put the scaling part at model level (like here) probably this logic should happen at base level , taking into account the quantile loss (rescaling all the quantiles). But I see it is a WIP so it does not bother me so much

I totally agree! I think this was added because the scaling was not present at D2 level when this model was implemented and we needed scaling here for some reason? (@PranavBhatP please correct me if I am wrong here).

Anyways, I think #1983 will solve this issue to some extent.

I see the return of the predict callback is a tensor, is there any additional interface for have the prediction in a more readable format? Or is the next piece of the puzzle we need to clarify?

Well a lot of work is done by to_prediction of the BaseModel which creates the tensors of final predictions (according to my understanding). I was thinking of having a method at pkg layer to create the final human readable format, but the design is still not concrete in my mind about it. I need some time to figure it out. Maybe we might need some discussion over it in near future to guide me in the right direction.

Given 2, probably we need to insert the temporal information as output, not as a tensor but probably as numpy array. I think we can have access to the time array using batch_idx and dataloader_idx in the on_predict_batch_end function, does it seems reasonable?

I am not quite clear about this right now, can you please elaborate your idea here? I agree we need temporal info in the output, but as numpy... I am not sure about it (how and why should we have numpy here?). I need some time to bring all the thoughts together!
At some point we need to link the tensor of the prediction to the correct timestamp. I think we have it in the D1 as np.array (since can not be casted as tensor), then we propagate it to the D2 dataset (I suppose). What I mean is that when we produce the prediction we need to associate the correct timestamp. We need to find the correct spot in which we have both the information aligned (prediction and timestamp). The temporal information is in the dataset (or in the dataloader.dataset) but I think we need to have a meeting, it is hard to describe it here :-)

phoeenniixx · 2025-10-29T12:55:22Z

The _integation method seems to change substantially, and _get_test_datamodule_from gets deleted everywhere. Why is that ok? I would like to understand the details, as I worry about loss of test coverage.

The main change to _integration is because of the wrapper. Now we dont need to define the trainer beforehand as this is handled by the pkg class. And we just need the trainer_cfg and rest would happen in the pkg class. Same reason why we dont need the dataloaders, but just the cfgs and rest would happen inside the wrapper.

About the _get_test_datamodule_from, we dont need that as that was creating the compatible datamodule for the model- getting EncoderDecoder or tslib datamodule, based on the type of model. Again, due to the wrapper, we just need the compatible datmodule class (which we get using get_datamodule_cls and rest happens under the hood.
I tried to keep the flow everything same as much as I could, the main thing that I removed was early stopping which used val_loss. Actually train-test-split is done right now based on groups and there is no temporal split, so when there are very less groups (like in the current test scenario) we donot get val_dataloaders, and this leads to issues.
In main branch, We are passing the same dataset as train and val datasets, and I didnot feel good about it, and I have removed this "hack". If you think this is important, I can add it back.

This is consequential though - also think about composite models where fit etc gets called not by the user but internally by the pipeline. Does it also make sense to have them there in such a scenario?

True, I didnot think about this issue :)
when using a pipeline, we this code could work, but we would have to be very careful. I agree, in that scenario, we could face a problem. I will look into it, Thanks!

phoeenniixx · 2025-10-29T12:56:16Z

not sure if we want to put the scaling part at model level (like here) probably this logic should happen at base level , taking into account the quantile loss (rescaling all the quantiles). But I see it is a WIP so it does not bother me so much

I totally agree! I think this was added because the scaling was not present at D2 level when this model was implemented and we needed scaling here for some reason? (@PranavBhatP please correct me if I am wrong here).
Anyways, I think #1983 will solve this issue to some extent.

I see the return of the predict callback is a tensor, is there any additional interface for have the prediction in a more readable format? Or is the next piece of the puzzle we need to clarify?

Well a lot of work is done by to_prediction of the BaseModel which creates the tensors of final predictions (according to my understanding). I was thinking of having a method at pkg layer to create the final human readable format, but the design is still not concrete in my mind about it. I need some time to figure it out. Maybe we might need some discussion over it in near future to guide me in the right direction.

Given 2, probably we need to insert the temporal information as output, not as a tensor but probably as numpy array. I think we can have access to the time array using batch_idx and dataloader_idx in the on_predict_batch_end function, does it seems reasonable?

I am not quite clear about this right now, can you please elaborate your idea here? I agree we need temporal info in the output, but as numpy... I am not sure about it (how and why should we have numpy here?). I need some time to bring all the thoughts together!
At some point we need to link the tensor of the prediction to the correct timestamp. I think we have it in the D1 as np.array (since can not be casted as tensor), then we propagate it to the D2 dataset (I suppose). What I mean is that when we produce the prediction we need to associate the correct timestamp. We need to find the correct spot in which we have both the information aligned (prediction and timestamp). The temporal information is in the dataset (or in the dataloader.dataset) but I think we need to have a meeting, it is hard to describe it here :-)

Yes, I also think we should someday have a meet and discuss about some things!

fkiraly · 2025-10-30T21:01:50Z

The main change to _integration is because of the wrapper. Now we dont need to define the trainer beforehand as this is handled by the pkg class. And we just need the trainer_cfg and rest would happen in the pkg class. Same reason why we dont need the dataloaders, but just the cfgs and rest would happen inside the wrapper.

Ok, that is good news - if you vouch for this being 1:1 (except for val_loader), this is a really great sign - it means the design goal is successfuly realized, in reducing complexity of usage substantially!

when using a pipeline, we this code could work, but we would have to be very careful. I agree, in that scenario, we could face a problem. I will look into it, Thanks!

I think that is the main reason for not having fit kwargs in sktime - it does not compose properly.

phoeenniixx added 8 commits August 26, 2025 23:44

initial design

cff05b9

Merge branch 'main' into predict-v2

1371c68

preliminary design

13c12b9

Merge branch 'main' into predict-v2

7cfb26c

add predict to all v2 models

35e2447

update base model

882caba

update base model

c53f881

update test_integration

f8d06fe

fkiraly added enhancement New feature or request module:models labels Oct 18, 2025

fkiraly reviewed Oct 19, 2025

View reviewed changes

Merge branch 'main' into predict-v2

a1aad82

phoeenniixx added 2 commits November 2, 2025 01:33

Merge branch 'main' into predict-v2

e7fcc77

hadnle kwargs

5024fcb

[ENH] Add predict to v2 models #1984

Are you sure you want to change the base?

[ENH] Add predict to v2 models #1984

Uh oh!

Conversation

phoeenniixx commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes:

Uh oh!

codecov bot commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fkiraly Oct 19, 2025

Choose a reason for hiding this comment

Uh oh!

phoeenniixx Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

fkiraly left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agobbifbk commented Oct 28, 2025

Uh oh!

phoeenniixx commented Oct 28, 2025

Uh oh!

phoeenniixx commented Oct 28, 2025

Uh oh!

phoeenniixx commented Oct 28, 2025

Uh oh!

fkiraly commented Oct 28, 2025

Uh oh!

agobbifbk commented Oct 29, 2025

Uh oh!

phoeenniixx commented Oct 29, 2025

Uh oh!

phoeenniixx commented Oct 29, 2025

Uh oh!

fkiraly commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[ENH] Add `predict` to v2 models #1984

[ENH] Add `predict` to v2 models #1984

phoeenniixx commented Oct 18, 2025 •

edited

Loading

codecov bot commented Oct 18, 2025 •

edited

Loading

fkiraly left a comment •

edited

Loading