-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Add FX feature extraction as an alternative to intermediate_layer_getter #4302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@fmassa just wanted to make sure this topic stays active so decided to get a draft going anyway. I have some checklist items above explaining what I think needs to be done to close this out. Can you please help me add what's needed there? I believe you mentioned you'd chat to the fx team about the train/eval problem so I'll be on standby to pick it back up based on the outcome of that. Also, do you think there's an appropriate place to add more docs than what's in the docstrings? |
Thanks a ton for the PR, let me know when you'll want me to have a closer look at it. For the points you brought, I have a few questions:
Maybe I'm missing something here, but is it even possible to have two modules with the same name? If this is for handling things like
Should this be a new
I've discussed about this with @jamesr66a . His view was that this was a special case of the more general re-specialization problem (like for specializing over shape / ranks of Tensors), and as of now FX is not going to be handling it explicitly, but nothing blocks us from having our own tooling to do this, and I think we should do it. A naive way would be to trace the model twice (one in A naive solution is something in the lines of class WrappedModel(nn.Module):
def __init__(self, model, ...):
super().__init__()
mode = model.training
self.train_model = trace(model.train(), ...)
self.eval_model = trace(model.eval(), ...)
model.train(mode)
def forward(self, *args, **kwargs):
if self.training:
return self.train_model(*args, **kwargs)
else:
return self.eval_model(*args, **kwargs) but ideally we wouldn't duplicate parameters nor change the namespace of the parameters.
Yes, I think we should either make a new doc page under https://github.com/pytorch/vision/tree/main/docs/source or use the new galleries to better illustrate how this functionality can be used.
The points I've made above might still require iterating a bit on some of the aspects implemented here (but please let me know if I'm missing something!). Also, adding tests will be important to validate that the code behave as expected (including dummy From my perspective, we can break this PR into multiple ones if it facilitate things for you:
Thoughts? |
@fmassa thanks for your input! I've made some good progress with the train/eval problem but I'm out of time for this week and will move onto tests/docs next week. Might be worth waiting till then before you have a closer look. But to answer the points that you already have:
I'm talking about something like: user defines
FX creates the new nodes and the node to module mapping, and I'm thinking we leave that behaviour as it is. All we're doing is making a node to string mapping. And the only reason we do that is because we want to use our nicer convention for naming nodes so it's easy for our user to specify which nodes they want.
I agree. I just deleted it and incorporated the desired behaviour into
Good suggestion. That's what I did + some gymnastics to not duplicate params or change the namespace.
100% of timm models are covered when following the guidelines here. When I say "covered" I mean that they can be converted, their forward and backward methods work, they are scriptable + forward + backward (wherever the original model is scriptable), and their output features match those of the conventional methods. Would it make sense to put a version of those guidelines in the docs and gradually phase them out as/if core FX updates solve those problems? |
Thanks for the clarifications, makes sense! Let's get this first version of the PR merged into torchvision this week. Just tests + docs should be good for now.
Good question, I think it might be good to have some high-level guidelines in the documentation as well. I'm not sure FX will be able to capture I'll do a first pass on the PR today with some more high-level comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great, thanks a lot @alexander-soare !
I think we are almost ready to get this merged.
Also, I was wondering if we should add it in a different location in torchvision that is not private, maybe torchvision.models.utils
or torchvision.models.feature_extraction
or something like that?
@fmassa (sorry, edited this a few times as I was working)
That makes sense. I used
Makes sense - see my list of tips in the docstring for |
Thanks for the changes @alexander-soare !
I've made I'm bikeshedding here, but I'm thinking if we can find another name for
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome, thanks a ton @alexander-soare !
I've made a few more comments, and I think you can make the PR ready for a more detailed review already.
I've looked into the test and some of the APIs we are providing, but I would like to pull the PR locally and play a bit with it once you mark the PR ready for review.
@fmassa thanks for the continued support on this! I've sorted it all out in the latest commit. Two more things Thing 1: About the naming. Cool, I've changed them. I went for I also dropped the use of the term "qualified node name" in the parts that are visible to the user. I do have one bikesheddy concern about this though. Calling it just a "node name" might not be specific enough, and may clash with other notions. An actual fx Node already has some sort of notion of a name attached to it. In fact, fx docs even mention qualified node names. But that's slightly different from the node naming convention I've put together. BUT, happy to go ahead with this PR as is, and leave this as a note.
Maybe it's ready for you to go ahead and do that - and in fact that would be welcome at this point. But before marking as ready, I wanted to have a go at training a detection model as the final test. Hoping to do that on the weekend. |
docs/source/index.rst
Outdated
@@ -32,6 +32,7 @@ architectures, and common image transformations for computer vision. | |||
:caption: Package Reference | |||
|
|||
datasets | |||
feature_extraction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like feature_extraction
is a submodule or a folder
in torchvision, but I think the files are placed under models/feature_extraction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't keen on putting it in the models docs as that seems to have a consistent theme of talking about models, how to load them, and their metrics. And I'm a noob with auto-docs so didn't know if I could nest it in the table of contents but still have its own page. Any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re. your comment below:
Probably there is small confusion in placing the files?
Yes probably :) Happy to hear further suggestions if you have any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty much noob too 😄 Probably the maintainers here would tell you better solution for docs.
Hi, I just had a very small look at this. I was very eager for this feature. |
Trained ImageNet classification with pretrained resnet18 + concatenated FPN features with both @fmassa this covers off my "Thing 2" above so from my end this is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried the PR locally and it worked great, thanks!
I've just noted one potential (unwanted?) side-effect of the prefix matching, let me know what you think.
Also, the test failures can be fixed with some of the comments I made.
I think after this batch of comments is addressed we are good to merge, thanks a ton @alexander-soare !
@alexander-soare also, following some discussion we had, maybe using |
@fmassa that's all sorted. Thanks! |
torchvision/models/__init__.py
Outdated
@@ -8,8 +8,8 @@ | |||
from .mobilenet import * | |||
from .mnasnet import * | |||
from .shufflenetv2 import * | |||
from .efficientnet import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add this line back?
EDIT: don't bother, I'll do it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep on it :) Sorry!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, there's more to do. For tests, need to add a leaf module that can't be traced through from effnet. Will send it soon. @fmassa
@fmassa see my comment above about your last push. Will pull your changes and tack on the last bit |
@alexander-soare should we change the implementation of EfficientNet so that it can be properly traced? Anyway, we can for now skip testing efficientnets, and tackle that in a follow-up PR? cc @datumbox |
@alexander-soare don't bother about fixing EfficientNet, we will do it ourselves. For now, can you just disable efficientnets in the testing of FX, so that we can move forward? Something like filtering |
@alexander-soare FYI we are changing Anyway, we can get your diff merged once tests finish (lint is failing btw), and then revert the latest changes so that we test |
@fmassa yeah this does make sense. Was just trying to make sure there wasn't another hidden reason for the issue. So as it stands, Stochastic depth is treated as a leaf module (in the testing), and I can revert it once you're ready. |
@alexander-soare there are still some minor test failures following your last changes, do you mind fixing it so that we can get this merged?
|
@fmassa sorted - introduced it last time... This is the one though. I can feel it (also I let the tests finish before I pushed this time 😛 ) |
Thanks a ton for all your work @alexander-soare ! |
@fmassa was a pleasure. And thank you again for all the support! |
…layer_getter (#4302) Summary: * add fx feature extraction util * Make it possible to use train and eval mode * FX feature extraction - Tweaks and small bug fixes * FX feature extraction - add tests * move to feature_extraction.py, add LeafModuleAwareTracer, add docs * Tweaks to docs * addressing latest round of feedback * undo line spacing changes * change type hints in docstrings * fix sphinx indentation * expose feature_extraction * add maskrcnn example * add api refernce subheading * address latest review notes, refactor names, fix regex, cosmetics * Add back efficientnet to models * fix tests for effnet * fix linting issue * fix test tracer kwargs Reviewed By: fmassa Differential Revision: D30793334 fbshipit-source-id: 34b3497ce75c5b4773f4f3bab328330c114b4193 Co-authored-by: Francisco Massa <[email protected]>
Motivation is similar to #3597. I just happened to pick up on this while working on timm: huggingface/pytorch-image-models#800
This builds on #3597 with a few adjustments:
A caveat, and possibly a dealbreaker that we should figure out how to handle:
model.training
Other things I will do once we are ready to commit:
cc @datumbox