-
Couldn't load subscription status.
- Fork 31k
Rework how PreTrainedModel.from_pretrained handles its arguments #866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hmm, well that's embarrassing. I'll inspect the failing tests some more to see what's up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is a nice solution, I like it.
I think we still need to add a mention of this new behavior in the migration guide since people might use at the same time the new Configuration classes (which have num_classes attributes so they will eat the num_classes supplied in **kwargs) and model derived from the old ForSequence pattern.
|
Regarding python 2, yes we want to keep supporting it and thanks for taking care of it. Google (which is still using python 2) is a major supplier of pretrained model and architectures and having python 2 support in the library make the job of re-implementing the models a lot easier (I can load TF and PT models side-by-side) :) |
|
I have updated the readme breaking change section on this (ba52fe6) |
|
Thanks for the feedback: In my latest commits I've updated the documentation as requested and renamed the I also removed the unused |
|
Looks good to me, thanks a lot @xanlsh! |
Unification of the
from_pretrainedfunctions belonging to various modules (GPT2PreTrainedModel, OpenAIGPTPreTrainedModel, BertPreTrainedModel) brought changes to the function's argument handling which don't cause any issues within the repository itself (afaik), but have the potential to break a variety of downstream code (eg. my own).In the last release of pytorch_transformers (v0.6.2), the
from_pretrainedfunctions took in*argsand**kwargsand passed them directly to the relevant model's constructor (perhaps with some processing along the way). For a typical example, seefrom_pretrained's signature inmodeling.pyhere https://github.com/huggingface/pytorch-transformers/blob/b832d5bb8a6dfc5965015b828e577677eace601e/pytorch_pretrained_bert/modeling.py#L526and the relevant usage of said arguments (after some small modifications) https://github.com/huggingface/pytorch-transformers/blob/b832d5bb8a6dfc5965015b828e577677eace601e/pytorch_pretrained_bert/modeling.py#L600
In the latest release, the function's signature remains unchanged but the
*argsand most of the**kwargsparameters, in particular pretty much anything not explicitly accessed in [1]https://github.com/huggingface/pytorch-transformers/blob/b33a385091de604afb566155ec03329b84c96926/pytorch_transformers/modeling_utils.py#L354-L358
is ignored. If a key of
kwargsis shared with the relevant model's configuration file then its value is still used to override said key (see the relevant logic here), but the current architecture breaks, for example, the following pattern which was previously possible.What's more, if these arguments have default values declared in
__init__then the entire pattern is broken silently: because these default values will never be overwritten via pretrained instantiation. Thus end users might continue running experiments passing different values ofuseful_argumenttofrom_pretrained, unaware that nothing is actually being changedAs evidenced by issue #833, I'm not the only one whose code was broken. This commit implements behavior which is a compromise between the old and new behaviors. From my docstring:
It would actually be ideal to avoid mixing configuration and model parameters entirely (via some sort of
model_argsparameter for example): however this fix has the advantages ofpytorch-pretrained-berterafrom_pretrained.**kwargsparameter introduced withpytorch-transformersI have also included various other (smaller) changes in this pull request:
MakingApparently necessary for the tests to pass :(PreTrainedModel.__init__not accept*argsand**kwargsparameters which it has no use for and currently ignoresStop using the the "popping from kwargs" antipattern (see [1]). Keyword arguments with default values achieve the same thing more quickly, and are strictly more informative since they linters/autodoc modules can actually make use of them. I've replaced all instances that I could find, if this pattern exists elsewhere it should be removed.Oops: turns out this is a Python 2 compatibility thing. With that said, is there really a need to continue supporting Python 2? Especially with its EOL coming up in just a few months, and especially when it necessitates such ugly code...