Skip to content

Re-upload weights for old model with new serialization #2068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fmassa opened this issue Apr 7, 2020 · 7 comments · Fixed by #3620
Closed

Re-upload weights for old model with new serialization #2068

fmassa opened this issue Apr 7, 2020 · 7 comments · Fixed by #3620

Comments

@fmassa
Copy link
Member

fmassa commented Apr 7, 2020

Some of the oldest models we have (including resnets) have been serialized with a legacy format which, while is mostly compatible with the current code, has some corner cases that do not work as expected anymore, see pytorch/pytorch#31615 (basically, if the data is in a BytesIO object it doesn't work).

Here is a snippet to reproduce the problem:

import urllib.request
import io

url = 'https://download.pytorch.org/models/resnet50-19c8e357.pth'
path = './tmp.pth'
urllib.request.urlretrieve(url, path)

with open(path, "rb") as fd:
    buf = io.BytesIO(fd.read())
    
state_dict = torch.load(buf, "cpu")

This fails. We should re-save those weights and re-upload them (while keeping the old files around for BC) and update the paths in the model files to use the new files. This code re-downloads the old files and re-serializes it with the new format

import urllib.request
import io

url = 'https://download.pytorch.org/models/resnet18-5c106cde.pth'
path = './tmp.pth'
urllib.request.urlretrieve(url, path)

torch.save(torch.load(path), path)

with open(path, "rb") as fd:
    buf = io.BytesIO(fd.read())

state_dict = torch.load(buf, "cpu")

One thing to keep in mind is if changing the default weights files (while being equivalent to the old models) won't bring unexpected changes / issues to some users, as it will trigger a re-download of the file, which might fail because of many reasons (the server doesn't have access to internet, files have been manually downloaded, etc).

@NicolasHug
Copy link
Member

For ref, the following classifications model urls raise an UnpicklingError:

  • resnet34-333f7ec4.pth
  • alexnet-owt-4df8aa71.pth
  • squeezenet1_0-a815701f.pth
  • resnet152-b121ed2d.pth
  • squeezenet1_1-f364aa15.pth
  • resnet50-19c8e357.pth
  • resnet101-5d3b4d8f.pth
  • resnet18-5c106cde.pth

@BIGBALLON
Copy link

same issue for me

@NicolasHug
Copy link
Member

@BIGBALLON the issues should have been fixed in the latest release already.
If you're experiencing any issue, could you please provide exactly on which model, the torchvision vesrion, as well as a minimal reproducing example? Thanks

@BIGBALLON
Copy link

Hi, @NicolasHug, thanks for your reply.

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

1 download the official resnet18 mode(resnet18-5c106cde.pth),I make sure torch.load('resnet18-5c106cde.pth') is work fine.
2 test the following code:

import io
import torch
with open('resnet18-5c106cde.pth', 'rb') as f:
    buffer = io.BytesIO(f.read())
torch.load(buffer)

3 got the following error:

Traceback (most recent call last):
  File "test.py", line 5, in <module>
    torch.load(buffer)
  File "/mnt/test/miniconda/envs/open-mmlab/lib/python3.7/site-packages/torch/serialization.py", line 595, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/mnt/test/miniconda/envs/open-mmlab/lib/python3.7/site-packages/torch/serialization.py", line 764, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: unpickling stack underflow

Expected behavior

Environment

Additional context

  1. I also test with pytorch1.8.1/pytorch1.5.1, got the same error.
  2. Both Mac OS and Linux has the same issue.
  3. 'torch.load' report 'bad pickle data' pytorch#31615

@NicolasHug
Copy link
Member

OK, that's because you're still using the "old" weights. The issue won't and cannot be fixed for the old weights. The new weights are at https://download.pytorch.org/models/resnet18-f37072fd.pth

Just delete the old ones and re-download the new ones (you can just call resnet18() from torchvision.models)

@BIGBALLON
Copy link

Thanks, @NicolasHug, the new weight can be loaded!

@JiangtianPan
Copy link

Hi, I also encountered this issue when using torch.load() for vgg19-dcbb9e9d.pth, with torch==1.13.0, torchvision=0.14.0. @NicolasHug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants