Skip to content

convert.py safetensors updates #4043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 14, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -1036,7 +1036,8 @@ def load_some_model(path: Path) -> ModelPlus:
# Be extra-friendly and accept either a file or a directory:
if path.is_dir():
# Check if it's a set of safetensors files first
files = list(path.glob("model-00001-of-*.safetensors"))
globs = ["model-00001-of-*.safetensors", "model.safetensors"]
Copy link
Contributor

@AlpinDale AlpinDale Nov 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just *.safetensors? That's the common approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the same thing, but that could, under some absurd circumstances cause problems. We never know what people do with their stuff... Might wait for someone with another option though

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlpinDale The glob looks like it's deliberately trying to target the first part of the set with model-00001-of-*. If it was just *.safetensors then you could get model-99999-of-99999.safetensors which is probably not what you want to load.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python indexes the files alphabetically when using a glob, so that is a non-issue. I'm simply pointing out that this way of doing it is unconventional and I've not seen any other project do this.

Copy link
Collaborator

@KerfuffleV2 KerfuffleV2 Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python indexes the files alphabetically when using a glob

I'm pretty sure that's not the case. The documentation doesn't even mention order: https://docs.python.org/3/library/pathlib.html#pathlib.Path.glob

Note also that their examples are like sorted(Path('.').glob('*.py')) which would be redundant if it was guaranteed to be already sorted.

I'm simply pointing out that this way of doing it is unconventional

That may be the case, but your proposed change would break it. There's actually a

1046if len(files) > 1:
1047raise # ...

a couple lines down. This is specifically supposed to pull in the first file of the set, not all of them.

files = [file for glob in globs for file in path.glob(glob)]
if not files:
# Try the PyTorch patterns too, with lower priority
globs = ["consolidated.00.pth", "pytorch_model-00001-of-*.bin", "*.pt", "pytorch_model.bin"]
Expand Down Expand Up @@ -1123,7 +1124,7 @@ def main(args_in: list[str] | None = None) -> None:
parser.add_argument("--outtype", choices=output_choices, help="output format - note: q8_0 may be very slow (default: f16 or f32 based on input)")
parser.add_argument("--vocab-dir", type=Path, help="directory containing tokenizer.model, if separate from model file")
parser.add_argument("--outfile", type=Path, help="path to write to; default: based on input")
parser.add_argument("model", type=Path, help="directory containing model file, or model file itself (*.pth, *.pt, *.bin)")
parser.add_argument("model", type=Path, help="directory containing model file, or model file itself (*.pth, *.pt, *.bin, *.safetensors)")
parser.add_argument("--vocabtype", choices=["spm", "bpe"], help="vocab format (default: spm)", default="spm")
parser.add_argument("--ctx", type=int, help="model training context (default: based on input)")
parser.add_argument("--concurrency", type=int, help=f"concurrency used for conversion (default: {DEFAULT_CONCURRENCY})", default = DEFAULT_CONCURRENCY)
Expand Down