Converting a StableLM fine tuned model fails with `Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.`

# Prerequisites

Tested on latest commit, 8e672efe632bb6a7333964a255c4b96f018b9a65 , and also on commits from yesterday.

# Current Behavior

Trying to convert model https://huggingface.co/pansophic/rocket-3B

Results in:
```
 [pytorch2] tomj@MC:/workspace/git/gguf-llama (master ✘)✭ ᐅ python3 ./convert-hf-to-gguf.py /workspace/process/pansophic_rocket-3b/source --outtype f16 --outfile /workspace/process/pansophic_rocket-3b/gguf/rocket-3b.fp16.gguf
Loading model: source
gguf: This GGUF file is for Little Endian only
Set model parameters
Set model tokenizer
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
gguf: Adding 50009 merge(s).
gguf: Setting special token type bos to 0
gguf: Setting special token type eos to 0
gguf: Setting special token type unk to 0
Exporting model to '/workspace/process/pansophic_rocket-3b/gguf/rocket-3b.fp16.gguf'
gguf: loading model part 'pytorch_model.bin'
Traceback (most recent call last):
  File "/workspace/git/gguf-llama/./convert-hf-to-gguf.py", line 897, in <module>
    model_instance.write()
  File "/workspace/git/gguf-llama/./convert-hf-to-gguf.py", line 126, in write
    self.write_tensors()
  File "/workspace/git/gguf-llama/./convert-hf-to-gguf.py", line 98, in write_tensors
    data = data_torch.squeeze().numpy()
RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.
```

I noticed that the latest commits mentioned StaleLM so I tried rolling back to before them, but still got the same error.

I have confirmed that the model loads OK via Transformers, so it appears to be valid.

Any thoughts @Galunid ?

Thanks in advance

# Environment and Context

Ubuntu 22.04, Python 3.10




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Converting a StableLM fine tuned model fails with `Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.` #4171

Prerequisites

Current Behavior

Environment and Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Converting a StableLM fine tuned model fails with Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. #4171

Description

Prerequisites

Current Behavior

Environment and Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Converting a StableLM fine tuned model fails with `Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.` #4171