-
Notifications
You must be signed in to change notification settings - Fork 30.9k
Description
System Info
transformers
version: 4.43.1- Platform: Linux-6.8.5-1-default-x86_64-with-glibc2.39
- Python version: 3.11.9
- Huggingface_hub version: 0.23.5
- Safetensors version: 0.4.3
- Accelerate version: 0.29.3
- Accelerate config: - compute_environment: LOCAL_MACHINE
- distributed_type: MULTI_GPU
- mixed_precision: bf16
- use_cpu: False
- debug: False
- num_processes: 8
- machine_rank: 0
- num_machines: 1
- gpu_ids: all
- rdzv_backend: static
- same_network: True
- main_training_function: main
- enable_cpu_affinity: False
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
- dynamo_config: {'dynamo_backend': 'INDUCTOR'} - PyTorch version (GPU?): 2.4.0+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: True
- Using GPU in script?: True
- GPU type: NVIDIA L40S
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Description
I am trying to use the AutoAWQ
library to quantize a Pixtral model (mistral-community/Pixtral-Large-Instruct-2411
). However, I am encountering the following error:
File "/quantization/quant/lib64/python3.11/site-packages/transformers/models/llava/modeling_llava.py", line 303, in _merge_input_ids_with_image_features
num_images, num_image_patches, embed_dim = image_features.shape
^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'shape'
Code
Here is the code I am using:
import os
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
model_path = r'/data/models/mistral/pixtral-large-instruct-2411' # from https://huggingface.co/mistral-community/Pixtral-Large-Instruct-2411
quant_path = r'/data/models/mistral/pixtral-large-instruct-2411-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
os.makedirs(quant_path, exist_ok=True)
# Load model
model = AutoAWQForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True)
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
# Quantize
model.quantize(tokenizer, quant_config=quant_config)
# Save quantized model
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)
print(f'Model is quantized and saved at "{quant_path}"')
Analysis
The model I am using is Pixtral-Large-Instruct-2411
, but its configuration is LlavaForConditionalGeneration
. The issue arises in the Transformers
library's source code where image_features
remains None
if pixel_values
is None
. Consequently, in the method _merge_input_ids_with_image_features
, the first line num_images, num_image_patches, embed_dim = image_features.shape
tries to access the shape
attribute of None
, resulting in an AttributeError
.
image_features = None
if pixel_values is not None:
image_features = self.get_image_features(
pixel_values=pixel_values,
vision_feature_layer=vision_feature_layer,
vision_feature_select_strategy=vision_feature_select_strategy,
)
if legacy_processing:
logger.warning_once(
"Expanding inputs for image tokens in LLaVa should be done in processing. "
"Please add `patch_size` and `vision_feature_select_strategy` to the model's processing config or set directly "
"with `processor.patch_size = {{patch_size}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. "
"Using processors without these attributes in the config is deprecated and will throw an error in v4.50."
)
# prefill stage vs decoding stage (legacy behavior copied)
if input_ids.shape[1] != 1:
inputs_embeds, attention_mask, labels, position_ids = self._merge_input_ids_with_image_features(
image_features, inputs_embeds, input_ids, attention_mask, labels # <-- image_features is still None here
)
cache_position = torch.arange(attention_mask.shape[1], device=attention_mask.device)
Steps to Reproduce
- Ensure the
Pixtral-Large-Instruct-2411
model is available at the specified path. - Run the provided code snippet.
Actual Behavior
An AttributeError
is raised due to image_features
being None
.
Expected behavior
The model should be loaded, quantized, and saved without any errors.