Skip to content

Conversation

ieBoytsov
Copy link
Contributor

When running the DPO script with a PEFT configuration, e.g.:

ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/dpo.py --config recipes/zephyr-7b-beta/dpo/config_qlora.yaml

the following error is raised:
ValueError: You passed both a ref_model and a peft_config. For training PEFT adapters with DPO there is no need to pass a reference model. Please pass `ref_model=None` in case you want to train PEFT adapters, or pass a ref_model with `force_use_ref_model=True` in DPOTrainer's init. if you want to use a different ref_model.

This happens because the script tries to initialize a reference model even when LoRA/PEFT is enabled.

This PR updates the logic to skip reference model initialization when PEFT adapters are used.

I am referring to the similar logic that is used in the trl example dpo script:

https://github.com/huggingface/trl/blob/main/trl/scripts/dpo.py#L109

When running the DPO script with a PEFT configuration, e.g.:
```
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/dpo.py --config recipes/zephyr-7b-beta/dpo/config_qlora.yaml
```

the following error is raised:
```ValueError: You passed both a ref_model and a peft_config. For training PEFT adapters with DPO there is no need to pass a reference model. Please pass `ref_model=None` in case you want to train PEFT adapters, or pass a ref_model with `force_use_ref_model=True` in DPOTrainer's init. if you want to use a different ref_model.```

This happens because the script tries to initialize a reference model even when LoRA/PEFT is enabled.

This PR updates the logic to skip reference model initialization when PEFT adapters are used. 

I am referring to the similar logic that is used in the trl example dpo script:

https://github.com/huggingface/trl/blob/main/trl/scripts/dpo.py#L109
@ieBoytsov
Copy link
Contributor Author

Hey @lewtun could you please have a look at this when you have time? My team would benefit from this fix :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant