Do not initialise ref model when peft is used #224

ieBoytsov · 2025-09-09T10:17:17Z

When running the DPO script with a PEFT configuration, e.g.:

ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/dpo.py --config recipes/zephyr-7b-beta/dpo/config_qlora.yaml

the following error is raised:
ValueError: You passed both a ref_model and a peft_config. For training PEFT adapters with DPO there is no need to pass a reference model. Please pass `ref_model=None` in case you want to train PEFT adapters, or pass a ref_model with `force_use_ref_model=True` in DPOTrainer's init. if you want to use a different ref_model.

This happens because the script tries to initialize a reference model even when LoRA/PEFT is enabled.

This PR updates the logic to skip reference model initialization when PEFT adapters are used.

I am referring to the similar logic that is used in the trl example dpo script:

https://github.com/huggingface/trl/blob/main/trl/scripts/dpo.py#L109

When running the DPO script with a PEFT configuration, e.g.: ``` ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/dpo.py --config recipes/zephyr-7b-beta/dpo/config_qlora.yaml ``` the following error is raised: ```ValueError: You passed both a ref_model and a peft_config. For training PEFT adapters with DPO there is no need to pass a reference model. Please pass `ref_model=None` in case you want to train PEFT adapters, or pass a ref_model with `force_use_ref_model=True` in DPOTrainer's init. if you want to use a different ref_model.``` This happens because the script tries to initialize a reference model even when LoRA/PEFT is enabled. This PR updates the logic to skip reference model initialization when PEFT adapters are used. I am referring to the similar logic that is used in the trl example dpo script: https://github.com/huggingface/trl/blob/main/trl/scripts/dpo.py#L109

ieBoytsov · 2025-09-15T10:01:57Z

Hey @lewtun could you please have a look at this when you have time? My team would benefit from this fix :)

ieBoytsov added 2 commits September 9, 2025 12:12

Fix if/else logic

3a4b5b7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not initialise ref model when peft is used #224

Do not initialise ref model when peft is used #224

Uh oh!

ieBoytsov commented Sep 9, 2025

Uh oh!

ieBoytsov commented Sep 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Do not initialise ref model when peft is used #224

Are you sure you want to change the base?

Do not initialise ref model when peft is used #224

Uh oh!

Conversation

ieBoytsov commented Sep 9, 2025

Uh oh!

ieBoytsov commented Sep 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant