[QEff Finetune]: Refactor the finetune main call #289

vbaddi · 2025-02-27T10:24:14Z

Refactor the finetune main api
Add support to override the PEFT config (yaml/json)
Add support to validate the correctness of PEFT Config
Some nit changes

r: 16
lora_alpha: 64
target_modules:
  - q_proj
  - v_proj
  - k_proj
bias: none
task_type: CAUSAL_LM
lora_dropout: 0.1

Command:

python -m QEfficient.cloud.finetune \
    --model_name "meta-llama/Llama-3.2-1B" \
    --lr 5e-4 \
    --peft_config_file "lora_config.yaml"

Using Default LoRA Config:

python -m QEfficient.cloud.finetune \
    --model_name "meta-llama/Llama-3.2-1B" \
    --lr 5e-4

tests/finetune/test_finetune.py

quic-mamta · 2025-04-21T08:10:22Z

tests/finetune/test_finetune.py

-    finetune(**kwargs)
+    results = finetune(**kwargs)
+
+    assert np.allclose(results["avg_train_prep"], 1.002326, atol=1e-5), "Train perplexity is not matching."


avg_train_prep to be changed to avg_train_metric wrt changes in PR 292

Updated in latest.

quic-mamta · 2025-04-21T08:27:38Z

QEfficient/finetune/utils/train_utils.py

    gradient_accumulation_steps,
-    train_config: TRAIN_CONFIG,
+    train_config: TrainConfig,
    device,


No need of passing all three train_config.gradient_accumulation_steps, train_config and train_config.device, only train_config is enough.

Updated in latest.

QEfficient/finetune/utils/config_utils.py

quic-swatia · 2025-04-23T07:13:23Z

QEfficient/finetune/utils/config_utils.py

+        - Ensures types match expected values (int, float, list, etc.).
+    """
+    if config_type.lower() != "lora":
+        raise ValueError(f"Unsupported config_type: {config_type}. Only 'lora' is supported.")


Since we are not doing lora finetuning in case of BERT, it will raise error.

No, this is used only when peft_config_file is provided in main() function of finetune.py. Currently for BERT there wont be any peft_config_file. But if it is provided then PEFT training will happen for BERT.

quic-swatia · 2025-04-23T07:50:38Z

QEfficient/finetune/utils/config_utils.py

+
+    Args:
+        config_data (Dict[str, Any]): The configuration dictionary loaded from YAML/JSON.
+        config_type (str): Type of config to validate ("lora" for LoraConfig, default: "lora").


Need to add field in config_type corresponding to BERT as we don't do lora fine tuning in it.

Lora can work with BERT but there is no point as BERT for sequence classification has random weights for classifier head.

quic-mamta · 2025-04-23T09:14:55Z

QEfficient/cloud/finetune.py

+    #  local_args = {k: v for k, v in locals().items() if v is not None and k != "peft_config_file" and k != "kwargs"}
+    update_config(train_config, **kwargs)
+
+    lora_config = LoraConfig()


this line is not required.

Removed in latest.

quic-mamta · 2025-04-23T09:22:46Z

QEfficient/cloud/finetune.py

-        longest_seq_length, _ = get_longest_seq_length(train_dataloader.dataset)
+        lora_config = LoraConfig()
+
+    update_config(lora_config, **kwargs)


why do need to update lora_config here with kwargs?

Moved this code inside 'generate_peft_config' function. It was required so update the config params based on cli arguments.

tests/finetune/test_finetune.py

QEfficient/cloud/finetune.py

QEfficient/finetune/configs/training.py

quic-mamta · 2025-04-28T09:58:32Z

QEfficient/cloud/finetune.py

-def main(**kwargs):
-    """
-    Helper function to finetune the model on QAic.
+def setup_distributed_training(config: TrainConfig) -> None:


Please move these functions setup_distributed_training, setup_seeds, load_model_and_tokenizer, apply_peft and setup_dataloaders to other utils file and import in this file.

We can take that in next PR. Keeping the refactored code in the same file for better comparison. Lot of refactoring is still required, we will do it in incremental fashion.

Signed-off-by: vbaddi <[email protected]> Signed-off-by: Meet Patel <[email protected]>

Signed-off-by: Meet Patel <[email protected]>

…he code Signed-off-by: Meet Patel <[email protected]>

Signed-off-by: Meet Patel <[email protected]>

…e. Addressed comments. Signed-off-by: Meet Patel <[email protected]>

Signed-off-by: Meet Patel <[email protected]>

…d issues. Signed-off-by: Meet Patel <[email protected]>

… New PR will be raised to enable tests. Signed-off-by: Meet Patel <[email protected]>

- Refactor the finetune main api - Add support to override the PEFT config (yaml/json) - Add support to validate the correctness of PEFT Config - Some nit changes ```yaml r: 16 lora_alpha: 64 target_modules: - q_proj - v_proj - k_proj bias: none task_type: CAUSAL_LM lora_dropout: 0.1 ``` Command: ```bash python -m QEfficient.cloud.finetune \ --model_name "meta-llama/Llama-3.2-1B" \ --lr 5e-4 \ --peft_config_file "lora_config.yaml" ``` #### Using Default LoRA Config: ```bash python -m QEfficient.cloud.finetune \ --model_name "meta-llama/Llama-3.2-1B" \ --lr 5e-4 ``` --------- Signed-off-by: vbaddi <[email protected]> Signed-off-by: Meet Patel <[email protected]> Co-authored-by: Meet Patel <[email protected]> Signed-off-by: Mohit Soni <[email protected]>

vbaddi self-assigned this Feb 27, 2025

vbaddi requested review from ochougul and quic-rishinr as code owners February 27, 2025 10:24

vbaddi force-pushed the add_peft_yaml_path branch from 2f19722 to 48061ee Compare February 27, 2025 13:57

vbaddi requested review from quic-mamta and quic-swatia March 19, 2025 08:27

vbaddi added the enhancement New feature or request label Mar 19, 2025

quic-swatia force-pushed the main branch from 32651d5 to cd9a6b9 Compare March 20, 2025 09:58

quic-meetkuma force-pushed the add_peft_yaml_path branch from 3ff66eb to c0d2315 Compare April 9, 2025 09:12

quic-amitraj marked this pull request as draft April 11, 2025 08:44

ochougul closed this Apr 15, 2025

quic-mamta reopened this Apr 16, 2025

quic-meetkuma force-pushed the add_peft_yaml_path branch from 7f2d367 to b2ee39a Compare April 17, 2025 11:00

quic-mamta requested changes Apr 21, 2025

View reviewed changes

quic-mamta reviewed Apr 21, 2025

View reviewed changes

quic-meetkuma force-pushed the add_peft_yaml_path branch 6 times, most recently from d0fff22 to e27deeb Compare April 21, 2025 11:28

quic-swatia reviewed Apr 23, 2025

View reviewed changes

quic-mamta reviewed Apr 23, 2025

View reviewed changes

quic-swatia reviewed Apr 23, 2025

View reviewed changes

tests/finetune/test_finetune.py Outdated Show resolved Hide resolved

tests/finetune/test_finetune.py Show resolved Hide resolved

quic-swatia reviewed Apr 23, 2025

View reviewed changes

QEfficient/cloud/finetune.py Outdated Show resolved Hide resolved

quic-swatia reviewed Apr 23, 2025

View reviewed changes

QEfficient/cloud/finetune.py Outdated Show resolved Hide resolved

quic-mamta reviewed Apr 28, 2025

View reviewed changes

QEfficient/finetune/configs/training.py Show resolved Hide resolved

QEfficient/finetune/configs/training.py Show resolved Hide resolved

quic-meetkuma force-pushed the add_peft_yaml_path branch 2 times, most recently from 0bb2a51 to ada9de8 Compare April 28, 2025 09:54

quic-mamta reviewed Apr 28, 2025

View reviewed changes

quic-meetkuma force-pushed the add_peft_yaml_path branch 3 times, most recently from 1709b85 to 0aead82 Compare April 30, 2025 09:45

quic-mamta approved these changes Apr 30, 2025

View reviewed changes

quic-swatia approved these changes Apr 30, 2025

View reviewed changes

quic-meetkuma force-pushed the add_peft_yaml_path branch 7 times, most recently from 8a5b6b6 to a4c8c50 Compare May 8, 2025 11:40

quic-meetkuma added 14 commits May 9, 2025 10:46

refactor the finetune main __call__

b675cb1

Signed-off-by: vbaddi <[email protected]> Signed-off-by: Meet Patel <[email protected]>

Fixed FT test case on qaic

ba6d7e5

Signed-off-by: Meet Patel <[email protected]>

Updated test case based on recent commits

8b35e84

Signed-off-by: Meet Patel <[email protected]>

Fixed few comments. Fixed some rebase related errors and restructed t…

97eabad

…he code Signed-off-by: Meet Patel <[email protected]>

Fixed the test after rebase

f176dab

Signed-off-by: Meet Patel <[email protected]>

Updated jenkins file to run finetuning tests and dump in separate fil…

aac6e56

…e. Addressed comments. Signed-off-by: Meet Patel <[email protected]>

Fixed comments for Jenkins file.

fb91f8e

Signed-off-by: Meet Patel <[email protected]>

Changed path for jenkins tests.

07fc464

Signed-off-by: Meet Patel <[email protected]>

Added new stage for Finetune CLI in Jenkins

0442d8a

Signed-off-by: Meet Patel <[email protected]>

Removed nested stages from Jenkins

a8035e6

Signed-off-by: Meet Patel <[email protected]>

Added torch_qaic in pip install section for finetune tests

77db8c1

Signed-off-by: Meet Patel <[email protected]>

Updated Jenkins file based on previous CI failures.

64f5454

Signed-off-by: Meet Patel <[email protected]>

Disabled vLLM tests as these are failing due to authentication relate…

f844d30

…d issues. Signed-off-by: Meet Patel <[email protected]>

Disabled the FT tests for now as CI is failing due to existing tests.…

777c7ca

… New PR will be raised to enable tests. Signed-off-by: Meet Patel <[email protected]>

quic-meetkuma force-pushed the add_peft_yaml_path branch from a4c8c50 to 777c7ca Compare May 9, 2025 05:16

quic-mamta merged commit 7d345dd into quic:main May 9, 2025
4 checks passed

[QEff Finetune]: Refactor the finetune main __call__ #289

[QEff Finetune]: Refactor the finetune main __call__ #289

Uh oh!

Conversation

vbaddi commented Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Using Default LoRA Config:

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

quic-mamta Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[QEff Finetune]: Refactor the finetune main call #289

[QEff Finetune]: Refactor the finetune main call #289

vbaddi commented Feb 27, 2025 •

edited

Loading

quic-mamta Apr 23, 2025 •

edited

Loading