Skip to content

Commit ad8f599

Browse files
lekurileLeetJoe
authored andcommitted
Add step 2 sweep script, clean up scripts (deepspeedai#664)
This PR adds a step 2 sweeping script in DS Chat and cleans up the existing step 1 and 3 scripts.
1 parent 328b5b2 commit ad8f599

File tree

10 files changed

+100
-12
lines changed

10 files changed

+100
-12
lines changed

applications/DeepSpeed-Chat/training/README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -60,12 +60,13 @@ We are sharing our training logs for all three steps for an OPT-1.3b actor and O
6060
| 3 | [single_node/run_1.3b.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_scripts/single_node/run_1.3b.sh) | [actor_opt-1.3b_critic_opt-350m_globalBatchSize64.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_log_output/actor_opt-1.3b_critic_opt-350m_globalBatchSize64.log) |
6161

6262
### Characterization Scripts
63-
Scripts for sweeping training across various parameters (Zero Stage, Offload, Lora, etc) are available for Step 1 and 3. These scripts can be further extended to sweep across additional parameters such as learning rate.
63+
Scripts for sweeping training across various parameters (Zero Stage, Offload, Lora, etc) are available for Step 1, 2, and 3. These scripts can be further extended to sweep across additional parameters such as learning rate.
6464

6565
| Step | Sweep Script | README |
6666
|--------------|-----------|-----------|
67-
| 1 | [run_step1_opt_sweep.sh](./step1_supervised_finetuning/training_scripts/single_node/sweep/run_step1_opt_sweep.sh) | [README](./step1_supervised_finetuning/training_scripts/single_node/sweep/README.md) |
68-
| 3 | [run_step3_opt_sweep.sh](./step3_rlhf_finetuning/training_scripts/single_node/sweep/run_step3_opt_sweep.sh) | [README](./step3_rlhf_finetuning/training_scripts/single_node/sweep/README.md) |
67+
| 1 | [run_step1_sweep.sh](./step1_supervised_finetuning/training_scripts/single_node/sweep/run_step1_sweep.sh) | [README](./step1_supervised_finetuning/training_scripts/single_node/sweep/README.md) |
68+
| 2 | [run_step2_sweep.sh](./step2_reward_model_finetuning/training_scripts/single_node/sweep/run_step2_sweep.sh) | [README](./step2_reward_model_finetuning/training_scripts/single_node/sweep/README.md) |
69+
| 3 | [run_step3_sweep.sh](./step3_rlhf_finetuning/training_scripts/single_node/sweep/run_step3_sweep.sh) | [README](./step3_rlhf_finetuning/training_scripts/single_node/sweep/README.md) |
6970

7071
### Others
7172
RLHF (Reinforcement Learning for Human Feedback) training is still an open problem, and DeepSpeed-Chat is designed to be a starting point for researchers and practitioners to work on it with an efficient and fast training experience. The Hybrid-Engine and other efficient components, like LoRA, can be inherited from DeepSpeed-Chat, allowing you to develop your own RLHF training pipeline for exploration, research, and other purposes.

applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_node/sweep/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,17 @@
55
* [Usage](#usage)
66

77
# Introduction
8-
The step 1 characterization script is intented to sweep across various training parameters. Currently, the following are parameters are swept:
8+
The step 1 characterization script sweeps across various training parameters. Currently, the following parameters are swept:
99
<pre>
1010
Zero Stage: 2, 3
1111
Offload: True, False
1212
Lora: True, False
1313
</pre>
1414

15-
The `run_step1_opt_sweep.sh` script passes configuration arguments to `run_1.3b_lora_swp.sh`, which can be extended to sweep beyond the parameters listed above (learning rate, weight decay, etc).
15+
The `run_step1_sweep.sh` script passes configuration arguments to `run_single.sh`, which can be extended to sweep beyond the parameters listed above (e.g. learning rate, weight decay, etc).
1616

1717
# Usage
1818
The sweep script can be run as follows:
1919
<pre>
20-
DeepSpeedExamples/applications/DeepSpeed-Chat/training/step1_supervised_finetuning$ bash training_scripts/single_node/sweep/run_step1_opt_sweep.sh
20+
DeepSpeedExamples/applications/DeepSpeed-Chat/training/step1_supervised_finetuning$ bash training_scripts/single_node/sweep/run_step1_sweep.sh
2121
</pre>
File renamed without changes.
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,11 @@ do
99
do
1010
for lora in true false
1111
do
12-
cmd="bash training_scripts/single_node/sweep/run_1.3b_lora_swp.sh \
12+
cmd="bash training_scripts/single_node/sweep/run_single.sh \
1313
${z} \
1414
${offload} \
1515
${lora} \
16-
step1_z${z}_offload_${offload}_lora_${lora}"
16+
z${z}_offload_${offload}_lora_${lora}"
1717
echo "----------------------------- CALLING SHELL SCRIPT -----------------------------"
1818
echo $cmd
1919
$cmd
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# DeepSpeed Characterization Script
2+
3+
# Contents
4+
* [Introduction](#introduction)
5+
* [Usage](#usage)
6+
7+
# Introduction
8+
The step 2 characterization script sweeps across various training parameters. Currently, the following parameters are swept:
9+
<pre>
10+
Zero Stage: 2, 3
11+
Offload: True, False
12+
</pre>
13+
14+
The `run_step2_sweep.sh` script passes configuration arguments to `run_single.sh`, which can be extended to sweep beyond the parameters listed above (e.g. learning rate, weight decay, etc).
15+
16+
# Usage
17+
The sweep script can be run as follows:
18+
<pre>
19+
DeepSpeedExamples/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning$ bash training_scripts/single_node/sweep/run_step2_sweep.sh
20+
</pre>
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
#!/bin/bash
2+
# Copyright (c) Microsoft Corporation.
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
# DeepSpeed Team
6+
ZERO_STAGE=$1
7+
OFFLOAD=$2
8+
OUTPUT=$3
9+
if [ "$ZERO_STAGE" == "" ]; then
10+
ZERO_STAGE=0
11+
fi
12+
if [ "$OFFLOAD" == true ]; then
13+
OFFLOAD="--offload"
14+
else
15+
OFFLOAD=""
16+
fi
17+
if [ "$OUTPUT" == "" ]; then
18+
OUTPUT=./output
19+
fi
20+
mkdir -p $OUTPUT
21+
22+
cmd="deepspeed main.py \
23+
--data_path Dahoas/rm-static Dahoas/full-hh-rlhf Dahoas/synthetic-instruct-gptj-pairwise yitingxie/rlhf-reward-datasets \
24+
--data_split 2,4,4 \
25+
--model_name_or_path facebook/opt-350m \
26+
--num_padding_at_beginning 1 \
27+
--per_device_train_batch_size 4 \
28+
--per_device_eval_batch_size 4 \
29+
--max_seq_len 512 \
30+
--learning_rate 5e-5 \
31+
--weight_decay 0.1 \
32+
--num_train_epochs 1 \
33+
--disable_dropout \
34+
--gradient_accumulation_steps 1 \
35+
--lr_scheduler_type cosine \
36+
--num_warmup_steps 0 \
37+
--seed 1234 \
38+
--zero_stage $ZERO_STAGE \
39+
--deepspeed \
40+
--output_dir $OUTPUT \
41+
$OFFLOAD"
42+
43+
echo "----------------------------- DS COMMAND -----------------------------"
44+
echo $cmd
45+
46+
$cmd &> $OUTPUT/${OUTPUT}.log
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#!/bin/bash
2+
# Copyright (c) Microsoft Corporation.
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
# DeepSpeed Team
6+
for z in {2..3}
7+
do
8+
for offload in true false
9+
do
10+
cmd="bash training_scripts/single_node/sweep/run_single.sh \
11+
${z} \
12+
${offload} \
13+
z${z}_offload_${offload}"
14+
echo "----------------------------- CALLING SHELL SCRIPT -----------------------------"
15+
echo $cmd
16+
$cmd
17+
pkill -9 python
18+
sleep 60
19+
echo ""
20+
done
21+
done

applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_scripts/single_node/sweep/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,18 +5,18 @@
55
* [Usage](#usage)
66

77
# Introduction
8-
The step 3 characterization script is intented to sweep across various training parameters. Currently, the following are parameters are swept:
8+
The step 3 characterization script sweeps across various training parameters. Currently, the following parameters are swept:
99
<pre>
1010
Zero Stage: 2, 3
1111
Hybrid Engine: True, False
1212
Offload: True, False
1313
Lora: True, False
1414
</pre>
1515

16-
The `run_step3_opt_sweep.sh` script passes configuration arguments to `run_1.3b_lora_swp.sh`, which can be extended to sweep beyond the parameters listed above (learning rate, weight decay, etc).
16+
The `run_step3_sweep.sh` script passes configuration arguments to `run_single.sh`, which can be extended to sweep beyond the parameters listed above (e.g. learning rate, weight decay, etc).
1717

1818
# Usage
1919
The sweep script can be run as follows:
2020
<pre>
21-
DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning$ bash training_scripts/single_node/sweep/run_step3_opt_sweep.sh
21+
DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning$ bash training_scripts/single_node/sweep/run_step3_sweep.sh
2222
</pre>
File renamed without changes.
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ do
1414
do
1515
for lora in true false
1616
do
17-
cmd="bash training_scripts/single_node/sweep/run_1.3b_lora_swp.sh \
17+
cmd="bash training_scripts/single_node/sweep/run_single.sh \
1818
$ACTOR_MODEL_PATH \
1919
$CRITIC_MODEL_PATH \
2020
${z} \

0 commit comments

Comments
 (0)