Skip to content

Commit 873c041

Browse files
authored
Update script location and docs for all 3 steps (#681)
1 parent f237ca5 commit 873c041

File tree

40 files changed

+19
-19
lines changed

40 files changed

+19
-19
lines changed

applications/DeepSpeed-Chat/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,7 @@ The train.py script has an easy-to-use command-line interface and can be launche
192192
cd training/step1_supervised_finetuning/
193193

194194
# Run the training script
195-
bash training_scripts/single_gpu/run_1.3b.sh
195+
bash training_scripts/opt/single_gpu/run_1.3b.sh
196196

197197
# Evaluate the model
198198
bash evaluation_scripts/run_prompt.sh
@@ -209,7 +209,7 @@ bash evaluation_scripts/run_prompt.sh
209209
cd training/step2_reward_model_finetuning
210210

211211
# Run the training script
212-
bash training_scripts/single_gpu/run_350m.sh
212+
bash training_scripts/opt/single_gpu/run_350m.sh
213213

214214
# Evaluate the model
215215
bash evaluation_scripts/run_eval.sh
@@ -237,7 +237,7 @@ As the most complex step of the entire 3-step InstructGPT pipeline, DeepSpeed Ch
237237
cd training/step3_rlhf_finetuning/
238238

239239
# Run the training script
240-
bash training_scripts/single_gpu/run_1.3b.sh
240+
bash training_scripts/opt/single_gpu/run_1.3b.sh
241241
```
242242
</p></details>
243243

applications/DeepSpeed-Chat/train.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ def get_script(args, step_num):
127127
script = os.path.join(
128128
os.getcwd(),
129129
step_dirs[step_num],
130-
"training_scripts",
130+
"training_scripts/opt/",
131131
args.deployment_type,
132132
f"run_{model_size}.sh",
133133
)

applications/DeepSpeed-Chat/training/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -55,18 +55,18 @@ We are sharing our training logs for all three steps for an OPT-1.3b actor and O
5555

5656
| Step | Run Script | Training Log |
5757
|--------------|-----------|------------|
58-
| 1 | [single_node/run_1.3b.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_node/run_1.3b.sh) | [opt-1.3b-globalBatchSize128.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_log_output/opt-1.3b-globalBatchSize128.log) |
59-
| 2 | [single_node/run_350m.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_scripts/single_node/run_350m.sh) | [opt-350m_globalBatchSize-64.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_log_output/opt-350m_globalBatchSize-64.log) |
60-
| 3 | [single_node/run_1.3b.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_scripts/single_node/run_1.3b.sh) | [actor_opt-1.3b_critic_opt-350m_globalBatchSize64.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_log_output/actor_opt-1.3b_critic_opt-350m_globalBatchSize64.log) |
58+
| 1 | [opt/single_node/run_1.3b.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/opt/single_node/run_1.3b.sh) | [opt-1.3b-globalBatchSize128.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_log_output/opt-1.3b-globalBatchSize128.log) |
59+
| 2 | [opt/single_node/run_350m.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_scripts/opt/single_node/run_350m.sh) | [opt-350m_globalBatchSize-64.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_log_output/opt-350m_globalBatchSize-64.log) |
60+
| 3 | [opt/single_node/run_1.3b.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_scripts/single_node/opt/run_1.3b.sh) | [actor_opt-1.3b_critic_opt-350m_globalBatchSize64.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_log_output/actor_opt-1.3b_critic_opt-350m_globalBatchSize64.log) |
6161

6262
### Characterization Scripts
6363
Scripts for sweeping training across various parameters (Zero Stage, Offload, Lora, etc) are available for Step 1, 2, and 3. These scripts can be further extended to sweep across additional parameters such as learning rate.
6464

6565
| Step | Sweep Script | README |
6666
|--------------|-----------|-----------|
67-
| 1 | [run_step1_sweep.sh](./step1_supervised_finetuning/training_scripts/single_node/sweep/run_step1_sweep.sh) | [README](./step1_supervised_finetuning/training_scripts/single_node/sweep/README.md) |
68-
| 2 | [run_step2_sweep.sh](./step2_reward_model_finetuning/training_scripts/single_node/sweep/run_step2_sweep.sh) | [README](./step2_reward_model_finetuning/training_scripts/single_node/sweep/README.md) |
69-
| 3 | [run_step3_sweep.sh](./step3_rlhf_finetuning/training_scripts/single_node/sweep/run_step3_sweep.sh) | [README](./step3_rlhf_finetuning/training_scripts/single_node/sweep/README.md) |
67+
| 1 | [run_step1_sweep.sh](./step1_supervised_finetuning/training_scripts/opt/single_node/sweep/run_step1_sweep.sh) | [README](./step1_supervised_finetuning/training_scripts/opt/single_node/sweep/README.md) |
68+
| 2 | [run_step2_sweep.sh](./step2_reward_model_finetuning/training_scripts/opt/single_node/sweep/run_step2_sweep.sh) | [README](./step2_reward_model_finetuning/training_scripts/opt/single_node/sweep/README.md) |
69+
| 3 | [run_step3_sweep.sh](./step3_rlhf_finetuning/training_scripts/opt/single_node/sweep/run_step3_sweep.sh) | [README](./step3_rlhf_finetuning/training_scripts/opt/single_node/sweep/README.md) |
7070

7171
### Others
7272
RLHF (Reinforcement Learning for Human Feedback) training is still an open problem, and DeepSpeed-Chat is designed to be a starting point for researchers and practitioners to work on it with an efficient and fast training experience. The Hybrid-Engine and other efficient components, like LoRA, can be inherited from DeepSpeed-Chat, allowing you to develop your own RLHF training pipeline for exploration, research, and other purposes.

applications/DeepSpeed-Chat/training/step1_supervised_finetuning/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Supervised finetuning (SFT) is very similar to standard language model finetunin
55
We provide multiple scripts for training on single GPUs (e.g., a single A6000-48G, V100-32G, A100-40G, etc.), single nodes (e.g., 8/16x V100-32G, 8 A100-40G/80G), and multiple nodes setting (e.g., 64x A100-80G), which can be found in the 'training_scripts' directory. For example, if you have a single A6000-48G, you can simply run the corresponding script.
66

77
```bash
8-
training_scripts/single_gpu/run_1.3b.sh
8+
training_scripts/opt/single_gpu/run_1.3b.sh
99
```
1010

1111
to train a OPT-1.3b model. It is easy to extend our single-node script to multi-node system.

0 commit comments

Comments
 (0)