deepspeedai
diff --git a/‎applications/DeepSpeed-Chat/README.md
Lines changed: 3 additions & 3 deletions b/‎applications/DeepSpeed-Chat/README.md
Lines changed: 3 additions & 3 deletions
diff --git a/‎applications/DeepSpeed-Chat/train.py
Lines changed: 1 addition & 1 deletion b/‎applications/DeepSpeed-Chat/train.py
Lines changed: 1 addition & 1 deletion
diff --git a/‎applications/DeepSpeed-Chat/training/README.md
Lines changed: 6 additions & 6 deletions b/‎applications/DeepSpeed-Chat/training/README.md
Lines changed: 6 additions & 6 deletions
diff --git a/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/README.md
Lines changed: 1 addition & 1 deletion b/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/README.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_node/llama2/run_llama2_7b.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/llama2/run_llama2_7b.sh b/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_node/llama2/run_llama2_7b.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/llama2/run_llama2_7b.sh
diff --git a/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_node/llama2/run_llama2_7b_lora.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/llama2/run_llama2_7b_lora.sh b/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_node/llama2/run_llama2_7b_lora.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/llama2/run_llama2_7b_lora.sh
diff --git a/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/multi_node/run_66b.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/opt/multi_node/run_66b.sh b/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/multi_node/run_66b.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/opt/multi_node/run_66b.sh
diff --git a/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_gpu/run_1.3b.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/opt/single_gpu/run_1.3b.sh b/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_gpu/run_1.3b.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/opt/single_gpu/run_1.3b.sh
diff --git a/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_gpu/run_6.7b_lora.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/opt/single_gpu/run_6.7b_lora.sh b/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_gpu/run_6.7b_lora.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/opt/single_gpu/run_6.7b_lora.sh
diff --git a/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_node/opt/run_1.3b.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/opt/single_node/run_1.3b.sh b/‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_node/opt/run_1.3b.sh renamed to ‎applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/opt/single_node/run_1.3b.sh
@@ -192,7 +192,7 @@ The train.py script has an easy-to-use command-line interface and can be launche
 cd training/step1_supervised_finetuning/
 
 # Run the training script
-bash training_scripts/single_gpu/run_1.3b.sh
+bash training_scripts/opt/single_gpu/run_1.3b.sh
 
 # Evaluate the model
 bash evaluation_scripts/run_prompt.sh
@@ -209,7 +209,7 @@ bash evaluation_scripts/run_prompt.sh
 cd training/step2_reward_model_finetuning
 
 # Run the training script
-bash training_scripts/single_gpu/run_350m.sh
+bash training_scripts/opt/single_gpu/run_350m.sh
 
 # Evaluate the model
 bash evaluation_scripts/run_eval.sh
@@ -237,7 +237,7 @@ As the most complex step of the entire 3-step InstructGPT pipeline, DeepSpeed Ch
 cd training/step3_rlhf_finetuning/
 
 # Run the training script
-bash training_scripts/single_gpu/run_1.3b.sh
+bash training_scripts/opt/single_gpu/run_1.3b.sh
 ```
 </p></details>
 
 
@@ -127,7 +127,7 @@ def get_script(args, step_num):
     script = os.path.join(
         os.getcwd(),
         step_dirs[step_num],
-        "training_scripts",
+        "training_scripts/opt/",
         args.deployment_type,
         f"run_{model_size}.sh",
     )
 
@@ -55,18 +55,18 @@ We are sharing our training logs for all three steps for an OPT-1.3b actor and O
 
 | Step         | Run Script     | Training Log |
 |--------------|-----------|------------|
-| 1 | [single_node/run_1.3b.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_node/run_1.3b.sh) | [opt-1.3b-globalBatchSize128.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_log_output/opt-1.3b-globalBatchSize128.log) |
-| 2 | [single_node/run_350m.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_scripts/single_node/run_350m.sh) |  [opt-350m_globalBatchSize-64.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_log_output/opt-350m_globalBatchSize-64.log) |
-| 3 | [single_node/run_1.3b.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_scripts/single_node/run_1.3b.sh) | [actor_opt-1.3b_critic_opt-350m_globalBatchSize64.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_log_output/actor_opt-1.3b_critic_opt-350m_globalBatchSize64.log) |
+| 1 | [opt/single_node/run_1.3b.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/opt/single_node/run_1.3b.sh) | [opt-1.3b-globalBatchSize128.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_log_output/opt-1.3b-globalBatchSize128.log) |
+| 2 | [opt/single_node/run_350m.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_scripts/opt/single_node/run_350m.sh) |  [opt-350m_globalBatchSize-64.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_log_output/opt-350m_globalBatchSize-64.log) |
+| 3 | [opt/single_node/run_1.3b.sh](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_scripts/single_node/opt/run_1.3b.sh) | [actor_opt-1.3b_critic_opt-350m_globalBatchSize64.log](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_log_output/actor_opt-1.3b_critic_opt-350m_globalBatchSize64.log) |
 
 ### Characterization Scripts
 Scripts for sweeping training across various parameters (Zero Stage, Offload, Lora, etc) are available for Step 1, 2, and 3. These scripts can be further extended to sweep across additional parameters such as learning rate.
 
 | Step         | Sweep Script     | README |
 |--------------|-----------|-----------|
-| 1 | [run_step1_sweep.sh](./step1_supervised_finetuning/training_scripts/single_node/sweep/run_step1_sweep.sh) | [README](./step1_supervised_finetuning/training_scripts/single_node/sweep/README.md) |
-| 2 | [run_step2_sweep.sh](./step2_reward_model_finetuning/training_scripts/single_node/sweep/run_step2_sweep.sh) | [README](./step2_reward_model_finetuning/training_scripts/single_node/sweep/README.md) |
-| 3 | [run_step3_sweep.sh](./step3_rlhf_finetuning/training_scripts/single_node/sweep/run_step3_sweep.sh) | [README](./step3_rlhf_finetuning/training_scripts/single_node/sweep/README.md) |
+| 1 | [run_step1_sweep.sh](./step1_supervised_finetuning/training_scripts/opt/single_node/sweep/run_step1_sweep.sh) | [README](./step1_supervised_finetuning/training_scripts/opt/single_node/sweep/README.md) |
+| 2 | [run_step2_sweep.sh](./step2_reward_model_finetuning/training_scripts/opt/single_node/sweep/run_step2_sweep.sh) | [README](./step2_reward_model_finetuning/training_scripts/opt/single_node/sweep/README.md) |
+| 3 | [run_step3_sweep.sh](./step3_rlhf_finetuning/training_scripts/opt/single_node/sweep/run_step3_sweep.sh) | [README](./step3_rlhf_finetuning/training_scripts/opt/single_node/sweep/README.md) |
 
 ### Others
 RLHF (Reinforcement Learning for Human Feedback) training is still an open problem, and DeepSpeed-Chat is designed to be a starting point for researchers and practitioners to work on it with an efficient and fast training experience. The Hybrid-Engine and other efficient components, like LoRA, can be inherited from DeepSpeed-Chat, allowing you to develop your own RLHF training pipeline for exploration, research, and other purposes.
 
@@ -5,7 +5,7 @@ Supervised finetuning (SFT) is very similar to standard language model finetunin
 We provide multiple scripts for training on single GPUs (e.g., a single A6000-48G, V100-32G, A100-40G, etc.), single nodes (e.g., 8/16x V100-32G, 8 A100-40G/80G), and multiple nodes setting (e.g., 64x A100-80G), which can be found in the 'training_scripts' directory. For example, if you have a single A6000-48G, you can simply run the corresponding script.
 
 ```bash
- training_scripts/single_gpu/run_1.3b.sh
+ training_scripts/opt/single_gpu/run_1.3b.sh
  ```
 
 to train a OPT-1.3b model. It is easy to extend our single-node script to multi-node system.
Original file line number	Diff line number	Diff line change
`@@ -127,7 +127,7 @@ def get_script(args, step_num):`
`127`	`127`	`script = os.path.join(`
`128`	`128`	`os.getcwd(),`
`129`	`129`	`step_dirs[step_num],`
`130`		`- "training_scripts",`
	`130`	`+ "training_scripts/opt/",`
`131`	`131`	`args.deployment_type,`
`132`	`132`	`f"run_{model_size}.sh",`
`133`	`133`	`)`