diff --git a/README.md b/README.md index 09a7a3a40..014f5683f 100644 --- a/README.md +++ b/README.md @@ -16,6 +16,10 @@ Model compression examples. ## 5. Benchmarks All benchmarks that use the DeepSpeed library are maintained in this folder. +# Build Pipeline Status +| Description | Status | +| ----------- | ------ | +| Integrations | [![nv-ds-chat](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-ds-chat.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-ds-chat.yml) | # Contributing diff --git a/applications/DeepSpeed-Chat/README.md b/applications/DeepSpeed-Chat/README.md index 6099ae299..58b1e5042 100644 --- a/applications/DeepSpeed-Chat/README.md +++ b/applications/DeepSpeed-Chat/README.md @@ -48,6 +48,7 @@ A fast, affordable, scalable and open system framework for enabling end-to-end R - [🐼 Serving Your Model: Plug-in and Test!](#-serving-plug-in-your-final-model-trained-by-deepspeed-chat-and-test-it-out) - [🔥 Training Performance Evaluation 🔥](#-training-performance-evaluation-) - [😽 Supported Models 😽](#-supported-models-) +- [🔬 Build Pipeline Status 🔬](#-build-pipeline-status-) - [⚓ Documentation and Tutorial ⚓](#-documentation-and-tutorial-) - [🌱 DeepSpeed Chat's Roadmap 🌱](#-deepspeed-chats-roadmap-) - [💬 DeepSpeed Chat and DeepSpeed Community 💬](#-deepspeed-chat-and-deepspeed-community-) @@ -387,6 +388,33 @@ model family | size range * All performance and accuracy tests have been performed using the OPT model family only. For other models, please see our training_scripts folder on how to change model families. + +## 🔬 Build Pipeline Status 🔬 + +| Description | Status | +| ----------- | ------ | +| Integrations | [![nv-ds-chat](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-ds-chat.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-ds-chat.yml) | + +A DeepSpeed CI workflow runs the DeepSpeed-Chat Step 3 pipeline nightly across the following test configurations: + +Models +``` +Actor: facebook/opt-125m +Critic: facebook/opt-125m (trained in DS-Chat Step 2) +``` + +Parameters comprising test matrix +``` +Zero Stage: 2, 3 +Hybrid Engine: True, False +Offload: True, False +LoRA: True, False +``` + +Each configuration (16 total) runs through a limited number of Step 3 non-overflow training steps (i.e. steps where neither actor nor critic overflow) and saves the actor/critic models. +Assertions are used to check if the training pipeline executed correctly and if the actor and critic models were saved properly. + + ## ⚓ Documentation and Tutorial ⚓ For more APIs, example scripts, and evaluation results, please refer to