-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix typo in README #56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
arashashari
approved these changes
Sep 24, 2020
Fix #52 |
tjruwase
added a commit
that referenced
this pull request
Apr 12, 2025
tjruwase
added a commit
that referenced
this pull request
Apr 12, 2025
* Fast model checkpointing * Support both legacy and serialized formats * Add io_buffer_mb option * Bug fix * Force flush * More model options; Refactor common codes * --gpu option * --half and more flexible options * Add deepspeed.save_checkpoint() * Free ds memory * Improve repro * Double I/O buffer (#56) * Double I/O buffer (#60) * Add checkpoint comparison (#62) * Add checkpoint comparison * Corrected a typo Co-authored-by: Yang Li <[email protected]> * save_checkpoint perf monitoring * Disable checkpoint save on exit * Perf statistics for save_checkpoint (#64) * save_checkpoint perf monitoring * Disable checkpoint save on exit * add logs for a100-80 * add torch* error log with half flag but without fused flag * log for error * local rank arg * Handle local_rank arg (#78) * save_checkpoint perf monitoring * Disable checkpoint save on exit * local rank arg * Single writer option * Single writer option (#79) * save_checkpoint perf monitoring * Disable checkpoint save on exit * local rank arg * Single writer option * Allow missing folder * DP writer refactor * Update for DS; Add GDS Signed-off-by: Olatunji Ruwase <[email protected]> * Integrate GDS into deepspeed_model_save --------- Signed-off-by: Olatunji Ruwase <[email protected]> Co-authored-by: jerryyangli <[email protected]> Co-authored-by: Yang Li <[email protected]> Co-authored-by: GuanhuaWang <[email protected]>
tjruwase
added a commit
that referenced
this pull request
Jun 9, 2025
* Fast model checkpointing * Support both legacy and serialized formats * Add io_buffer_mb option * Bug fix * Force flush * More model options; Refactor common codes * --gpu option * --half and more flexible options * Add deepspeed.save_checkpoint() * Free ds memory * Improve repro * Double I/O buffer (#56) * Double I/O buffer (#60) * Add checkpoint comparison (#62) * Add checkpoint comparison * Corrected a typo Co-authored-by: Yang Li <[email protected]> * save_checkpoint perf monitoring * Disable checkpoint save on exit * Perf statistics for save_checkpoint (#64) * save_checkpoint perf monitoring * Disable checkpoint save on exit * add logs for a100-80 * add torch* error log with half flag but without fused flag * log for error * local rank arg * Handle local_rank arg (#78) * save_checkpoint perf monitoring * Disable checkpoint save on exit * local rank arg * Single writer option * Single writer option (#79) * save_checkpoint perf monitoring * Disable checkpoint save on exit * local rank arg * Single writer option * Allow missing folder * DP writer refactor * Update for DS; Add GDS Signed-off-by: Olatunji Ruwase <[email protected]> * Integrate GDS into deepspeed_model_save * Rebase fast persist (#184) * Fast model checkpointing * Support both legacy and serialized formats * Add io_buffer_mb option * Bug fix * Force flush * More model options; Refactor common codes * --gpu option * --half and more flexible options * Add deepspeed.save_checkpoint() * Free ds memory * Improve repro * Double I/O buffer (#56) * Double I/O buffer (#60) * Add checkpoint comparison (#62) * Add checkpoint comparison * Corrected a typo Co-authored-by: Yang Li <[email protected]> * save_checkpoint perf monitoring * Disable checkpoint save on exit * Perf statistics for save_checkpoint (#64) * save_checkpoint perf monitoring * Disable checkpoint save on exit * add logs for a100-80 * add torch* error log with half flag but without fused flag * log for error * local rank arg * Handle local_rank arg (#78) * save_checkpoint perf monitoring * Disable checkpoint save on exit * local rank arg * Single writer option * Single writer option (#79) * save_checkpoint perf monitoring * Disable checkpoint save on exit * local rank arg * Single writer option * Allow missing folder * DP writer refactor * Update for DS; Add GDS Signed-off-by: Olatunji Ruwase <[email protected]> * Integrate GDS into deepspeed_model_save --------- Signed-off-by: Olatunji Ruwase <[email protected]> Co-authored-by: jerryyangli <[email protected]> Co-authored-by: Yang Li <[email protected]> Co-authored-by: GuanhuaWang <[email protected]> * Move folder Signed-off-by: Olatunji Ruwase <[email protected]> * Remove folder Signed-off-by: Olatunji Ruwase <[email protected]> * More cleanup Signed-off-by: Olatunji Ruwase <[email protected]> * torch changes Signed-off-by: Olatunji Ruwase <[email protected]> * sglang+zero_inference * Remove file * Add offload configs * Add pin_memory * Cleanup scripts * SGLang README * Remove file --------- Signed-off-by: Olatunji Ruwase <[email protected]> Co-authored-by: jerryyangli <[email protected]> Co-authored-by: Yang Li <[email protected]> Co-authored-by: GuanhuaWang <[email protected]> Co-authored-by: Logan Adams <[email protected]> Co-authored-by: Hongwei Chen <[email protected]> Co-authored-by: Zhipeng Wang <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Remove non-existent package requirement: NVIDIA/Megatron-LM#4