-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
🚀 Feature
Motivation
The motivation is to have a separate accelerator_strategy flag to support passing training type aliases (ddp, ddp_spawn, etc) and custom TrainingTypePlugin objects.
Trainer(strategy="ddp", accelerator="gpu", devices=4)
Trainer(strategy=DDPPlugin(find_unused_parameters=False), accelerator="gpu", devices=4)
Trainer(strategy="ddp_spawn", accelerator="cpu", devices=4)
Trainer(strategy="ddp_spawn", accelerator="tpu", devices=4)xxxxxxxxxxxxxx
Background
At the moment, there’s a single flag accelerator tied for Accelerators as well as Training Type plugins. We wish to have them decoupled and would like to add a separate flag accelerator_strategy for Training Type plugins!
trainer = Trainer(accelerator=GPUaccelerator(..))
trainer = Trainer(accelerator='ddp_spawn')Alternate flags to set Training Types
accelerator- type:
Optional[Union[str, Accelerator]]= None - Supports training types and Accelerator Objects
- type:
distributed_backend- type:
Optional[str]= None - Deprecated, should use
acceleratorinstead
- type:
plugins- type:
Optional[Union[List[Union[Plugin, ClusterEnvironment, str]], Plugin, ClusterEnvironment, str]]= None - Supports custom lightning plugins & environment
- type:
What's the difference between passing training type to accelerator, distributed_backend, or plugins?
acceleratoranddistributed_backendonly supportDistributedType(ddp, ddp_spawn, etc), whereaspluginssupport Custom Training Types (DDPPlugin(), ddp_find_unused_parameters_false, etc).
xxxxxxxxxxxxxxxxxxxxx
Proposed Solution
- Introduce
strategyflag to Trainer. - Support the exceptions and deprecations mentioned below
Exceptions:
Trainer(distributed_backend="ddp_cpu", strategy="ddp_spawn")Trainer(accelerator="ddp", strategy="ddp_spawn")Trainer(plugins="ddp_find_unused_parameters_false", strategy="ddp_spawn")
Deprecations: (Deprecated in v1.5 & will be removed in v1.6)
- Passing training type to
acceleratorflag - Passing training type to
pluginsflag
xxxxxxxxxxxxxxxxxxxxx
Related PR: #8597
Related Issue: #6090
If you agree with this change, react with 🎉, if not then 🙅🏽 with comments.
Alternatives
- Only deprecate passing the TrainingTypePlugin into the
pluginsargument not theacceleratorargument. - Use simpler
strategyargument instead ofaccelerator_strategy.
Additional context
If you enjoy Lightning, check out our other projects! ⚡
-
Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
-
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning
-
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch
-
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.