[Horovod] ModelCheckpoint and EarlyStopping CBs hit errors with Torch 1.13+

### Bug description

Since PyTorch 1.13, we have observed that ModelCheckpoint and EarlyStopping callbacks would hit an undefined symbol error with Horovod strategy.

Details and examples are in https://github.com/horovod/horovod/commit/e392eb9daf09d2c987497686ae82b62c924c9c27

It is reproducible with `Torch 1.13` alone, but I think underneath, `reduce_op` from`DDP` should be not mixed with `Horovod`. This line in PTL hits the error.

https://github.com/Lightning-AI/lightning/blob/master/src/pytorch_lightning/strategies/horovod.py#L179

### How to reproduce the bug

```python
from torch.distributed import ReduceOp

op = None
op in (ReduceOp.SUM, None)
```


### Error messages and logs

```
        Traceback (most recent call last):
            File "<stdin>", line 1, in <module>
            TypeError: __eq__(): incompatible function arguments. The following argument types are supported:
            1. (self: torch._C._distributed_c10d.ReduceOp, arg0: c10d::ReduceOp::RedOpType) -> bool
            2. (self: torch._C._distributed_c10d.ReduceOp, arg0: torch._C._distributed_c10d.ReduceOp) -> bool
        Invoked with: <torch.distributed.distributed_c10d.ReduceOp object at 0x7fba78c9e0b0>, None

```


### Environment

```

#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0):
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 1.10): 1.13+
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):
#- Running environment of LightningApp (e.g. local, cloud):

```


### More info
Comments and suggestions are welcome.

cc @awaelchli

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Horovod] ModelCheckpoint and EarlyStopping CBs hit errors with Torch 1.13+ #15802

Bug description

How to reproduce the bug

Error messages and logs

Environment

More info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Horovod] ModelCheckpoint and EarlyStopping CBs hit errors with Torch 1.13+ #15802

Description

Bug description

How to reproduce the bug

Error messages and logs

Environment

More info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions