Allow user to opt-in to Node draining during Cluster delete

### What would you like to be added (User Story)?

As an operator, I would like to opt-in to Node draining during Cluster delete, so that I can delete Clusters managed by Infrastructure Providers that impose additional conditions on deleting underlying VMs, e.g. detaching all secondary storage (which in turns can require all Pods using this storage to be evicted).

### Detailed Description

When it reconciles a Machine resource marked to be deleted, the Machine controller attempts to drain the corresponding Node. However, if the Cluster resource is marked for deletion, it does not perform the drain. This behavior was added in https://github.com/kubernetes-sigs/cluster-api/pull/2746 in order to decrease the time it takes to delete a cluster.

As part of reconciline a Machine marked for deletion, the Machine controller marks the referenced InfraMachine resource for deletion. The infrastructure provider's InfraMachine controller reconciles that delete. The InfraMachine controller can refuse to delete the underlying infrastructure until some condition is met. That condition could be that the corresponding Node must be drained. 

For example, if I create a cluster with the [VMware Cloud Director infrastructure provider](https://github.com/vmware/cluster-api-provider-cloud-director/) (CAPVCD), and use secondary storage  ("VCD Named Disk Volumes"), then the delete cannot proceed until I drain all Nodes with Pods that use this storage. Cluster API does not drain the Nodes.

CAPVCD requires the Node corresponding to the InfraMachine be drained. This is because CAPVCD requires all secondary storage to be detached from the VM underlying the InfraMachine. Detaching this storage is the responsibility of the VCD CSI driver, and it refuses to do this until all volumes that use this storage can be unmounted, and in turn that means all Pods using these volumes must be evicted from the Node.

Because Cluster API does not drain Nodes during Cluster delete, the Nodes are not drained, the volumes are not unmounted, and CAPVCD refuses to delete the underlying VMs. The Cluster delete process continues until the the Nodes are drained manually.

### Anything else you would like to add?

As @lubronzhan helpfully [points out](https://github.com/kubernetes-sigs/cluster-api/issues/9692#issuecomment-1804827181) below, draining on Cluster delete is possible by implementing and deploying a PreDrainDeleteHook.

It is my understanding that the Machine controller skips drain on Cluster delete _as an optimization_, not for correctness. For some infrastructure providers, draining is required for correctness. Given that, I think Cluster API itself should allow users to disable this optimization in favor of correctness, and I think requiring users to implement and deploy a webhook is too high a bar. I would prefer a simpler solution, e.g., an annotation, or an API field. 

### Label(s) to be applied

/kind feature
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow user to opt-in to Node draining during Cluster delete #9692

What would you like to be added (User Story)?

Detailed Description

Anything else you would like to add?

Label(s) to be applied

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow user to opt-in to Node draining during Cluster delete #9692

Description

What would you like to be added (User Story)?

Detailed Description

Anything else you would like to add?

Label(s) to be applied

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions