Skip to content

Commit 27bcf07

Browse files
committed
refactor: address pr feedback
1 parent 6b15d54 commit 27bcf07

File tree

1 file changed

+34
-18
lines changed
  • vcluster/configure/vcluster-yaml/control-plane/components/backing-store/etcd

1 file changed

+34
-18
lines changed

vcluster/configure/vcluster-yaml/control-plane/components/backing-store/etcd/embedded.mdx

Lines changed: 34 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -158,11 +158,15 @@ kubectl logs [[VAR:VCLUSTER NAME:my-vcluster]]-0 -n [[VAR:NAMESPACE:vcluster-my-
158158
<TabItem value="first-replica-failing" label="First replica is failing">
159159

160160
:::warning
161-
Before attempting any recovery procedure, create a backup of the virtual cluster namespace on the host cluster. If using namespace syncing, back up all synced namespaces as well.
161+
Before attempting any recovery procedure, create a backup of your virtual cluster using `vcluster snapshot create` or manually backup the virtual cluster namespace on the host cluster. If using namespace syncing, back up all synced namespaces as well.
162162
:::
163163

164164
The recovery procedure depends on your StatefulSet `podManagementPolicy` configuration. vCluster version 0.20 and later use `Parallel` by default. Earlier versions used `OrderedReady`.
165165

166+
:::info
167+
If more than one pod is down with `podManagementPolicy: OrderedReady`, you must first [migrate to `Parallel`](#migrate-to-parallel) before attempting recovery.
168+
:::
169+
166170
Check your configuration:
167171

168172
<InterpolatedCodeBlock
@@ -175,7 +179,16 @@ Check your configuration:
175179

176180
<Flow>
177181
<Step title="Delete the failed pod and PVC">
178-
Delete the corrupted pod and PVC for replica-0:
182+
First, identify the PVC for replica-0:
183+
184+
<InterpolatedCodeBlock
185+
code={`kubectl get pvc -l [[VAR:VCLUSTER LABEL:app=vcluster]] -n [[VAR:NAMESPACE:vcluster-my-team]]`}
186+
language="bash"
187+
/>
188+
189+
<br />
190+
191+
Delete the corrupted pod and its PVC:
179192

180193
<InterpolatedCodeBlock
181194
code={`kubectl delete pod [[VAR:VCLUSTER NAME:my-vcluster]]-0 -n [[VAR:NAMESPACE:vcluster-my-team]]
@@ -185,7 +198,7 @@ kubectl delete pvc [[VAR:PVC PREFIX:data]]-[[VAR:VCLUSTER NAME:my-vcluster]]-0 -
185198

186199
<br />
187200

188-
The pod restarts with a new empty PVC. After 1-3 pod restarts, the automatic recovery adds it back to the etcd cluster.
201+
The pod restarts with a new empty PVC. The initial attempts fail because the new member tries to join the existing etcd cluster but lacks the required data. After 1-3 pod restarts, vCluster's automatic recovery detects the empty member and properly adds it as a new learner, allowing it to sync data from healthy members and join the cluster.
189202
</Step>
190203

191204
<Step title="Monitor recovery">
@@ -239,29 +252,21 @@ Delete the StatefulSet without deleting the pods:
239252
</Step>
240253

241254
<Step title="Update configuration to Parallel">
255+
<a id="migrate-to-parallel"></a>
256+
242257
Update your virtual cluster configuration to use `Parallel` pod management policy.
243258

244259
If using a VirtualClusterInstance:
245260

246261
<InterpolatedCodeBlock
247-
code={`kubectl edit virtualclusterinstance [[VAR:VCLUSTER NAME:my-vcluster]] -n [[VAR:NAMESPACE:vcluster-my-team]]`}
262+
code={`kubectl patch virtualclusterinstance [[VAR:VCLUSTER NAME:my-vcluster]] -n [[VAR:NAMESPACE:vcluster-my-team]] \\
263+
--type merge \\
264+
-p '{"spec":{"controlPlane":{"statefulSet":{"scheduling":{"podManagementPolicy":"Parallel"}}}}}'`}
248265
language="bash"
249266
/>
250267

251268
<br />
252269

253-
Add or update the following configuration:
254-
255-
<InterpolatedCodeBlock
256-
code={`controlPlane:
257-
statefulSet:
258-
scheduling:
259-
podManagementPolicy: Parallel`}
260-
language="yaml"
261-
/>
262-
263-
<br />
264-
265270
If using Helm, update your `values.yaml` and run:
266271

267272
<InterpolatedCodeBlock
@@ -279,7 +284,18 @@ The StatefulSet is recreated with `Parallel` policy and pods pick up the existin
279284
</Step>
280285

281286
<Step title="Delete the failed pod and PVC">
282-
Now follow the same procedure as for `Parallel` mode:
287+
Now follow the same procedure as for `Parallel` mode.
288+
289+
First, identify the PVC for replica-0:
290+
291+
<InterpolatedCodeBlock
292+
code={`kubectl get pvc -l [[VAR:VCLUSTER LABEL:app=vcluster]] -n [[VAR:NAMESPACE:vcluster-my-team]]`}
293+
language="bash"
294+
/>
295+
296+
<br />
297+
298+
Delete the corrupted pod and its PVC:
283299

284300
<InterpolatedCodeBlock
285301
code={`kubectl delete pod [[VAR:VCLUSTER NAME:my-vcluster]]-0 -n [[VAR:NAMESPACE:vcluster-my-team]]
@@ -289,7 +305,7 @@ kubectl delete pvc [[VAR:PVC PREFIX:data]]-[[VAR:VCLUSTER NAME:my-vcluster]]-0 -
289305

290306
<br />
291307

292-
The pod restarts with a new empty PVC and automatic recovery adds it back to the cluster after 1-3 pod restarts.
308+
The pod restarts with a new empty PVC. The initial attempts fail because the new member tries to join the existing etcd cluster but lacks the required data. After 1-3 pod restarts, vCluster's automatic recovery detects the empty member and properly adds it as a new learner, allowing it to sync data from healthy members and join the cluster.
293309
</Step>
294310
</Flow>
295311

0 commit comments

Comments
 (0)