-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-41388][K8S] getReusablePVCs should ignore recently created PVCs in the previous batch
#38912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…s in the previous batch
getReusablePVCs should ignore recently created PVCs in the previous batch
| .thenReturn(Seq(persistentVolumeClaim("pvc-0", "gp2", "200Gi")).asJava) | ||
| val pvc = persistentVolumeClaim("pvc-0", "gp2", "200Gi") | ||
| pvc.getMetadata | ||
| .setCreationTimestamp(Instant.now().minus(podAllocationDelay + 1, MILLIS).toString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a previous test case for SPARK-35416: Support PersistentVolumeClaim Reuse.
Since our test framework does't fill CreationTimestamp, this PR added it properly.
Lines 224 to 227 in 9482906
| .withNewMetadata() | |
| .withName(claimName) | |
| .addToLabels(SPARK_APP_ID_LABEL, TEST_SPARK_APP_ID) | |
| .endMetadata() |
|
Could you review this when you have some time, @viirya ? |
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rationale make sense.
|
Thank you so much! |
|
All tests passed. Merged to master/3.3/3.2. |
…VCs in the previous batch This PR aims to prevent `getReusablePVCs` from choosing recently created PVCs in the very previous batch by excluding newly created PVCs whose creation time is within `spark.kubernetes.allocation.batch.delay`. In case of slow K8s control plane situation where `spark.kubernetes.allocation.batch.delay` is too short relatively or `spark.kubernetes.executor.enablePollingWithResourceVersion=true` is used, `onNewSnapshots` may not return the full list of executor pods created by the previous batch. This sometimes makes Spark driver think the PVCs in the previous batch are reusable for the next batch. No. Pass the CIs with the newly created test case. Closes #38912 from dongjoon-hyun/SPARK-41388. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit e234cd8) Signed-off-by: Dongjoon Hyun <[email protected]>
…VCs in the previous batch This PR aims to prevent `getReusablePVCs` from choosing recently created PVCs in the very previous batch by excluding newly created PVCs whose creation time is within `spark.kubernetes.allocation.batch.delay`. In case of slow K8s control plane situation where `spark.kubernetes.allocation.batch.delay` is too short relatively or `spark.kubernetes.executor.enablePollingWithResourceVersion=true` is used, `onNewSnapshots` may not return the full list of executor pods created by the previous batch. This sometimes makes Spark driver think the PVCs in the previous batch are reusable for the next batch. No. Pass the CIs with the newly created test case. Closes #38912 from dongjoon-hyun/SPARK-41388. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit e234cd8) Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 651f5da) Signed-off-by: Dongjoon Hyun <[email protected]>
…VCs in the previous batch ### What changes were proposed in this pull request? This PR aims to prevent `getReusablePVCs` from choosing recently created PVCs in the very previous batch by excluding newly created PVCs whose creation time is within `spark.kubernetes.allocation.batch.delay`. ### Why are the changes needed? In case of slow K8s control plane situation where `spark.kubernetes.allocation.batch.delay` is too short relatively or `spark.kubernetes.executor.enablePollingWithResourceVersion=true` is used, `onNewSnapshots` may not return the full list of executor pods created by the previous batch. This sometimes makes Spark driver think the PVCs in the previous batch are reusable for the next batch. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs with the newly created test case. Closes apache#38912 from dongjoon-hyun/SPARK-41388. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…VCs in the previous batch ### What changes were proposed in this pull request? This PR aims to prevent `getReusablePVCs` from choosing recently created PVCs in the very previous batch by excluding newly created PVCs whose creation time is within `spark.kubernetes.allocation.batch.delay`. ### Why are the changes needed? In case of slow K8s control plane situation where `spark.kubernetes.allocation.batch.delay` is too short relatively or `spark.kubernetes.executor.enablePollingWithResourceVersion=true` is used, `onNewSnapshots` may not return the full list of executor pods created by the previous batch. This sometimes makes Spark driver think the PVCs in the previous batch are reusable for the next batch. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs with the newly created test case. Closes apache#38912 from dongjoon-hyun/SPARK-41388. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit e234cd8) Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR aims to prevent
getReusablePVCsfrom choosing recently created PVCs in the very previous batch by excluding newly created PVCs whose creation time is withinspark.kubernetes.allocation.batch.delay.Why are the changes needed?
In case of slow K8s control plane situation where
spark.kubernetes.allocation.batch.delayis too short relatively orspark.kubernetes.executor.enablePollingWithResourceVersion=trueis used,onNewSnapshotsmay not return the full list of executor pods created by the previous batch. This sometimes makes Spark driver think the PVCs in the previous batch are reusable for the next batch.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Pass the CIs with the newly created test case.