You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[GCP] Add retry for transient error during launching GCP clusters (#2669)
* Add retry for flaky error during launching GCP clusters
* handle error
* format
* Do not log out stderr
* Add retry for gcloud crash
* fix retry return code
# Retry. Unlikely will succeed if it's due to no capacity.
1922
-
logger.info(
1923
-
'Retrying due to the possibly flaky RESOURCE_NOT_FOUND '
1924
-
'error.')
1948
+
logger.info('Retrying due to the possibly transient '
1949
+
'RESOURCE_NOT_FOUND error.')
1950
+
logger.debug(f'-- Stderr --\n{stderr}\n ----')
1951
+
returnTrue
1952
+
1953
+
# "The resource 'projects/skypilot-375900/regions/us-central1/subnetworks/default' is not ready". Details: "[{'message': "The resource 'projects/xxx/regions/us-central1/subnetworks/default' is not ready", 'domain': 'global', 'reason': 'resourceNotReady'}]"> # pylint: disable=line-too-long
1954
+
pattern= (r'is not ready(.*)\'reason\': \'resourceNotReady\'')
1955
+
result=re.search(pattern, stderr)
1956
+
ifresultisnotNone:
1957
+
# Retry. Unlikely will succeed if it's due to no capacity.
1958
+
logger.info('Retrying due to the possibly transient '
0 commit comments