Skip to content

Caching Introduced by Pull #374 Absorbs Errors and Impedes Error Retry #523

@githubwua

Description

@githubwua

The following commit introduced request caching.

85bf2bc

Caching catches and aborbs errors and prevents the retrying library from catching errors.

As a result, failed requests are not being caught and retried.

Environment details

Last Known good version 2.3.1

google-cloud-bigquery==2.3.1

First version that broke retry

google-cloud-bigquery==2.4.0

Steps to reproduce

  1. Run repro.py below with google-cloud-bigquery==2.3.1
    Result: Error is retried until timeout at 60 sec

  2. Run repro.py below with google-cloud-bigquery==2.4.0
    Result: Error is not being retried. Script exits early.

Code example

from google import api_core
from google.cloud import bigquery
from google.api_core import retry

if_transient_error = retry.if_exception_type(Exception,)

RETRY_INITIAL = 1
RETRY_MAXIMUM = 10
RETRY_MULTIPLIER = 2
RETRY_DEADLINE = 60

my_retry = retry.Retry(
    predicate=if_transient_error,
    initial=RETRY_INITIAL,
    maximum=RETRY_MAXIMUM,
    multiplier=RETRY_MULTIPLIER,
    deadline=RETRY_DEADLINE
)

# retry test
print("bq version:", bigquery.__version__)
print("api-core version:", api_core.__version__)

client = bigquery.Client()
sql = "hoge"
query_job = client.query(sql, retry=my_retry)
for row in query_job.result(retry=my_retry):
    print(row)
google-api-core==1.23.0

# Last Known good version 2.3.1
#google-cloud-bigquery==2.3.1

# First version that broke retry
google-cloud-bigquery==2.4.0

Stack trace

# Error is not being retried starting from google-cloud-bigquery==2.4.0

$ python3 repro.py 
bq version: 2.4.0
api-core version: 1.23.0
Traceback (most recent call last):
  File "repro.py", line 31, in <module>
    for row in query_job.result(retry=my_retry):
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1160, in result
    super(QueryJob, self).result(retry=retry, timeout=timeout)
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/base.py", line 631, in result
    return super(_AsyncJob, self).result(timeout=timeout, **kwargs)
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/future/polling.py", line 134, in result
    raise self._exception
google.api_core.exceptions.BadRequest: 400 Syntax error: Expected end of input but got identifier "hoge" at [1:1]


# In google-cloud-bigquery==2.3.1, error is retried as configured

$ python3 repro.py 
bq version: 2.3.1
api-core version: 1.23.0
Traceback (most recent call last):
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/retry.py", line 184, in retry_target
    return target()
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/_http.py", line 438, in api_request
    raise exceptions.from_http_response(response)
google.api_core.exceptions.BadRequest: 400 GET https://bigquery.googleapis.com/bigquery/v2/projects/wua-repro/queries/1ca1cb14-f9a5-4298-9058-91ab784be278?maxResults=0&location=US&prettyPrint=false: Syntax error: Expected end of input but got identifier "hoge" at [1:1]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/future/polling.py", line 107, in _blocking_poll
    retry_(self._done_or_raise)(**kwargs)
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/retry.py", line 281, in retry_wrapped_func
    return retry_target(
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/retry.py", line 184, in retry_target
    return target()
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/future/polling.py", line 85, in _done_or_raise
    if not self.done(**kwargs):
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1022, in done
    self._query_results = self._client._get_query_results(
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 1557, in _get_query_results
    resource = self._call_api(
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 636, in _call_api
    return call()
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/retry.py", line 281, in retry_wrapped_func
    return retry_target(
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/retry.py", line 199, in retry_target
    six.raise_from(
  File "<string>", line 3, in raise_from
google.api_core.exceptions.RetryError: Deadline of 60.0s exceeded while calling functools.partial(functools.partial(<bound method JSONConnection.api_request of <google.cloud.bigquery._http.Connection object at 0x7f9b79f60e50>>, method='GET', path='/projects/wua-repro/queries/1ca1cb14-f9a5-4298-9058-91ab784be278', query_params={'maxResults': 0, 'location': 'US'}, timeout=None)), last exception: 400 GET https://bigquery.googleapis.com/bigquery/v2/projects/wua-repro/queries/1ca1cb14-f9a5-4298-9058-91ab784be278?maxResults=0&location=US&prettyPrint=false: Syntax error: Expected end of input but got identifier "hoge" at [1:1]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "repro.py", line 31, in <module>
    for row in query_job.result(retry=my_retry):
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1146, in result
    super(QueryJob, self).result(retry=retry, timeout=timeout)
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/base.py", line 631, in result
    return super(_AsyncJob, self).result(timeout=timeout, **kwargs)
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/future/polling.py", line 129, in result
    self._blocking_poll(timeout=timeout, **kwargs)
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1042, in _blocking_poll
    super(QueryJob, self)._blocking_poll(timeout=timeout, **kwargs)
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/future/polling.py", line 109, in _blocking_poll
    raise concurrent.futures.TimeoutError(
concurrent.futures._base.TimeoutError: Operation did not complete within the designated timeout.

Note: retry works fine if we roll back the change in google/cloud/bigquery/job/query.py

Can we either fix google/cloud/bigquery/job/query.py or roll it back to previous version?

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: bigqueryIssues related to the googleapis/python-bigquery API.type: docsImprovement to the documentation for an API.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions