Skip to content

tests.system.test_gbq.TestToGBQIntegration: test_upload_data_if_table_exists_append failed #684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
flaky-bot bot opened this issue Oct 18, 2023 · 3 comments
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery-pandas API. flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@flaky-bot
Copy link

flaky-bot bot commented Oct 18, 2023

Note: #674 was also for this test, but it was closed more than 10 days ago. So, I didn't mark it flaky.


commit: e596b74
buildURL: Build Status, Sponge
status: failed

Test output
self = 
dataframe =    bools      flts  ints strs                            times
0  False -0.307413     7    6 2023-10-18 03:47:59.96752...ue -0.817531     4    4 2023-10-18 03:47:59.967521-07:00
9   True -1.709661     1    6 2023-10-18 03:47:59.967521-07:00
destination_table_ref = TableReference(DatasetReference('precise-truck-742', 'python_bigquery_pandas_tests_system_20231018104759_3e314e'), 'new_test3')
write_disposition = 'WRITE_APPEND', chunksize = None
schema = {'fields': [{'mode': 'NULLABLE', 'name': 'bools', 'type': 'BOOLEAN'}, {'mode': 'NULLABLE', 'name': 'flts', 'type': 'FL...}, {'mode': 'NULLABLE', 'name': 'strs', 'type': 'STRING'}, {'mode': 'NULLABLE', 'name': 'times', 'type': 'TIMESTAMP'}]}
progress_bar = True, api_method = 'load_parquet'
billing_project = 'precise-truck-742'
def load_data(
    self,
    dataframe,
    destination_table_ref,
    write_disposition,
    chunksize=None,
    schema=None,
    progress_bar=True,
    api_method: str = "load_parquet",
    billing_project: Optional[str] = None,
):
    from pandas_gbq import load

    total_rows = len(dataframe)

    try:
      chunks = load.load_chunks(
            self.client,
            dataframe,
            destination_table_ref,
            chunksize=chunksize,
            schema=schema,
            location=self.location,
            api_method=api_method,
            write_disposition=write_disposition,
            billing_project=billing_project,
        )

pandas_gbq/gbq.py:602:


pandas_gbq/load.py:243: in load_chunks
load_parquet(
pandas_gbq/load.py:131: in load_parquet
client.load_table_from_dataframe(
.nox/prerelease/lib/python3.8/site-packages/google/cloud/bigquery/job/base.py:922: in result
return super(_AsyncJob, self).result(timeout=timeout, **kwargs)
.nox/prerelease/lib/python3.8/site-packages/google/api_core/future/polling.py:256: in result
self._blocking_poll(timeout=timeout, retry=retry, polling=polling)
.nox/prerelease/lib/python3.8/site-packages/google/api_core/future/polling.py:137: in _blocking_poll
polling(self._done_or_raise)(retry=retry)
.nox/prerelease/lib/python3.8/site-packages/google/api_core/retry.py:366: in retry_wrapped_func
return retry_target(
.nox/prerelease/lib/python3.8/site-packages/google/api_core/retry.py:204: in retry_target
return target()
.nox/prerelease/lib/python3.8/site-packages/google/api_core/future/polling.py:119: in _done_or_raise
if not self.done(retry=retry):
.nox/prerelease/lib/python3.8/site-packages/google/cloud/bigquery/job/base.py:889: in done
self.reload(retry=retry, timeout=timeout)
.nox/prerelease/lib/python3.8/site-packages/google/cloud/bigquery/job/base.py:781: in reload
api_response = client._call_api(
.nox/prerelease/lib/python3.8/site-packages/google/cloud/bigquery/client.py:816: in _call_api
return call()


self = <google.cloud.bigquery._http.Connection object at 0x7f3bdc791fd0>
method = 'GET'
path = '/projects/precise-truck-742/jobs/831b2ed1-ac03-4e7b-a48a-004b1726cb85'
query_params = {'location': 'US'}, data = None, content_type = None
headers = None, api_base_url = None, api_version = None, expect_json = True
_target_object = None, timeout = None, extra_api_info = None

def api_request(
    self,
    method,
    path,
    query_params=None,
    data=None,
    content_type=None,
    headers=None,
    api_base_url=None,
    api_version=None,
    expect_json=True,
    _target_object=None,
    timeout=_DEFAULT_TIMEOUT,
    extra_api_info=None,
):
    """Make a request over the HTTP transport to the API.

    You shouldn't need to use this method, but if you plan to
    interact with the API using these primitives, this is the
    correct one to use.

    :type method: str
    :param method: The HTTP method name (ie, ``GET``, ``POST``, etc).
                   Required.

    :type path: str
    :param path: The path to the resource (ie, ``'/b/bucket-name'``).
                 Required.

    :type query_params: dict or list
    :param query_params: A dictionary of keys and values (or list of
                         key-value pairs) to insert into the query
                         string of the URL.

    :type data: str
    :param data: The data to send as the body of the request. Default is
                 the empty string.

    :type content_type: str
    :param content_type: The proper MIME type of the data provided. Default
                         is None.

    :type headers: dict
    :param headers: extra HTTP headers to be sent with the request.

    :type api_base_url: str
    :param api_base_url: The base URL for the API endpoint.
                         Typically you won't have to provide this.
                         Default is the standard API base URL.

    :type api_version: str
    :param api_version: The version of the API to call.  Typically
                        you shouldn't provide this and instead use
                        the default for the library.  Default is the
                        latest API version supported by
                        google-cloud-python.

    :type expect_json: bool
    :param expect_json: If True, this method will try to parse the
                        response as JSON and raise an exception if
                        that cannot be done.  Default is True.

    :type _target_object: :class:`object`
    :param _target_object:
        (Optional) Protected argument to be used by library callers. This
        can allow custom behavior, for example, to defer an HTTP request
        and complete initialization of the object at a later time.

    :type timeout: float or tuple
    :param timeout: (optional) The amount of time, in seconds, to wait
        for the server response.

        Can also be passed as a tuple (connect_timeout, read_timeout).
        See :meth:`requests.Session.request` documentation for details.

    :type extra_api_info: string
    :param extra_api_info: (optional) Extra api info to be appended to
        the X-Goog-API-Client header

    :raises ~google.cloud.exceptions.GoogleCloudError: if the response code
        is not 200 OK.
    :raises ValueError: if the response content type is not JSON.
    :rtype: dict or str
    :returns: The API response payload, either as a raw string or
              a dictionary if the response is valid JSON.
    """
    url = self.build_api_url(
        path=path,
        query_params=query_params,
        api_base_url=api_base_url,
        api_version=api_version,
    )

    # Making the executive decision that any dictionary
    # data will be sent properly as JSON.
    if data and isinstance(data, dict):
        data = json.dumps(data)
        content_type = "application/json"

    response = self._make_request(
        method=method,
        url=url,
        data=data,
        content_type=content_type,
        headers=headers,
        target_object=_target_object,
        timeout=timeout,
        extra_api_info=extra_api_info,
    )

    if not 200 <= response.status_code < 300:
      raise exceptions.from_http_response(response)

E google.api_core.exceptions.ServiceUnavailable: 503 GET https://bigquery.googleapis.com/bigquery/v2/projects/precise-truck-742/jobs/831b2ed1-ac03-4e7b-a48a-004b1726cb85?location=US&prettyPrint=false: Error encountered during execution. Retrying may solve the problem.

.nox/prerelease/lib/python3.8/site-packages/google/cloud/_http/init.py:494: ServiceUnavailable

The above exception was the direct cause of the following exception:

self = <system.test_gbq.TestToGBQIntegration object at 0x7f3bdef60910>
project_id = 'precise-truck-742'

def test_upload_data_if_table_exists_append(self, project_id):
    test_id = "3"
    test_size = 10
    df = make_mixed_dataframe_v2(test_size)
    df_different_schema = make_mixed_dataframe_v1()

    # Initialize table with sample data
    gbq.to_gbq(
        df,
        self.destination_table + test_id,
        project_id,
        chunksize=10000,
        credentials=self.credentials,
    )

    # Test the if_exists parameter with value 'append'
  gbq.to_gbq(
        df,
        self.destination_table + test_id,
        project_id,
        if_exists="append",
        credentials=self.credentials,
    )

tests/system/test_gbq.py:722:


pandas_gbq/gbq.py:1220: in to_gbq
connector.load_data(
pandas_gbq/gbq.py:622: in load_data
self.process_http_error(ex)


ex = ServiceUnavailable('GET https://bigquery.googleapis.com/bigquery/v2/projects/precise-truck-742/jobs/831b2ed1-ac03-4e7b-a48a-004b1726cb85?location=US&prettyPrint=false: Error encountered during execution. Retrying may solve the problem.')

@staticmethod
def process_http_error(ex):
    # See `BigQuery Troubleshooting Errors
    # <https://cloud.google.com/bigquery/troubleshooting-errors>`__

    if "cancelled" in ex.message:
        raise QueryTimeout("Reason: {0}".format(ex))
    elif "Provided Schema does not match" in ex.message:
        error_message = ex.errors[0]["message"]
        raise InvalidSchema(f"Reason: {error_message}")
    elif "Already Exists: Table" in ex.message:
        error_message = ex.errors[0]["message"]
        raise TableCreationError(f"Reason: {error_message}")
    else:
      raise GenericGBQException("Reason: {0}".format(ex)) from ex

E pandas_gbq.exceptions.GenericGBQException: Reason: 503 GET https://bigquery.googleapis.com/bigquery/v2/projects/precise-truck-742/jobs/831b2ed1-ac03-4e7b-a48a-004b1726cb85?location=US&prettyPrint=false: Error encountered during execution. Retrying may solve the problem.

pandas_gbq/gbq.py:396: GenericGBQException

@flaky-bot flaky-bot bot added flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Oct 18, 2023
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-pandas API. label Oct 18, 2023
@flaky-bot flaky-bot bot added the flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. label Oct 18, 2023
@flaky-bot
Copy link
Author

flaky-bot bot commented Oct 18, 2023

Looks like this issue is flaky. 😟

I'm going to leave this open and stop commenting.

A human should fix and close this.


When run at the same commit (e596b74), this test passed in one build (Build Status, Sponge) and failed in another build (Build Status, Sponge).

1 similar comment
@flaky-bot
Copy link
Author

flaky-bot bot commented Oct 18, 2023

Looks like this issue is flaky. 😟

I'm going to leave this open and stop commenting.

A human should fix and close this.


When run at the same commit (e596b74), this test passed in one build (Build Status, Sponge) and failed in another build (Build Status, Sponge).

@Linchin Linchin self-assigned this Oct 23, 2023
@Linchin
Copy link
Contributor

Linchin commented Oct 23, 2023

This error is caused by a 503 (Service Unavailable) error, which is likely an upstream issue. Since the continuous test later all passed, I will close this issue for now. If this flakiness becomes a recurring issue, we will investigate it further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-pandas API. flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

1 participant