Skip to content

Conversation

plamut
Copy link
Contributor

@plamut plamut commented May 17, 2021

Closes #629.

This PR adds logic to detect obsolete versions of BQ Storage extra and emit a warning or raise an error where applicable.

Traced all BQ Storage-related code and call paths and I think the changes here cover them all. As a bonus, we get a graceful fallback to tabledata.list in most cases (a warning is emitted instead of raising an error).

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label May 17, 2021
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label May 17, 2021
@plamut plamut marked this pull request as ready for review May 18, 2021 11:05
@plamut plamut requested a review from a team May 18, 2021 11:05
@plamut plamut requested a review from a team as a code owner May 18, 2021 11:05
@plamut plamut requested review from stephaniewang526 and tswast and removed request for a team May 18, 2021 11:05
@plamut
Copy link
Contributor Author

plamut commented May 18, 2021

Will fix the coverage error son, one line not hit by the tests.

@jimfulton jimfulton self-requested a review May 18, 2021 14:06
Copy link
Contributor

@jimfulton jimfulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can be a little DRYer if we reuse _create_bqstorage_client more after adding some arguments to it.

@plamut
Copy link
Contributor Author

plamut commented May 18, 2021

I think we can be a little DRYer if we reuse _create_bqstorage_client more after adding some arguments to it.

Sounds reasonable, I'll probably give it a shot. 👍

@jimfulton
Copy link
Contributor

I'm proposing that _create_bqstorage_client becomes something like:

    def _create_bqstorage_client(self, bqstorage_client=None, client_info=None, client_options=None):
        """Create a BigQuery Storage API client using this client's credentials.

        If a client cannot be created due to missing dependencies, raise a
        warning and return ``None``.

        Returns:
            Optional[google.cloud.bigquery_storage.BigQueryReadClient]:
                A BigQuery Storage API client.
        """
        try:
            from google.cloud import bigquery_storage
        except ImportError:
            warnings.warn(
                "Cannot create BigQuery Storage client, the dependency "
                "google-cloud-bigquery-storage is not installed."
            )
            return None

        try:
            _verify_bq_storage_version()
        except LegacyBigQueryStorageError as exc:
            warnings.warn(str(exc))
            return None

        if bqstorage_client is None:
            bqstorage_client = bigquery_storage.BigQueryReadClient(
                credentials=self._credentials,
                client_info=client_info,
                client_options=client_options,
                )

        return bqstorage_client

and maybe gets renamed :) and that it's called even when the caller is given a client.

If the library is out of date, the caller will end up with None even if given a value.

plamut added 4 commits May 19, 2021 17:30
The method is renamed to _ensure_bqstorage_client() and now performs
a check if BQ Storage dependency is recent enough.
The check is now performed in dbapi.Connection, which is sufficient.
The methods in higher layers already do the same check before
a BQ Storage client instance is passed to
_pandas_helpers._download_table_bqstorage() helper.
Lean more heavily on client._ensure_bqstorage_client() to de-duplicate
logic.
@plamut
Copy link
Contributor Author

plamut commented May 19, 2021

I checked the call chains and if I didn't miss anything, the only place left where a bad BQ Storage client could leak down is dbapi.Connection, which I sealed. Some of the lower-level checks turned out to be redundant, thus removed them:

  • In dbapi.Cursor._bqstorage_fetch()
  • In _pandas_helpers._download_table_bqstorage()

Refactored Client._make_bqstorage_client as suggested and de-duplicated logic in magics and dbapi.

Apart from the client's factory, the other central place where we check if BQ Storage client is compatible is table._validate_bqstorage() (and issues a warning if it isn't). The top level table methods use this check and set bqstorage_client to None, if necessary, before sending it down the call chain to lower level utility functions.

Edit: Will fix the two coverage misses tomorrow.

Copy link
Contributor

@jimfulton jimfulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@plamut plamut merged commit bd7dbda into googleapis:master May 20, 2021
@plamut plamut deleted the iss-629 branch May 20, 2021 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support google-cloud-bigquery-storage v1
2 participants