Skip to content

pd.read_gbq broken with 1.26.0 and pyarrow #177

@inglesp

Description

@inglesp

If pyarrow is installed, then with pandas-gbq==0.13.2, using pd.read_gbq causes an exception inside this library.

>>> pd.read_gbq("SELECT 1", project_id="ebmdatalab")
/home/inglesp/.pyenv/versions/openp/lib/python3.5/site-packages/google/auth/_default.py:69: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a "quota exceeded" or "API not enabled" error. We recommend you rerun `gcloud auth application-default login` and make sure a quota project is added. Or you can use service accounts instead. For more information about service accounts, see https://cloud.google.com/docs/authentication/
  warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)
/home/inglesp/.pyenv/versions/openp/lib/python3.5/site-packages/google/cloud/bigquery/client.py:407: UserWarning: Cannot create BigQuery Storage client, the dependency google-cloud-bigquery-storage is not installed.
  "Cannot create BigQuery Storage client, the dependency "
Downloading: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.23rows/s]
Traceback (most recent call last):
  File "/home/inglesp/.pyenv/versions/3.5.9/lib/python3.5/code.py", line 91, in runcode
    exec(code, self.locals)
  File "<console>", line 1, in <module>
  File "/home/inglesp/.pyenv/versions/openp/lib/python3.5/site-packages/pandas/io/gbq.py", line 176, in read_gbq
    **kwargs
  File "/home/inglesp/.pyenv/versions/openp/lib/python3.5/site-packages/pandas_gbq/gbq.py", line 967, in read_gbq
    progress_bar_type=progress_bar_type,
  File "/home/inglesp/.pyenv/versions/openp/lib/python3.5/site-packages/pandas_gbq/gbq.py", line 532, in run_query
    progress_bar_type=progress_bar_type,
  File "/home/inglesp/.pyenv/versions/openp/lib/python3.5/site-packages/pandas_gbq/gbq.py", line 562, in _download_results
    progress_bar_type=progress_bar_type,
  File "/home/inglesp/.pyenv/versions/openp/lib/python3.5/site-packages/google/cloud/bigquery/table.py", line 1727, in to_dataframe
    create_bqstorage_client=create_bqstorage_client,
  File "/home/inglesp/.pyenv/versions/openp/lib/python3.5/site-packages/google/cloud/bigquery/table.py", line 1561, in to_arrow
    bqstorage_client.transport.channel.close()
AttributeError: 'NoneType' object has no attribute 'transport'

If pyarrow is not installed, there is no exception. The same code works with 1.25.0, so I'm raising the issue against this library and not pydata/pandas-gbq/ or apache/arrow.

Here are details of the various versions used to reproduce this.

$ python --version
Python 3.8.2
$ cat requirements.in 
google-cloud-bigquery
pandas-gbq
pyarrow
$ pip freeze
cachetools==4.1.1
certifi==2020.6.20
chardet==3.0.4
click==7.1.2
google-api-core==1.22.0
google-auth==1.19.2
google-auth-oauthlib==0.4.1
google-cloud-bigquery==1.26.0
google-cloud-core==1.3.0
google-resumable-media==0.5.1
googleapis-common-protos==1.52.0
idna==2.10
numpy==1.19.1
oauthlib==3.1.0
pandas==1.0.5
pandas-gbq==0.13.2
pip-tools==5.2.1
protobuf==3.12.2
pyarrow==0.17.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pydata-google-auth==1.1.0
python-dateutil==2.8.1
pytz==2020.1
requests==2.24.0
requests-oauthlib==1.3.0
rsa==4.6
six==1.15.0
urllib3==1.25.9

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions