Skip to content

RowIterator to_dataframe requires pyarrow >= 1.0.0 to work #249

@BradLewis

Description

@BradLewis

Currently the google-cloud-bigquery library requires pyarrow > 0.16.0, however the method RowIterator.to_dataframe adds the kwarg "timestamp_as_object", which is only supported in pyarrow >= 1.0.0. If install pyarrow >= 1.0.0, everything works as expected, however we are using other libraries which require pyarrow < 1.0.0.

So the requirements should either be updated to require pyarrow >= 1.0.0, or backported to support versions less than 1.

Environment details

  • OS type and version: Any
  • Python version: 3.6.9
  • pip version: 20.2.2
  • google-cloud-bigquery version: 1.27.2

Steps to reproduce

  1. Use pyarrow < 1.0.0
  2. Run RowIterator to_dataframe

Stack trace

#     result = future.result()
  File "<path>/python3.6/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "<path>/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "<path>/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "<path>", line 133, in run_query
    bqstorage_client=client_storage
  File "<path>/python3.6/site-packages/google/cloud/bigquery/table.py", line 1757, in to_dataframe
    df = record_batch.to_pandas(date_as_object=date_as_object, **extra_kwargs)
  File "pyarrow/array.pxi", line 503, in pyarrow.lib._PandasConvertible.to_pandas
TypeError: to_pandas() got an unexpected keyword argument 'timestamp_as_object'

Metadata

Metadata

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions