-
Notifications
You must be signed in to change notification settings - Fork 323
Description
This code works with with the python package
google-cloud-bigquery==2.2.0
but it does not work with this package
google-cloud-bigquery==2.6.0
What you expected to happen:
I expect the client library to take a datetime of
'2017-11-01 04:08 AM -07:00' and convert it to
'2017-11-01 11:08 AM -00:00' in big query. but instead it uploads it to
'2017-11-01 04:08 AM -00:00'
Steps to reproduce:
import datetime
from dateutil.tz import tzoffset
from google.cloud.bigquery import SchemaField, Table
schema = [SchemaField('c', 'TIMESTAMP', 'NULLABLE', None, (), None)]
problem_date = datetime.datetime(2017, 11, 1, 4, 8, tzinfo=tzoffset(None, -25200))
table_name = ‘my_test_table’
dataset = <an existing dataset>
bigquery = bigquery.Client(credentials=<somecreds>, project=<ourproj>)
dataset_ref = bigquery.dataset(dataset, project=<ourproj>)
table_ref = dataset_ref.table(table_name)
table = bigquery.create_table(Table(table_ref, schema=schema))
errors = bigquery.insert_rows(table, [[problem_date]])
Other info
I worked with the google BigQuery team and here is their feedback.
Hello,
I have been investigating the issue you described and it seems like this change of behaviour of the python client library was introduced in the version 2.5.0 1 with the pull request #393 2, whose objective was to fix a previous issue with floating points and timestamps in the method insert_rows
.
I have been able to reproduce your issue in new versions after 2.5.0 while it does not happen in the previous ones. The suggested way to proceed here would be to open an issue in the Github repository 3 so you can transmit your findings to the Engineering Team, since the code is open source .
Alternatively, you might want to add the following line of code, so the timezone is converted to UTC before trying to insert the row into BigQuery. Please let me know if this workaround worked for you:
problem_date = datetime.datetime(2017, 11, 1, 4, 8, tzinfo=tzoffset(None, -25200))
problem_date = problem_date.astimezone(datetime.timezone.utc)
We will look into the feasibility of the workaround