-
Notifications
You must be signed in to change notification settings - Fork 126
Closed
Description
I upgraded my script that uses pandas-gbq from python 2.7 to python 3.6 and it started to give me an error when trying to load the dataframe:
Errors:
--
file-00000000: Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 329; errors: 1. (error code: invalid)
file-00000000: Error while reading data, error message: JSON parsing error in row starting at position 30858: Parser terminated before end of string (error code: invalid)
Using the python client for bigquery directly I found out that the problem was related to load_table_from_file
function.
It turns out that these lines (586, 587) in the gbq.py file was not right for the file_obj parameter, as stated at the API Reference here (https://googlecloudplatform.github.io/google-cloud-python/latest/bigquery/reference.html)
- file_obj (file) – A file handle opened in binary mode for reading.
body = body.encode('utf-8')
body = BytesIO(body)
try:
self.client.load_table_from_file(
body,
destination_table,
job_config=job_config).result()
except self.http_error as ex:
self.process_http_error(ex)
So I changed it to look like this:
fd = tempfile.NamedTemporaryFile(mode="w", delete=False)
fd.write('{}\n'.format('\n'.join(rows)))
fd.close()
with open(fd.name, 'rb') as body:
try:
self.client.load_table_from_file(
body,
destination_table,
job_config=job_config).result()
except self.http_error as ex:
self.process_http_error(ex)
os.remove(fd.name)
In this way it works.
I know it has something to do with special characters like ç and ãéó, etc, but I don't know another way to solve this.
Metadata
Metadata
Assignees
Labels
No labels