-
Notifications
You must be signed in to change notification settings - Fork 48
fix: reduce bigquery table modification via DML for to_gbq #1737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
056c6c8
to
5180abb
Compare
5180abb
to
1c32d28
Compare
1c32d28
to
d8ce12f
Compare
for field in table_schema: | ||
if field.name not in schema.names: | ||
return False | ||
if bigframes.dtypes.convert_schema_field(field)[1] != schema.get_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to do anything special here for the duration/timedelta type that we added recently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convert_schema_field()
is able handle timedelta:
python-bigquery-dataframes/bigframes/dtypes.py
Lines 717 to 722 in d937be0
elif ( | |
field.field_type == "INTEGER" | |
and field.description is not None | |
and field.description.endswith(TIMEDELTA_DESCRIPTION_TAG) | |
): | |
return field.name, TIMEDELTA_DTYPE |
I believe we are good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
To avoid exceeding BigQuery's 1500 daily table modification limit,
to_gbq
now prioritizesINSERT
orMERGE
DMLs. This method is used when the target table exists and shares the same schema, supporting both data replacement and appending. If schema discrepancies are found,to_gbq
will default back to its original table modification process.Fixes internal issue 409086472 🦕