Query performance optimizations

This issue tracks the "fast query path" changes for the Python client(s):

- [x] https://github.com/googleapis/python-bigquery/pull/363 -- Update `QueryJob` to use `getQueryResults` in `RowIterator`. Project down to avoid fetching schema and other unnecessary job stats in `RowIterator`.
- [x] https://github.com/googleapis/python-bigquery/pull/374 -- Update `QueryJob` and `RowIterator` to cache the first page of results, which we fetch as part of the logic to wait for the job to finish. Discard the cache if `maxResults` or `startIndex` are set.
- [x] https://github.com/googleapis/python-bigquery/pull/375 -- Update DB-API to avoid direct call to list_rows()
- [x] https://github.com/googleapis/python-bigquery/pull/384 -- Update `to_dataframe` and related methods in RowIterator to not call BQ Storage API if cached results are the only page.
- [ ] Update DB-API to not call BQ Storage API if cached results are the only page.
- [ ] Update Client.query to call jobs.query backend API method for acceptable `job_config`s.
- [ ] (optional?) Avoid call to jobs.get in certain cases, such as `QueryJob.to_dataframe` and `QueryJob.to_arrow`
  - Add "reload" argument to QueryJob.result() -- default to True.
  - Update RowIterator to call get_job to fetch the destination table ID before attempting use of BQ Storage API (if destination table ID isn't available).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Query performance optimizations #362

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Query performance optimizations #362

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions