Skip to content

feat: progress bar during query in QueryJob.to_dataframe and QueryJob.to_arrow #343

@tswast

Description

@tswast

Is your feature request related to a problem? Please describe.

When running a query via the %%bigquery magics or waiting for it to finish via QueryJob.to_dataframe or QueryJob.to_arrow, an argument progress_bar_type is accepted.

Currently, this only shows the progress of the query results download. It would be great if it would also give an indicator while the query is executing.

Describe the solution you'd like

When a value is passed to progress_bar_type, show some kind of progress bar. Ideally, it would work similarly to the UI. For example,

  • Show the job state in the progress bar description. For example, is it currently "pending" (queued, waiting for resources) or "running" (actually executing).
  • Show how many "stages" there are via length of the query_plan.
  • Find the latest incomplete stage.
  • Use parallel_inputs as the total amount of work (per stage)
  • Use completed_parallel_inputs as the amount of work completed so far.

To populate this, instead of calling result() once:

  • Call result(timeout=[a few seconds]) every few seconds.
  • Call job.reload() to fetch the latest job statistics.
  • Update the progress bar.
  • Repeat.

Describe alternatives you've considered

Additional context

Metadata

Metadata

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions