-
Notifications
You must be signed in to change notification settings - Fork 322
Description
Is your feature request related to a problem? Please describe.
When running a query via the %%bigquery
magics or waiting for it to finish via QueryJob.to_dataframe or QueryJob.to_arrow, an argument progress_bar_type
is accepted.
Currently, this only shows the progress of the query results download. It would be great if it would also give an indicator while the query is executing.
Describe the solution you'd like
When a value is passed to progress_bar_type
, show some kind of progress bar. Ideally, it would work similarly to the UI. For example,
- Show the job state in the progress bar description. For example, is it currently "pending" (queued, waiting for resources) or "running" (actually executing).
- Show how many "stages" there are via length of the query_plan.
- Find the latest incomplete stage.
- Use parallel_inputs as the total amount of work (per stage)
- Use completed_parallel_inputs as the amount of work completed so far.
To populate this, instead of calling result()
once:
- Call
result(timeout=[a few seconds])
every few seconds. - Call
job.reload()
to fetch the latest job statistics. - Update the progress bar.
- Repeat.
Describe alternatives you've considered
- Some kind of spinner that only shows that some time has elapsed. (This is actually harder than it sounds because
tqdm
doesn't actually support spinners, and if it ever does, it sounds like it'll be via a different API than the current progress bars. "Indefinite" progress bar tqdm/tqdm#427 Add spinners tqdm/tqdm#925
Additional context
-
pandas-gbq feature request: Implement initial "waiting" logs with tqdm? python-bigquery-pandas#327
-
Example of BigQuery UI progress bar: