Skip to content

Illogical use "full_table_id" #115

@pbkool

Description

@pbkool

Hi there! Up front, I am not that experienced with either Python nor the BigQuery API, so this may be intended. But if it is, I do not understand why 😅

I have the following simple script that looks for a given data set, and then loops through tables inside that data set and fires queries. Right now it does a simple select, but I intend to delete some old partitions that are no longer needed.

import os
import logging
from google.cloud import bigquery

logging.basicConfig(level=logging.INFO,format='%(levelname)s %(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "FILE_NAME"

bq = bigquery.Client(project="XXXX")
SEARCH_DATASET = "YYYY"                
MAX_AGE_PARTITIONS = 7

for database in list(bq.list_datasets()):
    if SEARCH_DATASET in database.dataset_id:
        temp_tables_object = bq.list_tables(database.dataset_id)
        for table in temp_tables_object:
            if (str(table.table_id).startswith("p_")
                and "stats" not in str(table.table_id).lower()): 
                    logging.info(  "Currently processing partions for table " + str(table.table_id) + " It contains " + str(len(bq.list_partitions(table))) +
                                    " partitions. With a total size of " + str(round(bq.get_table(table).num_bytes /pow(10,9),2)) + " GB" )
                    logging.info( "Deleting the partitions older than "  + str(MAX_AGE_PARTITIONS))
                    QUERY = (
                                "SELECT * FROM " + 
                                str(bq.get_table(table).full_table_id) + 
                                " where _PARTITIONTIME < TIMESTAMP(DATE_SUB(current_date(), INTERVAL" + str(MAX_AGE_PARTITIONS) + "DAY)) LIMIT 1")
                    query_job = bq.query(QUERY)
                    result = query_job.result()
                    logging.info(result)

I am stuck at the part of str(bq.get_table(table).full_table_id) I expected a table reference such as bigquery-public-data.austin_311.311_request but it returns something like bigquery-public-data:austin_311.311_request (note the : )

And as expected my script fails with google.api_core.exceptions.BadRequest: 400 Syntax error: Unexpected ":" at [1:27]

pip freeze
gcloud==0.18.3
google-api-core==1.17.0
google-auth==1.14.1
google-cloud==0.34.0
google-cloud-bigquery==1.24.0
google-cloud-core==1.3.0
google-resumable-media==0.5.0
googleapis-common-protos==1.51.0

python --version Python 3.8.1

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.type: questionRequest for information or clarification. Not an issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions