Skip to content

sqlglot.errors.ParseError for sample from website #315

@samiabboud

Description

@samiabboud

Hello everyone,

Running a copy/pasted sample from the site is raising a sqlglot.errors.ParseError. Version issues maybe? Please see details below.

Your help is appreciated!

Cheers,
Sami

Environment details

  • OS type and version: Sonoma 14.2.1 (on M2 Max)
  • Python version: python --version 3.9.18
  • pip version: pip --version pip 23.0.1
  • bigframes version: pip show bigframes 0.19.0

Steps to reproduce

  1. Run code sample from : https://cloud.google.com/bigquery/docs/bigquery-dataframes#bigframes-ml-regression after adding a project id

Code example

from bigframes.ml.linear_model import LinearRegression
import bigframes.pandas as bpd

bpd.options.bigquery.project = "our_project_id"

# Load data from BigQuery
query_or_table = "bigquery-public-data.ml_datasets.penguins"
bq_df = bpd.read_gbq(query_or_table)

# Filter down to the data to the Adelie Penguin species
adelie_data = bq_df[bq_df.species == "Adelie Penguin (Pygoscelis adeliae)"]

# Drop the species column
adelie_data = adelie_data.drop(columns=["species"])

# Drop rows with nulls to get training data
training_data = adelie_data.dropna()

# Specify your feature (or input) columns and the label (or output) column:
feature_columns = training_data[
    ["island", "culmen_length_mm", "culmen_depth_mm", "flipper_length_mm", "sex"]
]
label_columns = training_data[["body_mass_g"]]

test_data = adelie_data[adelie_data.body_mass_g.isnull()]

# Create the linear model
model = LinearRegression()
model.fit(feature_columns, label_columns)

# Score the model
score = model.score(feature_columns, label_columns)

# Predict using the model
result = model.predict(test_data)
# example

Stack trace

% python src/bq_run.py
Query job bb049054-a4f0-4d88-b128-b97eb020038b is DONE.28.9 kB processed.  
https://console.cloud.google.com/bigquery?project=platform-dev-285607&j=bq:US:bb049054-a4f0-4d88-b128-b97eb020038b&page=queryresults
Traceback (most recent call last):
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/sqlglot/parser.py", line 1039, in parse_into
    return self._parse(parser, raw_tokens, sql)
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/sqlglot/parser.py", line 1078, in _parse
    self.raise_error("Invalid expression / Unexpected token")
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/sqlglot/parser.py", line 1119, in raise_error
    raise error
sqlglot.errors.ParseError: Invalid expression / Unexpected token. Line 1, Col: 61.
  platform-dev-285607._21e83bdd53455fdc8544000e45591de500adacc2.anon0277dfb0_f1fc_47b2_a519_1493d286435f

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/samiabboud/dev/aampe/modeling/src/bq_run.py", line 33, in <module>
    model.fit(feature_columns, label_columns)
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/bigframes/ml/base.py", line 162, in fit
    return self._fit(X, y)
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/bigframes/core/log_adapter.py", line 44, in wrapper
    return method(*args, **kwargs)
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/bigframes/ml/linear_model.py", line 136, in _fit
    self._bqml_model = self._bqml_model_factory.create_model(
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/bigframes/ml/core.py", line 245, in create_model
    input_data = X_train._cached().join(y_train._cached(), how="outer")
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/bigframes/core/log_adapter.py", line 44, in wrapper
    return method(*args, **kwargs)
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/bigframes/dataframe.py", line 3045, in _cached
    self._set_block(self._block.cached())
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/bigframes/core/blocks.py", line 1677, in cached
    self.session._execute_and_cache(self.expr, cluster_cols=self.index_columns),
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/bigframes/session/__init__.py", line 1479, in _execute_and_cache
    table_expression = self.ibis_client.table(
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/ibis/backends/bigquery/__init__.py", line 509, in table
    table = sg.parse_one(name, into=sg.exp.Table, read=self.name)
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/sqlglot/__init__.py", line 124, in parse_one
    result = dialect.parse_into(into, sql, **opts)
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/sqlglot/dialects/dialect.py", line 325, in parse_into
    return self.parser(**opts).parse_into(expression_type, self.tokenize(sql), sql)
  File "/Users/samiabboud/dev/aampe/modeling/venv/lib/python3.9/site-packages/sqlglot/parser.py", line 1044, in parse_into
    raise ParseError(
sqlglot.errors.ParseError: Failed to parse 'platform-dev-285607._21e83bdd53455fdc8544000e45591de500adacc2.anon0277dfb0_f1fc_47b2_a519_1493d286435f' into <class 'sqlglot.expressions.Table'>

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the googleapis/python-bigquery-dataframes API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.samplesIssues that are directly related to samples.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions