Skip to content

Spark: Cannot create expression literals from Instant #2530

@RussellSpitzer

Description

@RussellSpitzer

Summary:
Literals.from doesn't have support for creating literals from java.time.Instant which can be passed in Spark 3.1.1

Cannot create expression literal from java.time.Instant: 2021-04-25T19:00:00Z
at org.apache.iceberg.expressions.Literals.from(Literals.java:85)
at org.apache.iceberg.expressions.UnboundPredicate.(UnboundPredicate.java:40)


Long story

Reports from some users testing out Spark 3.1 which has an extremely weird behavior.

  1. Create a table with a timestamp partition key
spark.sql("CREATE TABLE foo (ts timestamp) USING ICEBERG PARTITIONED BY (hours(ts)) LOCATION '/Users/russellspitzer/temp'")
spark.sql("INSERT into foo VALUES (cast(\"2021-04-25T19:01:00+00:00\" as timestamp))")
  1. Query this table
SELECT * from foo WHERE ts >= \"2021-04-25T19:00:00+00:00\" and ts < \"2021-04-25T20:00:00+00:00\"

The user reported the exception I noted at the top of this issue, I attempt to reproduce and cannot. I then find the following matrix of behaviors.

Spark 3.0.2 - spark-shell - No Issue
Spark 3.0.2 - spark-sql - No issue
Spark 3.1.1 - spark-shell - No issue
Spark 3.1.1 - spark-sql - Exception - Query Fails

While i'm slightly confused as why there is a difference in parsing behavior between spark-shell and spark-sql in 3.1.1 I think the fix is pretty clear and we need to just add support for creating timestamp literals from java.time.Instants

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions