Skip to content

Conversation

@peter-toth
Copy link
Contributor

@peter-toth peter-toth commented Jun 13, 2019

What changes were proposed in this pull request?

This PR is to port with.sql from PostgreSQL regression tests. https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/with.sql

The expected results can be found in the link: https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/expected/with.out

When porting the test cases, found 7 PostgreSQL specific features that do not exist in Spark SQL:

Also, found one inconsistent behavior:

Also, added the following notes:

  • Spark SQL doesn't support DELETE statement
  • Spark SQL doesn't support UPDATE statement
  • Spark SQL doesn't support RULEs
  • Spark SQL doesn't support UNIQUE constraints
  • Spark SQL doesn't support ON CONFLICT clause
  • Spark SQL doesn't support TRIGGERs
  • Spark SQL doesn't support INHERITS clause

How was this patch tested?

N/A

@peter-toth
Copy link
Contributor Author

peter-toth commented Jun 13, 2019

This is a work in progress PR as currently almost nothing can be ported and most of the with.sql is commented out, but I believe SPARK-19799, SPARK-28002 and SPARK-24497 could help a lot.
(Actually, SPARK-24497 could be moved under SPARK-27764 as a subtask.)

I left a lot of TODOs in the PR and will try to clean them up next week.

@peter-toth
Copy link
Contributor Author

peter-toth commented Jun 25, 2019

@dongjoon-hyun, @gatorsmile, @wangyum I finished the analysis of with.sql. Unfortunately there is a large discrepancy between PostgreSQL and Spark SQL so most of the file is still commented out.

I have a PR open to address WITH in subqueries: #24831,
and another work-in-progress one to support recursive queries: #23531 that probably could help a lot.

@peter-toth peter-toth changed the title [WIP][SPARK-28034][SQL] Port with.sql [SPARK-28034][SQL] Port with.sql Jun 25, 2019
@dongjoon-hyun
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Jun 26, 2019

Test build #106939 has finished for PR 24860 at commit fd8abe0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 28, 2019

Test build #107009 has finished for PR 24860 at commit b9e3388.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Jul 4, 2019

Test build #107214 has finished for PR 24860 at commit b9e3388.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 4, 2019

Test build #107235 has finished for PR 24860 at commit 292be80.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Since #25054 is merged, could you regenerate the result once more?

@peter-toth
Copy link
Contributor Author

Since #25054 is merged, could you regenerate the result once more?

Sure, done.

@SparkQA
Copy link

SparkQA commented Jul 5, 2019

Test build #107271 has finished for PR 24860 at commit ad7a86c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@peter-toth
Copy link
Contributor Author

I still left 4 TODOs in the file, could you please help me @dongjoon-hyun what to do with these?

  • I can't find an option in Spark SQL to query the DDL of a view, but I might be missing something. Do you know if there is an option to do it? If there isn't, shall I create a ticket for it?
  • WITH is a reserved keyword in PostgreSQL, but it isn't in Spark SQL. How shall we handle this?

Also please check the notes in the PR description. I left these notes in the ported SQL file where I though it makes no sense to create a ticket.

@dongjoon-hyun
Copy link
Member

Spark had Views-to-DDL before, but it's removed. IIRC, there is a ticket. If you search, you can find the deletion ticket.

@SparkQA
Copy link

SparkQA commented Jul 8, 2019

Test build #107345 has finished for PR 24860 at commit 5b9997e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@peter-toth
Copy link
Contributor Author

Views-to-DDL

Spark had Views-to-DDL before, but it's removed. IIRC, there is a ticket. If you search, you can find the deletion ticket.

Oh, it seems SHOW CREATE TABLE does work on views.

@SparkQA
Copy link

SparkQA commented Jul 8, 2019

Test build #107351 has finished for PR 24860 at commit 4a356e5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-28034][SQL] Port with.sql [SPARK-28034][SQL][TEST] Port with.sql Jul 9, 2019
@dongjoon-hyun
Copy link
Member

Retest this please.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 10, 2019

So, the remaining things looks like cleaning up the TODOs. Did I understand correctly?

@peter-toth
Copy link
Contributor Author

peter-toth commented Jul 10, 2019

So, the remaining things looks like cleaning up the TODOs. Did I understand correctly?

Yes, and actually the only remaining question is if WITH should be a reserved keyword as it is in PostgreSQL?

@dongjoon-hyun
Copy link
Member

WITH is already a reserved keyword in ANSI mode in Spark like the following. I think we don't need to do anything here.

@peter-toth
Copy link
Contributor Author

peter-toth commented Jul 10, 2019

WITH is already a reserved keyword in ANSI mode in Spark like the following. I think we don't need to do anything here.

Thanks! Does that mean that we should turn on spark.sql.parser.ansi.enabled=true in with.sql? Or shall I just leave a note about it?

@dongjoon-hyun
Copy link
Member

Please wrap those create statements with set spark.sql.parser.ansi.enabled=true and set spark.sql.parser.ansi.enabled=false. That will be enough. We don't want to break the existing workload. I think we don't need TODO or other comments since spark.sql.parser.ansi.enabled has a clean meaning.


== SQL ==
create table foo (with baz)
-----------------------^^^
Copy link
Member

@dongjoon-hyun dongjoon-hyun Jul 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With set spark.sql.parser.ansi.enabled=true, this will raise the following instead of DataType baz is not supported error.

Error in query:
no viable alternative at input 'with'(line 1, pos 18)

== SQL ==
create table foo (with baz)
------------------^^^

@SparkQA
Copy link

SparkQA commented Jul 10, 2019

Test build #107472 has finished for PR 24860 at commit 4a356e5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

@maropu . Could you file a JIRA for that PosrgreSQL test should run with the following?

  • set spark.sql.crossJoin.enabled=true;
  • set spark.sql.parser.ansi.enabled=true;

@maropu
Copy link
Member

maropu commented Jul 11, 2019

Looks nice, will do now
oh, @wangyum seems to be working on it.

@SparkQA
Copy link

SparkQA commented Jul 11, 2019

Test build #107515 has finished for PR 24860 at commit 615f592.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Jul 11, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Jul 11, 2019

Test build #107520 has finished for PR 24860 at commit 615f592.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Hi, @maropu and @wangyum , and @peter-toth .
Since the followings seems to need more time, I'll merge this first. Please rebase and update those PRs.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master.

@peter-toth
Copy link
Contributor Author

Thanks @dongjoon-hyun, @maropu for the review!

vinodkc pushed a commit to vinodkc/spark that referenced this pull request Jul 18, 2019
## What changes were proposed in this pull request?

This PR is to port with.sql from PostgreSQL regression tests. https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/with.sql

The expected results can be found in the link: https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/expected/with.out

When porting the test cases, found 7 PostgreSQL specific features that do not exist in Spark SQL:

- [SPARK-19799](https://issues.apache.org/jira/browse/SPARK-19799) Support WITH clause in subqueries
- [SPARK-24497](https://issues.apache.org/jira/browse/SPARK-24497) Support recursive SQL query
- [SPARK-28297](https://issues.apache.org/jira/browse/SPARK-28297) Handling outer links in CTE subquery expressions
- [SPARK-28296](https://issues.apache.org/jira/browse/SPARK-28296) Improved VALUES support
- [SPARK-28146](https://issues.apache.org/jira/browse/SPARK-28146) Support IS OF type predicate
- [SPARK-28147](https://issues.apache.org/jira/browse/SPARK-28147) Support RETURNING clause
- [SPARK-27878](https://issues.apache.org/jira/browse/SPARK-27878) Support ARRAY(sub-SELECT) expressions

Also, found one inconsistent behavior:
- [SPARK-28299](https://issues.apache.org/jira/browse/SPARK-28299) Evaluation of multiple CTE uses

Also, added the following notes:
- Spark SQL doesn't support DELETE statement
- Spark SQL doesn't support UPDATE statement
- Spark SQL doesn't support RULEs
- Spark SQL doesn't support UNIQUE constraints
- Spark SQL doesn't support ON CONFLICT clause
- Spark SQL doesn't support TRIGGERs
- Spark SQL doesn't support INHERITS clause

## How was this patch tested?

N/A

Closes apache#24860 from peter-toth/SPARK-28034.

Authored-by: Peter Toth <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants