Skip to content

Conversation

@shivusondur
Copy link
Contributor

@shivusondur shivusondur commented Jul 15, 2019

What changes were proposed in this pull request?

changed the test according to steps mentioned in SPARK-27921

difference comparing to select_having.sql

diff --git a/sql/core/src/test/resources/sql-tests/results/pgSQL/select_having.sql.out b/sql/core/src/test/resources/sql-tests/results/udf/pgSQL/udf-select_having.sql.out
index 02536eb..f731d11 100644
--- a/sql/core/src/test/resources/sql-tests/results/pgSQL/select_having.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/udf/pgSQL/udf-select_having.sql.out
@@ -91,54 +91,54 @@ struct<>
 
 
 -- !query 11
-SELECT b, c FROM test_having
-	GROUP BY b, c HAVING count(*) = 1 ORDER BY b, c
+SELECT udf(b), udf(c) FROM test_having
+	GROUP BY b, c HAVING udf(count(*)) = 1 ORDER BY udf(b), udf(c)
 -- !query 11 schema
-struct<b:int,c:string>
+struct<CAST(udf(cast(b as string)) AS INT):int,CAST(udf(cast(c as string)) AS STRING):string>
 -- !query 11 output
 1	XXXX
 3	bbbb
 
 
 -- !query 12
-SELECT b, c FROM test_having
-	GROUP BY b, c HAVING b = 3 ORDER BY b, c
+SELECT udf(b), udf(c) FROM test_having
+	GROUP BY b, c HAVING udf(b) = 3 ORDER BY udf(b), udf(c)
 -- !query 12 schema
-struct<b:int,c:string>
+struct<CAST(udf(cast(b as string)) AS INT):int,CAST(udf(cast(c as string)) AS STRING):string>
 -- !query 12 output
 3	BBBB
 3	bbbb
 
 
 -- !query 13
-SELECT c, max(a) FROM test_having
-	GROUP BY c HAVING count(*) > 2 OR min(a) = max(a)
+SELECT udf(c), max(udf(a)) FROM test_having
+	GROUP BY c HAVING udf(count(*)) > 2 OR udf(min(a)) = udf(max(a))
 	ORDER BY c
 -- !query 13 schema
-struct<c:string,max(a):int>
+struct<CAST(udf(cast(c as string)) AS STRING):string,max(CAST(udf(cast(a as string)) AS INT)):int>
 -- !query 13 output
 XXXX	0
 bbbb	5
 
 
 -- !query 14
-SELECT min(a), max(a) FROM test_having HAVING min(a) = max(a)
+SELECT udf(udf(min(udf(a)))), udf(udf(max(udf(a)))) FROM test_having HAVING udf(udf(min(udf(a)))) = udf(udf(max(udf(a))))
 -- !query 14 schema
-struct<min(a):int,max(a):int>
+struct<CAST(udf(cast(cast(udf(cast(min(cast(udf(cast(a as string)) as int)) as string)) as int) as string)) AS INT):int,CAST(udf(cast(cast(udf(cast(max(cast(udf(cast(a as string)) as int)) as string)) as int) as string)) AS INT):int>
 -- !query 14 output
 
 
 
 -- !query 15
-SELECT min(a), max(a) FROM test_having HAVING min(a) < max(a)
+SELECT udf(min(udf(a))), udf(udf(max(a))) FROM test_having HAVING udf(min(a)) < udf(max(udf(a)))
 -- !query 15 schema
-struct<min(a):int,max(a):int>
+struct<CAST(udf(cast(min(cast(udf(cast(a as string)) as int)) as string)) AS INT):int,CAST(udf(cast(cast(udf(cast(max(a) as string)) as int) as string)) AS INT):int>
 -- !query 15 output
 0	9
 
 
 -- !query 16
-SELECT a FROM test_having HAVING min(a) < max(a)
+SELECT udf(a) FROM test_having HAVING udf(min(a)) < udf(max(a))
 -- !query 16 schema
 struct<>
 -- !query 16 output
@@ -147,16 +147,16 @@ grouping expressions sequence is empty, and 'default.test_having.`a`' is not an
 
 
 -- !query 17
-SELECT 1 AS one FROM test_having HAVING a > 1
+SELECT 1 AS one FROM test_having HAVING udf(a) > 1
 -- !query 17 schema
 struct<>
 -- !query 17 output
 org.apache.spark.sql.AnalysisException
-cannot resolve '`a`' given input columns: [one]; line 1 pos 40
+cannot resolve '`a`' given input columns: [one]; line 1 pos 44
 
 
 -- !query 18
-SELECT 1 AS one FROM test_having HAVING 1 > 2
+SELECT 1 AS one FROM test_having HAVING udf(udf(1) > udf(2))
 -- !query 18 schema
 struct<one:int>
 -- !query 18 output
@@ -164,7 +164,7 @@ struct<one:int>
 
 
 -- !query 19
-SELECT 1 AS one FROM test_having HAVING 1 < 2
+SELECT 1 AS one FROM test_having HAVING udf(udf(1) < udf(2))
 -- !query 19 schema
 struct<one:int>
 -- !query 19 output
@@ -172,7 +172,7 @@ struct<one:int>
 
 
 -- !query 20
-SELECT 1 AS one FROM test_having WHERE 1/a = 1 HAVING 1 < 2
+SELECT 1 AS one FROM test_having WHERE 1/udf(a) = 1 HAVING 1 < 2
 -- !query 20 schema
 struct<one:int>
 -- !query 20 output

How was this patch tested?

by:

sudo SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/test-only *SQLQueryTestSuite -- -z udf/pgSQL/udf-select_having.sql"

@HyukjinKwon
Copy link
Member

Thanks for working on this. Let's wait for #25130. We need to fix UDF testing base itself

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After merging the PR I pointed out, let's try HAVING udf(1 > 2) combination as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@HyukjinKwon
Copy link
Member

Can you post the diff in the PR description as guided in the parent JIRA?

@HyukjinKwon HyukjinKwon changed the title [SPARK-28390][SQL][Test] Convert and port 'pgSQL/select_having.sql' into UDF test base [SPARK-28390][SQL][PYTHON][TESTS] Convert and port 'pgSQL/select_having.sql' into UDF test base Jul 15, 2019
@shivusondur
Copy link
Contributor Author

Can you post the diff in the PR description as guided in the parent JIRA?

Done

@HyukjinKwon
Copy link
Member

PR description isn't exactly guided. It should be foldable with

Details html tag.

@shivusondur
Copy link
Contributor Author

PR description isn't exactly guided. It should be foldable with
@HyukjinKwon
Handled, Thanks for reminding me.

@HyukjinKwon
Copy link
Member

ok to test

@HyukjinKwon
Copy link
Member

#25130 is merged. Can you rebase and update it against the current master?

@HyukjinKwon
Copy link
Member

Looks fine otherwise if the tests pass

@SparkQA
Copy link

SparkQA commented Jul 18, 2019

Test build #107821 has finished for PR 25161 at commit 6f44282.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

changed the test according to steps mentnioned in SPARK-27921
@SparkQA
Copy link

SparkQA commented Jul 19, 2019

Test build #107886 has finished for PR 25161 at commit b0a4fd1.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 19, 2019

Test build #107889 has finished for PR 25161 at commit b0a4fd1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 21, 2019

Test build #107969 has finished for PR 25161 at commit fc65bcf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

@shivusondur, is the diff correct? Looks weird:

diff --git a/sql/core/src/test/resources/sql-tests/results/pgSQL/select_having.sql.out b/sql/core/src/test/resources/sql-tests/results/udf/pgSQL/udf-select_having.sql.out
index 02536eb..96ccd15 100644
--- a/sql/core/src/test/resources/sql-tests/results/pgSQL/select_having.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/udf/pgSQL/udf-select_having.sql.out
@@ -1,5 +1,5 @@
 -- Automatically generated by SQLQueryTestSuite
--- Number of queries: 22
+-- Number of queries: 42

@SparkQA
Copy link

SparkQA commented Jul 22, 2019

Test build #107979 has finished for PR 25161 at commit a1ba9a8.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Thanks for updating and addressing comments. Looks fine. Let me take a final look again before merging it.

@SparkQA
Copy link

SparkQA commented Jul 22, 2019

Test build #107997 has finished for PR 25161 at commit ef39cdd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

-- SELECT_HAVING
-- https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/select_having.sql
--
-- This test file was converted from inputs/pgSQL/select_having.sql
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add this todo on the top:

-- TODO: We should add UDFs in GROUP BY clause when [SPARK-28445] is resolved.

and remove same lines below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

-- Per SQL spec, these should generate 0 or 1 row, even without aggregates

SELECT udf(udf(min(udf(a)))), udf(udf(max(udf(a)))) FROM test_having HAVING udf(udf(min(udf(a)))) = udf(udf(max(udf(a))));
SELECT udf(udf(min(udf(a)))), udf(udf(max(udf(a)))) FROM test_having HAVING udf(udf(min(udf(a)))) < udf(udf(max(udf(a))));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use different UDF combination comparing to the one right above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated like below

SELECT udf(udf(min(udf(a)))), udf(udf(max(udf(a)))) FROM test_having HAVING udf(udf(min(udf(a)))) = udf(udf(max(udf(a))));
SELECT udf(min(udf(a))), udf(udf(max(a))) FROM test_having HAVING udf(min(a)) < udf(max(udf(a)));

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good otherwise

@SparkQA
Copy link

SparkQA commented Jul 22, 2019

Test build #108021 has finished for PR 25161 at commit c4ad657.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 23, 2019

Test build #108031 has finished for PR 25161 at commit 8795d66.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivusondur
Copy link
Contributor Author

retest this please

1 similar comment
@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 24, 2019

Test build #108068 has finished for PR 25161 at commit 8795d66.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants