Skip to content

Conversation

@shivusondur
Copy link
Contributor

What changes were proposed in this pull request?

only todo message updated. Need to add udf() for GroupBy Tests, after resolving following jira
[SPARK-28386] and [SPARK-26741]

How was this patch tested?

NA, only TODO message updated.

@shivusondur
Copy link
Contributor Author

@HyukjinKwon
plz check
or we can wait for corresponding jira([SPARK-28386] and [SPARK-26741]) to handle?

--
-- This test file was converted from inputs/pgSQL/select_having.sql
-- TODO: We should add UDFs in GROUP BY clause when [SPARK-28445] is resolved.
-- TODO: We should add UDFs in GROUP BY clause when [SPARK-28386] and [SPARK-26741] is resolved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SPARK-28445 was wrong from the beginning, @shivusondur ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun
After resolving the SPARK-28445 also, test were failing and found [SPARK-28386] and [SPARK-26741] are blocking it.

for furher details follow #25215 (comment)

--
-- This test file was converted from inputs/pgSQL/select_having.sql
-- TODO: We should add UDFs in GROUP BY clause when [SPARK-28445] is resolved.
-- TODO: We should add UDFs in GROUP BY clause when [SPARK-28386] and [SPARK-26741] is resolved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit lost about this or I forget something.
Can't we add UDF in group-by clause (resolved in SPARK-28445)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyukjinKwon
From this
#25215 (comment)
I thought I need to update todo with blocking jira numbers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I forgot. Can we enable all other tests with UDF in group-by and comment out the test?

-- !query 11
SELECT udf(b), udf(c) FROM test_having
GROUP BY udf(b), udf(c) HAVING udf(count(*)) = 1 ORDER BY udf(b), udf(c)
-- !query 11 schema
struct<>
-- !query 11 output
org.apache.spark.sql.AnalysisException
cannot resolve 'b' given input columns: [CAST(udf(cast(b as string)) AS INT), CAST(udf(cast(c as string)) AS STRING)]; line 2 pos 63

I guess we can still add some more tests?

@dongjoon-hyun
Copy link
Member

BTW, @shivusondur . The follow-up PR had better have its own PR title. The current one seems to be copied from the original PR.

@shivusondur shivusondur changed the title [SPARK-28390][SQL][PYTHON][TESTS] [FOLLOW-UP]Convert and port 'pgSQL/select_having.sql' into UDF test base [SPARK-28390][SQL][PYTHON][TESTS] [FOLLOW-UP] Update the TODO with actual blocking JIRA IDs Aug 13, 2019
@dongjoon-hyun
Copy link
Member

Usually, it's not worth to change it immediately because it will be resolved in 3.0.0 timeframe. But, I'm okay to keep it up-to-date, too.
I'll leave this PR to @HyukjinKwon .

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Aug 14, 2019

Yea that's fine to update comments. But @shivusondur can you confirm if you are unable to fix any test or some tests to have use in GROUP BY clause due to both JIRAs? If you can, let's add some tests and only comment out the other tests not working by both JIRAs.

@SparkQA
Copy link

SparkQA commented Aug 15, 2019

Test build #4829 has finished for PR 25415 at commit dd41b26.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivusondur
Copy link
Contributor Author

Yea that's fine to update comments. But @shivusondur can you confirm if you are unable to fix any test or some tests to have used in GROUP BY clause due to both JIRAs? If you can, let's add some tests and only comment out the other tests not working by both JIRAs.

@HyukjinKwon

There are 3 instances of groupby test in udf-select_having.sql, all 3 are not working due to the same reason.
Originally we have copied this "udf-select_having.sql" from "select_having.sql", so we are maintaining the same tests as original, (#25161 (comment))
if the extra tests we can add in new file.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Aug 19, 2019

Test build #109304 has finished for PR 25415 at commit dd41b26.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants