Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This PR adds SQL generation support for Generate operator. It always converts Generate operator into LATERAL VIEW format as there are many limitations to put UDTF in project list.

This PR is based on #11658, please see the last commit to review the real changes.

Thanks @dilipbiswal for his initial work! Takes over #11596

How was this patch tested?

new tests in LogicalPlanToSQLSuite

@cloud-fan
Copy link
Contributor Author

@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53071 has finished for PR 11696 at commit cfc74ef.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor

yhuai commented Mar 14, 2016

test this please

@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53075 has finished for PR 11696 at commit cfc74ef.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Just for my understanding, do we miss some case now that we want to improve upon ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to SQL standard, the operator order should be SELECT ... FROM ... WHERE ... GROUP BY ... HAVING ... ORDER BY ... LIMIT ..., we should re-order operators to make them in standard order in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Thank you !!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put your comment actually in the code so we know why we need to re-order the operators

@dilipbiswal
Copy link
Contributor

@cloud-fan Thank you. LGTM

@gatorsmile
Copy link
Member

LGTM. Thank you for your work! @cloud-fan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just inline this since it is used only once

@rxin
Copy link
Contributor

rxin commented Mar 15, 2016

@cloud-fan the logic looks good to me.

I left some comments -- try make this understandable (by adding more comments or renaming some functions) for people that haven't spent a lot of time understanding this code.

@yhuai
Copy link
Contributor

yhuai commented Mar 16, 2016

OK. I have merged https://github.com/apache/spark/pull/11658/commits. Let me rebase this one to just keep the top commit.

@yhuai
Copy link
Contributor

yhuai commented Mar 16, 2016

#11768 is the commit for generator. To avoid of destroying @cloud-fan's branch by any accident, I create that new PR.

@SparkQA
Copy link

SparkQA commented Mar 17, 2016

Test build #53402 has finished for PR 11696 at commit 6fca665.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 17, 2016

Test build #53404 has finished for PR 11696 at commit dcbbcb8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

I'm going to merge it to unblock following works, will address comments of @yhuai and @liancheng if you have any.

@asfgit asfgit closed this in 1974d1d Mar 17, 2016
}

private def generateToSQL(g: Generate): String = {
val columnAliases = g.generatorOutput.map(_.sql).mkString(",")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please add a space after ,

@liancheng
Copy link
Contributor

Sorry for the late review. LGTM except for two minor comments.

asfgit pushed a commit that referenced this pull request Mar 18, 2016
…while generate SQL

## What changes were proposed in this pull request?

We only need to make sub-query names unique every time we generate a SQL string, but not all the time. This PR moves the `newSubqueryName` method to `class SQLBuilder` and remove `object SQLBuilder`.

also addressed 2 minor comments in #11696

## How was this patch tested?

existing tests.

Author: Wenchen Fan <[email protected]>

Closes #11783 from cloud-fan/tmp.
asfgit pushed a commit that referenced this pull request Mar 18, 2016
PR #11696 introduced a complex pattern match that broke Scala 2.10 match unreachability check and caused build failure.  This PR fixes this issue by expanding this pattern match into several simpler ones.

Note that tuning or turning off `-Dscalac.patmat.analysisBudget` doesn't work for this case.

Compilation against Scala 2.10

Author: tedyu <[email protected]>

Closes #11798 from yy2016/master.
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
## What changes were proposed in this pull request?

This PR adds SQL generation support for `Generate` operator. It always converts `Generate` operator into `LATERAL VIEW` format as there are many limitations to put UDTF in project list.

This PR is based on apache#11658, please see the last commit to review the real changes.

Thanks dilipbiswal for his initial work! Takes over apache#11596

## How was this patch tested?

new tests in `LogicalPlanToSQLSuite`

Author: Wenchen Fan <[email protected]>

Closes apache#11696 from cloud-fan/generate.
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
…while generate SQL

## What changes were proposed in this pull request?

We only need to make sub-query names unique every time we generate a SQL string, but not all the time. This PR moves the `newSubqueryName` method to `class SQLBuilder` and remove `object SQLBuilder`.

also addressed 2 minor comments in apache#11696

## How was this patch tested?

existing tests.

Author: Wenchen Fan <[email protected]>

Closes apache#11783 from cloud-fan/tmp.
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
PR apache#11696 introduced a complex pattern match that broke Scala 2.10 match unreachability check and caused build failure.  This PR fixes this issue by expanding this pattern match into several simpler ones.

Note that tuning or turning off `-Dscalac.patmat.analysisBudget` doesn't work for this case.

Compilation against Scala 2.10

Author: tedyu <[email protected]>

Closes apache#11798 from yy2016/master.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants