Skip to content

Conversation

@xinrong-meng
Copy link
Member

@xinrong-meng xinrong-meng commented Apr 8, 2021

What changes were proposed in this pull request?

Now that we merged the Koalas main code into the PySpark code base (#32036), we should port the Koalas DataFrame unit test to PySpark.

Why are the changes needed?

Currently, the pandas-on-Spark modules are not tested at all. We should enable the DataFrame unit test first.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Enable the DataFrame unit test.

Keyword:SPARK-34849

@xinrong-meng xinrong-meng changed the title [WIP] [SPARK-34886] Port/integrate Koalas DataFrame unit test into PySpark [WIP][SPARK-34886][PYTHON] Port/integrate Koalas DataFrame unit test into PySpark Apr 8, 2021
@ueshin
Copy link
Member

ueshin commented Apr 8, 2021

ok to test.

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41627/

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Test build #137049 has finished for PR 32083 at commit 3b924c0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

add to whitelist

from pyspark.pandas.utils import default_session, sql_conf as sqlc, SPARK_CONF_ARROW_ENABLED


class SQLTestUtils(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have this util in PySpark. we should probably merge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea!

May I take it as a separate task later and port test files first?

Copy link
Member

@ueshin ueshin Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we file a JIRA ticket to track the task then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, I filed https://issues.apache.org/jira/browse/SPARK-34999 to track this.

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41654/

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41654/

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Test build #137076 has finished for PR 32083 at commit 3b924c0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Test build #137092 has finished for PR 32083 at commit 6d13af9.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41670/

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41670/

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41675/

@ueshin
Copy link
Member

ueshin commented Apr 8, 2021

I guess we should add pyspark.pandas.tests to heavy_tests in python/run-tests.py?

Copy link
Member

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending tests.

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Test build #137097 has finished for PR 32083 at commit 4abe527.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@xinrong-meng xinrong-meng changed the title [WIP][SPARK-34886][PYTHON] Port/integrate Koalas DataFrame unit test into PySpark [SPARK-34886][PYTHON] Port/integrate Koalas DataFrame unit test into PySpark Apr 8, 2021
@xinrong-meng xinrong-meng marked this pull request as ready for review April 8, 2021 22:03
@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41678/

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41678/

@SparkQA
Copy link

SparkQA commented Apr 9, 2021

Test build #137100 has finished for PR 32083 at commit 3a73eba.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 9, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41685/

@SparkQA
Copy link

SparkQA commented Apr 9, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41685/

@SparkQA
Copy link

SparkQA commented Apr 9, 2021

Test build #137107 has finished for PR 32083 at commit f5d9fd3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Apr 9, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41692/

@SparkQA
Copy link

SparkQA commented Apr 9, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41692/

@SparkQA
Copy link

SparkQA commented Apr 9, 2021

Test build #137113 has finished for PR 32083 at commit f5d9fd3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@HyukjinKwon
Copy link
Member

I manually checked that all related tests passed during running multiple times. I don't believe this cause any extra test failures.

Merged to master.

@SparkQA
Copy link

SparkQA commented Apr 9, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41698/

@SparkQA
Copy link

SparkQA commented Apr 9, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41698/

@SparkQA
Copy link

SparkQA commented Apr 9, 2021

Test build #137119 has finished for PR 32083 at commit f5d9fd3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants