-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34886][PYTHON] Port/integrate Koalas DataFrame unit test into PySpark #32083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
ok to test. |
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Test build #137049 has finished for PR 32083 at commit
|
|
add to whitelist |
| from pyspark.pandas.utils import default_session, sql_conf as sqlc, SPARK_CONF_ARROW_ENABLED | ||
|
|
||
|
|
||
| class SQLTestUtils(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have this util in PySpark. we should probably merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea!
May I take it as a separate task later and port test files first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we file a JIRA ticket to track the task then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly, I filed https://issues.apache.org/jira/browse/SPARK-34999 to track this.
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137076 has finished for PR 32083 at commit
|
|
Test build #137092 has finished for PR 32083 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
I guess we should add |
ueshin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending tests.
|
Test build #137097 has finished for PR 32083 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137100 has finished for PR 32083 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137107 has finished for PR 32083 at commit
|
|
retest this please |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137113 has finished for PR 32083 at commit
|
|
retest this please |
|
I manually checked that all related tests passed during running multiple times. I don't believe this cause any extra test failures. Merged to master. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137119 has finished for PR 32083 at commit
|
What changes were proposed in this pull request?
Now that we merged the Koalas main code into the PySpark code base (#32036), we should port the Koalas DataFrame unit test to PySpark.
Why are the changes needed?
Currently, the pandas-on-Spark modules are not tested at all. We should enable the DataFrame unit test first.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Enable the DataFrame unit test.
Keyword:SPARK-34849