Skip to content

Conversation

@yanboliang
Copy link
Contributor

@yanboliang yanboliang commented Apr 25, 2017

What changes were proposed in this pull request?

Some PySpark & SparkR tests run with tiny dataset and tiny maxIter, which means they are not converged. I don’t think checking intermediate result during iteration make sense, and these intermediate result may vulnerable and not stable, so we should switch to check the converged result. We hit this issue at #17746 when we upgrade breeze to 0.13.1.

How was this patch tested?

Existing tests.

@SparkQA
Copy link

SparkQA commented Apr 25, 2017

Test build #76133 has finished for PR 17757 at commit 57d4446.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Hi @yanboliang, do you mind if I ask why they were flaky? I am just curious and want to know.

@yanboliang
Copy link
Contributor Author

@HyukjinKwon We hit this issue at #17746 when we upgrade breeze to 0.13.1. Since these tests don't converged, we check intermediate result which is vulnerable, so I switched to check the last converged result. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If set maxIter = 2, the result is not converged, so the result is vulnerable. We should check the last converged result.

@dbtsai
Copy link
Member

dbtsai commented Apr 25, 2017

LGTM. Please merge the current master to resolve the conflicts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is there any result we could use when it is converged?
we have remove a call to predict - we should keep the call to make sure the api works and ideally check for the prediction results too if we could

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, here we just removed the unconverged test(with maxIter = 2), since we can't guarantee any equality during the iteration. I think the best way to test the api works well is to check number of iterations. If we set proper initial weights, the number of iterations to converge would be different from other initial weights or no initial weights. Let's open a separate JIRA to expose training summary for MLP at MLlib side, and then we can expose them at SparkR and add check here. Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the uncoverged test with the maxIter.
My main concern at this end is to at least exercise calling from R to JVM for each public API we export (ie. by calling predict on the MLP model) - we have had issues in the past the API never works and/or it is broken and we don't know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking more closely it looks like earlier tests do call predict. I'm good with simplifying this part of the test with weights.

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with comments above.

@SparkQA
Copy link

SparkQA commented Apr 26, 2017

Test build #76168 has finished for PR 17757 at commit a87d5c0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Apr 26, 2017
## What changes were proposed in this pull request?
Some PySpark & SparkR tests run with tiny dataset and tiny ```maxIter```, which means they are not converged. I don’t think checking intermediate result during iteration make sense, and these intermediate result may vulnerable and not stable, so we should switch to check the converged result. We hit this issue at #17746 when we upgrade breeze to 0.13.1.

## How was this patch tested?
Existing tests.

Author: Yanbo Liang <[email protected]>

Closes #17757 from yanboliang/flaky-test.

(cherry picked from commit dbb06c6)
Signed-off-by: Yanbo Liang <[email protected]>
@yanboliang
Copy link
Contributor Author

Merged into master and branch-2.2. Thanks for all reviewing.

@asfgit asfgit closed this in dbb06c6 Apr 26, 2017
@yanboliang yanboliang deleted the flaky-test branch April 26, 2017 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants