Skip to content

Conversation

@alismess-db
Copy link
Contributor

What changes were proposed in this pull request?

#28671 introduced a change where the order in which CANCELED state for SparkExecuteStatementOperation is set was changed. Before setting the state to CANCELED, cleanup() was called which kills the jobs, causing an exception to be thrown inside execute(). This causes the state to transiently become ERROR before being set to CANCELED. This PR fixes the order.

Why are the changes needed?

Bug: wrong operation state is set.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test in SparkExecuteStatementOperationSuite.scala.

@alismess-db
Copy link
Contributor Author

cc @juliuszsompolski

@SparkQA
Copy link

SparkQA commented Jun 23, 2020

Test build #124431 has finished for PR 28912 at commit be5e2fa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu maropu changed the title [SPARK-32057] ExecuteStatement: cancel should not transiently ERROR [SPARK-32057][SQL] ExecuteStatement: cancel should not transiently ERROR Jun 23, 2020
@maropu
Copy link
Member

maropu commented Jun 23, 2020

also cc: @yaooqinn

@HyukjinKwon
Copy link
Member

and also @wangyum

@SparkQA
Copy link

SparkQA commented Jun 24, 2020

Test build #124488 has finished for PR 28912 at commit 49f4335.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@juliuszsompolski juliuszsompolski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Thanks, and sorry for the breakage...
Since #28671 went to 3.0, I think this should get backported as fell.

@alismess-db alismess-db changed the title [SPARK-32057][SQL] ExecuteStatement: cancel should not transiently ERROR [SPARK-32057][SQL] ExecuteStatement: cancel and close should not transiently ERROR Jun 24, 2020
Copy link
Member

@yaooqinn yaooqinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@maropu
Copy link
Member

maropu commented Jun 24, 2020

retest this please

(OperationState.CANCELED, (_: SparkExecuteStatementOperation).cancel()),
(OperationState.CLOSED, (_: SparkExecuteStatementOperation).close())
).foreach { case (finalState, transition) =>
test(s"SPARK-32057 SparkExecuteStatementOperation should not transiently become ERROR " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: drop s in the head.

@SparkQA
Copy link

SparkQA commented Jun 24, 2020

Test build #124503 has finished for PR 28912 at commit 49f4335.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jun 25, 2020

Test build #124509 has finished for PR 28912 at commit 49f4335.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

retest this please

1 similar comment
@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jun 25, 2020

Test build #124514 has finished for PR 28912 at commit 49f4335.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Jun 26, 2020

@alismess-db Looks the valid test failures.

@SparkQA
Copy link

SparkQA commented Jun 26, 2020

Test build #124546 has finished for PR 28912 at commit 03b6c93.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 26, 2020

Test build #124548 has finished for PR 28912 at commit d68e77b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@juliuszsompolski
Copy link
Contributor

sbt.ForkMain$ForkError: org.mockito.exceptions.base.MockitoException: 
ClassCastException occurred while creating the mockito mock :
  class to mock : 'org.apache.hive.service.cli.session.HiveSession', loaded by classloader : 'sun.misc.Launcher$AppClassLoader@77f03bb1'
  created class : 'org.mockito.codegen.HiveSession$MockitoMock$1654160678', loaded by classloader : 'net.bytebuddy.dynamic.loading.MultipleParentClassLoader@55c6518c'
  proxy instance class : 'org.mockito.codegen.HiveSession$MockitoMock$1654160678', loaded by classloader : 'net.bytebuddy.dynamic.loading.MultipleParentClassLoader@55c6518c'
  instance creation by : ObjenesisInstantiator

You might experience classloading issues, please ask the mockito mailing-list.

@gatorsmile @maropu @HyukjinKwon @yaooqinn I have seen a lot of such issues with mockito in Spark tests lately... Do you know if there is some underlying build/test infra issue that causes all of these?

@alismess-db
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jul 1, 2020

Test build #124767 has finished for PR 28912 at commit 839f44c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member

yaooqinn commented Jul 1, 2020

The test failures look like that we need to set up a dedicated JVM for this newly added test

@maropu
Copy link
Member

maropu commented Jul 1, 2020

@alismess-db Could you try to add the test suite in testsWhichShouldRunInTheirOwnDedicatedJvm?
https://github.com/apache/spark/blob/master/project/SparkBuild.scala#L464

@SparkQA
Copy link

SparkQA commented Jul 1, 2020

Test build #124798 has finished for PR 28912 at commit 4aaa34b.

  • This patch fails build dependency tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@alismess-db
Copy link
Contributor Author

retest this please

1 similar comment
@maropu
Copy link
Member

maropu commented Jul 2, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jul 2, 2020

Test build #124834 has finished for PR 28912 at commit 4aaa34b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@alismess-db
Copy link
Contributor Author

retest this please

1 similar comment
@yaooqinn
Copy link
Member

yaooqinn commented Jul 3, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jul 6, 2020

Test build #124944 has finished for PR 28912 at commit 4aaa34b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member

yaooqinn commented Jul 6, 2020

The tag test-hive1.2 works wrongly with hadoop-3.2, I opened another PR #29006 to fix that. May you add test-hadoop2.7 to the pr title and retry jenkins? @maropu @alismess-db

@HyukjinKwon HyukjinKwon changed the title [SPARK-32057][SQL][test-hive1.2] ExecuteStatement: cancel and close should not transiently ERROR [SPARK-32057][SQL][test-hive1.2][test-hadoop2.7] ExecuteStatement: cancel and close should not transiently ERROR Jul 6, 2020
@HyukjinKwon
Copy link
Member

retest this please

@maropu
Copy link
Member

maropu commented Jul 6, 2020

Ah, I see. we need the tag [test-hadoop2.7], too...

@SparkQA
Copy link

SparkQA commented Jul 6, 2020

Test build #125019 has finished for PR 28912 at commit 4aaa34b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@juliuszsompolski
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Jul 6, 2020

Test build #125056 has finished for PR 28912 at commit 4aaa34b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@wangyum wangyum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retest this please.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 7, 2020

Test build #125210 has finished for PR 28912 at commit 4aaa34b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Merged to master and branch-3.0.

HyukjinKwon pushed a commit that referenced this pull request Jul 8, 2020
…ncel and close should not transiently ERROR

### What changes were proposed in this pull request?
#28671 introduced a change where the order in which CANCELED state for SparkExecuteStatementOperation is set was changed. Before setting the state to CANCELED, `cleanup()` was called which kills the jobs, causing an exception to be thrown inside `execute()`. This causes the state to transiently become ERROR before being set to CANCELED. This PR fixes the order.

### Why are the changes needed?
Bug: wrong operation state is set.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Unit test in SparkExecuteStatementOperationSuite.scala.

Closes #28912 from alismess-db/execute-statement-operation-cleanup-order.

Authored-by: Ali Smesseim <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 8b0a54e)
Signed-off-by: HyukjinKwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants