Skip to content

Conversation

@elek
Copy link
Member

@elek elek commented Jun 13, 2019

TestEventWatcher is intermittent. (Failed twice out of 44 executions).

Error is:

{code}
Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.764 s <<< FAILURE! - in org.apache.hadoop.hdds.server.events.TestEventWatcher
testMetrics(org.apache.hadoop.hdds.server.events.TestEventWatcher) Time elapsed: 2.384 s <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<3>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at org.apache.hadoop.hdds.server.events.TestEventWatcher.testMetrics(TestEventWatcher.java:197)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
{code}

In the test we do the following:

  1. fire start-event1
  2. fire start-event2
  3. fire start-event3
  4. fire end-event1
  5. wait

Usually the event2 and event3 are timed out and event1 is completed but in case of an accidental time between 3 and 4 (in fact between 1 and 4) the event1 also can be timed out.

I improved the unit test and fixed the metrics calculation (completed message should be incremented only if it's not yet timed out).

See: https://issues.apache.org/jira/browse/HDDS-1682

@elek elek added the ozone label Jun 13, 2019
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 34 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 515 trunk passed
+1 compile 293 trunk passed
+1 checkstyle 92 trunk passed
+1 mvnsite 0 trunk passed
+1 shadedclient 897 branch has no errors when building and testing our client artifacts.
+1 javadoc 180 trunk passed
0 spotbugs 334 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 526 trunk passed
_ Patch Compile Tests _
+1 mvninstall 480 the patch passed
+1 compile 297 the patch passed
+1 javac 297 the patch passed
+1 checkstyle 94 the patch passed
+1 mvnsite 0 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 655 patch has no errors when building and testing our client artifacts.
+1 javadoc 181 the patch passed
+1 findbugs 543 the patch passed
_ Other Tests _
-1 unit 154 hadoop-hdds in the patch failed.
-1 unit 1361 hadoop-ozone in the patch failed.
+1 asflicense 46 The patch does not generate ASF License warnings.
6550
Reason Tests
Failed junit tests hadoop.ozone.container.common.impl.TestHddsDispatcher
hadoop.ozone.client.rpc.TestOzoneRpcClient
hadoop.ozone.om.TestScmSafeMode
hadoop.ozone.client.rpc.TestOzoneAtRestEncryption
hadoop.ozone.TestStorageContainerManager
hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis
hadoop.hdds.scm.pipeline.TestRatisPipelineProvider
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-962/1/artifact/out/Dockerfile
GITHUB PR #962
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 8f67fc261bf2 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 940bcf0
Default Java 1.8.0_212
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-962/1/artifact/out/patch-unit-hadoop-hdds.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-962/1/artifact/out/patch-unit-hadoop-ozone.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-962/1/testReport/
Max. process+thread count 5030 (vs. ulimit of 5500)
modules C: hadoop-hdds/framework U: hadoop-hdds/framework
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-962/1/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.


queue.fireEvent(REPLICATION_COMPLETED, event1Completed);

//lease manager timeout = 2000L
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to increase this even further, say 2x or 3x of LM timeout. 200ms of buffer is too low.

Copy link
Contributor

@arp7 arp7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with a minor comment to increase delay.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 36 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 469 trunk passed
+1 compile 253 trunk passed
+1 checkstyle 63 trunk passed
+1 mvnsite 0 trunk passed
+1 shadedclient 787 branch has no errors when building and testing our client artifacts.
+1 javadoc 147 trunk passed
0 spotbugs 302 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 488 trunk passed
_ Patch Compile Tests _
+1 mvninstall 433 the patch passed
+1 compile 256 the patch passed
+1 javac 256 the patch passed
+1 checkstyle 80 the patch passed
+1 mvnsite 0 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 661 patch has no errors when building and testing our client artifacts.
+1 javadoc 154 the patch passed
+1 findbugs 513 the patch passed
_ Other Tests _
-1 unit 278 hadoop-hdds in the patch failed.
-1 unit 1498 hadoop-ozone in the patch failed.
+1 asflicense 40 The patch does not generate ASF License warnings.
6314
Reason Tests
Failed junit tests hadoop.hdds.scm.container.placement.algorithms.TestSCMContainerPlacementRackAware
hadoop.hdds.scm.block.TestBlockManager
hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures
hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient
hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException
hadoop.ozone.client.rpc.TestOzoneRpcClient
hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures
hadoop.ozone.container.server.TestSecureContainerServer
hadoop.ozone.container.ozoneimpl.TestSecureOzoneContainer
hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis
hadoop.hdds.scm.pipeline.TestRatisPipelineCreateAndDestory
hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
hadoop.ozone.client.rpc.TestFailureHandlingByClient
Subsystem Report/Notes
Docker Client=18.09.8 Server=18.09.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-962/2/artifact/out/Dockerfile
GITHUB PR #962
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux f1116a5f31ca 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 4e66cb9
Default Java 1.8.0_212
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-962/2/artifact/out/patch-unit-hadoop-hdds.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-962/2/artifact/out/patch-unit-hadoop-ozone.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-962/2/testReport/
Max. process+thread count 5050 (vs. ulimit of 5500)
modules C: hadoop-hdds/framework U: hadoop-hdds/framework
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-962/2/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 40 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 614 trunk passed
+1 compile 364 trunk passed
+1 checkstyle 63 trunk passed
+1 mvnsite 0 trunk passed
+1 shadedclient 835 branch has no errors when building and testing our client artifacts.
+1 javadoc 148 trunk passed
0 spotbugs 444 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 636 trunk passed
_ Patch Compile Tests _
+1 mvninstall 555 the patch passed
+1 compile 352 the patch passed
+1 javac 352 the patch passed
+1 checkstyle 66 the patch passed
+1 mvnsite 0 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 611 patch has no errors when building and testing our client artifacts.
+1 javadoc 140 the patch passed
+1 findbugs 615 the patch passed
_ Other Tests _
-1 unit 292 hadoop-hdds in the patch failed.
-1 unit 1825 hadoop-ozone in the patch failed.
+1 asflicense 36 The patch does not generate ASF License warnings.
7333
Reason Tests
Failed junit tests hadoop.hdds.scm.pipeline.TestRatisPipelineProvider
hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures
hadoop.ozone.om.TestScmSafeMode
hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
hadoop.ozone.client.rpc.TestOzoneRpcClient
hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis
hadoop.ozone.client.rpc.TestOzoneAtRestEncryption
hadoop.hdds.scm.pipeline.TestRatisPipelineCreateAndDestory
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-962/3/artifact/out/Dockerfile
GITHUB PR #962
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux dbe65afceb8c 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / c0a0c35
Default Java 1.8.0_212
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-962/3/artifact/out/patch-unit-hadoop-hdds.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-962/3/artifact/out/patch-unit-hadoop-ozone.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-962/3/testReport/
Max. process+thread count 4733 (vs. ulimit of 5500)
modules C: hadoop-hdds/framework U: hadoop-hdds/framework
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-962/3/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@elek
Copy link
Member Author

elek commented Jul 29, 2019

Thanks @arp7 the review. I am committing it with the suggested 3x time out (= 3x least timeout).

@elek elek closed this in b039f75 Jul 29, 2019
shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
- Added a public API to pass ExternalContext to TestRunner
- Refactored existing tests to test it

Author: Sanil15 <[email protected]>

Reviewers: Cameron Lee <[email protected]>

Closes apache#962 from Sanil15/SAMZA-2135
amahussein pushed a commit to amahussein/hadoop that referenced this pull request Oct 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants