Skip to content

Conversation

@adoroszlai
Copy link
Contributor

What changes were proposed in this pull request?

If any container in the sample cluster fails to start, all successfully started containers are left running. This prevents any further acceptance tests from normal completion. This is only a minor inconvenience, since acceptance test as a whole fails either way.

This change makes sure the cluster is stopped if startup fails.

https://issues.apache.org/jira/browse/HDDS-2045

How was this patch tested?

Temporarily added fake failures in start_docker_env and wait_for_datanodes, and verified that the cluster is stopped:

$ ./test.sh
Removing network ozone_default
WARNING: Network ozone_default not found.
Creating network "ozone_default" with the default driver
Creating ozone_scm_1      ... done
Creating ozone_datanode_1 ... done
Creating ozone_datanode_2 ... done
Creating ozone_datanode_3 ... done
Creating ozone_om_1       ... done
0 datanode is up and healthy (until now)
Stopping ozone_datanode_1 ... done
Stopping ozone_datanode_3 ... done
Stopping ozone_om_1       ... done
Stopping ozone_datanode_2 ... done
Stopping ozone_scm_1      ... done
Removing ozone_datanode_1 ... done
Removing ozone_datanode_3 ... done
Removing ozone_om_1       ... done
Removing ozone_datanode_2 ... done
Removing ozone_scm_1      ... done
Removing network ozone_default

Verified that the test succeeds without the fake failure.

$ ./test.sh
Removing network ozone_default
WARNING: Network ozone_default not found.
Creating network "ozone_default" with the default driver
Creating ozone_scm_1      ... done
Creating ozone_om_1       ... done
Creating ozone_datanode_1 ... done
Creating ozone_datanode_2 ... done
Creating ozone_datanode_3 ... done
0 datanode is up and healthy (until now)
3 datanodes are up and registered to the scm
==============================================================================
ozone-auditparser
==============================================================================
ozone-auditparser.Auditparser :: Smoketest ozone cluster startup
==============================================================================
Initiating freon to generate data                                     | PASS |
------------------------------------------------------------------------------
Testing audit parser                                                  | PASS |
------------------------------------------------------------------------------
ozone-auditparser.Auditparser :: Smoketest ozone cluster startup      | PASS |
2 critical tests, 2 passed, 0 failed
2 tests total, 2 passed, 0 failed
==============================================================================
ozone-auditparser                                                     | PASS |
2 critical tests, 2 passed, 0 failed
2 tests total, 2 passed, 0 failed
==============================================================================
Output:  /tmp/smoketest/ozone/result/robot-ozone-ozone-auditparser-om.xml
==============================================================================
ozone-basic :: Smoketest ozone cluster startup
==============================================================================
Check webui static resources                                          | PASS |
------------------------------------------------------------------------------
Start freon testing                                                   | PASS |
------------------------------------------------------------------------------
ozone-basic :: Smoketest ozone cluster startup                        | PASS |
2 critical tests, 2 passed, 0 failed
2 tests total, 2 passed, 0 failed
==============================================================================
Output:  /tmp/smoketest/ozone/result/robot-ozone-ozone-basic-scm.xml
Stopping ozone_datanode_1 ... done
Stopping ozone_datanode_3 ... done
Stopping ozone_datanode_2 ... done
Stopping ozone_om_1       ... done
Stopping ozone_scm_1      ... done
Removing ozone_datanode_1 ... done
Removing ozone_datanode_3 ... done
Removing ozone_datanode_2 ... done
Removing ozone_om_1       ... done
Removing ozone_scm_1      ... done
Removing network ozone_default

@adoroszlai
Copy link
Contributor Author

/label ozone

@elek elek added the ozone label Aug 27, 2019
&& wait_for_datanodes "$COMPOSE_FILE" "${datanode_count}" \
&& sleep 10

if [[ $? -gt 0 ]]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shellcheck:9: note: Check exit code directly with e.g. 'if mycmd;', not indirectly with $?. [SC2181]

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 40 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
0 shelldocs 0 Shelldocs was not available.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 580 trunk passed
+1 mvnsite 0 trunk passed
+1 shadedclient 802 branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 mvninstall 563 the patch passed
+1 mvnsite 0 the patch passed
-1 shellcheck 0 The patch generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1)
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 696 patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 unit 112 hadoop-hdds in the patch passed.
+1 unit 292 hadoop-ozone in the patch passed.
+1 asflicense 51 The patch does not generate ASF License warnings.
3349
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1358/1/artifact/out/Dockerfile
GITHUB PR #1358
Optional Tests dupname asflicense mvnsite unit shellcheck shelldocs
uname Linux 6dd06018c83b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 6e37d65
shellcheck https://builds.apache.org/job/hadoop-multibranch/job/PR-1358/1/artifact/out/diff-patch-shellcheck.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1358/1/testReport/
Max. process+thread count 447 (vs. ulimit of 5500)
modules C: hadoop-ozone/dist U: hadoop-ozone/dist
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1358/1/console
versions git=2.7.4 maven=3.3.9 shellcheck=0.4.6
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

Copy link
Member

@elek elek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 thanks the contribution @adoroszlai

Will commit it soon.

@elek elek closed this in c749f62 Aug 29, 2019
@adoroszlai adoroszlai deleted the HDDS-2045 branch August 29, 2019 08:23
@adoroszlai
Copy link
Contributor Author

Thanks @elek for committing it.

amahussein pushed a commit to amahussein/hadoop that referenced this pull request Oct 29, 2019
RogPodge pushed a commit to RogPodge/hadoop that referenced this pull request Mar 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants