Closed
Description
I'm seeing these error in a dtest that adds 2 new nodes (in parallel) to a 3-nodes cluster.
08:56:37,665 1736154 ccm DEBUG cluster.py :762 | test_tablets_add_two_nodes_in_parallel: node4: Starting scylla: args=['/home/bhalevy/.dtest/dtest-u7u3ay78/test/node4/bin/scylla', '--options-file', '/home/bhalevy/.dtest/dtest-u7u3ay78/test/no
de4/conf/scylla.yaml', '--log-to-stdout', '1', '--api-address', '127.0.73.4', '--smp', '2', '--memory', '1024M', '--developer-mode', 'true', '--default-log-level', 'info', '--overprovisioned', '--prometheus-address', '127.0.73.4', '--unsafe-bypass-fsync', '1', '--kernel-page-cache'
, '1', '--commitlog-use-o-dsync', '0', '--max-networking-io-control-blocks', '1000'] wait_other_notice=False wait_for_binary_proto=False
08:56:37,667 1736154 ccm DEBUG cluster.py :762 | test_tablets_add_two_nodes_in_parallel: node4: Starting scylla-jmx: args=['/home/bhalevy/.dtest/dtest-u7u3ay78/test/node4/bin/symlinks/scylla-jmx', '-Dapiaddress=127.0.73.4', '-Djavax.managemen
t.builder.initial=com.scylladb.jmx.utils.APIBuilder', '-Djava.rmi.server.hostname=127.0.73.4', '-Dcom.sun.management.jmxremote', '-Dcom.sun.management.jmxremote.host=127.0.73.4', '-Dcom.sun.management.jmxremote.port=7199', '-Dcom.sun.management.jmxremote.rmi.port=7199', '-Dcom.sun.
management.jmxremote.local.only=false', '-Xmx256m', '-XX:+UseSerialGC', '-Dcom.sun.management.jmxremote.authenticate=false', '-Dcom.sun.management.jmxremote.ssl=false', '-jar', '/home/bhalevy/.dtest/dtest-u7u3ay78/test/node4/bin/scylla-jmx-1.0.jar']
08:56:37,972 1736154 ccm DEBUG cluster.py :762 | test_tablets_add_two_nodes_in_parallel: node5: Starting scylla: args=['/home/bhalevy/.dtest/dtest-u7u3ay78/test/node5/bin/scylla', '--options-file', '/home/bhalevy/.dtest/dtest-u7u3ay78/test/no
de5/conf/scylla.yaml', '--log-to-stdout', '1', '--api-address', '127.0.73.5', '--smp', '2', '--memory', '1024M', '--developer-mode', 'true', '--default-log-level', 'info', '--overprovisioned', '--prometheus-address', '127.0.73.5', '--unsafe-bypass-fsync', '1', '--kernel-page-cache'
, '1', '--commitlog-use-o-dsync', '0', '--max-networking-io-control-blocks', '1000'] wait_other_notice=False wait_for_binary_proto=False
08:56:37,973 1736154 ccm DEBUG cluster.py :762 | test_tablets_add_two_nodes_in_parallel: node5: Starting scylla-jmx: args=['/home/bhalevy/.dtest/dtest-u7u3ay78/test/node5/bin/symlinks/scylla-jmx', '-Dapiaddress=127.0.73.5', '-Djavax.managemen
t.builder.initial=com.scylladb.jmx.utils.APIBuilder', '-Djava.rmi.server.hostname=127.0.73.5', '-Dcom.sun.management.jmxremote', '-Dcom.sun.management.jmxremote.host=127.0.73.5', '-Dcom.sun.management.jmxremote.port=7199', '-Dcom.sun.management.jmxremote.rmi.port=7199', '-Dcom.sun.
management.jmxremote.local.only=false', '-Xmx256m', '-XX:+UseSerialGC', '-Dcom.sun.management.jmxremote.authenticate=false', '-Dcom.sun.management.jmxremote.ssl=false', '-jar', '/home/bhalevy/.dtest/dtest-u7u3ay78/test/node5/bin/scylla-jmx-1.0.jar']
08:56:43,620 1736154 cassandra.cluster WARNING thread.py :58 | test_tablets_add_two_nodes_in_parallel: Found an invalid row for peer (127.0.73.5). Ignoring host.
08:56:43,620 1736154 cassandra.cluster INFO thread.py :58 | test_tablets_add_two_nodes_in_parallel: New Cassandra host <Host: 127.0.73.4:9042 datacenter1> discovered
08:56:45,521 1736154 cassandra.cluster WARNING thread.py :58 | test_tablets_add_two_nodes_in_parallel: Found an invalid row for peer (127.0.73.5). Ignoring host.
08:56:46,585 1736154 update_cluster_layout_tests DEBUG update_cluster_layout_tests.py:436 | test_tablets_add_two_nodes_in_parallel: Check that nodes started successfully
08:56:46,586 1736154 update_cluster_layout_tests DEBUG update_cluster_layout_tests.py:441 | test_tablets_add_two_nodes_in_parallel: Verify tablet load-balancing after 10 seconds
08:56:48,206 1736154 cassandra.cluster INFO thread.py :58 | test_tablets_add_two_nodes_in_parallel: New Cassandra host <Host: 127.0.73.5:9042 datacenter1> discovered
08:56:56,590 1736154 cassandra.cluster INFO dtest_setup.py :493 | test_tablets_add_two_nodes_in_parallel: New Cassandra host <Host: 127.0.73.5:9042 datacenter1> discovered
08:56:56,591 1736154 cassandra.cluster INFO dtest_setup.py :493 | test_tablets_add_two_nodes_in_parallel: New Cassandra host <Host: 127.0.73.3:9042 datacenter1> discovered
08:56:56,591 1736154 cassandra.cluster INFO dtest_setup.py :493 | test_tablets_add_two_nodes_in_parallel: New Cassandra host <Host: 127.0.73.2:9042 datacenter1> discovered
08:56:56,591 1736154 cassandra.cluster INFO dtest_setup.py :493 | test_tablets_add_two_nodes_in_parallel: New Cassandra host <Host: 127.0.73.4:9042 datacenter1> discovered
It looks like the error heals itself eventually but it's unclear why the the row in system.peers was considered as invalid, and on which node it was read from.
It would help if we printed more information to facilitate debugging in case something is wrong in the scylla server side.
Metadata
Metadata
Assignees
Labels
No labels