Skip to content

Implement getting the cluster nodes list from nodes #135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 10, 2019

Conversation

msiomkin
Copy link

Periodically call a user-defined Lua function on a node to obtain or
refresh the full cluster nodes list.

Resolves: #134

@msiomkin msiomkin force-pushed the get-node-list-from-nodes branch 3 times, most recently from 7a4659b to 0448226 Compare March 23, 2019 09:52
@msiomkin msiomkin changed the title WIP: Implement getting the cluster nodes list from nodes Implement getting the cluster nodes list from nodes Mar 23, 2019
@msiomkin msiomkin force-pushed the get-node-list-from-nodes branch 4 times, most recently from 9dce8ff to b087be0 Compare March 28, 2019 07:05
Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch and sorry for the long delay before the review. Please, consider some initial comments about the patch.

Please, factor out all unrelated changes into separate PRs: say, updating of Travis-CI jobs, Sophia → Vinyl update in the README, fix of the authentification issue (see below). Then squash all relevant commits: nobody is interesting in intermediate changes.

@Totktonada
Copy link
Member

BTW, there is the task about distros updating (#123). I have pinged Alexander about that.

@denis-ignatenko denis-ignatenko force-pushed the get-node-list-from-nodes branch from ddcf29c to cf62582 Compare May 23, 2019 16:42
Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, rebase and squash changes. Factor out unrelated readme updates (I suggest to file separate PR to don't mix discussions, but don't insist: separate commit is ok too).

I look at the PR more or less briefly, so feel free to discuss certain points here or privately in text or voicely.

@Totktonada
Copy link
Member

I decided to skip review ping-pong and reworked some parts of the patch where I guess I can improve things. I have made the following changes:

  • Fixed all flake8 warnings in mesh_connection.py and test_mesh.py: trailing whitespaces and long lines.
  • Prevent a user provided addrs argument from changing: copy it before inserting host+port.
  • Added warnings for parse_uri() and other cases like a network error during a call to a discovery function.
  • Removed _opt_reconnect() before a call to a discovery function, but give a warning if something went wrong during the call.
  • Restrict validation (say, catch ':1234' or 'localhost:65536'), extract it from RoundRobinStrategy; reworked to return warning messages as values to use as warnings or error (see the following bullet).
  • Raise a configuration error if user provided addresses does not pass validation. Give warnings as before for cluster discovery function result.
  • Added test cases for wrong cluster discovery function results.
  • Give a configuration error for empty address list even with connect_now=False. It is needed to keep self.{host,port} consistent with self.strategy.pos and so simplify the code: say, we don't need to explicitly get a next address in connect(), only in _opt_reconnect().
  • Reverted cycle usage in RoundRobinStrategy to implement self.pos adjusting properly (when a current address is in the new address list).
  • Reverted self.connected = False in connection.py::close(). I looked at the git history and it is intended to be the internal field. Added test case for connect() after close() in PR test: verify connect() after close() #144. If we want to provide the way to determine whether a connection is alive we need to expose a function. But maybe ping() should be used for this.
  • Fixed the bug when self.strategy.getnext() gives currently used address: this is why +1 was needed in the loop around super(...)._opt_reconnect() + getnext() in MeshConnection._opt_reconnect().
  • Re-raise a last error in MeshConnection._opt_reconnect() (before it raises NetworkError class).
  • Reworked test_mesh.py a bit: added / enhanced comments, cropped extra arguments where it is appropriate, some taste changes.
  • Expanded the commit message.

Maybe I missed to mention something in the list.

@robinhood23rus Can you please give me feedback about the new version of the patch (force-pushed here).

This feature adds the new optional arguments to the MeshConnection
contructor:

* `cluster_discovery_function` -- a name of the function which will be
  periodically called on a currently connected tarantool instance to
  update a list of MeshConnection addresses.
* `cluster_discovery_delay` -- minimal amount of seconds between address
  list updates (default is 60 seconds).

The update of addresses is performed right after successful connecting
and before performing a request (if a minimal time passes).

This commits changes the round robin retry strategy. Before it performs
two attempts to connect to each address reconnect_max_attempts times (3
by default), now it do that only once.

The new type of error is added: ConfigurationError. It is risen when a
user provides incorrect configuration: say, one of provided addresses is
not correct.

The new type of warning is added: ClusterDiscoveryWarning. This warning
is shown when a something went wrong during cluster discovery: say, one
of returned addresses is not correct. Note the difference: a user
provided configuration verified strictly, while a cluster discovery
function result is filtered (with warnings) and good addresses are
applied (if the list is not empty).

Aside of the new functionality this commit improves compatibility of
MeshConnection API with Connection. The following arguments are added to
the MeshConnection constructor: `host`, `port`, `call_16`,
`connection_timeout`. An address from `host` / `port` arguments is added
to `addrs` (if provided) as the first item.

Fixes #134.
@Totktonada Totktonada force-pushed the get-node-list-from-nodes branch from 1576d2c to d64c52b Compare June 10, 2019 07:42
@Totktonada
Copy link
Member

Reorganized parse_uri() to get rid of asserts and make it obvious that its conditions are always met.

@denis-ignatenko
Copy link
Contributor

Changes ok. Approve

@Totktonada Totktonada merged commit 4dfe5f9 into master Jun 10, 2019
@Totktonada Totktonada deleted the get-node-list-from-nodes branch June 10, 2019 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Get a list of 'connectable' instances from the first node
3 participants