-
Notifications
You must be signed in to change notification settings - Fork 884
Default to empty DC failover preferred remoted DCs #2027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default to empty DC failover preferred remoted DCs #2027
Conversation
8e1836e
to
1872c81
Compare
Edited the commit with reformatted code. Also rebased while there. |
1872c81
to
2f420ab
Compare
is it possible to add a unit test to cover this case? |
Yes, sure. Should be easy. |
Now that I've checked, I cannot remember back the exact error I observed during my testing and also seems like putting non existing DCs to the preferredRemoteDcs config is kinda makes it no-op and narrows down to behave as its is unset. So this change is needed only for the cosmetic-perfectioninsm purposes (which is optional). I came to that conclusion after attempting to build a test to reproduce the problem. I will close this MR until finding back what was causing the issues. Sry for the noise. |
So I can demonstrate pretty conclusively that without this patch we are getting a non-empty list for preferred remote DCs in the default case: @Test
public void default_preferred_dcs_should_be_empty() {
CqlSession session = new CqlSessionBuilder().build();
DriverExecutionProfile profile = session.getContext().getConfig().getDefaultProfile();
assertThat(profile.getStringList(DefaultDriverOption.LOAD_BALANCING_DC_FAILOVER_PREFERRED_REMOTE_DCS)).isEmpty();
}
The fix provided by @jahstreet in this PR does indeed make this problem go away (as expected). That said, I agree with @jahstreet that it's not immediately clear to me that there's a problem here. The fact that the preferred DC list is always non-empty puts us into buildRemoteQueryPlanPreferred() here which means that we'll prefer nodes from a DC whose name is the empty string first in this method. But we won't find any nodes in that DC so... we basically wind up looping through the existing nodes anyway, which leaves us with something very much like what we'd get out of buildRemoteQueryPlanAll(). All of that said, the analysis above is entirely dependent on the current impl of these functions; that could easily change and a change in buildRemoteQueryPlanPreferred() might cause unexpected side effects if it's still being used when we don't expect it to be. I agree with @jahstreet that this is hardly the most significant problem we're facing but it seems to me like (a) we do have a pretty clear error in the code in this PR and (b) we have a very simple fix (in the form of this PR) for it. So my inclination is to say let's merge it. @netudima is right to ask about a test for this case but honestly I'm not sure how we'd do that. You could try a variation on BasicLoadBalancingPolicyPreferredRemoteDcsTest with a different impl of createAndInitPolicy() which sets up the preferred remote DCs list to be either empty or (probably better) equal to the default. But here you hit a problem based on the behaviour mentioned above; because buildRemoteQueryPlanPreferred() and buildRemoteQueryPlanAll() should produce similar results here there's no way to tell which function was actually used. I suppose there's an argument that it doesn't matter; our only interest is in making sure that the expected value is returned in this case and we don't care which of those two functions actually produces it... as long as we can confirm we get the right output. I guess I'd be okay with that... but it seems like we'd need a pretty good comment (at least) saying what the pitfall is and why we're doing it. |
Thanks @absurdfarce for summarizing. Your comments are always detailed enough while kept straight to the point. I need to learn doing it alike from you! My initial motivation to do that change was a weird routing bug which I observed in my remote environment causing empty query plan to be returned, which I mistakenly thought was caused by this default setting, but most probably it was something different. I wasn't able to remember and reproduce it back while coming up with the test case. For now, I see this problem as a Magic String problem, not more than that. Happy to push it through though if we all agree on it. wrt the comment - can it be as simple as "Default preferred remote DC config to empty list"? |
In #1896 we added a feature to prefer the order of the remote DCs on the x-DC failovers. As described in #1896 (comment), the default is misleading to errors. With this tiny change I propose to default to empty list i/o containing a single empty string value.
cc @nitinitt