Skip to content

Conversation

@zhangkun83
Copy link
Contributor

@zhangkun83 zhangkun83 commented Mar 26, 2019

Resolves #5497

Notes for reviewers

The API and the needed changes in LoadBalancer implementations are relatively simple. The large changes are mostly in tests. Please focus on non-test files first.

Motivation

In hierarchical LoadBalancers (e.g., XdsLoadBalancer) or wrapped LoadBalancers (e.g., HealthCheckingLoadBalancerFactory, the top-level LoadBalancer receives Subchannel state updates from the Channel impl, and they almost always pass it down to its children LoadBalancers.

Sometimes the children LoadBalancers are not directly created by the parent, thus requires whatever API in the middle to also pass Subchannel state updates, complicating that API. For example, the proposed RequestDirector includes handleSubchannelState() solely to plumb state updates to where they are used. We also see this pattern in HealthCheckingLoadBalancerFactory, GrpclbState and SubchannelPool.

Another minor issue is, the parent LoadBalancer would need to intercept the Helper passed to its children to map Subchannels to the children LoadBalancers, so that it pass updates about relevant Subchannels to the children. Otherwise, a child LoadBalancer may be surprised by seeing Subchannel not created by it, and it's not efficient to broadcast Subchannel updates to all children.

API Proposal

We will pass a SubchannelStateListener when creating a Subchannel to accept state updates, those updates could be directly passed to where the Subchannel is created, skipping the explicit chaining in the middle.

Also define a first-class object CreateSubchannelArgs to pass arguments for the reasons below:

  1. It may avoid breakages when we add new arguments to createSubchannel(). For example, a LoadBalancer may wrap Helper and intercept createSubchannel() in a hierarchical case. It may not be interested in all arguments. Passing a single CreateSubchannelArgs will not break the parent LoadBalancer if we add new fields later.
  2. This also reduces the eventual size of Helper interface, as the convenience createSubchannel() that accepts one EAG instead of a List is no longer necessary, since that convenience is moved into CreateSubchannelArgs.
interface SubchannelStateListener {
  void onSubchannelState(Subchannel subchannel, ConnectivityStateInfo newState);
}

abstract class LoadBalancer.Helper {
  abstract Subchannel createSubchannel(CreateSubchannelArgs args);
}

final class CreateSubchannelArgs {
  final List<EquivalentAddressGroup> getAddresses();
  final Attributes getAttributes();
  final SubchannelStateListener getStateListener();
  final class Builder () {
    ...
  }
}

The new createSubchannel() must be called from synchronization context, as a step towards #5015.

How the new API helps

Most hierarchical LoadBalancers would just let the listener from the child LoadBalancers directly pass through to gRPC core, which is less boilerplate than before.

Without any effort by the parent, each child will only see updates for the Subchannels it has created, which is clearer and more efficient.

If a parent LoadBalancer does want to look at or alter the Subchannel state updates for its delegate (like in HealthCheckingLoadBalancerFactory), it can still do so in the wrapping LoadBalancer.Helper passed to the delegate by intercepting the SubchannelStateListener.

Migration implications

Existing LoadBalancer implementations will continue to work, while they will see deprecation warnings when compiled:

  • The old LoadBalancer.Helper#createSubchannel variants are now deprecated, but will work until they are deleted. They create a SubchannelStateListener that delegates to LoadBalancer#handleSubchannelState.
  • LoadBalancer#handleSubchannelState is now deprecated, and will throw if called and the implementation doesn't override it. It will be deleted in a future release.

The migration for most LoadBalancer implementation would be moving the logic from LoadBalancer#handleSubchannelState into a SubchannelStateListener.

@carl-mastrangelo
Copy link
Contributor

Quick thoughts before diving in:

  • Listeners in our code typically get passed in at start, not construction. (e.g. NameResolver.Listener, ClientCall.Listener, ServerCall.Listener, InternalServer.Listener, ManagedClientTransport.Listener) I think you should keep with this pattern, and attach the listener to the subchannel later. This would fill a missing gap in Subchannels API, which already has a shutdown() method.

  • In context of the comment above, I think that would allow the Subchannel arg from the listener to go away, since the LB can late-associate the listener with the Subchannel.

@zhangkun83
Copy link
Contributor Author

Listeners in our code typically get passed in at start, not construction. (e.g. NameResolver.Listener, ClientCall.Listener, ServerCall.Listener, InternalServer.Listener, ManagedClientTransport.Listener) I think you should keep with this pattern, and attach the listener to the subchannel later. This would fill a missing gap in Subchannels API, which already has a shutdown() method.
In context of the comment above, I think that would allow the Subchannel arg from the listener to go away, since the LB can late-associate the listener with the Subchannel.

I thought about that option, but didn't go that route. The Subchannel arg from the listener actually fits naturally. A LoadBalancer that creates more than one Subchannels will almost always have one listener that looks at the states of all Subchannels and update the picker. If we stop passing Subchannel to the listener, they would end up doing something like:

...
subchannel.start((state) -> handleState(subchannel, state));
...
private void handleState(Subchannel subchannel, ConnectivityStateInfo state) {
...
}

For the sake of API style consistency, and a relatively smaller churn on the API (which mostly affects unit tests), it may be the right thing to do.

@zhangkun83
Copy link
Contributor Author

Another reason that may justify passing the listener to createSubchannel(): prevent bugs in LoadBalancer implementations.

The interfaces that has start(), e.g.,ClientCall, are active objects in the sense they act on their own, and the listener is the only way to receive outputs from it. Thus it's impossible to make the mistake of "forgetting to call start()". It would be very obvious in the code.

Subchannel is a passive object, as it's returned as a result of a pick. It by its own doesn't produce any output, and the listener is for making decisions related to it (include it or not). It's possible that a LoadBalancer forgets to call start() and misses out all the updates from that Subchannel.

As suggested by @carl-mastrangelo, using a first-class object to pass
arguments may avoid breakages when we add new arguments to
createSubchannel().  For example, a LoadBalancer may wrap Helper and
intercept createSubchannel() in a hierarchical case.  It may not be
interested in all arguments.  Passing a single CreateSubchannelArgs
will not break the parent LoadBalancer if new fields are added.

This also reduces the eventual size of Helper interface, as the
convenience createSubchannel() that accepts one EAG instead of a List
is no longer necessary. That convenience is moved into
CreateSubchannelArgs.
Copy link
Contributor

@carl-mastrangelo carl-mastrangelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM. A few nits first

private CreateSubchannelArgs(
List<EquivalentAddressGroup> addrs, Attributes attrs,
SubchannelStateListener stateListener) {
this.addrs = checkNotNull(addrs, "addresses are not set");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be a copy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made Builder copy it.

this.addrs = Collections.singletonList(addrs);
return this;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make Builders ctor package private

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (entry == null) {
subchannel = helper.createSubchannel(eag, defaultAttributes);
final Attributes attrs = defaultAttributes.toBuilder()
.set(STATE_LISTENER, new AtomicReference<>(listener)).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: .build() should be on the next line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

private RoundRobinPicker currentPicker =
new RoundRobinPicker(Collections.<DropEntry>emptyList(), Arrays.asList(BUFFER_ENTRY));

private final SubchannelStateListener subchannelStateListener = new SubchannelStateListener() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing you should consider: Make these listeners a named private class. It makes debugging much easier in a stack trace. Also, the named classes can be made final, which avoids letting Mockito.spy extend them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed SubchannelStateListener per @ejona86's request. Some LoadBalancers now implement it directly. In other cases I have made named classes for the listeners.

*/
public Builder setAddresses(List<EquivalentAddressGroup> addrs) {
checkArgument(!addrs.isEmpty(), "addrs is empty");
this.addrs = Collections.unmodifiableList(addrs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An immutable copy or doc that the given arg should not be mutated afterwards?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

@Test
public void helper_createSubchannel_delegates() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this does not test anything useful. (It only tests the ArgsBuilder creates equal instances for the same input though, but that test could be simpler.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Deleting this test.

new NoopHelper().createSubchannel(Arrays.asList(eag), attrs);
}

@Test(expected = UnsupportedOperationException.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this will be an errorprone when import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. TestExceptionChecker discourages use of @Test(expected) because "the test will pass if any statement in the test method throws the expected exception". Switched away from it.

.setStateListener(subchannelStateListener)
.build());
fail("Should throw");
} catch (IllegalStateException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This will also be an errorprone when import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does errorprone prefer assertThrows? It's not used in grpc.

@SuppressWarnings({"deprecation", "unchecked"})
public void tearDown() throws Exception {
verifyNoMoreInteractions(mockArgs);
verify(mockHelper, never()).createSubchannel(any(List.class), any(Attributes.class));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ArgumentMatchers.<List<EAG>>any() does not need suppress unchecked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// in the cache.
subchannel.getAttributes().get(STATE_LISTENER).set(listener);
// Make the listener up-to-date with the latest state in case it has changed while it's in the
// cache.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the whole method in syncContext? Why not run listener.onSubchannelState(subchannel, entry.state) directly, so that there will be no state change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onSubchannelState() is not re-entrant with the other logic from GrpclbState when called by channel impl. Although reentrancy may not cause a problem, it's better to keep the baheavior closer to with channel impl.

That said, I think there is a problem with the current code. If channel impl schedules an update before the task on the next line while this method is running, the update from the channel, which is newer, would be overwritten with the older value by the latter task. I went with a different approach of the implementation that always track Subchannels and their states, whether in or out of the pool, which ends up simpler.

Copy link
Contributor

@dapengzhang0 dapengzhang0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

when a subchannel is returned and channel impl sends an update at the
same time, channel impl may schedule the state update BEFORE the fake
update call for the previously saved state, thus the newer state would
be overwritten by the older state.
Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spoke with Kun about the API change yesterday. In the end I wasn't really convinced and close to being on the fence, but was willing to defer to Kun.

Without this API change, parent policies with multiple subpolicies concurrently need to track which subchannels go to which subpolicy. They can do that with ~2ish lines via the per-Subchannel Attributes, so I didn't find it that big of a deal. So I was sort of leaning toward the conservative approach and let sleeping dogs lie; changes to the API cause instability that can lead to more changes. But I also agreed there weren't too many ways this could cause additional issues.

Copy link
Contributor Author

@zhangkun83 zhangkun83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ejona86 PTAL

private RoundRobinPicker currentPicker =
new RoundRobinPicker(Collections.<DropEntry>emptyList(), Arrays.asList(BUFFER_ENTRY));

private final SubchannelStateListener subchannelStateListener = new SubchannelStateListener() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed SubchannelStateListener per @ejona86's request. Some LoadBalancers now implement it directly. In other cases I have made named classes for the listeners.

}

@Test
public void helper_createSubchannel_delegates() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Deleting this test.

new NoopHelper().createSubchannel(Arrays.asList(eag), attrs);
}

@Test(expected = UnsupportedOperationException.class)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. TestExceptionChecker discourages use of @Test(expected) because "the test will pass if any statement in the test method throws the expected exception". Switched away from it.

.setStateListener(subchannelStateListener)
.build());
fail("Should throw");
} catch (IllegalStateException e) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does errorprone prefer assertThrows? It's not used in grpc.

@SuppressWarnings({"deprecation", "unchecked"})
public void tearDown() throws Exception {
verifyNoMoreInteractions(mockArgs);
verify(mockHelper, never()).createSubchannel(any(List.class), any(Attributes.class));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoadBalancer and CachedSubchannelPool look good. I didn't look much at anything else.

@zhangkun83 zhangkun83 merged commit 62b03fd into grpc:master Apr 12, 2019
@zhangkun83 zhangkun83 deleted the handle_subchannel_state branch April 12, 2019 17:58
zhangkun83 added a commit to zhangkun83/grpc-java that referenced this pull request May 8, 2019
…r rather than LoadBalancer (grpc#5503)"

This reverts commit 62b03fd.

Effectively reverts its follow-up commits:
dc218b6
405d8c3
44840fe
zhangkun83 added a commit that referenced this pull request May 8, 2019
…r rather than LoadBalancer (#5503)" (#5684)

This reverts commit 62b03fd.

Effectively reverts its follow-up commits:
dc218b6
405d8c3
44840fe
zhangkun83 added a commit that referenced this pull request May 17, 2019
…than LoadBalancer (take 2) (#5722)

This is a revised version of #5503 (62b03fd), which was rolled back in f8d0868. The newer version passes SubchannelStateListener to Subchannel.start() instead of SubchannelCreationArgs, which allows us to remove the Subchannel argument from the listener, which works as a solution for #5676.

LoadBalancers that call the old createSubchannel() will get start() implicitly called with a listener that passes updates to the deprecated LoadBalancer.handleSubchannelState(). Those who call the new createSubchannel() will have to call start() explicitly.

GRPCLB code is still using the old API, because it's a pain to migrate the SubchannelPool to the new API.  Since CachedSubchannelHelper is on the way, it's easier to switch to it when it's ready. Keeping
GRPCLB with the old API would also confirm the backward compatibility.
@lock lock bot locked as resolved and limited conversation to collaborators Jul 11, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move LoadBalancer.handleSubchannelState() to a per-Subchannel listener interface

4 participants