Skip to content

Conversation

@arjan-bal
Copy link
Contributor

@arjan-bal arjan-bal commented Jan 23, 2025

This change allows the children of endpointsharding to remain in IDLE state. This allows lazy initialization of children and therefore lazy creation of subchannels for endpoints. This enables the usage of endpointsharding in the ringhash balancer for managing pickfirst children.

RELEASE NOTES: N/A

@arjan-bal arjan-bal added Type: Feature New features or improvements in behavior Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. labels Jan 23, 2025
@arjan-bal arjan-bal added this to the 1.71 Release milestone Jan 23, 2025
@codecov
Copy link

codecov bot commented Jan 23, 2025

Codecov Report

Attention: Patch coverage is 89.18919% with 4 lines in your changes missing coverage. Please review.

Project coverage is 82.26%. Comparing base (67bee55) to head (930c996).
Report is 11 commits behind head on master.

Files with missing lines Patch % Lines
balancer/endpointsharding/endpointsharding.go 89.18% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8031      +/-   ##
==========================================
+ Coverage   82.22%   82.26%   +0.03%     
==========================================
  Files         383      384       +1     
  Lines       38688    38926     +238     
==========================================
+ Hits        31813    32024     +211     
- Misses       5555     5575      +20     
- Partials     1320     1327       +7     
Files with missing lines Coverage Δ
balancer/endpointsharding/endpointsharding.go 77.12% <89.18%> (+0.96%) ⬆️

... and 44 files with indirect coverage changes

bal := child.(*balancerWrapper)
if _, ok := newChildren.Get(e); !ok {
bal.Close()
bal.isClosed = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this into balancerWrapper's Close method?

Also since calls into a balancer.Balancer must not happen concurrently, I don't think it requires a lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a Close method and moved this into the Close method.

@dfawley dfawley assigned arjan-bal and unassigned easwars and dfawley Jan 24, 2025
@arjan-bal arjan-bal assigned dfawley and unassigned arjan-bal Jan 24, 2025
func (bw *balancerWrapper) ExitIdle() {
if ei, ok := bw.Balancer.(balancer.ExitIdler); ok {
go func() {
bw.es.childMu.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we're calling into the child from a goroutine here:

  • All calls into the child need to grab bw.es.childMu to avoid concurrent calls in,
  • That means Close() needs it (and also needed it before you added the method explicitly here, so good that we caught it), and
  • That means we shouldn't embed the child Balancer, since directly calling it is wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means Close() needs it (and also needed it before you added the method explicitly here, so good that we caught it)

Before this PR, calls to balancerWrapper.Close() and updates to endpointsharding.closedwere holding es.ChildMu:
In endpointshardng.Close():

func (es *endpointSharding) Close() {

In endpointshardng.UpdateClientConnState():

es.childMu.Lock()
defer es.childMu.Unlock()

// Delete old children that are no longer present.
for _, e := range children.Keys() {
child, _ := children.Get(e)
bal := child.(balancer.Balancer)
if _, ok := newChildren.Get(e); !ok {
bal.Close()
}
}

Maybe I'm not understanding this point correctly?

That means we shouldn't embed the child Balancer, since directly calling it is wrong.

I've removed the embedded balancer.Balancer from balancerWrapper and provided methods named updateClientConnStateLocked and closeLocked and that expect the caller to lock es.ChildMu.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR, calls to balancerWrapper.Close() and updates to endpointsharding.closed were holding es.ChildMu:

Oh, I see... I missed where it grabbed the lock in Close and thought it was just delegating directly to the wrappers.

This now seems good and makes it explicit what is required.

@dfawley dfawley assigned arjan-bal and unassigned dfawley Jan 24, 2025
@arjan-bal arjan-bal force-pushed the endpointsharding-configure-auto-reconnect branch from 79159f2 to a38a6ce Compare January 27, 2025 09:54
@arjan-bal arjan-bal force-pushed the endpointsharding-configure-auto-reconnect branch from a38a6ce to b95ae1d Compare January 27, 2025 10:06
@arjan-bal arjan-bal assigned dfawley and unassigned arjan-bal Jan 27, 2025
inhibitChildUpdates atomic.Bool

mu sync.Mutex // Sync updateState callouts and childState recent state updates
// mu synchronizes access to the stored children balancer states.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// mu synchronizes access to the stored children balancer states.
// mu synchronizes access to the stored children balancer states in children.

?

Or "in the children field" or something to be more specific.

// update.
continue
}
var bal *balancerWrapper
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename this local to childBalancerWrapper or something more clear -- at least child or something? bal to me makes me think this is the endpointsharding balancer itself when I see it later on. I guess we use es for that, but the more common convention is that the Balancer implementation within the package is simply named balancer, and b is the receiver.

for _, e := range children.Keys() {
child, _ := children.Get(e)
bal := child.(balancer.Balancer)
bal := child.(*balancerWrapper)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar nit about the name of things.. This local isn't even necessary:

child, _ := children.Get(e)
if _, ok := newChildren.Get(e); !ok {
	child.(*balancerWrapper).closeLocked()
}

func (bw *balancerWrapper) ExitIdle() {
if ei, ok := bw.Balancer.(balancer.ExitIdler); ok {
go func() {
bw.es.childMu.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR, calls to balancerWrapper.Close() and updates to endpointsharding.closed were holding es.ChildMu:

Oh, I see... I missed where it grabbed the lock in Close and thought it was just delegating directly to the wrappers.

This now seems good and makes it explicit what is required.

// The following fields are initialized at build time and read-only after
// that and therefore do not need to be guarded by a mutex.

// child contains the wrapped balancer. Access it's methods only through
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*its

@dfawley dfawley assigned arjan-bal and unassigned dfawley Jan 27, 2025
@arjan-bal arjan-bal merged commit b0e2ae9 into grpc:master Jan 28, 2025
15 checks passed
@arjan-bal arjan-bal deleted the endpointsharding-configure-auto-reconnect branch January 28, 2025 09:47
purnesh42H pushed a commit to purnesh42H/grpc-go that referenced this pull request Jan 30, 2025
janardhanvissa pushed a commit to janardhanvissa/grpc-go that referenced this pull request Feb 13, 2025
janardhanvissa pushed a commit to janardhanvissa/grpc-go that referenced this pull request Feb 13, 2025
janardhanvissa pushed a commit to janardhanvissa/grpc-go that referenced this pull request Feb 15, 2025
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 30, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. Type: Feature New features or improvements in behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants