Skip to content

Modify the mechanism to pause indexing #128405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 30, 2025

Conversation

ankikuma
Copy link
Contributor

This PR changes the mechanism to pause indexing which was introduced in #127173.

The original PR caused IndexStatsIT#testThrottleStats to fail. See #126359.

@ankikuma ankikuma added Team:Distributed Indexing Meta label for Distributed Indexing team >bug and removed v9.1.0 labels May 23, 2025
@ankikuma
Copy link
Contributor Author

The problem with the original implementation of the pause lock mechanism was as follows:

  1. First note that the EngineConcurrentMergeScheduler#beforeMerge and EngineConcurrentMergeScheduler#afterMerge can be called concurrently by 2 different merge threads.

  2. Also note that ThreadPoolMergeScheduler#checkMergeTaskThrottling can be called concurrently from 2 different threads via ThreadPoolMergeScheduler#mergeTaskDone and ThreadPoolMergeScheduler#submitNewMergeTask

  3. Next note that the above concurrency can cause Engine#IndexThrottle#activate and Engine#IndexThrottle#deactivate to be called concurrently. The IndexStatsIT#testThrottleStats test exposed that this concurrency was not handled correctly by the pause mechanism in place.

This is because the semaphore based approach relies on acquiring and releasing precise number of permits during activate and deactivate, which is not possible to synchronize. Let me try to explain with the the following scenario:
a. Thread 1 : activates throttling on shard --> acquires all but 1 permits
b. Multiple indexing jobs arrive, all waiting to acquire the single available permit --> only one goes through at a time (all is good so far)
c. Thread 2 : deactivate throttling on shard --> releases all but 1 permits --> all is good but note that the jobs in step (b) will still need to acquire and release these permits.
d. Thread 3: activates throttling immediately after the deactivate in step (c) but it is not able to acquire the requested number of permits immediately because the indexing threads in step (b) have acquired permits in the meantime to work on indexing. So the activate is waiting for permits before it can switch the lock to pauseLock.
e. Thread 4: deactivates throttling immediately following step (d) and finds that we still have a NOOP lock and asserts. In any case, it cannot release the precise number of permits deactivate would like to release, because activate hasn't even acquired them yet.

@ankikuma ankikuma marked this pull request as ready for review May 27, 2025 17:48
@ankikuma ankikuma requested review from henningandersen and tlrx May 27, 2025 17:48
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Distributed Indexing Meta label for Distributed Indexing team labels May 27, 2025
@ankikuma ankikuma requested a review from bcully May 27, 2025 19:21
@ankikuma ankikuma added the :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. label May 27, 2025
@elasticsearchmachine elasticsearchmachine added Team:Distributed Indexing Meta label for Distributed Indexing team and removed needs:triage Requires assignment of a team area label labels May 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@ankikuma ankikuma added needs:triage Requires assignment of a team area label v9.1.0 and removed Team:Distributed Indexing Meta label for Distributed Indexing team labels May 27, 2025
@elasticsearchmachine elasticsearchmachine added Team:Distributed Indexing Meta label for Distributed Indexing team and removed needs:triage Requires assignment of a team area label labels May 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @ankikuma, I've created a changelog YAML for you.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments.

I am ok to revert to this if you find it simpler. I see how it reduces the race condition surface, but there is still a race where if a thread deactivates throttling and at the same time another thread activates throttling, we risk running with no throttling at all.

I think that we could make activate/deactive throttling synchronized on a lock object instead to avoid all of this?

if (lock == pauseLockReference) {
pauseLockReference.acquire();
try {
while (pauseIndexing.getAcquire()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be:

Suggested change
while (pauseIndexing.getAcquire()) {
while (lock == pauseLockReference) {

and thus we can avoid the extra boolean?

pauseLockReference.acquire();
try {
// System.out.println("Deactivate pause");
pauseIndexing.setRelease(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the pauseIndexing boolean as suggested above, we need to set the lock = noop no later than here.

I think we could simply this to:

lock = NOOP_LOCK;
pauseLockReference.acquire();
                try {
                    // System.out.println("Deactivate pause");
                    pauseIndexing.setRelease(false);
Comment
To avoid the `pauseIndexing` boolean as suggested above, we need to set the `lock = noop` no later than here.

I think we could simply this to:

lock = NOOP_LOCK;
try (pauseLockReference.acquire()) {
pauseCondition.signalAll();
}

// System.out.println("Acquired pause indexing lock");
logger.trace("Acquired pause indexing lock");
}
return pauseLockReference;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might as well release the lock immediately and return a noop releasable?

That would allow us to use try-with-resource when acquiring the pauseLockReference too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like to see this carried through.

As we wake up after pause throttling, with the current mechanism, we'll only wake up one of the paused threads at a time until each of their indexing requests complete. Notice that await only allows one thread at a time to come out of await until the lock is released, since returning from await reacquires the lock.

This means that if the situation is cured, but the first few indexing requests take "a long time", we are still essentially throttling to fewer threads for a while.

I'll LGTM it for now, since it is not that harmful, but would prefer to see this sorted in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had also thought about this problem that even after throttling gets deactivated, we still only let one indexing job pass at a time. But I reasoned that we were already doing that before I moved to the pause indexing option.
But I think I understand now your solution.

@ankikuma
Copy link
Contributor Author

ankikuma commented May 29, 2025

Thanks for reviewing @henningandersen . I made some changes based on your comments.
I also added 2 APIs for pausing and unpausing throttling, which are currently unused but I plan to call from IndexShardOperationPermits#blockOperations in a followup to allow indexing permits to be acquired.
I thought it might be good to think about how we would like to unpause indexing in this PR.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

private final Lock pauseIndexingLock = new ReentrantLock();
private final Condition pauseCondition = pauseIndexingLock.newCondition();
private final ReleasableLock pauseLockReference = new ReleasableLock(pauseIndexingLock);
private volatile AtomicBoolean pauseThrottling = new AtomicBoolean();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name this and the method differently, like suspendThrottling? pauseThrottling sounds like a variable indicating that the "pause throttling" mechanism is enabled when instead it is that it is disabled ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

// System.out.println("Acquired pause indexing lock");
logger.trace("Acquired pause indexing lock");
}
return pauseLockReference;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like to see this carried through.

As we wake up after pause throttling, with the current mechanism, we'll only wake up one of the paused threads at a time until each of their indexing requests complete. Notice that await only allows one thread at a time to come out of await until the lock is released, since returning from await reacquires the lock.

This means that if the situation is cured, but the first few indexing requests take "a long time", we are still essentially throttling to fewer threads for a while.

I'll LGTM it for now, since it is not that harmful, but would prefer to see this sorted in this PR.

}

public Releasable acquireThrottle() {
return lock.acquire();
if (lock == pauseLockReference) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We read lock twice when doing regular throttling, once for the condition here and once below. Since it is volatile, it would be good to only read it once, i.e., copy it to a local variable:

var lock = this.lock;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But don't we need to read the latest value of lock inside the try block ?

@ankikuma ankikuma merged commit 3e0584a into elastic:main May 30, 2025
18 checks passed
@ankikuma ankikuma linked an issue May 30, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. Team:Distributed Indexing Meta label for Distributed Indexing team v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] IndexStatsIT testThrottleStats failing
3 participants