Modify the mechanism to pause indexing #128405

ankikuma · 2025-05-23T19:49:10Z

This PR changes the mechanism to pause indexing which was introduced in #127173.

The original PR caused IndexStatsIT#testThrottleStats to fail. See #126359.

ankikuma · 2025-05-23T20:21:36Z

The problem with the original implementation of the pause lock mechanism was as follows:

First note that the EngineConcurrentMergeScheduler#beforeMerge and EngineConcurrentMergeScheduler#afterMerge can be called concurrently by 2 different merge threads.
Also note that ThreadPoolMergeScheduler#checkMergeTaskThrottling can be called concurrently from 2 different threads via ThreadPoolMergeScheduler#mergeTaskDone and ThreadPoolMergeScheduler#submitNewMergeTask
Next note that the above concurrency can cause Engine#IndexThrottle#activate and Engine#IndexThrottle#deactivate to be called concurrently. The IndexStatsIT#testThrottleStats test exposed that this concurrency was not handled correctly by the pause mechanism in place.

This is because the semaphore based approach relies on acquiring and releasing precise number of permits during activate and deactivate, which is not possible to synchronize. Let me try to explain with the the following scenario:
a. Thread 1 : activates throttling on shard --> acquires all but 1 permits
b. Multiple indexing jobs arrive, all waiting to acquire the single available permit --> only one goes through at a time (all is good so far)
c. Thread 2 : deactivate throttling on shard --> releases all but 1 permits --> all is good but note that the jobs in step (b) will still need to acquire and release these permits.
d. Thread 3: activates throttling immediately after the deactivate in step (c) but it is not able to acquire the requested number of permits immediately because the indexing threads in step (b) have acquired permits in the meantime to work on indexing. So the activate is waiting for permits before it can switch the lock to pauseLock.
e. Thread 4: deactivates throttling immediately following step (d) and finds that we still have a NOOP lock and asserts. In any case, it cannot release the precise number of permits deactivate would like to release, because activate hasn't even acquired them yet.

…eLock Refresh branch

elasticsearchmachine · 2025-05-27T19:21:41Z

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

elasticsearchmachine · 2025-05-27T22:34:50Z

Hi @ankikuma, I've created a changelog YAML for you.

…csearch into 05232025/ModifyPauseLock Pull

henningandersen

Left a few comments.

I am ok to revert to this if you find it simpler. I see how it reduces the race condition surface, but there is still a race where if a thread deactivates throttling and at the same time another thread activates throttling, we risk running with no throttling at all.

I think that we could make activate/deactive throttling synchronized on a lock object instead to avoid all of this?

henningandersen · 2025-05-28T12:16:03Z

server/src/main/java/org/elasticsearch/index/engine/Engine.java

+            if (lock == pauseLockReference) {
+                pauseLockReference.acquire();
+                try {
+                    while (pauseIndexing.getAcquire()) {


Can this be:

Suggested change

while (pauseIndexing.getAcquire()) {

while (lock == pauseLockReference) {

and thus we can avoid the extra boolean?

henningandersen · 2025-05-28T12:19:08Z

server/src/main/java/org/elasticsearch/index/engine/Engine.java

+                pauseLockReference.acquire();
+                try {
+                    // System.out.println("Deactivate pause");
+                    pauseIndexing.setRelease(false);


To avoid the pauseIndexing boolean as suggested above, we need to set the lock = noop no later than here.

I think we could simply this to:

lock = NOOP_LOCK; pauseLockReference.acquire(); try { // System.out.println("Deactivate pause"); pauseIndexing.setRelease(false); Comment To avoid the `pauseIndexing` boolean as suggested above, we need to set the `lock = noop` no later than here. I think we could simply this to:

lock = NOOP_LOCK;
try (pauseLockReference.acquire()) {
pauseCondition.signalAll();
}

henningandersen · 2025-05-28T12:21:53Z

server/src/main/java/org/elasticsearch/index/engine/Engine.java

+                    // System.out.println("Acquired pause indexing lock");
+                    logger.trace("Acquired pause indexing lock");
+                }
+                return pauseLockReference;


I think we might as well release the lock immediately and return a noop releasable?

That would allow us to use try-with-resource when acquiring the pauseLockReference too.

I'd still like to see this carried through.

As we wake up after pause throttling, with the current mechanism, we'll only wake up one of the paused threads at a time until each of their indexing requests complete. Notice that await only allows one thread at a time to come out of await until the lock is released, since returning from await reacquires the lock.

This means that if the situation is cured, but the first few indexing requests take "a long time", we are still essentially throttling to fewer threads for a while.

I'll LGTM it for now, since it is not that harmful, but would prefer to see this sorted in this PR.

I had also thought about this problem that even after throttling gets deactivated, we still only let one indexing job pass at a time. But I reasoned that we were already doing that before I moved to the pause indexing option.
But I think I understand now your solution.

…eLock Refresh branch

ankikuma · 2025-05-29T01:39:21Z

Thanks for reviewing @henningandersen . I made some changes based on your comments.
I also added 2 APIs for pausing and unpausing throttling, which are currently unused but I plan to call from IndexShardOperationPermits#blockOperations in a followup to allow indexing permits to be acquired.
I thought it might be good to think about how we would like to unpause indexing in this PR.

…eLock Refresh branch

henningandersen

LGTM.

henningandersen · 2025-05-30T06:50:46Z

server/src/main/java/org/elasticsearch/index/engine/Engine.java

+        private final Lock pauseIndexingLock = new ReentrantLock();
+        private final Condition pauseCondition = pauseIndexingLock.newCondition();
+        private final ReleasableLock pauseLockReference = new ReleasableLock(pauseIndexingLock);
+        private volatile AtomicBoolean pauseThrottling = new AtomicBoolean();


Can we name this and the method differently, like suspendThrottling? pauseThrottling sounds like a variable indicating that the "pause throttling" mechanism is enabled when instead it is that it is disabled ;-)

henningandersen · 2025-05-30T06:57:25Z

server/src/main/java/org/elasticsearch/index/engine/Engine.java

+                    // System.out.println("Acquired pause indexing lock");
+                    logger.trace("Acquired pause indexing lock");
+                }
+                return pauseLockReference;


I'd still like to see this carried through.

As we wake up after pause throttling, with the current mechanism, we'll only wake up one of the paused threads at a time until each of their indexing requests complete. Notice that await only allows one thread at a time to come out of await until the lock is released, since returning from await reacquires the lock.

This means that if the situation is cured, but the first few indexing requests take "a long time", we are still essentially throttling to fewer threads for a while.

I'll LGTM it for now, since it is not that harmful, but would prefer to see this sorted in this PR.

henningandersen · 2025-05-30T06:58:46Z

server/src/main/java/org/elasticsearch/index/engine/Engine.java

        }

        public Releasable acquireThrottle() {
-            return lock.acquire();
+            if (lock == pauseLockReference) {


We read lock twice when doing regular throttling, once for the condition here and once below. Since it is volatile, it would be good to only read it once, i.e., copy it to a local variable:

var lock = this.lock;

But don't we need to read the latest value of lock inside the try block ?

…eLock Refresh branch

ankikuma added 3 commits May 23, 2025 15:23

Change indexing pause throttling implementation

60ed208

refresh branch

e1728df

Fix check in afterMerge()

6d4202e

elasticsearchmachine added the v9.1.0 label May 23, 2025

ankikuma added Team:Distributed Indexing Meta label for Distributed Indexing team >bug and removed v9.1.0 labels May 23, 2025

Merge remote-tracking branch 'upstream/main' into 05232025/ModifyPaus…

00dfd32

…eLock Refresh branch

ankikuma marked this pull request as ready for review May 27, 2025 17:48

ankikuma requested review from henningandersen and tlrx May 27, 2025 17:48

elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Distributed Indexing Meta label for Distributed Indexing team labels May 27, 2025

ankikuma requested a review from bcully May 27, 2025 19:21

ankikuma added the :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. label May 27, 2025

elasticsearchmachine added Team:Distributed Indexing Meta label for Distributed Indexing team and removed needs:triage Requires assignment of a team area label labels May 27, 2025

ankikuma added needs:triage Requires assignment of a team area label v9.1.0 and removed Team:Distributed Indexing Meta label for Distributed Indexing team labels May 27, 2025

elasticsearchmachine added Team:Distributed Indexing Meta label for Distributed Indexing team and removed needs:triage Requires assignment of a team area label labels May 27, 2025

Update docs/changelog/128405.yaml

aa4d58a

ankikuma added 3 commits May 27, 2025 18:37

Add test

ff014b8

Merge branch '05232025/ModifyPauseLock' of github.com:ankikuma/elasti…

a2c8eb2

…csearch into 05232025/ModifyPauseLock Pull

remove comments

50129f3

henningandersen reviewed May 28, 2025

View reviewed changes

ankikuma added 3 commits May 28, 2025 20:58

review comments

0fabe84

Merge remote-tracking branch 'upstream/main' into 05232025/ModifyPaus…

93723b6

…eLock Refresh branch

remove unused code

72b3b75

Merge remote-tracking branch 'upstream/main' into 05232025/ModifyPaus…

8d2c216

…eLock Refresh branch

henningandersen approved these changes May 30, 2025

View reviewed changes

ankikuma and others added 5 commits May 30, 2025 09:24

review comments

a88680b

Merge remote-tracking branch 'upstream/main' into 05232025/ModifyPaus…

ab6644e

…eLock Refresh branch

code cleanup

4755af4

Merge remote-tracking branch 'upstream/main' into 05232025/ModifyPaus…

941c181

…eLock Refresh branch

[CI] Auto commit changes from spotless

f3d91a9

ankikuma merged commit 3e0584a into elastic:main May 30, 2025
18 checks passed

ankikuma linked an issue May 30, 2025 that may be closed by this pull request

[CI] IndexStatsIT testThrottleStats failing #126359

Open

	while (pauseIndexing.getAcquire()) {
	while (lock == pauseLockReference) {

Modify the mechanism to pause indexing #128405

Modify the mechanism to pause indexing #128405

Uh oh!

Conversation

ankikuma commented May 23, 2025

Uh oh!

ankikuma commented May 23, 2025

Uh oh!

elasticsearchmachine commented May 27, 2025

Uh oh!

elasticsearchmachine commented May 27, 2025

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ankikuma commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ankikuma commented May 29, 2025 •

edited

Loading