Skip to content

Fix intermittent test failure in TestLevelQueue #15777

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

zeripath
Copy link
Contributor

@zeripath zeripath commented May 7, 2021

Fix #15776

Signed-off-by: Andrew Thornton [email protected]

@zeripath zeripath added this to the 1.15.0 milestone May 7, 2021
@GiteaBot GiteaBot added the lgtm/need 1 This PR needs approval from one additional maintainer to be merged. label May 7, 2021
@silverwind
Copy link
Member

silverwind commented May 7, 2021

Did not resolve it for me, here is another failure with this PR applied:

$ while true; do go test -count=1 -run TestLevelQueue ./modules/queue; done
ok    code.gitea.io/gitea/modules/queue 0.450s
ok    code.gitea.io/gitea/modules/queue 0.455s
ok    code.gitea.io/gitea/modules/queue 0.458s
ok    code.gitea.io/gitea/modules/queue 0.467s
ok    code.gitea.io/gitea/modules/queue 0.489s
2021/05/07 20:43:52 ...les/queue/manager.go:134:Add() [T] Queue Manager registered: queue-1 (QID: 1)
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:96:Run() [D] level:  Starting
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:104:Run() [T] level:  Waiting til closed
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:147:readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
2021/05/07 20:43:52 .../queue/workerpool.go:251:commonRegisterWorkers() [T] WorkerPool: 1 (for queue-1) adding 1 workers with group id: 1
2021/05/07 20:43:52 .../queue/workerpool.go:93:zeroBoost() [W] WorkerPool: 1 (for queue-1) has zero workers - adding 5 temporary workers for 5m0s
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:147:readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
2021/05/07 20:43:52 .../queue/workerpool.go:420:doWork() [T] Handling: 1 data, [0xc000c7a1b0]
2021/05/07 20:43:52 .../queue/workerpool.go:420:doWork() [T] Handling: 1 data, [0xc00067c180]
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:106:Run() [T] level:  Waiting til done
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:109:Run() [T] level:  Waiting til cleaned
2021/05/07 20:43:52 .../queue/workerpool.go:309:CleanUp() [T] WorkerPool: 1 CleanUp
2021/05/07 20:43:52 .../queue/workerpool.go:321:CleanUp() [T] WorkerPool: 1 CleanUp Done
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:174:Terminate() [T] level:  Terminating
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:186:Terminate() [D] level:  Closing with 2 tasks left in queue
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:191:Terminate() [D] level:  Terminated
2021/05/07 20:43:52 ...les/queue/manager.go:134:Add() [T] Queue Manager registered: queue-2 (QID: 2)
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:96:Run() [D] level:  Starting
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:104:Run() [T] level:  Waiting til closed
2021/05/07 20:43:52 .../queue/workerpool.go:251:commonRegisterWorkers() [T] WorkerPool: 2 (for queue-2) adding 1 workers with group id: 1
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:147:readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:147:readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
2021/05/07 20:43:52 .../queue/workerpool.go:412:doWork() [T] Handling: 2 data, [0xc00058e378 0xc00058e5b8]
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:106:Run() [T] level:  Waiting til done
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:174:Terminate() [T] level:  Terminating
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:174:Terminate() [T] level:  Terminating
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:186:Terminate() [D] level:  Closing with 0 tasks left in queue
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:191:Terminate() [D] level:  Terminated
--- FAIL: TestLevelQueue (0.70s)
    queue_disk_test.go:21:
          Error Trace:  queue_disk_test.go:21
                              workerpool.go:421
                              workerpool.go:276
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
    queue_disk_test.go:21:
          Error Trace:  queue_disk_test.go:21
                              workerpool.go:421
                              workerpool.go:276
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
FAIL
FAIL  code.gitea.io/gitea/modules/queue 0.742s
FAIL
ok    code.gitea.io/gitea/modules/queue 0.438s
ok    code.gitea.io/gitea/modules/queue 0.447s
ok    code.gitea.io/gitea/modules/queue 0.438s
ok    code.gitea.io/gitea/modules/queue 0.447s
ok    code.gitea.io/gitea/modules/queue 0.496s

@zeripath
Copy link
Contributor Author

zeripath commented May 7, 2021

Can you try on the wait-on-empty PR because I would hate to spend an evening fixing this if I have already fixed and prevented it there,

Signed-off-by: Andrew Thornton <[email protected]>
@zeripath zeripath force-pushed the fix-15776-prevent-intermittent-testlevelqueue-failure branch from 169a702 to a2d3420 Compare May 7, 2021 18:54
@zeripath
Copy link
Contributor Author

zeripath commented May 7, 2021

Ah it's a boost situation. I think there's a race with zeroWorkers and addWorkers - so you may get a boost before the initial workers have started. I will look at fixing this in my wait-on-empty branch.

@silverwind
Copy link
Member

silverwind commented May 7, 2021

Fails with similar error on that branch as well, just with different logging:

failure log on main
Add() [T] Queue Manager registered: queue-1 (QID: 1)
Run() [D] level:  Starting
Run() [T] level:  Waiting til closed
readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
commonRegisterWorkers() [T] WorkerPool: 1 (for queue-1) adding 1 workers with group id: 1
eroBoost() [W] WorkerPool: 1 (for queue-1) has zero workers - adding 5 temporary workers for 5m0s
readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
doWork() [T] Handling: 1 data, [0xc000c7a1b0]
doWork() [T] Handling: 1 data, [0xc00067c180]
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Run() [T] level:  Waiting til done
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
Run() [T] level:  Waiting til cleaned
CleanUp() [T] WorkerPool: 1 CleanUp
CleanUp() [T] WorkerPool: 1 CleanUp Done
Terminate() [T] level:  Terminating
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Terminate() [D] level:  Closing with 2 tasks left in queue
Terminate() [D] level:  Terminated
Add() [T] Queue Manager registered: queue-2 (QID: 2)
un() [D] level:  Starting
Run() [T] level:  Waiting til closed
commonRegisterWorkers() [T] WorkerPool: 2 (for queue-2) adding 1 workers with group id: 1
readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
doWork() [T] Handling: 2 data, [0xc00058e378 0xc00058e5b8]
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Run() [T] level:  Waiting til done
Terminate() [T] level:  Terminating
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Terminate() [T] level:  Terminating
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Terminate() [D] level:  Closing with 0 tasks left in queue
Terminate() [D] level:  Terminated
--- FAIL: TestLevelQueue (0.70s)
    queue_disk_test.go:21:
          Error Trace:  queue_disk_test.go:21
                              workerpool.go:421
                              workerpool.go:276
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
    queue_disk_test.go:21:
          Error Trace:  queue_disk_test.go:21
                              workerpool.go:421
                              workerpool.go:276
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
FAIL
FAIL  code.gitea.io/gitea/modules/queue 0.742s
FAIL
failure log on wait-on-empty
Add() [T] Queue Manager registered: queue-1 (QID: 1)
Run() [D] level:  Starting
Run() [T] level:  Now running
commonRegisterWorkers() [T] WorkerPool: 1 (for queue-1) adding 1 workers with group id: 1
readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
readToChan() [T] level:  Waiting on Empty
readToChan() [T] level:  Waiting on Empty
doWork() [T] Handling: 2 data, [0xc00068c0d8 0xc00068c210]
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Run() [T] level:  Waiting til done
doWork() [T] Worker shutting down
Run() [T] level:  Waiting til cleaned
CleanUp() [T] WorkerPool: 1 CleanUp
CleanUp() [T] WorkerPool: 1 CleanUp Done
Terminate() [T] level:  Terminating
Shutdown() [T] level:  Shutting down
Add() [T] Queue Manager registered: queue-2 (QID: 2)
Run() [D] level:  Starting
Run() [T] level:  Now running
readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
commonRegisterWorkers() [T] WorkerPool: 2 (for queue-2) adding 1 workers with group id: 1
eroBoost() [W] WorkerPool: 2 (for queue-2) has zero workers - adding 5 temporary workers for 5m0s
readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
readToChan() [T] level:  Waiting on Empty
doWork() [T] Handling: 1 data, [0xc0005ac468]
doWork() [T] Handling: 1 data, [0xc000667e90]
Shutdown() [T] level:  Shutting down
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Run() [T] level:  Waiting til done
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
Terminate() [T] level:  Terminating
doWork() [T] Worker shutting down
Shutdown() [T] level:  Shutting down
Terminate() [T] level:  Terminating
Shutdown() [T] level:  Shutting down
Run() [T] level:  Waiting til cleaned
Terminate() [D] level:  Closing with 0 tasks left in queue
CleanUp() [T] WorkerPool: 2 CleanUp
CleanUp() [T] WorkerPool: 2 CleanUp Done
Terminate() [D] level:  Terminated
--- FAIL: TestLevelQueue (0.59s)
    queue_disk_test.go:20:
          Error Trace:  queue_disk_test.go:20
                              workerpool.go:408
                              workerpool.go:261
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
    queue_disk_test.go:20:
          Error Trace:  queue_disk_test.go:20
                              workerpool.go:408
                              workerpool.go:261
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
FAIL
FAIL  code.gitea.io/gitea/modules/queue 0.639s
FAIL

@zeripath
Copy link
Contributor Author

zeripath commented May 7, 2021

I've just pushed up a fix to wait-on-empty

@silverwind
Copy link
Member

That last change seems to have done it, 150 runs without failure 👍

@silverwind silverwind closed this May 7, 2021
@silverwind
Copy link
Member

To clear any misunderstandings, this PR does not fix it, but #15693 does, closing this one.

@zeripath zeripath deleted the fix-15776-prevent-intermittent-testlevelqueue-failure branch May 12, 2021 19:26
@go-gitea go-gitea locked and limited conversation to collaborators Oct 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/need 1 This PR needs approval from one additional maintainer to be merged. type/testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

queue: Flaky test TestLevelQueue
4 participants