Skip to content

Race in shutting down event broadcasters (manifests as test flake) #1171

@DirectXMan12

Description

@DirectXMan12

Full description of the issue is at kubernetes/kubernetes#94906.

The TL;DR is that it's unherently unsafe (racey) to stop an event broadcaster due to the way it does non-blocking event emission. We might have to fork that code entirely to fix it in a timely manner.

Symptoms look like

relevant race detector output/stack traces
==================
WARNING: DATA RACE
Write at 0x00c0001762b0 by goroutine 171:
  runtime.closechan()
      /usr/local/go/src/runtime/chan.go:334 +0x0
  k8s.io/client-go/tools/record.(*eventBroadcasterImpl).Shutdown()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/pkg/watch/mux.go:199 +0x68
  sigs.k8s.io/controller-runtime/pkg/internal/recorder.(*Provider).Stop.func1()
      /home/prow/go/src/sigs.k8s.io/controller-runtime/pkg/internal/recorder/recorder.go:73 +0xa0
Previous read at 0x00c0001762b0 by goroutine 168:
  runtime.chansend()
      /usr/local/go/src/runtime/chan.go:142 +0x0
  k8s.io/client-go/tools/record.(*recorderImpl).generateEvent.func1()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/pkg/watch/mux.go:189 +0x11b
Goroutine 171 (running) created at:
  sigs.k8s.io/controller-runtime/pkg/internal/recorder.(*Provider).Stop()
      /home/prow/go/src/sigs.k8s.io/controller-runtime/pkg/internal/recorder/recorder.go:66 +0x8c
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure()
      /home/prow/go/src/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:563 +0x38d
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).Start.func1()
      /home/prow/go/src/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:468 +0x49
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).Start()
      /home/prow/go/src/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:518 +0x339
  sigs.k8s.io/controller-runtime/pkg/manager.glob..func3.3.8.3.6()
      /home/prow/go/src/sigs.k8s.io/controller-runtime/pkg/manager/manager_test.go:273 +0x22b
Goroutine 168 (finished) created at:
  k8s.io/client-go/tools/record.(*recorderImpl).generateEvent()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/tools/record/event.go:341 +0x42a
  k8s.io/client-go/tools/record.(*recorderImpl).Event()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/tools/record/event.go:349 +0xcf
  k8s.io/client-go/tools/record.(*recorderImpl).Eventf()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/tools/record/event.go:353 +0xd7
  sigs.k8s.io/controller-runtime/pkg/internal/recorder.(*lazyRecorder).Eventf()
      /home/prow/go/src/sigs.k8s.io/controller-runtime/pkg/internal/recorder/recorder.go:151 +0xf3
  k8s.io/client-go/tools/leaderelection/resourcelock.(*LeaseLock).RecordEvent()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/tools/leaderelection/resourcelock/leaselock.go:85 +0x30b
  k8s.io/client-go/tools/leaderelection/resourcelock.(*MultiLock).RecordEvent()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/tools/leaderelection/resourcelock/multilock.go:88 +0xa5
  k8s.io/client-go/tools/leaderelection.(*LeaderElector).acquire.func1()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/tools/leaderelection/leaderelection.go:251 +0x213
  k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155 +0x6f
  k8s.io/apimachinery/pkg/util/wait.BackoffUntil()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156 +0xb3
  k8s.io/apimachinery/pkg/util/wait.JitterUntil()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133 +0x10d
  k8s.io/client-go/tools/leaderelection.(*LeaderElector).acquire()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/tools/leaderelection/leaderelection.go:244 +0x2b2
  k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run()
      /home/prow/go/pkg/mod/k8s.io/[email protected]/tools/leaderelection/leaderelection.go:203 +0xee
==================

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.kind/flakeCategorizes issue or PR as related to a flaky test.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.priority/critical-urgentHighest priority. Must be actively worked on as someone's top priority right now.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions