-
Notifications
You must be signed in to change notification settings - Fork 145
Description
From kubernetes-sigs/azurefile-csi-driver#495
What happened:
After installing azurefile-csi-driver and azuredisk-csi-driver in a Kubernetes cluster, csi-resizer container, inside csi-azurefile-controller and csi-azuredisk-controller pods, is crashing every 1 or 2 minutes with the following message:
csi-resizer log:
...
I1211 12:27:26.339777 1 leaderelection.go:283] successfully renewed lease kube-system/external-resizer-file-csi-azure-com
I1211 12:27:31.349381 1 leaderelection.go:283] successfully renewed lease kube-system/external-resizer-file-csi-azure-com
runtime: mlock of signal stack failed: 12
runtime: increase the mlock limit (ulimit -l) or
runtime: update your kernel to 5.3.15+, 5.4.2+, or 5.5+
fatal error: mlock failed
runtime stack:
runtime.throw(0x15d27f3, 0xc)
/usr/lib/go-1.14/src/runtime/panic.go:1112 +0x72
runtime.mlockGsignal(0xc000682a80)
/usr/lib/go-1.14/src/runtime/os_linux_x86.go:72 +0x107
runtime.mpreinit(0xc000079180)
/usr/lib/go-1.14/src/runtime/os_linux.go:341 +0x78
runtime.mcommoninit(0xc000079180)
/usr/lib/go-1.14/src/runtime/proc.go:630 +0x108
runtime.allocm(0xc00004f800, 0x1672e98, 0x14f676dd7e26c)
/usr/lib/go-1.14/src/runtime/proc.go:1390 +0x14e
runtime.newm(0x1672e98, 0xc00004f800)
/usr/lib/go-1.14/src/runtime/proc.go:1704 +0x39
runtime.startm(0x0, 0xc000103201)
/usr/lib/go-1.14/src/runtime/proc.go:1869 +0x12a
runtime.wakep(...)
/usr/lib/go-1.14/src/runtime/proc.go:1953
runtime.resetspinning()
/usr/lib/go-1.14/src/runtime/proc.go:2415 +0x93
runtime.schedule()
/usr/lib/go-1.14/src/runtime/proc.go:2527 +0x2de
runtime.park_m(0xc000103200)
/usr/lib/go-1.14/src/runtime/proc.go:2690 +0x9d
runtime.mcall(0x0)
/usr/lib/go-1.14/src/runtime/asm_amd64.s:318 +0x5b
goroutine 1 [select, 2 minutes]:
k8s.io/client-go/tools/leaderelection.(*LeaderElector).renew.func1.1(0x13763e0, 0x0, 0xc0000eee40)
...
What you expected to happen:
The container should not fail so frequently.
How to reproduce it:
The failure started right after installing v0.7.0 of azurefile-csi-driver. I upgraded to v0.9.0 (for both, azurefile and azuredisk) with the same results. The Kubernetes cluster is composed of 3 master nodes and 3 workers running on Azure VMs (not AKS).
Anything else we need to know?:
Found a couple issues in golang/go repository that seems to be related:
- runtime: mlock of signal stack failed: 12 [1.14 backport] #37807 (runtime: mlock of signal stack failed: 12 [1.14 backport] golang/go#37807)
- runtime: mlock of signal stack failed: 12 (runtime: mlock of signal stack failed: 12 golang/go#37436)
Possibly upgrading golang version from 1.14 to 1.15 will solve the problem.
Environment:
- CSI Driver version: v0.7.0 and v0.9.0
- Kubernetes version (use
kubectl version
): v1.19.14 - OS (e.g. from /etc/os-release): Ubuntu v20.04.1 LTS
- Kernel (e.g.
uname -a
): 5.4.0-1032-azure Removed unused variable #33-Ubuntu SMP Fri Nov 13 14:23:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux - Install tools: Helm v3.4.2
- Others:
- Master node size: Standard D2ds_v4 (2 vcpus, 8 GiB memory)
- Worker node size: Standard D16ds_v4 (16 vcpus, 64 GiB memory)
Complete log file: csi-resizer.log