external-resizer crashing in azuredisk-csi-driver and azurefile-csi-driver

From https://github.com/kubernetes-sigs/azurefile-csi-driver/issues/495

**What happened**:
After installing azurefile-csi-driver and azuredisk-csi-driver in a Kubernetes cluster, *csi-resizer* container, inside *csi-azurefile-controller* and *csi-azuredisk-controller* pods, is crashing every 1 or 2 minutes with the following message:

*csi-resizer* log:
```
...
I1211 12:27:26.339777       1 leaderelection.go:283] successfully renewed lease kube-system/external-resizer-file-csi-azure-com
I1211 12:27:31.349381       1 leaderelection.go:283] successfully renewed lease kube-system/external-resizer-file-csi-azure-com
runtime: mlock of signal stack failed: 12
runtime: increase the mlock limit (ulimit -l) or
runtime: update your kernel to 5.3.15+, 5.4.2+, or 5.5+
fatal error: mlock failed

runtime stack:
runtime.throw(0x15d27f3, 0xc)
    /usr/lib/go-1.14/src/runtime/panic.go:1112 +0x72
runtime.mlockGsignal(0xc000682a80)
    /usr/lib/go-1.14/src/runtime/os_linux_x86.go:72 +0x107
runtime.mpreinit(0xc000079180)
    /usr/lib/go-1.14/src/runtime/os_linux.go:341 +0x78
runtime.mcommoninit(0xc000079180)
    /usr/lib/go-1.14/src/runtime/proc.go:630 +0x108
runtime.allocm(0xc00004f800, 0x1672e98, 0x14f676dd7e26c)
    /usr/lib/go-1.14/src/runtime/proc.go:1390 +0x14e
runtime.newm(0x1672e98, 0xc00004f800)
    /usr/lib/go-1.14/src/runtime/proc.go:1704 +0x39
runtime.startm(0x0, 0xc000103201)
    /usr/lib/go-1.14/src/runtime/proc.go:1869 +0x12a
runtime.wakep(...)
    /usr/lib/go-1.14/src/runtime/proc.go:1953
runtime.resetspinning()
    /usr/lib/go-1.14/src/runtime/proc.go:2415 +0x93
runtime.schedule()
    /usr/lib/go-1.14/src/runtime/proc.go:2527 +0x2de
runtime.park_m(0xc000103200)
    /usr/lib/go-1.14/src/runtime/proc.go:2690 +0x9d
runtime.mcall(0x0)
    /usr/lib/go-1.14/src/runtime/asm_amd64.s:318 +0x5b

goroutine 1 [select, 2 minutes]:
k8s.io/client-go/tools/leaderelection.(*LeaderElector).renew.func1.1(0x13763e0, 0x0, 0xc0000eee40)
...
```

**What you expected to happen**:
The container should not fail so frequently.

**How to reproduce it**:
The failure started right after installing v0.7.0 of *azurefile-csi-driver*. I upgraded to v0.9.0 (for both, azurefile and azuredisk) with the same results. The Kubernetes cluster is composed of 3 master nodes and 3 workers running on Azure VMs (not AKS).

**Anything else we need to know?**:
Found a couple issues in *golang/go* repository that seems to be related:

* runtime: mlock of signal stack failed: 12 [1.14 backport] #37807 (https://github.com/golang/go/issues/37807)
* runtime: mlock of signal stack failed: 12 (https://github.com/golang/go/issues/37436)

Possibly upgrading golang version from **1.14** to **1.15** will solve the problem.

**Environment**:
- CSI Driver version: v0.7.0 and v0.9.0
- Kubernetes version (use `kubectl version`): v1.19.14
- OS (e.g. from /etc/os-release): Ubuntu v20.04.1 LTS
- Kernel (e.g. `uname -a`): 5.4.0-1032-azure #33-Ubuntu SMP Fri Nov 13 14:23:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
- Install tools: Helm v3.4.2
- Others:
  * Master node size: Standard D2ds_v4 (2 vcpus, 8 GiB memory)
  * Worker node size: Standard D16ds_v4 (16 vcpus, 64 GiB memory)

Complete log file: [csi-resizer.log](https://github.com/kubernetes-csi/external-resizer/files/5679400/csi-resizer.log)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

external-resizer crashing in azuredisk-csi-driver and azurefile-csi-driver #130

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

external-resizer crashing in azuredisk-csi-driver and azurefile-csi-driver #130

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions