Closed
Description
Bug description
/dev/fuse
can't be opened in a gitpod workspace
openat(AT_FDCWD, "/dev/fuse", O_RDWR) = -1 EPERM (Operation not permitted)
Steps to reproduce
I suggest reproducing this in gitpod-staging because the mknod fix on fuse is not in production yet (PR #4594)
Create a file named open_fuse.c
in the workspace with the following content:
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/fs.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
int main() {
const char* src_path = "/dev/fuse";
unsigned int flags = O_RDWR;
printf("RET: %ld", syscall(SYS_openat, AT_FDCWD, src_path, flags));
}
Now compile it
gcc open_fuse.c
Now execute it, it will show that the return is -1
which is -EPERM
.
Expected behavior
The fuse device can be opened and used.
Example repository
No response
Anything else?
Analysis of a solution
I debugged this a bit and the error is triggered exactly in device_cgroup.c
:
Relevant code
/**
* __devcgroup_check_permission - checks if an inode operation is permitted
* @dev_cgroup: the dev cgroup to be tested against
* @type: device type
* @major: device major number
* @minor: device minor number
* @access: combination of DEVCG_ACC_WRITE, DEVCG_ACC_READ and DEVCG_ACC_MKNOD
*
* returns 0 on success, -EPERM case the operation is not permitted
*/
int __devcgroup_check_permission(short type, u32 major, u32 minor,
short access)
{
struct dev_cgroup *dev_cgroup;
bool rc;
rcu_read_lock();
dev_cgroup = task_devcgroup(current);
if (dev_cgroup->behavior == DEVCG_DEFAULT_ALLOW)
/* Can't match any of the exceptions, even partially */
rc = !match_exception_partial(&dev_cgroup->exceptions,
type, major, minor, access);
else
/* Need to match completely one exception to be allowed */
rc = match_exception(&dev_cgroup->exceptions, type, major,
minor, access);
rcu_read_unlock();
if (!rc)
return -EPERM;
return 0;
}
The reason is that the device cgroup that containerd/runc setup does not allow handling with that type of device, even if it's mknod inside the container (which is allowed) as we do.
From man 7 cgroups
devices (since Linux 2.6.26; CONFIG_CGROUP_DEVICE)
This supports controlling which processes may create
(mknod) devices as well as open them for reading or
writing. The policies may be specified as allow-lists and
deny-lists. Hierarchy is enforced, so new rules must not
violate existing rules for the target or ancestor cgroups.
Further information can be found in the kernel source file
Documentation/admin-guide/cgroup-v1/devices.rst (or
Documentation/cgroup-v1/devices.txt in Linux 5.2 and
earlier).
Here is the cgroup of the containerd-shim
process
CGROUP
12:memory:/system.slice/containerd.service,9:devices:/system.slice/containerd.service,8:cpu,cpuacct:/system.slice/containerd.service,6:blkio:/system.slice/containerd.servic