Skip to content

Device cgroup for /dev/fuse #4652

Closed
@fntlnz

Description

@fntlnz

Bug description

/dev/fuse can't be opened in a gitpod workspace

openat(AT_FDCWD, "/dev/fuse", O_RDWR)   = -1 EPERM (Operation not permitted)

Steps to reproduce

I suggest reproducing this in gitpod-staging because the mknod fix on fuse is not in production yet (PR #4594)

Create a file named open_fuse.c in the workspace with the following content:

#define _GNU_SOURCE
#include <unistd.h>

#include <sys/syscall.h>
#include <linux/fs.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>


int main() {
  const char* src_path = "/dev/fuse";
  unsigned int flags = O_RDWR;
  printf("RET: %ld", syscall(SYS_openat, AT_FDCWD, src_path, flags));
}

Now compile it

gcc open_fuse.c

Now execute it, it will show that the return is -1 which is -EPERM.

Expected behavior

The fuse device can be opened and used.

Example repository

No response

Anything else?

Analysis of a solution

I debugged this a bit and the error is triggered exactly in device_cgroup.c:

Relevant code

/**
 * __devcgroup_check_permission - checks if an inode operation is permitted
 * @dev_cgroup: the dev cgroup to be tested against
 * @type: device type
 * @major: device major number
 * @minor: device minor number
 * @access: combination of DEVCG_ACC_WRITE, DEVCG_ACC_READ and DEVCG_ACC_MKNOD
 *
 * returns 0 on success, -EPERM case the operation is not permitted
 */
int __devcgroup_check_permission(short type, u32 major, u32 minor,
				 short access)
{
	struct dev_cgroup *dev_cgroup;
	bool rc;

	rcu_read_lock();
	dev_cgroup = task_devcgroup(current);
	if (dev_cgroup->behavior == DEVCG_DEFAULT_ALLOW)
		/* Can't match any of the exceptions, even partially */
		rc = !match_exception_partial(&dev_cgroup->exceptions,
					      type, major, minor, access);
	else
		/* Need to match completely one exception to be allowed */
		rc = match_exception(&dev_cgroup->exceptions, type, major,
				     minor, access);
	rcu_read_unlock();

	if (!rc)
		return -EPERM;

	return 0;
}

The reason is that the device cgroup that containerd/runc setup does not allow handling with that type of device, even if it's mknod inside the container (which is allowed) as we do.

From man 7 cgroups

       devices (since Linux 2.6.26; CONFIG_CGROUP_DEVICE)
              This supports controlling which processes may create
              (mknod) devices as well as open them for reading or
              writing.  The policies may be specified as allow-lists and
              deny-lists.  Hierarchy is enforced, so new rules must not
              violate existing rules for the target or ancestor cgroups.

              Further information can be found in the kernel source file
              Documentation/admin-guide/cgroup-v1/devices.rst (or
              Documentation/cgroup-v1/devices.txt in Linux 5.2 and
              earlier).

Here is the cgroup of the containerd-shim process

CGROUP
12:memory:/system.slice/containerd.service,9:devices:/system.slice/containerd.service,8:cpu,cpuacct:/system.slice/containerd.service,6:blkio:/system.slice/containerd.servic

Metadata

Metadata

Assignees

Labels

type: bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions