Running Podman in Rootless Mode

Publication Date: 05 Dec 2024

WHAT?

Everything you need to know about using rootless containers with Podman.

WHY?

Using Podman in rootless mode makes managing containers more efficient and secure.

EFFORT

Approximately 20 minutes of reading time.

GOAL

Understand how to configure, use and troubleshoot rootless containers with Podman.

REQUIREMENTS

A SUSE Linux Enterprise Server system with Podman installed.
Working knowledge of Podman.

Revision History: Running Podman in Rootless Mode

1 Rootless Podman on SUSE Linux Enterprise #

Podman is the default container management and orchestration tool on SUSE Linux Enterprise. In addition to providing a drop-in replacement for Docker Open Source Engine, Podman offers several advantages, including the ability to run containers in rootless mode. This allows regular users to deploy containers without elevated privileges. In other words, rootless mode means that you can deploy a container without becoming root or using sudo.

By default, Podman launches containers as the current regular user. Support for rootless containers is enabled for all newly created users in SLE by default, and no additional steps are necessary.

Note: Data storage for rootless containers

When working with container images and containers as a regular user, all relevant data is stored in $HOME/.local/share/containers/storage instead of /var/lib/containers. Make sure that the home directory has enough storage space to accommodate the data.

2 When to use rootless containers #

Improved security is the key advantage of using rootless containers. Similar to regular users, rootless containers cannot access and manipulate resources that require root privileges. This safeguards the host system from malicious processes running within rootless containers.

Because running containers in rootless mode provides better security and normally does not require any additional configuration, use it as the default method of deploying containers in most situations.

A typical scenario for rootless containers would be running a development environment based on a Language Stack SLE Base Container Image. As long as the development environment does not require write access to the system files on the host, you can run it as a rootless container.

Rootless containers also make it possible for multiple regular users to run containers on the same machine. This can be particularly useful for deploying containers in high-performance computing environments.

3 When not to use rootless containers #

Although Podman makes it easy to run containers in rootless mode, there are scenarios, such as those listed below, where that is not an option.

You need the container to have write access to the host's file system. For example, you need to launch a container using a database dump stored in /var/lib/postgresql and being able to write data to it.
You need to bind the container to a port lower than 1024, without reconfiguring sysctl.

Note: Rootless containers and slirp4netns

Because unprivileged users cannot configure network namespaces on Linux, Podman relies on a user space network implementation called slirp4netns. It emulates the full TCP-IP stack and can cause a heavy performance degradation for workloads relying on high network transfer rates. This means that rootless containers suffer from slow network transfers.

If slirp4netns is not on your system, you can install it using the zypper install slirp4netns command.

4 Configuring rootless containers #

4.1 Giving Podman access to ports below 1024 #

On Linux, unprivileged users cannot open ports below port number 1024. This limitation also applies to Podman, so by default, rootless containers cannot expose ports below port number 1024. You can remove this limitation temporarily using the following command:

sysctl net.ipv4.ip_unprivileged_port_start=0

To remove the limitation permanently, run sysctl -w net.ipv4.ip_unprivileged_port_start=0.

Note that this allows all unprivileged applications to bind to ports below 1024.

4.2 Using cgroups v2 #

When using rootless containers with Podman, it is recommended to use cgroups v2. cgroups v1 have limited functionality compared to v2. For example, cgroups v1 do not allow proper hierarchical delegation to the user's subtrees. Additionally, Podman is unable to read container logs properly with cgroups v1 and the systemd log driver.

To find out which cgroups version is default on your system, use the following command:

> mount|grep ^cgroup|awk '{print $1}'|uniq
cgroup2
cgroup

The first entry in the output indicates the default cgroups version.

If you are using a version of SUSE Linux Enterprise Server with cgroups v1, you can enable cgroups v2 by adding the following to the kernel cmdline: systemd.unified_cgroup_hierarchy=1. Then update GRUB using the grub2-mkconfig -o /boot/grub2/grub.cfg command.

Even on setups of SUSE Linux Enterprise Server with cgroup v2, the default configuration delegates no controllers to user sessions (for performance reasons) and chosen controllers should be enabled explicitly, https://documentation.suse.com/sles/single-html/SLES-tuning/#sec-cgroups-user-sessions.

4.3 Enabling read access to the SUSE Customer Center credentials #

Running a container with Podman in rootless mode on SUSE Linux Enterprise Server may fail, because the container needs read access to the SUSE Customer Center credentials. For example, running a container with the command podman run -it --rm registry.suse.com/suse/sle15 bash and then executing zypper ref results in the following error message:

Refreshing service 'container-suseconnect-zypp'.
  Problem retrieving the repository index file for service 'container-suseconnect-zypp':
  [container-suseconnect-zypp|file:/usr/lib/zypp/plugins/services/container-suseconnect-zypp]
  Warning: Skipping service 'container-suseconnect-zypp' because of the above error.
  Warning: There are no enabled repositories defined.
  Use 'zypper addrepo' or 'zypper modifyrepo' commands to add or enable repositories

To solve the problem, grant the current user the required access rights by running the following command on the host:

> sudo setfacl -m u:$(id -nu):r /etc/zypp/credentials.d/*

Log out and log in again to apply the changes.

To give multiple users the required access, create a dedicated group using the groupadd GROUPNAME command. Then use the following command to change the group ownership and rights of files in the /etc/zypp/credentials.d/ directory.

> sudo chgrp GROUPNAME /etc/zypp/credentials.d/*
> sudo chmod g+r /etc/zypp/credentials.d/*

You can then grant a specific user write access by adding them to the created group.

5 Understanding user mapping #

5.1 User mapping and rootless containers #

By definition, rootless containers are run by a regular user. At the same time, certain applications deployed using containers expect to be run as root. This leads to a problem: how do you run a container as root, when you are not root on the host system? To solve the issue, Podman relies on user namespaces to map user IDs in the container to different user IDs on the host. By default, Podman maps the root user inside the container with the user ID (UID) 0 to the UID of the current user on the host system.

To illustrate how this is done in practice, create a temporary SLE BCI-Base container (the sleep value determines for how long the container runs):

> podman run -d --rm registry.suse.com/bci/bci-base sleep 600

Then run the podman top command as follows:

> podman top -l user huser
USER  HUSER
root  1000

The output indicates that the root user in the container is mapped to the user with UID 1000 on the host, so a root process inside the container is treated by the kernel as UID 1000 outside the container. This means that even though the process is running as root inside the container, this process does not have root privileges outside of the container. Moreover, if a file owned by a UID, and this UID is not mapped into the user namespace, the file is treated as owned by “nobody” (UID 65534). This also means that the container process is not allowed to access the file, unless the file is world readable and writable.

There are also situations, when a process or an application inside a container must run as the current host user. The --userns=keep-id option solves this problem by instructing Podman to map the user as itself into the container. To see how this works, create a temporary SLE BCI-Base container, but this time with --userns=keep-id:

> podman run -d --rm --userns=keep-id registry.suse.com/bci/bci-base sleep 600

Run the podman top again:

> podman top -l user huser

USER  HUSER
tux   tux

The output now shows that the regular user on the host system is mapped into the container.

There is another way to see how this works in practice. First, run the following command:

> podman run --rm -it -v ~/Downloads/:/downloads:Z registry.suse.com/bci/bci-base:15.5 ls -al /downloads

The command creates a container from the SLE BCI-Base container image, mounts the Downloads directory on the host system as the downloads directory inside, and then runs the ls -al command in the downloads directory. The output of the command looks something like this:

-rw-r--r-- 1 root root  4417088 Aug 25 09:21 document.pdf

This shows that the file is owned by root. Now, run the same command, but this time with the --userns=keep-id option:

> podman run --userns=keep-id --rm -it -v ~/Downloads/:/downloads:Z registry.suse.com/bci/bci-base:15.5 ls -al /downloads

This time, the output looks slightly different, indicating that the file is owned by the host user:

-rw-r--r-- 1 tux users 4417088 Aug 25 09:21 document.pdf

6 Troubleshooting #

6.1 Rootless mode fails #

If Podman fails to launch containers in rootless mode, check whether an entry for the current user is present in /etc/subuid on the host system:

> grep $(id -nu) /etc/subuid
  user:10000:65536

When no entry is found, add the required sub-UID and sub-GID entries via the following command:

> sudo usermod --add-subuids 100000-165535 --add-subgids 100000-165535 $(id -nu)

To enable the change, reboot the machine or stop the session of the current user. To do the latter, run loginctl list-sessions | grep USER and note the session ID. Then run loginctl kill-session SESSION_ID to stop the session.

The usermod command above defines a range of local UIDs to which the UIDs allocated to users inside the container are mapped on the host. Note that the ranges defined for different users must not overlap. It is also important that the ranges do not reuse the UID of an existing local user or group. By default, adding a user with the useradd command on SUSE Linux Enterprise Server automatically allocates sub-UID and sub-GID ranges.

6.2 Rootless containers and the storage graph root #

Podman stores the containers' data in the storage graph root (default is ~/.local/share/containers/storage). Because of the way Podman remaps user IDs in rootless containers, the graph root may contain files that are not owned by your current user but by a user ID in the sub-UID region assigned to your user. As these files do not belong to your current user, they can be inaccessible to you.

To read or modify any file in the graph root, enter a shell as follows:

> podman unshare bash
> id
  uid=0(root) gid=0(root) groups=0(root),65534(nobody)

Note that podman unshare performs the same user remapping as podman run does when launching a rootless container. You cannot gain elevated privileges via podman unshare.

Warning

Do not modify files in the graph root as this can corrupt Podman's internal state and render your containers, images and volumes inoperable.

7 Legal Notice #

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”.

For SUSE trademarks, see https://www.suse.com/company/legal/. All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its affiliates. Asterisks (*) denote third-party trademarks.

All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its affiliates, the authors, nor the translators shall be held liable for possible errors or the consequences thereof.