hacktricks/linux-hardening/privilege-escalation/docker-breakout/namespaces/user-namespace.md

11 KiB

User Namespace

HackTricks in 🐦 Twitter 🐦 - 🎙️ Twitch Wed - 18.30(UTC) 🎙️ - 🎥 Youtube 🎥

Basic Information

A user namespace is a Linux kernel feature that provides isolation of user and group ID mappings, allowing each user namespace to have its own set of user and group IDs. This isolation enables processes running in different user namespaces to have different privileges and ownership, even if they share the same user and group IDs numerically.

User namespaces are particularly useful in containerization, where each container should have its own independent set of user and group IDs, allowing for better security and isolation between containers and the host system.

How it works:

  1. When a new user namespace is created, it starts with an empty set of user and group ID mappings. This means that any process running in the new user namespace will initially have no privileges outside of the namespace.
  2. ID mappings can be established between the user and group IDs in the new namespace and those in the parent (or host) namespace. This allows processes in the new namespace to have privileges and ownership corresponding to user and group IDs in the parent namespace. However, the ID mappings can be restricted to specific ranges and subsets of IDs, allowing for fine-grained control over the privileges granted to processes in the new namespace.
  3. Within a user namespace, processes can have full root privileges (UID 0) for operations inside the namespace, while still having limited privileges outside the namespace. This allows containers to run with root-like capabilities within their own namespace without having full root privileges on the host system.
  4. Processes can move between namespaces using the setns() system call or create new namespaces using the unshare() or clone() system calls with the CLONE_NEWUSER flag. When a process moves to a new namespace or creates one, it will start using the user and group ID mappings associated with that namespace.

Lab:

Create different Namespaces

CLI

sudo unshare -U [--mount-proc] /bin/bash

By mounting a new instance of the /proc filesystem if you use the param --mount-proc, you ensure that the new mount namespace has an accurate and isolated view of the process information specific to that namespace.

Error: bash: fork: Cannot allocate memory

If you run the previous line without -f you will get that error.
The error is caused by the PID 1 process exits in the new namespace.

After bash start to run, bash will fork several new sub-processes to do somethings. If you run unshare without -f, bash will have the same pid as the current "unshare" process. The current "unshare" process call the unshare systemcall, create a new pid namespace, but the current "unshare" process is not in the new pid namespace. It is the desired behavior of linux kernel: process A creates a new namespace, the process A itself won't be put into the new namespace, only the sub-processes of process A will be put into the new namespace. So when you run:

unshare -p /bin/bash

The unshare process will exec /bin/bash, and /bin/bash forks several sub-processes, the first sub-process of bash will become PID 1 of the new namespace, and the subprocess will exit after it completes its job. So the PID 1 of the new namespace exits.

The PID 1 process has a special function: it should become all the orphan processes' parent process. If PID 1 process in the root namespace exits, kernel will panic. If PID 1 process in a sub namespace exits, linux kernel will call the disable_pid_allocation function, which will clean the PIDNS_HASH_ADDING flag in that namespace. When linux kernel create a new process, kernel will call alloc_pid function to allocate a PID in a namespace, and if the PIDNS_HASH_ADDING flag is not set, alloc_pid function will return a -ENOMEM error. That's why you got the "Cannot allocate memory" error.

You can resolve this issue by use the '-f' option:

unshare -fp /bin/bash

If you run unshare with '-f' option, unshare will fork a new process after it create the new pid namespace. And run /bin/bash in the new process. The new process will be the pid 1 of the new pid namespace. Then bash will also fork several sub-processes to do some jobs. As bash itself is the pid 1 of the new pid namespace, its sub-processes can exit without any problem.

Copied from https://stackoverflow.com/questions/44666700/unshare-pid-bin-bash-fork-cannot-allocate-memory

Docker

docker run -ti --name ubuntu1 -v /usr:/ubuntu1 ubuntu bash

To use user namespace, Docker daemon needs to be started with --userns-remap=default(In ubuntu 14.04, this can be done by modifying /etc/default/docker and then executing sudo service docker restart)

Check which namespace is your process in

ls -l /proc/self/ns/user
lrwxrwxrwx 1 root root 0 Apr  4 20:57 /proc/self/ns/user -> 'user:[4026531837]'

It's possible to check the user map from the docker container with:

cat /proc/self/uid_map 
         0          0 4294967295  --> Root is root in host
         0     231072      65536  --> Root is 231072 userid in host

Or from the host with:

cat /proc/<pid>/uid_map 

Find all User namespaces

{% code overflow="wrap" %}

sudo find /proc -maxdepth 3 -type l -name user -exec readlink {} \; 2>/dev/null | sort -u
# Find the processes with an specific namespace
sudo find /proc -maxdepth 3 -type l -name user -exec ls -l  {} \; 2>/dev/null | grep <ns-number>

{% endcode %}

Enter inside a User namespace

nsenter -U TARGET_PID --pid /bin/bash

Also, you can only enter in another process namespace if you are root. And you cannot enter in other namespace without a descriptor pointing to it (like /proc/self/ns/user).

Create new User namespace (with mappings)

{% code overflow="wrap" %}

unshare -U [--map-user=<uid>|<name>] [--map-group=<gid>|<name>] [--map-root-user] [--map-current-user]

{% endcode %}

# Container
sudo unshare -U /bin/bash
nobody@ip-172-31-28-169:/home/ubuntu$ #Check how the user is nobody

# From the host
ps -ef | grep bash # The user inside the host is still root, not nobody
root       27756   27755  0 21:11 pts/10   00:00:00 /bin/bash

Recovering Capabilities

In the case of user namespaces, when a new user namespace is created, the process that enters the namespace is granted a full set of capabilities within that namespace. These capabilities allow the process to perform privileged operations such as mounting filesystems, creating devices, or changing ownership of files, but only within the context of its user namespace.

For example, when you have the CAP_SYS_ADMIN capability within a user namespace, you can perform operations that typically require this capability, like mounting filesystems, but only within the context of your user namespace. Any operations you perform with this capability won't affect the host system or other namespaces.

{% hint style="warning" %} Therefore, even if getting a new process inside a new User namespace will give you all the capabilities back (CapEff: 000001ffffffffff), you actually can only use the ones related to the namespace (mount for example) but not every one. So, this on its own is not enough to escape from a Docker container. {% endhint %}

# There are the syscalls that are filtered after changing User namespace with:
unshare -UmCpf  bash

Probando: 0x067 . . . Error
Probando: 0x070 . . . Error
Probando: 0x074 . . . Error
Probando: 0x09b . . . Error
Probando: 0x0a3 . . . Error
Probando: 0x0a4 . . . Error
Probando: 0x0a7 . . . Error
Probando: 0x0a8 . . . Error
Probando: 0x0aa . . . Error
Probando: 0x0ab . . . Error
Probando: 0x0af . . . Error
Probando: 0x0b0 . . . Error
Probando: 0x0f6 . . . Error
Probando: 0x12c . . . Error
Probando: 0x130 . . . Error
Probando: 0x139 . . . Error
Probando: 0x140 . . . Error
Probando: 0x141 . . . Error
Probando: 0x143 . . . Error
HackTricks in 🐦 Twitter 🐦 - 🎙️ Twitch Wed - 18.30(UTC) 🎙️ - 🎥 Youtube 🎥