ZFS is confused by user namespaces (uid/gid mapping) when used with acltype=posixac

Bug #1567558 reported by Stéphane Graber
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Tim Gardner
Xenial
Fix Released
Undecided
Tim Gardner
Yakkety
Fix Released
Undecided
Tim Gardner
zfs-linux (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Confirmed
Undecided
Unassigned
Yakkety
Fix Released
Undecided
Unassigned

Bug Description

This report is copy/paste from the following upstream issue: https://github.com/zfsonlinux/zfs/issues/4177

I was asked to file a matching Ubuntu bug for tracking.

Hello,

# First a quick introduction to the world of containers

I'm the project leader for LXC and LXD, working on containers on Linux. We now extensively use the user namespaces to provide an extra layer of security in Linux containers.

The user namespace allows one to map a range of uid and gid from the host or parent namespace into another range of uid and gid of a new namespace.

Typically what's done is that 65536 uids and gids are set aside per non-system users on the host. Those users through a couple of setuid helpers (newuidmap and newgidmap) can then setup a uid and gid map for their processes. Their 65536 allocation is therefore mapped from uid/gid 0 to 65536 of the new namespace, providing a POSIX-compatible environment.

That means that given a user on the host with uid and gid range 100000 through 165536, uid 100 in their container will be mapped to uid 100100 outside of it.

# The problem with ZFS

When using ZFS with acltype=posixacl and an ACL entry on the host set for a uid (or gid) that's then mapped into the container, the container doesn't see the right mapped value when querying the acl from inside the namespace.

# Example with zfs (broken)

root@dakara:~# zfs create lxd/test -o mountpoint=/tmp/test
root@dakara:~# zfs set acltype=posixacl lxd/test
root@dakara:~# cd /tmp/test/
root@dakara:/tmp/test# mkdir a
root@dakara:/tmp/test# setfacl -m default:user:100100:rwX a
root@dakara:/tmp/test# setfacl -m user:100100:rwX a
root@dakara:/tmp/test# getfacl a
# file: a
# owner: root
# group: root
user::rwx
user:100100:rwx
group::r-x
mask::rwx
other::r-x
default:user::rwx
default:user:100100:rwx
default:group::r-x
default:mask::rwx
default:other::r-x

root@dakara:/tmp/test# lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /bin/bash
root@dakara:/tmp/test (in userns)# ls -lh
total 512
drwxrwxr-x+ 2 nobody nogroup 2 Jan 7 22:19 a

root@dakara:/tmp/test (in userns)# getfacl -n a
# file: a
# owner: nobody
# group: nogroup
user::rwx
user:4294967295:rwx
group::r-x
mask::rwx
other::r-x
default:user::rwx
default:user:4294967295:rwx
default:group::r-x
default:mask::rwx
default:other::r-x

# Example with ext4 (working)

root@dakara:/tmp/test.ext4# mkdir a

root@dakara:/tmp/test.ext4# setfacl -m default:user:100100:rwX a

root@dakara:/tmp/test.ext4# setfacl -m user:100100:rwX a

root@dakara:/tmp/test.ext4# getfacl a
# file: a
# owner: root
# group: root
user::rwx
user:100100:rwx
group::r-x
mask::rwx
other::r-x
default:user::rwx
default:user:100100:rwx
default:group::r-x
default:mask::rwx
default:other::r-x

root@dakara:/tmp/test.ext4# lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /bin/bash
root@dakara:/tmp/test.ext4 (in userns)# ls -lh
total 4.0K
drwxrwxr-x+ 2 nobody nogroup 4.0K Jan 7 22:22 a

root@dakara:/tmp/test.ext4 (in userns)# getfacl -n a
# file: a
# owner: 65534
# group: 65534
user::rwx
user:100:rwx
group::r-x
mask::rwx
other::r-x
default:user::rwx
default:user:100:rwx
default:group::r-x
default:mask::rwx
default:other::r-x

# Environment

This was noticed on Ubuntu 14.04 using the zfs stable PPA. I first found it in production environments first with file servers misbehaving due to the problem, then reproduced it on my development systems.

The zfs version here is 0.6.5.3-1~trusty and I've seen this on 3.13, 3.16, 3.19 and 4.2 kernels (not that it should matter, the dkms code was the same). zfs-dkms is at 2.53-zfs1.

The lxc-usernsexec helper tool I'm using there comes from the LXC package in Ubuntu. It essentially causes a call to fork() followed by a call to unshare(CLONE_NEWUSER), then calls the newuidmap and newgidmap setuid helpers with the provided map so that the namespace can be configured properly.

You could reproduce something similar using the simple unshare tool and manual writes to /proc/PID/{u,g}id_map

Revision history for this message
Richard Laager (rlaager) wrote :
Richard Laager (rlaager)
Changed in linux (Ubuntu):
status: New → Confirmed
Changed in zfs-linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Richard Laager (rlaager) wrote :

@stgraber: If this is something you can reproduce (e.g. in a VM) using zfs-dkms rather than the pre-compiled zfs.ko from linux-image, can you please test from this PPA:
https://launchpad.net/~rlaager/+archive/ubuntu/zfs

The package there has the patch from upstream.

If you do test from my PPA, please remove it from your APT sources when you're done testing. I don't want some future experiment I upload to break your system.

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Yakkety):
assignee: nobody → Tim Gardner (timg-tpi)
status: Confirmed → Fix Committed
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in zfs-linux (Ubuntu Xenial):
status: New → Confirmed
Revision history for this message
Kamal Mostafa (kamalmostafa) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (16.9 KiB)

This bug was fixed in the package linux - 4.4.0-23.41

---------------
linux (4.4.0-23.41) xenial; urgency=low

  [ Kamal Mostafa ]

  * Release Tracking Bug
    - LP: #1582431

  * zfs: disable module checks for zfs when cross-compiling (LP: #1581127)
    - [Packaging] disable zfs module checks when cross-compiling

  * Xenial update to v4.4.10 stable release (LP: #1580754)
    - Revert "UBUNTU: SAUCE: (no-up) ACPICA: Dispatcher: Update thread ID for
      recursive method calls"
    - Revert "UBUNTU: SAUCE: nbd: ratelimit error msgs after socket close"
    - Revert: "powerpc/tm: Check for already reclaimed tasks"
    - RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
    - ipvs: handle ip_vs_fill_iph_skb_off failure
    - ipvs: correct initial offset of Call-ID header search in SIP persistence
      engine
    - ipvs: drop first packet to redirect conntrack
    - mfd: intel-lpss: Remove clock tree on error path
    - nbd: ratelimit error msgs after socket close
    - ata: ahci_xgene: dereferencing uninitialized pointer in probe
    - mwifiex: fix corner case association failure
    - CNS3xxx: Fix PCI cns3xxx_write_config()
    - clk-divider: make sure read-only dividers do not write to their register
    - soc: rockchip: power-domain: fix err handle while probing
    - clk: rockchip: free memory in error cases when registering clock branches
    - clk: meson: Fix meson_clk_register_clks() signature type mismatch
    - clk: qcom: msm8960: fix ce3_core clk enable register
    - clk: versatile: sp810: support reentrance
    - clk: qcom: msm8960: Fix ce3_src register offset
    - lpfc: fix misleading indentation
    - ath9k: ar5008_hw_cmn_spur_mitigate: add missing mask_m & mask_p
      initialisation
    - mac80211: fix statistics leak if dev_alloc_name() fails
    - tracing: Don't display trigger file for events that can't be enabled
    - MD: make bio mergeable
    - Minimal fix-up of bad hashing behavior of hash_64()
    - mm, cma: prevent nr_isolated_* counters from going negative
    - mm/zswap: provide unique zpool name
    - ARM: EXYNOS: Properly skip unitialized parent clock in power domain on
    - ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernel
    - xen: Fix page <-> pfn conversion on 32 bit systems
    - xen/balloon: Fix crash when ballooning on x86 32 bit PAE
    - xen/evtchn: fix ring resize when binding new events
    - HID: wacom: Add support for DTK-1651
    - HID: Fix boot delay for Creative SB Omni Surround 5.1 with quirk
    - Input: zforce_ts - fix dual touch recognition
    - proc: prevent accessing /proc/<PID>/environ until it's ready
    - mm: update min_free_kbytes from khugepaged after core initialization
    - batman-adv: fix DAT candidate selection (must use vid)
    - batman-adv: Check skb size before using encapsulated ETH+VLAN header
    - batman-adv: Fix broadcast/ogm queue limit on a removed interface
    - batman-adv: Reduce refcnt of removed router when updating route
    - writeback: Fix performance regression in wb_over_bg_thresh()
    - MAINTAINERS: Remove asterisk from EFI directory names
    - x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
    - ARM: cpuidle: Pass on arm_cpuidle_s...

Changed in linux (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Revision history for this message
Andreas Fuchs (asf) wrote :

I tested this the xenial-proposed kernel (4.4.0-23) on a machine that was showing the exact symptoms described by the original reporter in Xenial. Here's the sequence of commands on the -proposed kernel:

root@bonnetmaker:~# uname -a
Linux bonnetmaker 4.4.0-23-lowlatency #41-Ubuntu SMP PREEMPT Mon May 16 23:55:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
root@bonnetmaker:~# zfs create lxd/test -o mountpoint=/tmp/test
root@bonnetmaker:~# zfs set acltype=posixacl lxd/test
root@bonnetmaker:~# cd /tmp/test/
root@bonnetmaker:/tmp/test# mkdir a
root@bonnetmaker:/tmp/test# setfacl -m default:user:100100:rwX a
root@bonnetmaker:/tmp/test# setfacl -m user:100100:rwX a
root@bonnetmaker:/tmp/test# getfacl -n a
# file: a
# owner: 0
# group: 0
user::rwx
user:100100:rwx
group::r-x
mask::rwx
other::r-x
default:user::rwx
default:user:100100:rwx
default:group::r-x
default:mask::rwx
default:other::r-x

root@bonnetmaker:/tmp/test# lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /bin/bash
bash: /root/.bashrc: Permission denied
root@bonnetmaker:/tmp/test# ls -lh
total 512
drwxrwxr-x+ 2 nobody nogroup 2 May 23 16:24 a
root@bonnetmaker:/tmp/test# getfacl -n a
# file: a
# owner: 65534
# group: 65534
user::rwx
user:100:rwx
group::r-x
mask::rwx
other::r-x
default:user::rwx
default:user:100:rwx
default:group::r-x
default:mask::rwx
default:other::r-x

root@bonnetmaker:/tmp/test#

Numbers check out - looks like it's working now!

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Richard Laager (rlaager) wrote :

This is fixed in zfs-linux in yakkety by way of having the 0.6.5.7 release.

Changed in zfs-linux (Ubuntu Yakkety):
status: Confirmed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.4.0-24.43

---------------
linux (4.4.0-24.43) xenial; urgency=low

  [ Kamal Mostafa ]

  * CVE-2016-1583 (LP: #1588871)
    - ecryptfs: fix handling of directory opening
    - SAUCE: proc: prevent stacking filesystems on top
    - SAUCE: ecryptfs: forbid opening files without mmap handler
    - SAUCE: sched: panic on corrupted stack end

  * arm64: statically link rtc-efi (LP: #1583738)
    - [Config] Link rtc-efi statically on arm64

 -- Kamal Mostafa <email address hidden> Fri, 03 Jun 2016 10:02:16 -0700

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.