Page MenuHomePhabricator

"mkdir: cannot create directory '/sys/fs/cgroup/memory/mediawiki/job/5': Permission denied" "limit.sh: failed to create the cgroup."
Closed, ResolvedPublic

Description

Example command:

'/usr/bin/firejail' '--quiet' '--profile=/srv/mediawiki/php-1.36.0-wmf.30/includes/shell/firejail.profile' '--blacklist=/srv/mediawiki/php-1.36.0-wmf.30/LocalSettings.php' '--noroot' '--seccomp' '--private-dev' '--net=none' -- /bin/bash '/srv/mediawiki/php-1.36.0-wmf.30/vendor/wikimedia/shellbox/src/Command/limit.sh' ''\''/srv/mediawiki/php-1.36.0-wmf.30/extensions/SyntaxHighlight_GeSHi/includes/../pygments/pygmentize'\'' '\''-l'\'' '\''python'\'' '\''-f'\'' '\''html'\'' '\''-O'\'' '\''cssclass=mw-highlight,encoding=utf-8'\''' 'SB_INCLUDE_STDERR=;SB_CPU_LIMIT=50; SB_CGROUP='\''/sys/fs/cgroup/memory/mediawiki/job'\''; SB_MEM_LIMIT=1073741824; SB_FILE_SIZE_LIMIT=536870912; SB_WALL_CLOCK_LIMIT=180; SB_USE_LOG_PIPE=yes'

Error output is:

mkdir: cannot create directory '/sys/fs/cgroup/memory/mediawiki/job/5': Permission denied
limit.sh: failed to create the cgroup.

You can easily find more via channel: exec in logstash.

This doesn't appear to actually cause the command to fail though?

Event Timeline

It's caused by firejail. I can reproduce it by running the command, but if I remove firejail then it works.

It's necessary to do "noblacklist /sys/fs". That suppresses the default blacklist of that location. It can be in MW's profile since it's not specific to our servers.

Actually, that doesn't work, and it's hard to see how to make it work or how it could have ever worked. Firejail unconditionally unmounts the host's /sys, and doesn't mount /sys/fs/cgroup/memory inside the container -- that's a separate mount in the host which would have to be duplicated into the container. Normally it blacklists /sys/fs but if you suppress that, there's still nothing inside /sys/fs/cgroup.

I found https://github.com/netblue30/firejail/issues/862 which led me to T145623, where @Gilles discovered the same thing basically. That task is declined, so I'm guessing he never found a workaround.

But how come this hasn't been spewing warnings since firejail was enabled...?

FWIW, firejail does have:

--rlimit-as=number
    Set the maximum size of the process's virtual memory (address space) in bytes.

Previously limit.sh was running outside of firejail. So the wrapper priority in Shellbox is wrong.

	/**
	 * Get an integer priority level used to determine the order in which to
	 * run multiple wrappers. Low numbers are innermost, high numbers are
	 * outermost, run last.
	 *
	 * If you nest sandboxes, it makes sense to have the most privileged
	 * hypervisor/wrapper at the outside, and the least privileged on the
	 * inside. Suggested values:
	 *
	 *    - 20: ulimit
	 *    - 40: chroot
	 *    - 60: system-level container
	 *    - 80: initial shell
	 *
	 * @return int
	 */

We could switch the order of ulimit and chroot here. I can't use a cgroup from inside my systemd-nspawn container, which is an argument for limit.sh being 60. Or the priority could depend on whether a cgroup is being used.

Change 666226 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/libs/Shellbox@master] Raise priority of limit.sh if it uses a cgroup

https://gerrit.wikimedia.org/r/666226

Change 666226 merged by jenkins-bot:
[mediawiki/libs/Shellbox@master] Raise priority of limit.sh if it uses a cgroup

https://gerrit.wikimedia.org/r/666226

Legoktm assigned this task to tstarling.

The last error in logstash is from Mar 4, 2021 @ 14:03:52.830 which coincides with https://sal.toolforge.org/log/O5-N_XcBa_6PSCT9lCZq \o/