Page MenuHomePhabricator

Toolforge: sgebastion: systemd resource control not working
Closed, ResolvedPublic

Description

I just found this:

aborrero@tools-sgebastion-06:~$ sudo systemctl status user-.slice
● user-.slice
   Loaded: error (Reason: Invalid argument)
  Drop-In: /etc/systemd/system/user-.slice.d
           └─puppet-override.conf
   Active: inactive (dead)

Feb 04 11:14:26 tools-sgebastion-06 systemd[1]: [/etc/systemd/system/user-.slice.d/puppet-override.conf:11] Memory limit '0' out of range. Ignoring.
Feb 04 11:14:26 tools-sgebastion-06 systemd[1]: [/etc/systemd/system/user-.slice.d/puppet-override.conf:14] Unknown lvalue 'IPAccounting' in section 'Slice'
Feb 04 11:14:26 tools-sgebastion-06 systemd[1]: user-.slice: Slice name user-.slice is not valid. Refusing.

aborrero@tools-sgebastion-06:~$ apt-cache policy systemd
systemd:
  Installed: 232-25+deb9u8
  Candidate: 232-25+deb9u8
  Version table:
     239-12~bpo9+1 100
        100 http://mirrors.wikimedia.org/debian stretch-backports/main amd64 Packages
 *** 232-25+deb9u8 500
        500 http://security.debian.org stretch/updates/main amd64 Packages
        100 /var/lib/dpkg/status
     232-25+deb9u6 500
        500 http://deb.debian.org/debian stretch/main amd64 Packages
  • We have the wrong systemd version installed
  • We have some typo in the config

Event Timeline

aborrero created this task.

Change 487823 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: bastion: introduce apt pinning for systemd

https://gerrit.wikimedia.org/r/487823

Change 487823 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: bastion: introduce apt pinning for systemd

https://gerrit.wikimedia.org/r/487823

Mentioned in SAL (#wikimedia-cloud) [2019-02-04T11:36:32Z] <arturo> T215154 manually install systemd 239 in tools-sgebastion-06

Mentioned in SAL (#wikimedia-cloud) [2019-02-04T11:38:49Z] <arturo> T215154 reboot tools-sgebastion-06 to totally refresh systemd status

Mentioned in SAL (#wikimedia-cloud) [2019-02-04T12:26:18Z] <arturo> T215154 another reboot for tools-sgebastion-06. Puppet is disabled

Something is not making any sense. I can go over the memory limit and systemd does nothing despite limits being set:

root@tools-sgebastion-06:~# systemctl status user-18194.slice
● user-18194.slice
   Loaded: loaded
  Drop-In: /etc/systemd/system/user-.slice.d
           └─puppet-override.conf
   Active: active since Mon 2019-02-04 12:43:51 UTC; 1min 6s ago
    Tasks: 9 (limit: 100)
   Memory: 760.5M (high: 100.0M max: 150.0M swap max: 0B limit: 150.0M)
      CPU: 4.993s
   CGroup: /user.slice/user-18194.slice
           ├─session-9.scope
           │ ├─1307 sshd: aborrero [priv]
           │ ├─1333 sshd: aborrero@pts/1
           │ ├─1334 -bash
           │ ├─2378 stress -c 2 --vm 1 --vm-bytes 1G
           │ ├─2379 stress -c 2 --vm 1 --vm-bytes 1G
           │ ├─2380 stress -c 2 --vm 1 --vm-bytes 1G
           │ └─2381 stress -c 2 --vm 1 --vm-bytes 1G

I'm now running the correct systemd version:

root@tools-sgebastion-06:~# dpkg -s systemd | grep Version
Version: 239-12~bpo9+1
root@tools-sgebastion-06:~# systemd --version
systemd 239
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid

Also, other weird thing is this:

root@tools-sgebastion-06:~# systemctl status user-.slice
Warning: The unit file, source configuration file or drop-ins of user-.slice changed on disk. Run 'systemctl daemon-reload' to reload units.
● user-.slice
   Loaded: error (Reason: Unit user-.slice failed to loaded properly: Invalid argument.)
  Drop-In: /etc/systemd/system/user-.slice.d
           └─puppet-override.conf
   Active: inactive (dead)

The systemctl report for that slice is 'invalid' but there is no further information on what is failing, Apparently daemon-reload does nothing.

Is this one of those template units that need to be enabled for each user separately? I see there's a "user-0.slice" for root.

Mentioned in SAL (#wikimedia-cloud) [2019-02-04T13:19:58Z] <arturo> T215154 another reboot for tools-sgebastion-06

Is this one of those template units that need to be enabled for each user separately? I see there's a "user-0.slice" for root.

Yes, user-0.slice is for root. We need an explicit config for root because otherwise it will get applied the same limits that the rest of the users (i.e user-.slice).

I just wrote this https://wikitech.wikimedia.org/wiki/Systemd_resource_control

The problem has been mostly solved now:

root@tools-sgebastion-06:~# systemctl status user-.slice
Warning: The unit file, source configuration file or drop-ins of user-.slice changed on disk. Run 'systemctl daemon-reload' to reload units.
● user-.slice
   Loaded: error (Reason: Unit user-.slice failed to loaded properly: Invalid argument.)
  Drop-In: /etc/systemd/system/user-.slice.d
           └─puppet-override.conf
   Active: inactive (dead)

^^^ this happen because a feature/bug in systemctl status which doesn't like the 'fake' unit name ending in a dash

root@tools-sgebastion-06:~# systemctl status user-18194.slice
● user-18194.slice
   Loaded: loaded
  Drop-In: /etc/systemd/system/user-.slice.d
           └─puppet-override.conf
   Active: active since Mon 2019-02-04 12:43:51 UTC; 1min 6s ago
    Tasks: 9 (limit: 100)
   Memory: 760.5M (high: 100.0M max: 150.0M swap max: 0B limit: 150.0M)

^^^ this happened because I already had a lot of memory allocated and was trying to set a limit above the current allocation. In the logs I saw:
user-18194.slice: Failed to set memory.limit_in_bytes: Device or resource busy

aborrero@tools-sgebastion-06:~$ sudo systemctl status user-.slice
● user-.slice
   Loaded: error (Reason: Invalid argument)
  Drop-In: /etc/systemd/system/user-.slice.d
           └─puppet-override.conf
   Active: inactive (dead)

Feb 04 11:14:26 tools-sgebastion-06 systemd[1]: [/etc/systemd/system/user-.slice.d/puppet-override.conf:11] Memory limit '0' out of range. Ignoring.
Feb 04 11:14:26 tools-sgebastion-06 systemd[1]: [/etc/systemd/system/user-.slice.d/puppet-override.conf:14] Unknown lvalue 'IPAccounting' in section 'Slice'
Feb 04 11:14:26 tools-sgebastion-06 systemd[1]: user-.slice: Slice name user-.slice is not valid. Refusing.

^^^ all these are gone after upgrading to systemd 239.

Change 487847 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: bastion: also apt pin udev

https://gerrit.wikimedia.org/r/487847

Change 487886 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: bastion: split resource control puppet code

https://gerrit.wikimedia.org/r/487886

Change 487886 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: bastion: split resource control puppet code

https://gerrit.wikimedia.org/r/487886

Change 487847 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: bastion: also apt pin udev

https://gerrit.wikimedia.org/r/487847

aborrero moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.

I think we are all set.

Reopening because we moved from tools-sgebastion06 to tools-sgebastion07 and the systemd version isn't right, so we don't have resource control:

aborrero@tools-sgebastion-07:~$ sudo apt-get install systemd
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 systemd : Depends: libsystemd0 (= 239-12~bpo9+1) but 232-25+deb9u8 is to be installed
E: Unable to correct problems, you have held broken packages.
aborrero@tools-sgebastion-07:~$ apt-cache policy udev systemd libsystemd0
udev:
  Installed: 232-25+deb9u8
  Candidate: 239-12~bpo9+1
  Version table:
     239-12~bpo9+1 1001
        100 http://mirrors.wikimedia.org/debian stretch-backports/main amd64 Packages
 *** 232-25+deb9u8 500
        500 http://security.debian.org stretch/updates/main amd64 Packages
        100 /var/lib/dpkg/status
     232-25+deb9u6 500
        500 http://deb.debian.org/debian stretch/main amd64 Packages
systemd:
  Installed: 232-25+deb9u8
  Candidate: 239-12~bpo9+1
  Version table:
     239-12~bpo9+1 1001
        100 http://mirrors.wikimedia.org/debian stretch-backports/main amd64 Packages
 *** 232-25+deb9u8 500
        500 http://security.debian.org stretch/updates/main amd64 Packages
        100 /var/lib/dpkg/status
     232-25+deb9u6 500
        500 http://deb.debian.org/debian stretch/main amd64 Packages
libsystemd0:
  Installed: 232-25+deb9u8
  Candidate: 232-25+deb9u8
  Version table:
     239-12~bpo9+1 100
        100 http://mirrors.wikimedia.org/debian stretch-backports/main amd64 Packages
 *** 232-25+deb9u8 500
        500 http://security.debian.org stretch/updates/main amd64 Packages
        100 /var/lib/dpkg/status
     232-25+deb9u6 500
        500 http://deb.debian.org/debian stretch/main amd64 Packages

This indicates we should extend the apt pinning to libsystemd0 as well.

Change 490605 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] systemd: user slice: pinning for libsystemd0 as well

https://gerrit.wikimedia.org/r/490605

Change 490605 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] systemd: user slice: pinning for libsystemd0 as well

https://gerrit.wikimedia.org/r/490605

Mentioned in SAL (#wikimedia-cloud) [2019-02-14T17:35:24Z] <arturo> T215154 tools-sgebastion-07 now running systemd 239 and starts enforcing user limits

Reopening because I saw something weird in the server regarding this and I don't think is working as expected. I don't have yet any clues of what's wrong though.

On the new tools-sgebastion-08.tools.eqiad.wmflabs:

$ apt-cache policy udev systemd libsystemd0
udev:
  Installed: 239-12~bpo9+1
  Candidate: 239-12~bpo9+1
  Version table:
 *** 239-12~bpo9+1 100
        100 http://mirrors.wikimedia.org/debian stretch-backports/main amd64 Packages
        100 /var/lib/dpkg/status
     232-25+deb9u9 500
        500 http://security.debian.org stretch/updates/main amd64 Packages
     232-25+deb9u8 500
        500 http://deb.debian.org/debian stretch/main amd64 Packages
systemd:
  Installed: 239-12~bpo9+1
  Candidate: 239-12~bpo9+1
  Version table:
 *** 239-12~bpo9+1 100
        100 http://mirrors.wikimedia.org/debian stretch-backports/main amd64 Packages
        100 /var/lib/dpkg/status
     232-25+deb9u9 500
        500 http://security.debian.org stretch/updates/main amd64 Packages
     232-25+deb9u8 500
        500 http://deb.debian.org/debian stretch/main amd64 Packages
libsystemd0:
  Installed: 239-12~bpo9+1
  Candidate: 239-12~bpo9+1
  Version table:
 *** 239-12~bpo9+1 1001
        100 http://mirrors.wikimedia.org/debian stretch-backports/main amd64 Packages
        100 /var/lib/dpkg/status
     232-25+deb9u9 500
        500 http://security.debian.org stretch/updates/main amd64 Packages
     232-25+deb9u8 500
        500 http://deb.debian.org/debian stretch/main amd64 Packages
$ sudo systemctl --no-pager status user-.slice
● user-.slice - User Slice of UID
   Loaded: error (Reason: Unit user-.slice failed to loaded properly: Invalid argument.)
  Drop-In: /lib/systemd/system/user-.slice.d
           └─10-defaults.conf
           /etc/systemd/system/user-.slice.d
           └─puppet-override.conf
   Active: inactive (dead)
Warning: The unit file, source configuration file or drop-ins of user-.slice changed on disk. Run 'systemctl daemon-reload' to reload units.

This does not make any sense:

root@tools-sgebastion-07:~# loginctl list-users
No users.
root@tools-sgebastion-07:~# loginctl list-sessions
No sessions.
root@tools-sgebastion-07:~# w | wc -l
31

The user.slice is not being used at all:

# in a sane system:
sudo systemd-cgls | grep [u]ser.slice
 ├─user.slice

# in the bastion
sudo systemd-cgls | grep [u]ser.slice
[.. nothing ..]

After reading some docs, I discovered that systemd knows about users and sessions by means of the libpam-systemd mechanism, i.e, when pam triggers, users and sessiones are then notified to systemd-logind, creating then slices, etc.
Well, it turns out we do't have libpam-systemd installed in the bastions:

root@tools-sgebastion-07:~# apt-cache policy libpam-systemd
libpam-systemd:
  Installed: (none)
  Candidate: 232-25+deb9u9
  Version table:
     241-1~bpo9+1 100
        100 http://mirrors.wikimedia.org/debian stretch-backports/main amd64 Packages
     232-25+deb9u9 500
        500 http://security.debian.org stretch/updates/main amd64 Packages
     232-25+deb9u8 500
        500 http://deb.debian.org/debian stretch/main amd64 Packages

And we can't simply install it, because the specific pinning we have for systemd:

root@tools-sgebastion-07:~# aptitude install libpam-systemd
The following NEW packages will be installed:
  libpam-systemd{b} 
The following packages will be REMOVED:
  libargon2-1{u} libsodium23{u} 
0 packages upgraded, 1 newly installed, 2 to remove and 9 not upgraded.
Need to get 188 kB of archives. After unpacking 85.0 kB will be freed.
The following packages have unmet dependencies:
 libpam-systemd : Depends: systemd (= 232-25+deb9u9) but 239-12~bpo9+1 is installed
The following actions will resolve these dependencies:

     Keep the following packages at their current version:
1)     libpam-systemd [Not Installed]                     



Accept this solution? [Y/n/q/?] ^C

It turns out that the apt pinning we have in puppet is for systemd 239, but that version is no longer present in the stretch-backports repo.

root@tools-sgebastion-07:~# apt-cache policy systemd
systemd:
  Installed: 239-12~bpo9+1
  Candidate: 239-12~bpo9+1
  Version table:
     241-1~bpo9+1 100
        100 http://mirrors.wikimedia.org/debian stretch-backports/main amd64 Packages
 *** 239-12~bpo9+1 1001
        100 /var/lib/dpkg/status
     232-25+deb9u9 500
        500 http://security.debian.org stretch/updates/main amd64 Packages
     232-25+deb9u8 500
        500 http://deb.debian.org/debian stretch/main amd64 Packages

We will need to fix this apt pinning thing.

Change 496146 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] systemd: slice for all users: fix apt pinning

https://gerrit.wikimedia.org/r/496146

Change 496147 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] systemd: slice for all users: also add pinning for libpam-systemd

https://gerrit.wikimedia.org/r/496147

Mentioned in SAL (#wikimedia-cloud) [2019-03-13T11:20:44Z] <arturo> disable puppet in tools-sgebastion-07 for testing T215154

Mentioned in SAL (#wikimedia-cloud) [2019-03-13T11:53:32Z] <arturo> enable puppet in tools-sgebastion-07 (T215154)

Change 496146 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] systemd: slice for all users: fix apt pinning

https://gerrit.wikimedia.org/r/496146

Change 496147 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] systemd: slice for all users: also add pinning for libpam-systemd

https://gerrit.wikimedia.org/r/496147

After these couple of patches, everything looks far better:

systemd-cgls
Control group /:
-.slice
├─user.slice
│ └─user-18194.slice
│   ├─session-35002.scope
│   │ ├─24779 sshd: aborrero [priv]
│   │ ├─24806 sshd: aborrero@pts/3
│   │ ├─24807 -bash
│   │ ├─24845 sudo systemd-cgls
│   │ ├─24846 systemd-cgls
│   │ └─24847 pager
│   └─user@18194.service
│     └─init.scope
│       ├─24785 /lib/systemd/systemd --user
│       └─24786 (sd-pam)
├─init.scope
[...]
aborrero@tools-sgebastion-07:~$ sudo loginctl list-users
  UID USER    
18194 aborrero

1 users listed.

I tested the limits:

aborrero@tools-sgebastion-07:~$ stress -c 2 --vm 2 --vm-bytes 1400M --vm-keep
stress: info: [27671] dispatching hogs: 2 cpu, 0 io, 2 vm, 0 hdd
stress: FAIL: [27671] (415) <-- worker 27675 got signal 9
stress: WARN: [27671] (417) now reaping child worker processes
stress: FAIL: [27671] (451) failed run completed in 6s

Thing is, there are several logged in users who aren't covered by this. For proper cleanup, we should reboot the server.

Mentioned in SAL (#wikimedia-cloud) [2019-03-13T12:17:07Z] <arturo> reboot tools-sgebastion-07 (T215154)

aborrero added a subscriber: Lucas_Werkmeister_WMDE.

The reboot was a bit disruptive for our users. @Lucas_Werkmeister_WMDE suggested I use a wall message next time.

Mentioned in SAL (#wikimedia-cloud) [2019-03-13T12:33:36Z] <arturo> reboot tools-sgebastion-08 (T215154)