Page MenuHomePhabricator

Project deployment-prep instance deployment-sessionstore06 is down
Closed, DuplicatePublic

Description

Common information

  • summary: Project deployment-prep instance deployment-sessionstore06 is down
  • alertname: InstanceDown
  • instance: deployment-sessionstore06
  • job: node
  • project: deployment-prep
  • severity: warning

Firing alerts


  • summary: Project deployment-prep instance deployment-sessionstore06 is down
  • alertname: InstanceDown
  • instance: deployment-sessionstore06
  • job: node
  • project: deployment-prep
  • severity: warning
  • Source

Event Timeline

$ ssh deployment-sessionstore06.deployment-prep.eqiad1.wikimedia.cloud
Linux deployment-sessionstore06 5.10.0-35-cloud-amd64 #1 SMP Debian 5.10.237-1 (2025-05-19) x86_64
Debian GNU/Linux 11 (bullseye)
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
*  /!\ Please take extra care and AVOID MAKING UNPUPPETIZED CHANGES.  *
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-

Access is subject to Wikimedia Cloud Services Terms of Use:
- https://wikitech.wikimedia.org/wiki/Terms_of_use

Intro: https://www.mediawiki.org/wiki/Beta_Cluster

The last Puppet run was at Thu Mar 12 20:34:30 UTC 2026 (30 minutes ago).
Last Puppet commit: (c25f0eda5e) gitpuppet - MediaWiki: Only proxy existing .php files, otherwise return nice 404
Last login: Tue Jan 20 21:09:37 2026 from 172.16.17.143
bd808@deployment-sessionstore06:~$ w
 21:15:59 up 244 days,  6:25,  2 users,  load average: 3.47, 2.87, 1.66
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
root     ttyS0    -                11Jul25 244days  0.03s  0.01s -bash
bd808    pts/0    172.16.17.143    21:04    0.00s  0.04s  0.01s w

Probably more of T415021: Cassandra killed by oom-killer and prometheus scrapes failing intermittently on deployment-sessionstore06, but I will poke around a bit before merging as a dupe.

The instance went non-responsive on me. The Horizon console shows:

[13099164.309067] Out of memory: Killed process 1648239 (java) total-vm:2922196kB, anon-rss:1363180kB, file-rss:29816kB, shmem-rss:0kB, UID:115 pgtables:3028kB oom_score_adj:0
[14470071.115979] Out of memory: Killed process 2800123 (java) total-vm:2901392kB, anon-rss:1367192kB, file-rss:30140kB, shmem-rss:0kB, UID:115 pgtables:3008kB oom_score_adj:0
[15369293.335222] Out of memory: Killed process 3524021 (java) total-vm:2932620kB, anon-rss:1358068kB, file-rss:30168kB, shmem-rss:0kB, UID:115 pgtables:3036kB oom_score_adj:0
[15416796.899673] Out of memory: Killed process 3998159 (java) total-vm:2993124kB, anon-rss:1354184kB, file-rss:29752kB, shmem-rss:0kB, UID:115 pgtables:3116kB oom_score_adj:0
[15419567.140715] Out of memory: Killed process 4030266 (java) total-vm:2922256kB, anon-rss:1336872kB, file-rss:31524kB, shmem-rss:0kB, UID:115 pgtables:2900kB oom_score_adj:0
[15425032.065816] Out of memory: Killed process 4032686 (java) total-vm:2924640kB, anon-rss:1341072kB, file-rss:31592kB, shmem-rss:0kB, UID:115 pgtables:2908kB oom_score_adj:0
[16658298.596796] Out of memory: Killed process 4036288 (java) total-vm:3150900kB, anon-rss:1353600kB, file-rss:30016kB, shmem-rss:0kB, UID:115 pgtables:3384kB oom_score_adj:0
[16701241.208735] Out of memory: Killed process 493627 (java) total-vm:3052504kB, anon-rss:1335200kB, file-rss:30176kB, shmem-rss:0kB, UID:115 pgtables:3132kB oom_score_adj:0
[16905026.142859] Out of memory: Killed process 517253 (java) total-vm:3059584kB, anon-rss:1335820kB, file-rss:29452kB, shmem-rss:0kB, UID:115 pgtables:3012kB oom_score_adj:0