Page MenuHomePhabricator

ocg1003 partitions are severely misconfigured
Closed, ResolvedPublic

Description

We just had an alert for disk space on ocg1003

08:51 <+icinga-wm> PROBLEM - Disk space on ocg1003 is CRITICAL: DISK CRITICAL - free space: / 1660 MB (3% inode=85%)

upon analysis, I found out that, on the contrary of what happens on ocg1002 where /dev/md2/ is mounted on /srv , it has no partition mounted under /srv.

It has, instead, a mostly empty LVM volume group.

We need to amend this situation and have a new partition on LVM mounted as /srv. It is challenging to do so without causing major service disruption, if we remember that ocg1001 is now under maintenance and depooled.

Event Timeline

Joe triaged this task as High priority.
Volans renamed this task from ogc1003 partitions are severely misconfigured to ocg1003 partitions are severely misconfigured.Apr 12 2017, 9:29 AM

Mentioned in SAL (#wikimedia-operations) [2017-04-12T09:47:01Z] <_joe_> remounting the new partition under /srv/deployment/ocg/output, cleaning out the old dir. Will cause a service interruption for requests to ocg1003 for a few minutes. T162780

Mentioned in SAL (#wikimedia-operations) [2017-04-16T15:35:25Z] <elukey> executing sudo find -name *.pdf -mtime +3 -exec rm {} \; on ocg1003's /srv/deployment/ocg/output to clean up some disk space - T162780

Mentioned in SAL (#wikimedia-operations) [2017-04-23T03:11:59Z] <andrewbogott> removing files in /srv/deployment/ocg/postmortem on ocg1003, another case of T162780

Mentioned in SAL (#wikimedia-operations) [2017-07-21T14:30:18Z] <_joe_> stopping ocg temporarily on ocg1003, T162780