Page MenuHomePhabricator

ocg1003 partitions are severely misconfigured
Closed, ResolvedPublic

Description

We just had an alert for disk space on ocg1003

08:51 <+icinga-wm> PROBLEM - Disk space on ocg1003 is CRITICAL: DISK CRITICAL - free space: / 1660 MB (3% inode=85%)

upon analysis, I found out that, on the contrary of what happens on ocg1002 where /dev/md2/ is mounted on /srv , it has no partition mounted under /srv.

It has, instead, a mostly empty LVM volume group.

We need to amend this situation and have a new partition on LVM mounted as /srv. It is challenging to do so without causing major service disruption, if we remember that ocg1001 is now under maintenance and depooled.

Event Timeline

Joe created this task.Apr 12 2017, 9:05 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 12 2017, 9:05 AM
Joe claimed this task.Apr 12 2017, 9:05 AM
Joe triaged this task as High priority.
Volans renamed this task from ogc1003 partitions are severely misconfigured to ocg1003 partitions are severely misconfigured.Apr 12 2017, 9:29 AM

Mentioned in SAL (#wikimedia-operations) [2017-04-12T09:47:01Z] <_joe_> remounting the new partition under /srv/deployment/ocg/output, cleaning out the old dir. Will cause a service interruption for requests to ocg1003 for a few minutes. T162780

Mentioned in SAL (#wikimedia-operations) [2017-04-16T15:35:25Z] <elukey> executing sudo find -name *.pdf -mtime +3 -exec rm {} \; on ocg1003's /srv/deployment/ocg/output to clean up some disk space - T162780

Mentioned in SAL (#wikimedia-operations) [2017-04-23T03:11:59Z] <andrewbogott> removing files in /srv/deployment/ocg/postmortem on ocg1003, another case of T162780

Mentioned in SAL (#wikimedia-operations) [2017-07-21T14:30:18Z] <_joe_> stopping ocg temporarily on ocg1003, T162780

Joe closed this task as Resolved.Jul 21 2017, 2:34 PM