Page MenuHomePhabricator

Move `tmpdir` on deployment-db instances
Open, MediumPublic

Description

There have been a couple of instances in the past weeks (e.g. T116447: [postmortem] Beta Cluster outage: deployment-db2 disk filled up, locked db replication) where runaway queries have taken down Beta Cluster database servers by filling up the /mnt partition which both the datadir and the tmpdir share. We should isolate the latter with its own partition (or quota perhaps) to mitigate the damage in such scenarios. A separate volume for tmpdir is probably the most foolproof setup and should also help performance slighty, though performance is definitely not a major concern.

Event Timeline

dduvall raised the priority of this task from to Needs Triage.
dduvall updated the task description. (Show Details)
dduvall subscribed.

On the labs instances we are using a puppet class that allocates all the extended disk to a single partition /mnt with lvm. End result for db2:

FilesystemSizeUsedAvailUse%Mounted on
/dev/mapper/vd-second--local--disk147G58G82G42%/mnt

The instance has role::mariadb::beta, in theory we could invoke some puppet define to generate two partitions, one for data and the other for tmp.

We would need to resize the existing partition which most probably require MySQL to be shutdown. Hence schedule downtime for beta cluster.

thcipriani triaged this task as Medium priority.Nov 2 2015, 8:31 PM
thcipriani moved this task from To Triage to Backlog on the Beta-Cluster-Infrastructure board.