deployment-ores-redis /srv/ redis is too small (500MBytes)
Closed, ResolvedPublic


deployment-ores-redis.deployment-prep.eqiad.wmflabs causes disks alarms:

Free space - all mounts on deployment-ores-redis is CRITICAL: CRITICAL: deployment-prep.deployment-ores-redis.diskspace._srv.byte_percentfree (<44.44%)

It has redis installed on the instance extended disk mounted on /srv. However the instance is a m1.small with 20GBytes disk and / already allocates 20GB leaving on 438MBytes for /srv:

$ df -h -t ext4
Filesystem                          Size  Used Avail Use% Mounted on
/dev/vda3                            19G  2.2G   16G  13% /
/dev/mapper/vd-second--local--disk  484M  438M   17M  97% /srv

There are two ways to solve it:

  • Remove the extended disk mount and have redis data directly into / . Involves a bit of puppet work to remove the Mount['/srv'], then one will have to copy the data, unmount and move data to /
  • migrate to a new instance using the flavor c1.m2.s80 which comes with 80 Gbytes of disk. Which probably involves more configuration to update the IP wherever it is used.

Mentioned in SAL (#wikimedia-releng) [2017-03-21T16:07:14Z] <Amir1> ladsgroup@deployment-ores-redis:~$ redis-cli -h deployment-ores-redis.deployment-prep.eqiad.wmflabs -p 6380 -a areallysecretpassword flushall (T160762)

This should make it less severe until I migrate it to a bigger instance

Mentioned in SAL (#wikimedia-releng) [2017-03-21T16:47:37Z] <halfak> halfak@deployment-ores-redis:~$ redis-cli -h deployment-ores-redis.deployment-prep.eqiad.wmflabs -p 6380 -a areallysecretpassword flushall (T160762)

Halfak triaged this task as High priority.Mar 24 2017, 9:21 PM

I get this fixed by migrating to a new instance ASAP.

greg added a subscriber: greg.Mar 24 2017, 9:31 PM

(Let us know if you need any assistance.)

Mentioned in SAL (#wikimedia-releng) [2017-03-24T21:34:35Z] <Amir1> launching deployment-ores-redis-02 (T160762)

Mentioned in SAL (#wikimedia-releng) [2017-03-25T10:39:55Z] <Amir1> changing ores redis address to deployment-ores-redis-01 (T160762)

Mentioned in SAL (#wikimedia-releng) [2017-03-25T10:46:17Z] <Amir1> deleting deployment-ores-redis (T160762)

Okay. I migrated the redis server from deployment-ores-redis to deployment-ores-redis-01 which is a medium size instance and should not run into space issues any time soon (at least for years). Due to T148929: New instances attached to a role::puppetmaster::standalone Puppetmaster need manual changes after switching from the default Puppetmaster it took way more than I expected but I got it done (and made some notes in the task). This is done now and ores in beta works as expected:

Ladsgroup moved this task from Active to Done on the Scoring-platform-team board.Mar 25 2017, 10:46 AM
Ladsgroup moved this task from In progress to Done on the User-Ladsgroup board.
Ladsgroup closed this task as Resolved.