Boxes out of warranty
Description
Details
Event Timeline
Change 364814 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] AH, yes, statistics-privatedata-users should be on cruncher, it is a superset of perms
Change 364814 merged by Ottomata:
[operations/puppet@production] AH, yes, statistics-privatedata-users should be on cruncher, it is a superset of perms
Change 364817 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Use conditionals instead of new role files to deal with stat box migration
Change 364817 merged by Ottomata:
[operations/puppet@production] Use conditionals instead of new role files to deal with stat box migration
Change 364823 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Run reportupdater::jobs::hadoop from stat1005 instead of stat1002
Change 364823 merged by Ottomata:
[operations/puppet@production] Run reportupdater::jobs::hadoop from stat1005 instead of stat1002
Change 364826 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Move refinery::job::data_check to stat1005 from stat1002
Change 364826 merged by Ottomata:
[operations/puppet@production] Move refinery::job::data_check to stat1005 from stat1002
Change 364829 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Rsync MW API logs to stat1005
Change 364829 merged by Ottomata:
[operations/puppet@production] Rsync MW API logs to stat1005
Change 365614 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] logrotate reportupdater logs as proper user/group
Change 365614 merged by Ottomata:
[operations/puppet@production] logrotate reportupdater logs as proper user/group
Change 365634 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Backup eventlogging log data from stat1005 srv-log-eventlogging
Change 365634 abandoned by Ottomata:
Backup eventlogging log data from stat1005 srv-log-eventlogging
Reason:
Ah, I forgot, we were going to stop backing up this data, since it exists on 3 different servers.
Change 365640 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Move geowiki from stat1003 to stat1006
Change 365640 merged by Ottomata:
[operations/puppet@production] Move geowiki from stat1003 to stat1006
Change 365666 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Use https gerrit url to clone geowiki data-public on stat1006
Change 365666 merged by Ottomata:
[operations/puppet@production] Use https gerrit url to clone geowiki data-public on stat1006
Change 365668 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Set up rsync module for /home on stat boxes
Change 365668 merged by Ottomata:
[operations/puppet@production] Set up rsync module for /home on stat boxes
Change 365669 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/geowiki@master] No longer push to public data repo, we don't use it
Change 365670 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Remove public data geowiki push
Change 365669 merged by Ottomata:
[analytics/geowiki@master] No longer push to public data repo, we don't use it
Change 365670 merged by Ottomata:
[operations/puppet@production] Remove public data geowiki push
Change 365671 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/geowiki@master] Remove checking of public web page data, since it no longer exists
Change 365671 merged by Ottomata:
[analytics/geowiki@master] Remove checking of public web page data, since it no longer exists
Change 365672 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Remove dependency on geowiki public data job
Change 365672 merged by Ottomata:
[operations/puppet@production] Remove dependency on geowiki public data job
Change 365684 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Move reportupdater jobs from stat1003 -> stat1006
Change 365684 merged by Ottomata:
[operations/puppet@production] Move reportupdater jobs from stat1003 -> stat1006
The new server ran out of disk space.
13:29 < icinga-wm> PROBLEM - Disk space on stat1006 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=95%)
Mentioned in SAL (#wikimedia-operations) [2017-07-18T07:32:10Z] <elukey> moved /home to /srv/home on stat1006 to free disk space (created symling from /home -> /srv/home too) - T152712
Change 366107 had a related patch set uploaded (by Bearloga; owner: Bearloga):
[operations/puppet@production] statistics::packages: Add libssl-dev and comments
Change 366107 merged by Ottomata:
[operations/puppet@production] statistics::packages: Add libssl-dev and comments
@Ottomata I checked stat1002:/a/. Can you copy psinger's folder to stat1005? that's my only request. thanks.
Done. /a/$USER directories from stat1002 are in /srv/stat1002-a/user_dirs_from_stat1002.
stat1002 has been powered off.
Change 368461 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Install virtualenv bin on stat boxes
Change 368461 merged by Ottomata:
[operations/puppet@production] Install virtualenv bin on stat boxes
Change 368612 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Remove stat1002 configuration as part of decom
Change 368763 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Install libcgi-pm-perl for wikistats 1.0 ezachte
Change 368763 merged by Ottomata:
[operations/puppet@production] Install libcgi-pm-perl for wikistats 1.0 ezachte
Change 368794 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Sync published datasets more often, and allow users to rsync to speed up the process.
Change 368794 merged by Ottomata:
[operations/puppet@production] Sync published datasets more often, allow users to rsync
@Catrope just emailed:
I would love to migrate to stat1006 from stat1003, but stat1006 is unusably slow right now while stat1003 is snappy. Connecting to analytics-store from stat1006 times out sometimes, and even when it doesn't, simple DESCRIBE queries take 5-15 seconds. I just talked to Adam Wight on IRC and he's experiencing similar issues.
Really!? What is slow is the MySQL connection? Not actual usage of stat1006, right?
Change 370478 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Allow rsync to dataset1001 for pagecounts-ez
Change 370478 merged by Ottomata:
[operations/puppet@production] Allow rsync to dataset1001 for pagecounts-ez
Change 368612 merged by Elukey:
[operations/puppet@production] Remove stat1002 configuration as part of decom
Change 371487 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] statistics: re-add working_path variable
Change 371487 merged by Elukey:
[operations/puppet@production] statistics: re-add working_path variable
Change 371486 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Remove stat1002 from puppet as part of decom process
Change 371486 merged by Elukey:
[operations/puppet@production] Remove stat1002 from puppet as part of decom process
There is some cronspam from stat1006:
Cron Daemon root@stat1006.eqiad.wmnet via wikimedia.org
4:30 PM (1 hour ago)
to stats
Error: Value 54312 (2017-08-15) below lower absolute threshold 55000 for column 'Global North (5+)' of global_south (stride: 7)
I have been trying to fix it for the past two weeks, probably going to move it to analytics-alerts@ to reduce the noise to ops :)
For the records I created https://phabricator.wikimedia.org/T173486 and moved the cron alert to analytics-alerts@.
Change 374332 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] stat1003: remove puppet configuration as part of decom
Today I rsynced /home from stat1003 -> stat1006 with rsync -av --update stat1003.eqiad.wmnet::home/ /home/. Only files that either did not exist on stat1006 or have a newer modification timestamp on stat1003 were copied over. The list of files that were copied is here: https://gist.github.com/ottomata/2743d43188d7d7446a133dac656b12bc
Change 374332 merged by Elukey:
[operations/puppet@production] stat1003: remove puppet configuration as part of decom
stat1003 is official not a analytics host anymore and ssh keys have been removed accordingly, everything (including your home dirs) should already be on stat1006
Change 376248 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Remove stat1003 traces for decom
Change 376248 merged by Elukey:
[operations/puppet@production] Remove stat1003 traces for decom
Puppet/Salt cleaned from stat1002/3, the last steps are for DC-ops in the related tasks.