Consider removing NFS home directories from deployment-prep. This will improve stability (lesser 'can not log in when NFS is dead') problems, and make it match prod closer (which has no NFS home directories). You will also get beers from Yuvi Panda
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | yuvipanda | T105720 Labs team reliability goal for Q1 2015/16 | |||
Resolved | Andrew | T102240 Audit projects' use of NFS, and remove it where not necessary | |||
Resolved | • AlexMonk-WMF | T102953 Completely remove Beta Cluster dependency on NFS | |||
Resolved | hashar | T102169 Disable NFS home directories on deployment-prep | |||
Resolved | hashar | T103731 deployment-parsoid01-test fails puppet: Could not find class role::parsoid |
Event Timeline
Is there a good way to distribute dotfiles to the beta cluster hosts if we drop shared NFS? I guess this would be an incentive for me to finally get mine cleaned up and on github if not.
Yes, I think putting them in git is the best way to go about this. A labs specific solution is probably needed since a lot of people will want this. Perhaps a git repo you can associate with your account on wikitech or something and it gets auto cloned on every project you're a member of? That's actually probably crazy. Someting in puppet similar to the admin module perhaps.
Needs a separate bug though. Do you consider this a blocker for disabling home NFS on deployment-prep?
Fancy! Probably/maybe too fancy but it would be pretty cool.
Needs a separate bug though. Do you consider this a blocker for disabling home NFS on deployment-prep?
Not a blocker for me, just a bit of pain. I already have a solution to manage this on the prod cluster (tarball I unpack).
Jenkins slaves on beta cluster use the jenkins-deploy user which has its home on the instance extended disk (/mnt) and thus /mnt/home/jenkins-deploy/workspace/ .
So there should be no impact on the Jenkins jobs. YMMV.
Reopening, some instances apparently still rely on NFS because puppet does not run properly :(
An example is deployment-zookeeper01.
Not yet.
hashar@deployment-salt:~$ sudo salt '*' cmd.run 'mount|grep /home' deployment-fluorine.deployment-prep.eqiad.wmflabs: deployment-sca02.deployment-prep.eqiad.wmflabs: deployment-logstash2.deployment-prep.eqiad.wmflabs: deployment-db2.deployment-prep.eqiad.wmflabs: deployment-sentry2.deployment-prep.eqiad.wmflabs: deployment-memc02.deployment-prep.eqiad.wmflabs: deployment-logstash1.deployment-prep.eqiad.wmflabs: deployment-memc03.deployment-prep.eqiad.wmflabs: deployment-zotero01.deployment-prep.eqiad.wmflabs: i-000002de.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home on /home type nfs (rw,noatime,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,nofsc) deployment-mediawiki01.deployment-prep.eqiad.wmflabs: deployment-cache-mobile03.deployment-prep.eqiad.wmflabs: deployment-elastic07.deployment-prep.eqiad.wmflabs: deployment-urldownloader.deployment-prep.eqiad.wmflabs: deployment-jobrunner01.deployment-prep.eqiad.wmflabs: deployment-elastic08.deployment-prep.eqiad.wmflabs: deployment-mx.deployment-prep.eqiad.wmflabs: deployment-restbase01.deployment-prep.eqiad.wmflabs: deployment-stream.deployment-prep.eqiad.wmflabs: deployment-apertium01.deployment-prep.eqiad.wmflabs: deployment-db1.deployment-prep.eqiad.wmflabs: deployment-redis01.deployment-prep.eqiad.wmflabs: deployment-mediawiki03.deployment-prep.eqiad.wmflabs: i-00000958.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home on /home type nfs (rw,noatime,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,nofsc) deployment-kafka02.deployment-prep.eqiad.wmflabs: deployment-elastic05.deployment-prep.eqiad.wmflabs: i-000008d5.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home on /home type nfs (rw,noatime,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,nofsc) deployment-upload.deployment-prep.eqiad.wmflabs: deployment-elastic06.deployment-prep.eqiad.wmflabs: deployment-redis02.deployment-prep.eqiad.wmflabs: deployment-cxserver03.deployment-prep.eqiad.wmflabs: tmpfs on /mnt/home/jenkins-deploy/tmpfs type tmpfs (rw,noatime,size=512M,mode=1777) deployment-memc04.deployment-prep.eqiad.wmflabs: deployment-mediawiki02.deployment-prep.eqiad.wmflabs: hashar@deployment-salt:~$
And a couple instances have not been migrated to new DNS fqdn :-/
I have fixed DNS on the i-** instances.
- deployment-cache-upload02 fixed up (dns/puppet/certs etc)
- deployment-zookeeper01 no more has the /home NFS dir after a reboot
- deployment-parsoid01-test still has the NFS home but T103731: deployment-parsoid01-test fails puppet: Could not find class role::parsoid
Still has a bunch:
^[[Aroot@deployment-salt:~# salt '*' cmd.run 'grep /home /etc/fstab|egrep ^labstore' deployment-fluorine.deployment-prep.eqiad.wmflabs: deployment-sca02.deployment-prep.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,nofsc 0 0 deployment-logstash2.deployment-prep.eqiad.wmflabs: deployment-mediawiki01.deployment-prep.eqiad.wmflabs: deployment-restbase01.deployment-prep.eqiad.wmflabs: deployment-parsoid05.deployment-prep.eqiad.wmflabs: deployment-stream.deployment-prep.eqiad.wmflabs: deployment-jobrunner01.deployment-prep.eqiad.wmflabs: deployment-db1.deployment-prep.eqiad.wmflabs: deployment-elastic05.deployment-prep.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,nofsc 0 0 deployment-redis02.deployment-prep.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,nofsc 0 0 deployment-parsoidcache02.deployment-prep.eqiad.wmflabs: deployment-memc03.deployment-prep.eqiad.wmflabs: deployment-mediawiki02.deployment-prep.eqiad.wmflabs: deployment-memc04.deployment-prep.eqiad.wmflabs: deployment-db2.deployment-prep.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,nofsc 0 0 deployment-test.deployment-prep.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,nofsc 0 0 deployment-cache-bits01.deployment-prep.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,nofsc 0 0 deployment-kafka02.deployment-prep.eqiad.wmflabs: deployment-bastion.deployment-prep.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,nofsc 0 0 deployment-mediawiki03.deployment-prep.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,nofsc 0 0 deployment-sca01.deployment-prep.eqiad.wmflabs: deployment-pdf02.deployment-prep.eqiad.wmflabs: deployment-zotero01.deployment-prep.eqiad.wmflabs: deployment-elastic08.deployment-prep.eqiad.wmflabs: deployment-videoscaler01.deployment-prep.eqiad.wmflabs: deployment-memc02.deployment-prep.eqiad.wmflabs: deployment-apertium01.deployment-prep.eqiad.wmflabs: deployment-salt.deployment-prep.eqiad.wmflabs: deployment-zookeeper01.deployment-prep.eqiad.wmflabs: deployment-mx.deployment-prep.eqiad.wmflabs: deployment-eventlogging02.deployment-prep.eqiad.wmflabs: deployment-elastic07.deployment-prep.eqiad.wmflabs: deployment-upload.deployment-prep.eqiad.wmflabs: deployment-cxserver03.deployment-prep.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,nofsc 0 0 deployment-cache-mobile03.deployment-prep.eqiad.wmflabs: deployment-logstash1.deployment-prep.eqiad.wmflabs: deployment-sentry2.deployment-prep.eqiad.wmflabs: deployment-redis01.deployment-prep.eqiad.wmflabs: labstore.svc.eqiad.wmnet:/project/deployment-prep/home /home nfs rw,vers=4,bg,hard,intr,sec=sys,proto=tcp,port=0,noatime,nofsc 0 0 deployment-urldownloader.deployment-prep.eqiad.wmflabs: deployment-pdf01.deployment-prep.eqiad.wmflabs: deployment-restbase02.deployment-prep.eqiad.wmflabs: deployment-elastic06.deployment-prep.eqiad.wmflabs: deployment-mathoid.deployment-prep.eqiad.wmflabs: deployment-cache-text02.deployment-prep.eqiad.wmflabs: root@deployment-salt:~#
I have cleaned up in /etc/fstab the #labstore... lines with:
salt '*' cmd.run "sed -i '/^#labstore/d' /etc/fstab"
Manually cleaned the /home entry on:
- deployment-sca02 - puppet lock file from June 5th
- deployment-redis02
- deployment-test
- deployment-cxserver03
- deployment-redis01
- deployment-elastic05
- deployment-cache-bits01
- deployment-db2
- deployment-mediawiki03
- deployment-bastion
labstore is no more referenced in /etc/fstab beside /data/project: salt '*' cmd.run 'grep labstore /etc/fstab|grep -v /data/project'
Nothing left mounted: salt '*' cmd.run 'mount |grep /home'