madhuvishy (Madhu)
User

Projects (7)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Apr 13 2015, 10:09 PM (144 w, 1 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
MViswanathan (WMF)

Recent Activity

Today

madhuvishy closed T185102: labstore2003 reboots into mode missing /srv disks as Resolved.

Drives not being mounted at /srv is the right behavior. The lvms aren't mounted by default because if they were, our bdsync based backups would fail, claiming that the lvms were mounted. /srv/backup is defined as a canonical location to mount our lvms manually by puppet - https://github.com/wikimedia/puppet/blob/production/modules/role/manifests/labs/nfs/secondary_backup/misc.pp#L5, in case we need to look at or test things.

Wed, Jan 17, 6:12 PM · Operations, cloud-services-team

Mon, Jan 15

madhuvishy triaged T184958: tools.wikidata-exports using 369G of 8T tools NFS storage as High priority.
Mon, Jan 15, 10:52 PM · cloud-services-team (Kanban), Cloud-VPS

Thu, Jan 11

madhuvishy added a comment to T171540: Figure out how NFS failovers will work for the dumps servers - labstore1006|7.

Here's a draft of the failover plan for the dumps distribution servers:

Thu, Jan 11, 10:52 PM · Patch-For-Review, Data-Services

Tue, Jan 9

madhuvishy added a comment to T184500: PAWS is down again.

@Discasto Please continue the conversation on the same ticket, thank you. Could you check now?

Tue, Jan 9, 10:29 PM · PAWS
madhuvishy triaged T184540: Maintain-views and maintain_meta-p scripts shouldn't run if mysql-upgrade is running as Normal priority.
Tue, Jan 9, 5:34 PM · DBA, Data-Services, cloud-services-team
madhuvishy created T184540: Maintain-views and maintain_meta-p scripts shouldn't run if mysql-upgrade is running.
Tue, Jan 9, 5:33 PM · DBA, Data-Services, cloud-services-team
madhuvishy added a comment to T184500: PAWS is down again.

@Discasto Should be resolved now, could you check, and resolve this task if things are good? Thanks!

Tue, Jan 9, 5:20 PM · PAWS
madhuvishy closed T184535: PAWS pods hung in state "Terminating" as Resolved.

Looks like the flannel pod in tools-paws-worker-1007 was also stuck in a weird state.

Tue, Jan 9, 5:19 PM · Data-Services, cloud-services-team
madhuvishy added a comment to T184500: PAWS is down again.

Hi @Discasto, This is due to T184535, I'm poking at it but not sure when it will get resolved. Will keep you posted!

Tue, Jan 9, 4:48 PM · PAWS
madhuvishy added a parent task for T184500: PAWS is down again: T184535: PAWS pods hung in state "Terminating".
Tue, Jan 9, 4:48 PM · PAWS
madhuvishy added a subtask for T184535: PAWS pods hung in state "Terminating": T184500: PAWS is down again.
Tue, Jan 9, 4:48 PM · Data-Services, cloud-services-team
madhuvishy triaged T184535: PAWS pods hung in state "Terminating" as High priority.
Tue, Jan 9, 4:47 PM · Data-Services, cloud-services-team
madhuvishy created T184535: PAWS pods hung in state "Terminating".
Tue, Jan 9, 4:47 PM · Data-Services, cloud-services-team
madhuvishy added a comment to T181925: Remove als.wik(ibooks|iquote|tionary), mo.wik(ipedia|tionary) views from replicas.

@jcrespo Thanks for pointing that out! Will add that to our docs.

Tue, Jan 9, 5:44 AM · Patch-For-Review, cloud-services-team (Kanban), Data-Services

Sat, Jan 6

madhuvishy added a comment to T181925: Remove als.wik(ibooks|iquote|tionary), mo.wik(ipedia|tionary) views from replicas.

@bd808 jfyi I also cleaned up the dns entries for these replicas, see patch in above comments.

Sat, Jan 6, 1:20 AM · Patch-For-Review, cloud-services-team (Kanban), Data-Services
madhuvishy updated the task description for T182722: Ferm changes on the host node break networking for Kubernetes pods.
Sat, Jan 6, 1:04 AM · Patch-For-Review, PAWS, cloud-services-team (Kanban), Kubernetes, Toolforge
madhuvishy added a comment to T182722: Ferm changes on the host node break networking for Kubernetes pods.

PAWS cluster DNS broke too, and all the workers had switched the default policy for Chain FORWARD to DROP again. I fixed by running sudo iptables -P FORWARD ACCEPT across the paws-workers. So these two things seem related.

Sat, Jan 6, 12:35 AM · Patch-For-Review, PAWS, cloud-services-team (Kanban), Kubernetes, Toolforge

Fri, Jan 5

madhuvishy added a comment to T165136: Ferm rules for labstore NFS hosts.

Noting that I merged https://gerrit.wikimedia.org/r/353508 and applied profile::wmcs::nfs::ferm to the new dumps distribution servers labstore1006&7, and the ferm rules seem to be working well.

Fri, Jan 5, 7:32 PM · Patch-For-Review, Cloud-VPS, Operations
madhuvishy placed T171394: Better monitoring for labstore backup crons up for grabs.
Fri, Jan 5, 6:55 PM · Data-Services
madhuvishy closed T158196: Reimage labstore1001 and labstore1002 for DRBD storage setup as Resolved.

We'll do the upgrade to stretch for all labstore servers as a separate step after testing stretch for NFS. We would like to have parity in the OS versions across all the labstores to keep operational overhead minimal. I'm opening a different task - T184290 for upgrading labstore* to stretch and resolving this for now.

Fri, Jan 5, 6:52 PM · Data-Services, cloud-services-team (Kanban), Patch-For-Review, Operations
madhuvishy closed T158196: Reimage labstore1001 and labstore1002 for DRBD storage setup, a subtask of T126083: overhaul labstore setup [tracking], as Resolved.
Fri, Jan 5, 6:52 PM · Data-Services, Tracking, Operations
madhuvishy created T184290: Upgrade labstore servers in eqiad to Stretch.
Fri, Jan 5, 6:51 PM · Data-Services
madhuvishy placed T156934: Functionality to share & view SWAP notebooks up for grabs.
Fri, Jan 5, 6:44 PM · PAWS

Thu, Jan 4

madhuvishy added a comment to T183953: tools.iabot is using 1.3T of 8T available tools nfs storage.

@Cyberpower678 Any update on this? Thanks!

Thu, Jan 4, 6:53 PM · cloud-services-team (Kanban), Operations, Cloud-VPS
madhuvishy added a comment to T183970: wikidumpparse is using 1.2TB of 5T available NFS misc storage.

@notconfusing @Dfko @Hargup Hello! Poke on this task again, could you please clean up the home folder soon, thank you.

Thu, Jan 4, 6:52 PM · cloud-services-team, Operations, Cloud-VPS
madhuvishy closed T183954: templatetiger is using 827G of 8T available tools nfs storage as Resolved.

Thank you!

Thu, Jan 4, 6:49 PM · cloud-services-team, Operations, Cloud-VPS
madhuvishy closed T183954: templatetiger is using 827G of 8T available tools nfs storage, a subtask of T183920: 2018-01-02: labstore Tools and Misc share very full, as Resolved.
Thu, Jan 4, 6:49 PM · cloud-services-team (Kanban), Operations, Cloud-VPS
madhuvishy added a comment to T183983: Re-institute query killer for the analytics WikiReplica.

@jcrespo Sounds great! Let's puppetize and tweak later if needed. Thank you :)

Thu, Jan 4, 6:30 PM · Data-Services, DBA
madhuvishy closed T183142: paws: 502 bad gateway as Resolved.

With T184018, this should be resolved for now. We plan to upgrade to the newer k8s version to mitigate the iptables rule issue soon. Let's reopen/make a new task if this issue happens again.

Thu, Jan 4, 6:26 PM · PAWS, Cloud-Services
madhuvishy added a parent task for T184018: Remove overlay from kernel blacklist on toolforge: T183142: paws: 502 bad gateway.
Thu, Jan 4, 6:25 PM · Patch-For-Review, Toolforge
madhuvishy added a subtask for T183142: paws: 502 bad gateway: T184018: Remove overlay from kernel blacklist on toolforge.
Thu, Jan 4, 6:25 PM · PAWS, Cloud-Services

Tue, Jan 2

madhuvishy triaged T183983: Re-institute query killer for the analytics WikiReplica as Normal priority.
Tue, Jan 2, 7:57 PM · Data-Services, DBA
madhuvishy added a comment to T174468: VPS Project dumps is using 1.7T at /data/project on NFS.

We are at high utilization by the dumps project again, 2T or 5T available storage. Please cleanup excess files and data soon, thank you!

Tue, Jan 2, 7:44 PM · Cloud-VPS
madhuvishy added a subtask for T183920: 2018-01-02: labstore Tools and Misc share very full: T174468: VPS Project dumps is using 1.7T at /data/project on NFS.
Tue, Jan 2, 7:43 PM · cloud-services-team (Kanban), Operations, Cloud-VPS
madhuvishy added a parent task for T174468: VPS Project dumps is using 1.7T at /data/project on NFS: T183920: 2018-01-02: labstore Tools and Misc share very full.
Tue, Jan 2, 7:43 PM · Cloud-VPS
madhuvishy merged task T183971: dumps project is using 2T of 5T available NFS misc storage into T174468: VPS Project dumps is using 1.7T at /data/project on NFS.
Tue, Jan 2, 7:42 PM · cloud-services-team, Operations, Cloud-VPS
madhuvishy merged T183971: dumps project is using 2T of 5T available NFS misc storage into T174468: VPS Project dumps is using 1.7T at /data/project on NFS.
Tue, Jan 2, 7:42 PM · Cloud-VPS
madhuvishy triaged T183971: dumps project is using 2T of 5T available NFS misc storage as High priority.
Tue, Jan 2, 7:05 PM · cloud-services-team, Operations, Cloud-VPS
madhuvishy triaged T183970: wikidumpparse is using 1.2TB of 5T available NFS misc storage as High priority.
Tue, Jan 2, 7:02 PM · cloud-services-team, Operations, Cloud-VPS
madhuvishy added a comment to T183953: tools.iabot is using 1.3T of 8T available tools nfs storage.

@Cyberpower678 See Chase's comments on the parent task for more info T183920.

Tue, Jan 2, 6:59 PM · cloud-services-team (Kanban), Operations, Cloud-VPS
madhuvishy triaged T183954: templatetiger is using 827G of 8T available tools nfs storage as High priority.
Tue, Jan 2, 6:56 PM · cloud-services-team, Operations, Cloud-VPS
madhuvishy triaged T183953: tools.iabot is using 1.3T of 8T available tools nfs storage as High priority.
Tue, Jan 2, 6:53 PM · cloud-services-team (Kanban), Operations, Cloud-VPS

Thu, Dec 28

madhuvishy lowered the priority of T183142: paws: 502 bad gateway from High to Normal.
Thu, Dec 28, 8:00 PM · PAWS, Cloud-Services
madhuvishy added a comment to T183142: paws: 502 bad gateway.

I think this is fixed now, and PAWS is back up.

Thu, Dec 28, 7:59 PM · PAWS, Cloud-Services
madhuvishy created P6505 Paws worker iptables.
Thu, Dec 28, 7:52 PM

Dec 15 2017

madhuvishy added a comment to T183029: Stop managing account creation for labsdb1001 and 1003 through the maintain-dbusers script.

The script is unbroken now and runs alright. Leaving this open until we remove all the metadata for labsdb1001 and 3 from labsdbaccount.account_host post decommission.

Dec 15 2017, 8:30 PM · cloud-services-team (Kanban), Data-Services
madhuvishy lowered the priority of T183029: Stop managing account creation for labsdb1001 and 1003 through the maintain-dbusers script from High to Normal.
Dec 15 2017, 8:29 PM · cloud-services-team (Kanban), Data-Services
madhuvishy added a comment to T183029: Stop managing account creation for labsdb1001 and 1003 through the maintain-dbusers script.

I dropped all the status=absent accounts from labsdbaccount.account_host for labsdb1001 and 1003.

Dec 15 2017, 8:28 PM · cloud-services-team (Kanban), Data-Services
madhuvishy triaged T183029: Stop managing account creation for labsdb1001 and 1003 through the maintain-dbusers script as High priority.
Dec 15 2017, 8:06 PM · cloud-services-team (Kanban), Data-Services

Dec 6 2017

madhuvishy added a comment to T181518: kafka1018 fails to boot.

@elukey I recommend copying home directories on notebook1002 and back them up somewhere on notebook1001, and send a note to analytics and research-l asking folks to just use 1001. I don't think anyone uses 1002, but a few users have notebooks there, so notifying would be good.

Dec 6 2017, 5:34 PM · User-Elukey, Patch-For-Review, Analytics-Kanban, Operations, ops-eqiad

Nov 27 2017

madhuvishy updated the task description for T168486: Migrate customer-facing Dumps endpoints to Cloud Services.
Nov 27 2017, 7:56 PM · Datasets-General-or-Unknown, cloud-services-team (FY2017-18), Goal
madhuvishy added a parent task for T171540: Figure out how NFS failovers will work for the dumps servers - labstore1006|7: T181431: Setup NFS on dumps servers.
Nov 27 2017, 7:55 PM · Patch-For-Review, Data-Services
madhuvishy added a subtask for T181431: Setup NFS on dumps servers: T171540: Figure out how NFS failovers will work for the dumps servers - labstore1006|7.
Nov 27 2017, 7:55 PM · Patch-For-Review, Datasets-General-or-Unknown, cloud-services-team (FY2017-18), Goal
madhuvishy created T181431: Setup NFS on dumps servers.
Nov 27 2017, 7:54 PM · Patch-For-Review, Datasets-General-or-Unknown, cloud-services-team (FY2017-18), Goal
madhuvishy closed T174590: Revert patch that adds a temporary exception to the block-for-export check for the testlabs project as Resolved.

Done

Nov 27 2017, 7:52 PM · cloud-services-team (Kanban), Cloud-Services
madhuvishy closed T174590: Revert patch that adds a temporary exception to the block-for-export check for the testlabs project, a subtask of T171508: Investigate and implement alternative for showmount based check at instance boot time, as Resolved.
Nov 27 2017, 7:52 PM · cloud-services-team (Kanban), Patch-For-Review, Cloud-Services

Nov 20 2017

madhuvishy closed T171539: Puppetize and setup initial lvms and directory structures for labstore1006|7 as Resolved.

With https://gerrit.wikimedia.org/r/#/c/391892/, this is all done now.

Nov 20 2017, 5:20 AM · Patch-For-Review, Data-Services
madhuvishy closed T171539: Puppetize and setup initial lvms and directory structures for labstore1006|7, a subtask of T168486: Migrate customer-facing Dumps endpoints to Cloud Services, as Resolved.
Nov 20 2017, 5:20 AM · Datasets-General-or-Unknown, cloud-services-team (FY2017-18), Goal

Nov 17 2017

madhuvishy added a comment to T171508: Investigate and implement alternative for showmount based check at instance boot time.

Raw notes from Etherpad in rolling this all out:

Nov 17 2017, 12:00 AM · cloud-services-team (Kanban), Patch-For-Review, Cloud-Services

Nov 16 2017

madhuvishy added a comment to T171541: Setup periodic rsync jobs from dataset1001/dumpsdata1001|2 to labstore1006|7.

OK, I delcare the patch ready to merge, as soon as the following happen on labstore1006:

  • a new directory /srv/dumps/xmldatadumps created with owner/group root and 755 perms
  • move all directories and files under /srv/dumps, to /srv/dumps/xmldatadumps

This is all done now :)

Nov 16 2017, 8:22 PM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown
madhuvishy added a project to T180659: Investigate the use of the shared NFS mount from labstore1003 to dataset1001: Data-Services.
Nov 16 2017, 6:31 AM · Patch-For-Review, Data-Services
madhuvishy created T180659: Investigate the use of the shared NFS mount from labstore1003 to dataset1001.
Nov 16 2017, 6:31 AM · Patch-For-Review, Data-Services

Nov 13 2017

madhuvishy closed T173647: Prepare and check storage layer for hif.wiktionary as Resolved.

Everything seems to be good now, I'm resolving this task. Thanks a ton @Marostegui!

Nov 13 2017, 7:05 PM · cloud-services-team (Kanban), Cloud-Services, DBA
madhuvishy added a comment to T173647: Prepare and check storage layer for hif.wiktionary.

@jcrespo Understood, I wasn't aware of that. We are in the right track then :)

Nov 13 2017, 7:00 PM · cloud-services-team (Kanban), Cloud-Services, DBA
madhuvishy added a comment to T173647: Prepare and check storage layer for hif.wiktionary.

@jcrespo That makes sense, I didn't know the private data was the reason we didn't do the wildcard grants. Lets leave it as is then, @aborrero may be soon working on automating a little better our flow to import a new DB into the replicas and set up access, and we can explore giving the grant on a per view database level every time we do that, in an automated fashion.

Nov 13 2017, 6:57 PM · cloud-services-team (Kanban), Cloud-Services, DBA

Nov 10 2017

madhuvishy added a comment to T173647: Prepare and check storage layer for hif.wiktionary.

@aborrero and I caught up on this, and it looks like all the DNS records are created now:

Nov 10 2017, 9:12 PM · cloud-services-team (Kanban), Cloud-Services, DBA

Nov 9 2017

madhuvishy added a comment to T180179: Evaluate the possibility to add Juniper images to Openstack.

Noting here that proprietary software is not usually installed on WMCS environments per https://wikitech.wikimedia.org/wiki/Wikitech:Labs_Terms_of_use#What_uses_of_Labs_do_we_not_like.3F (Proprietary Software).

Nov 9 2017, 11:58 PM · cloud-services-team (Kanban), Cloud-VPS, netops, Operations, Traffic
madhuvishy added a comment to T173647: Prepare and check storage layer for hif.wiktionary.

@Marostegui right, okay. Thanks! Do we have a ticket for this issue?

Nov 9 2017, 6:55 PM · cloud-services-team (Kanban), Cloud-Services, DBA
madhuvishy added a comment to T173647: Prepare and check storage layer for hif.wiktionary.

@aborrero FYI, after Manuel's magic, I've run

Nov 9 2017, 6:50 PM · cloud-services-team (Kanban), Cloud-Services, DBA
madhuvishy added a comment to T173647: Prepare and check storage layer for hif.wiktionary.

@Marostegui Worked now! what did you have to do?

Nov 9 2017, 6:47 PM · cloud-services-team (Kanban), Cloud-Services, DBA
madhuvishy added a comment to T173647: Prepare and check storage layer for hif.wiktionary.

Also running directly on labsdb1011,

Nov 9 2017, 6:00 PM · cloud-services-team (Kanban), Cloud-Services, DBA

Nov 8 2017

madhuvishy added a comment to T171508: Investigate and implement alternative for showmount based check at instance boot time.

The pam_nologin behavior you're reporting sounds very odd indeed. If it's actually the case it will be CVE-worthy! It's an old, popular and well-audited piece of code though, so it'd be surprising to me if the root cause lies with pam_nologin and not somewhere in our configuration. It's not impossible of course, bugs and CVEs do happen :)

Have you encountered this behavior only during early/first boot, or is this reproducible after the first boot when e.g. creating that file? Is this perhaps a race that occurs while puppet is running and changing the system's configuration? Maybe something as innocent as sshd's UsePAM setting, or another PAM configuration, given that we're messing with it in the first puppet run to add LDAP auth?

When the config is account required nologin.so I've only been able to reproduce this behavior during the firstboot stage. I've tried applying auth required nologin.so post boot to see how the behavior changes, and been able to log in every time, despite that config existing.

Nov 8 2017, 9:29 PM · cloud-services-team (Kanban), Patch-For-Review, Cloud-Services

Nov 3 2017

madhuvishy closed T179153: pawikisource_p.page table not available as Resolved.

I've fixed the grants for pawikisource_p now.

Nov 3 2017, 12:33 AM · Data-Services

Nov 2 2017

madhuvishy added a comment to T168584: Labsdb* servers need to be rebooted.

fyi @Cmjohnson We are not doing the labsdb1003 reboot on Tuesday Nov 7, due to T179464.

Nov 2 2017, 11:53 PM · Patch-For-Review, Scoring-platform-team (Current), DBA, cloud-services-team, Operations
madhuvishy awarded T179461: Use the term "developer account" for Wikimedia LDAP accounts a Love token.
Nov 2 2017, 10:06 PM · Operations, Cloud-Services, Developer-Relations

Nov 1 2017

madhuvishy added a comment to T179464: labsdb1001 crashed - storage issue.

It looks like it may be time to say goodbye to this server. I've spent some time today looking at the state of the storage configuration, and the damage, and if anything at all might be possible to recover the disk.

Nov 1 2017, 11:18 PM · Operations, cloud-services-team (Kanban)
madhuvishy edited P6241 Badblocks labsdb1001.
Nov 1 2017, 9:39 PM
madhuvishy created P6241 Badblocks labsdb1001.
Nov 1 2017, 8:11 PM
madhuvishy added a comment to T179464: labsdb1001 crashed - storage issue.

Disk setup for labsdb1001

Nov 1 2017, 5:06 PM · Operations, cloud-services-team (Kanban)

Oct 30 2017

madhuvishy added a comment to T168584: Labsdb* servers need to be rebooted.

The 1001 reboot is all done. Notes from my planning etherpad:

Oct 30 2017, 5:24 PM · Patch-For-Review, Scoring-platform-team (Current), DBA, cloud-services-team, Operations
madhuvishy closed T178128: Access to raw database tables on labsdb* for wmcs-admin users as Resolved.
Oct 30 2017, 1:27 PM · Patch-For-Review, cloud-services-team (Kanban), Ops-Access-Requests, Operations, DBA

Oct 27 2017

madhuvishy added a comment to T178128: Access to raw database tables on labsdb* for wmcs-admin users.

I've now rolled this out to labsdb10[01|03|09|10|11]. @Marostegui Is there a file/config/logs somewhere you'd like me to persist these grants? Thanks for your help :)

Oct 27 2017, 6:07 PM · Patch-For-Review, cloud-services-team (Kanban), Ops-Access-Requests, Operations, DBA
madhuvishy added a comment to T178128: Access to raw database tables on labsdb* for wmcs-admin users.

Cool, I've run

Oct 27 2017, 5:38 PM · Patch-For-Review, cloud-services-team (Kanban), Ops-Access-Requests, Operations, DBA
madhuvishy added a comment to T178128: Access to raw database tables on labsdb* for wmcs-admin users.

@Marostegui Sounds good, thanks

Oct 27 2017, 5:26 PM · Patch-For-Review, cloud-services-team (Kanban), Ops-Access-Requests, Operations, DBA
madhuvishy added a comment to T178128: Access to raw database tables on labsdb* for wmcs-admin users.

@Marostegui Yeah that sounds right to me! Cool if I run that across the wiki replicas?

Oct 27 2017, 5:10 PM · Patch-For-Review, cloud-services-team (Kanban), Ops-Access-Requests, Operations, DBA

Oct 26 2017

madhuvishy updated the task description for T178807: Onboard aborrero to WMF.
Oct 26 2017, 9:19 PM · Patch-For-Review, cloud-services-team
madhuvishy added a comment to T168584: Labsdb* servers need to be rebooted.

Started a planning doc for the reboots here - https://etherpad.wikimedia.org/p/labsdb-reboots

Oct 26 2017, 6:16 PM · Patch-For-Review, Scoring-platform-team (Current), DBA, cloud-services-team, Operations
madhuvishy added a comment to T179075: User s53550 unable to connect to tools-db with given credentials.

Fixed! @MusikAnimal can you verify that your credentials work now and close this? Thank you :)

Oct 26 2017, 4:21 PM · cloud-services-team (Kanban), Data-Services

Oct 25 2017

madhuvishy added a comment to T179024: nfsiostat collector appears to be broken.

+1 That sounds like the right thing to do

Oct 25 2017, 9:47 PM · Patch-For-Review, cloud-services-team
madhuvishy updated subscribers of T178128: Access to raw database tables on labsdb* for wmcs-admin users.
Oct 25 2017, 9:18 PM · Patch-For-Review, cloud-services-team (Kanban), Ops-Access-Requests, Operations, DBA
madhuvishy added a comment to T178128: Access to raw database tables on labsdb* for wmcs-admin users.

@bd808 I looked at the accounts set up we have now, and it looks like the labsdbadmin user is already set up with remote (specific ips) permissions, but it only has Grant_priv and Create_user_priv, which in turn we use to Create accounts and grant View privileges for toolforge users/tool accounts.

Oct 25 2017, 9:14 PM · Patch-For-Review, cloud-services-team (Kanban), Ops-Access-Requests, Operations, DBA
madhuvishy added a comment to T179024: nfsiostat collector appears to be broken.

nfsiostat.py has

Oct 25 2017, 8:13 PM · Patch-For-Review, cloud-services-team
madhuvishy added a comment to T178920: tools-package-builder-01.tools.eqiad.wmflabs Puppet failing for pbuilder changes.

Awesome thanks @akosiaris!

Oct 25 2017, 4:30 PM · cloud-services-team (Kanban), Toolforge

Oct 24 2017

madhuvishy updated the task description for T142807: Migrate all users to new Wiki Replica cluster and decommission old hardware.
Oct 24 2017, 8:17 PM · Patch-For-Review, Goal, cloud-services-team (FY2017-18), Data-Services, DBA
madhuvishy added a comment to T168584: Labsdb* servers need to be rebooted.

I've updated the lists, and our wiki here -https://wikitech.wikimedia.org/wiki/Wiki_Replica_c1_and_c3_shutdown

Oct 24 2017, 8:14 PM · Patch-For-Review, Scoring-platform-team (Current), DBA, cloud-services-team, Operations
madhuvishy added a comment to T168584: Labsdb* servers need to be rebooted.

Proposed timing for the 2 reboots:

Oct 24 2017, 7:53 PM · Patch-For-Review, Scoring-platform-team (Current), DBA, cloud-services-team, Operations
madhuvishy reopened T168584: Labsdb* servers need to be rebooted as "Open".

Reopening since we are scheduling the labsdb1001 and 1003 reboots over the next couple weeks.

Oct 24 2017, 7:04 PM · Patch-For-Review, Scoring-platform-team (Current), DBA, cloud-services-team, Operations
madhuvishy reopened T168584: Labsdb* servers need to be rebooted, a subtask of T168445: Reboots of cloud servers, as Open.
Oct 24 2017, 7:04 PM · cloud-services-team, Operations
madhuvishy reopened T168584: Labsdb* servers need to be rebooted, a subtask of T142807: Migrate all users to new Wiki Replica cluster and decommission old hardware, as Open.
Oct 24 2017, 7:04 PM · Patch-For-Review, Goal, cloud-services-team (FY2017-18), Data-Services, DBA
madhuvishy added a comment to T178805: Increase Tools available quota.

+1

Oct 24 2017, 6:30 PM · Cloud-VPS (Quota-requests)