Page MenuHomePhabricator
Feed Advanced Search

Mar 13 2018

madhuvishy closed T136192: templatetiger is using 613G in Tools out of 8T, a subtask of T136212: Contact tool maintainters using large amounts of disk space, as Resolved.
Mar 13 2018, 4:39 PM · Goal, Toolforge, Cloud-Services
madhuvishy lowered the priority of T183954: templatetiger is using 827G of 8T available tools nfs storage from High to Medium.
Mar 13 2018, 4:36 PM · cloud-services-team, SRE, Cloud-VPS
madhuvishy reopened T183954: templatetiger is using 827G of 8T available tools nfs storage as "Open".

@Kolossos I see utilization has climbed up again to over 600G. How can we ensure we don't have to keep making these tickets to clean up? We are happy to help figure out long term strategies!

Mar 13 2018, 4:35 PM · cloud-services-team, SRE, Cloud-VPS
madhuvishy reopened T183954: templatetiger is using 827G of 8T available tools nfs storage, a subtask of T183920: 2018-01-02: labstore Tools and Misc share very full, as Open.
Mar 13 2018, 4:35 PM · cloud-services-team (Kanban), SRE, Cloud-VPS
madhuvishy closed T174468: VPS Project dumps is using 1.7T at /data/project on NFS as Resolved.

Resolving this for now. This project still has high utilization, albeit less than before. We can discuss strategies to mitigate in T159930.

Mar 13 2018, 4:31 PM · Cloud-VPS
madhuvishy closed T174468: VPS Project dumps is using 1.7T at /data/project on NFS, a subtask of T183920: 2018-01-02: labstore Tools and Misc share very full, as Resolved.
Mar 13 2018, 4:31 PM · cloud-services-team (Kanban), SRE, Cloud-VPS

Mar 11 2018

madhuvishy moved T159930: Create custom instance flavor for Dumps project from Inbox to Discussion needed on the Cloud-VPS (Quota-requests) board.
Mar 11 2018, 8:26 PM · cloud-services-team (Kanban), Cloud-VPS (Quota-requests), User-Hydriz

Mar 9 2018

madhuvishy triaged T189284: Stop serving slowparse logs from dumps distribution servers as Medium priority.
Mar 9 2018, 7:28 AM · Performance-Team (Radar), Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown
madhuvishy triaged T189283: Replace cron jobs from EZachte's home directory on stat1005 with rsync fetches as Medium priority.
Mar 9 2018, 7:15 AM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown

Mar 7 2018

madhuvishy edited P6813 drafts of announcements for migration from dataset1001->labstore1006,7.
Mar 7 2018, 11:27 PM
madhuvishy added a comment to T188647: Announce/Communicate dumps migration to labstore1006|7 to stakeholders.

Draft timeline - T168486#4033572

Mar 7 2018, 10:02 PM · cloud-services-team (Kanban), Data-Services, User-ArielGlenn, Datasets-General-or-Unknown
madhuvishy added a comment to T168486: Migrate customer-facing Dumps endpoints to Cloud Services.

Draft timeline for migration:

Mar 7 2018, 10:01 PM · Patch-For-Review, Datasets-General-or-Unknown, cloud-services-team (FY2017-18), Goal
madhuvishy edited P6813 drafts of announcements for migration from dataset1001->labstore1006,7.
Mar 7 2018, 9:50 PM

Mar 6 2018

madhuvishy added a comment to T159930: Create custom instance flavor for Dumps project.

If the higher usage is periodic, I want to encourage setting up automatic clean up jobs after the dumps are processed

I might have missed something, but this is not something that clean up would help. It's about transferring bigger datasets which are only produced occasionally. Clearly they can be split down in smaller pieces, but to do so we might end up increasing the usage of resources (download something, write it, read it, process and split it, write again, move elsewhere etc.).

Mar 6 2018, 10:11 PM · cloud-services-team (Kanban), Cloud-VPS (Quota-requests), User-Hydriz
madhuvishy added a comment to T159930: Create custom instance flavor for Dumps project.

I have managed to reduce the disk usage to less than 500G. However, the original problem still stands where the dumps project may have a very high utilization of disk space during certain periods of time which may negatively affect other CloudVPS projects. Is it possible for a separate labstore volume to be created just for the dumps project?

Mar 6 2018, 7:17 PM · cloud-services-team (Kanban), Cloud-VPS (Quota-requests), User-Hydriz
madhuvishy added a comment to T189018: Toolforge Iinstances (maybe only Jessie?) are having issues with NFS/LDAP.

tools-worker-1011 was having issues allowing non-root logins. I rebooted it:

Mar 6 2018, 4:07 PM · Patch-For-Review, cloud-services-team (Kanban), Toolforge
madhuvishy added a project to T189018: Toolforge Iinstances (maybe only Jessie?) are having issues with NFS/LDAP: cloud-services-team.
Mar 6 2018, 3:34 PM · Patch-For-Review, cloud-services-team (Kanban), Toolforge

Mar 5 2018

madhuvishy added a comment to T171541: Setup periodic rsync jobs from dumps generation hosts to labstore1006|7.

See T188726 for new task on datasets in other/

Mar 5 2018, 6:01 PM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown
madhuvishy closed T171541: Setup periodic rsync jobs from dumps generation hosts to labstore1006|7 as Resolved.

These have been running for awhile now. The only thing that doesn't get synced over on a regular basis are the various datasets pulled or pushed onto dataset1001 from kiwix, mwlog hosts, etc. Instead of setting up an additional sync job for those, we ought to just enable those syncs to happen on labstore1006 and sync from there to 1007.

  • profile::dumps::fetcher with appropriate hiera settings and permissions on stat1005 will take care of the incoming datasets
  • profile/manifests/phabricator/main.pp has a stanza for the push to dataset1001, so it should get a new stanza added, or convert this to pull
  • role/manifests/logging/mediawiki/udp2log.pp has a stanza for push to dumps.wikimedia.org, so it should get a new stanza added, or convert this to pull

Then this task could be closed.

Mar 5 2018, 5:58 PM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown
madhuvishy closed T171541: Setup periodic rsync jobs from dumps generation hosts to labstore1006|7, a subtask of T168486: Migrate customer-facing Dumps endpoints to Cloud Services, as Resolved.
Mar 5 2018, 5:58 PM · Patch-For-Review, Datasets-General-or-Unknown, cloud-services-team (FY2017-18), Goal
madhuvishy closed T171541: Setup periodic rsync jobs from dumps generation hosts to labstore1006|7, a subtask of T182540: get datset1001, ms1001 ready for decommission, as Resolved.
Mar 5 2018, 5:58 PM · Patch-For-Review, Dumps-Generation
madhuvishy renamed T171541: Setup periodic rsync jobs from dumps generation hosts to labstore1006|7 from Setup periodic rsync jobs from dataset1001/dumpsdata1001|2 to labstore1006|7 to Setup periodic rsync jobs from dumps generation hosts to labstore1006|7.
Mar 5 2018, 5:57 PM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown

Mar 1 2018

madhuvishy added a comment to T188680: Fix maintain-dbusers to handle clashes with old existing accounts.

That seems like it would work yes :)

Mar 1 2018, 11:32 PM · cloud-services-team (Kanban), Data-Services
madhuvishy triaged T188680: Fix maintain-dbusers to handle clashes with old existing accounts as High priority.
Mar 1 2018, 11:25 PM · cloud-services-team (Kanban), Data-Services
madhuvishy triaged T188681: Maintain-dbusers should handle failures due to replicas being in maintenance as Medium priority.
Mar 1 2018, 11:24 PM · Patch-For-Review, cloud-services-team (Kanban), Data-Services
madhuvishy added a project to T188681: Maintain-dbusers should handle failures due to replicas being in maintenance: Data-Services.
Mar 1 2018, 11:24 PM · Patch-For-Review, cloud-services-team (Kanban), Data-Services
madhuvishy created T188681: Maintain-dbusers should handle failures due to replicas being in maintenance.
Mar 1 2018, 11:24 PM · Patch-For-Review, cloud-services-team (Kanban), Data-Services
madhuvishy created T188680: Fix maintain-dbusers to handle clashes with old existing accounts.
Mar 1 2018, 11:21 PM · cloud-services-team (Kanban), Data-Services
madhuvishy closed T188508: MySQL access not working for wmde-inline-movedparagraphs on tools as Resolved.
Mar 1 2018, 11:18 PM · Toolforge
madhuvishy added a comment to T188176: Revert: iiab temporary m1.xlarge increase.

Hey @Tim-moody, Chase is on-call this week and will make the changes soon :) Thanks for your patience!

Mar 1 2018, 11:16 PM · Wikimedia-Medicine, cloud-services-team (Kanban), Cloud-VPS (Quota-requests)
madhuvishy assigned T188176: Revert: iiab temporary m1.xlarge increase to chasemp.
Mar 1 2018, 11:16 PM · Wikimedia-Medicine, cloud-services-team (Kanban), Cloud-VPS (Quota-requests)
madhuvishy added a comment to T188508: MySQL access not working for wmde-inline-movedparagraphs on tools.

@WMDE-Fisch Argh sorry, should be fixed for real now!

Mar 1 2018, 11:15 PM · Toolforge
madhuvishy added a comment to T188500: toolforge and misc NFS share backups log errors when reading old snapshots.

Our current theory is that running when snapshot-manager runs lvs to check if a snapshot exists, it throws these read errors, potentially because the older snapshots are full or unreadable for some reason. But they will get deleted anyway so these errors are red herrings and don't affect the backups. We can either fix logging these errors, or remove the snapshots at the source server after the backup is done to avoid this problem.

Mar 1 2018, 11:01 PM · cloud-services-team, Data-Services, Cloud-VPS
madhuvishy renamed T188500: toolforge and misc NFS share backups log errors when reading old snapshots from toolforge and misc NFS share backups failed to toolforge and misc NFS share backups log errors when reading old snapshots.
Mar 1 2018, 10:57 PM · cloud-services-team, Data-Services, Cloud-VPS
madhuvishy added a comment to T188500: toolforge and misc NFS share backups log errors when reading old snapshots.

Despite the error lines in both cron logs for reading misc-snap, both backups seem to have completed successfully. Pasting everything but the Error lines from the above logs

Mar 1 2018, 10:43 PM · cloud-services-team, Data-Services, Cloud-VPS
madhuvishy added a comment to T188500: toolforge and misc NFS share backups log errors when reading old snapshots.
On labstore1004 I see:

root@labstore1004:~# lvs
  /dev/misc/misc-snap: read failed after 0 of 4096 at 5497558073344: Input/output error
  /dev/misc/misc-snap: read failed after 0 of 4096 at 5497558130688: Input/output error
  /dev/misc/misc-snap: read failed after 0 of 4096 at 0: Input/output error
  /dev/misc/misc-snap: read failed after 0 of 4096 at 4096: Input/output error
  /dev/tools/tools-snap: read failed after 0 of 4096 at 8796092956672: Input/output error
  /dev/tools/tools-snap: read failed after 0 of 4096 at 8796093014016: Input/output error
  /dev/tools/tools-snap: read failed after 0 of 4096 at 0: Input/output error
  /dev/tools/tools-snap: read failed after 0 of 4096 at 4096: Input/output error
  LV            VG    Attr       LSize  Pool Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  misc-project  misc  owi-aos---  5.00t
  misc-snap     misc  swi-I-s---  1.00t      misc-project  100.00
  test          misc  -wi-ao---- 10.00g
  tools-project tools owi-aos---  8.00t
  tools-snap    tools swi-I-s---  1.00t      tools-project 100.00
root@labstore1004:~#

Are snapshots not being created each time or did 1T of content really fill up while sync was in progress?
Mar 1 2018, 10:35 PM · cloud-services-team, Data-Services, Cloud-VPS
madhuvishy triaged T188647: Announce/Communicate dumps migration to labstore1006|7 to stakeholders as Medium priority.
Mar 1 2018, 6:15 PM · cloud-services-team (Kanban), Data-Services, User-ArielGlenn, Datasets-General-or-Unknown
madhuvishy added a parent task for T185101: Labstore1006/7 profile for meltdown kernel: T168486: Migrate customer-facing Dumps endpoints to Cloud Services.
Mar 1 2018, 6:12 PM · cloud-services-team (Kanban), SRE
madhuvishy added a subtask for T168486: Migrate customer-facing Dumps endpoints to Cloud Services: T185101: Labstore1006/7 profile for meltdown kernel.
Mar 1 2018, 6:12 PM · Patch-For-Review, Datasets-General-or-Unknown, cloud-services-team (FY2017-18), Goal
madhuvishy updated the task description for T168486: Migrate customer-facing Dumps endpoints to Cloud Services.
Mar 1 2018, 6:12 PM · Patch-For-Review, Datasets-General-or-Unknown, cloud-services-team (FY2017-18), Goal
madhuvishy updated the task description for T182540: get datset1001, ms1001 ready for decommission.
Mar 1 2018, 6:11 PM · Patch-For-Review, Dumps-Generation
madhuvishy triaged T188646: Point dumps.wikimedia.org to labstore1006/7 as Medium priority.
Mar 1 2018, 6:11 PM · Patch-For-Review, cloud-services-team (Kanban), Data-Services, Datasets-General-or-Unknown
madhuvishy triaged T188645: Get all the rsync mirror sites to to switch over to labstore1006,7 as Medium priority.
Mar 1 2018, 6:10 PM · cloud-services-team (Kanban), Data-Services, Datasets-General-or-Unknown
madhuvishy triaged T188644: Migrate the stat* mount from dataset1001 to labstore1006/7 as Medium priority.
Mar 1 2018, 6:07 PM · Patch-For-Review, cloud-services-team (Kanban), Datasets-General-or-Unknown
madhuvishy triaged T188643: Migrate Dumps WMCS NFS users from labstore1003 to labstore1006/7 as Medium priority.
Mar 1 2018, 6:05 PM · Patch-For-Review, cloud-services-team (Kanban), Data-Services, Datasets-General-or-Unknown
madhuvishy triaged T188642: Set up labstore1006|7 as the source for rsync mirror sites as Medium priority.
Mar 1 2018, 5:59 PM · cloud-services-team (Kanban), Data-Services, Datasets-General-or-Unknown
madhuvishy triaged T188641: Set up the web service that serves dumps.wikimedia.org as Medium priority.
Mar 1 2018, 5:57 PM · Patch-For-Review, cloud-services-team (Kanban), Data-Services, Datasets-General-or-Unknown
madhuvishy added a comment to T188589: m5-master overloaded by idle connections to the nova database.

Some logs from nova-conductor corresponding to the time of incident, doesn't seem like the root cause but correlates with the db spike. https://phabricator.wikimedia.org/P6770

Mar 1 2018, 7:26 AM · SRE, Cloud-Services, DBA
madhuvishy created P6770 Nova Conductor logs for db1009 overload.
Mar 1 2018, 7:25 AM
madhuvishy added a comment to T188589: m5-master overloaded by idle connections to the nova database.

Things seem a lot better now since

Mar 1 2018, 7:03 AM · SRE, Cloud-Services, DBA

Feb 28 2018

madhuvishy added a comment to T188508: MySQL access not working for wmde-inline-movedparagraphs on tools.

The script is failing due to existing user account clash issue that we hoped would go away with the 1001|3 decommission - it looks like we still have older accounts in labsdb1005 that cause the same problem.

Feb 28 2018, 7:43 PM · Toolforge
madhuvishy added a comment to T188500: toolforge and misc NFS share backups log errors when reading old snapshots.
Feb 28 2018, 7:43 PM · cloud-services-team, Data-Services, Cloud-VPS

Feb 23 2018

madhuvishy created T188137: Fix tools dumps availability over NFS check.
Feb 23 2018, 8:48 PM · cloud-services-team, Data-Services
madhuvishy added a comment to T188073: Maintain symlinks for WMCS NFS Dumps users with new directory structure.

+1 to handling on labstore boxes. Puppet should be able to do it.

Feb 23 2018, 7:45 PM · Patch-For-Review, cloud-services-team (Kanban), Data-Services
madhuvishy triaged T188073: Maintain symlinks for WMCS NFS Dumps users with new directory structure as Medium priority.
Feb 23 2018, 7:40 AM · Patch-For-Review, cloud-services-team (Kanban), Data-Services

Feb 22 2018

madhuvishy edited projects for T156934: Functionality to share & view notebooks, added: Analytics; removed PAWS.
Feb 22 2018, 10:14 PM · Data-Engineering, Product-Analytics, Data-Engineering-Jupyter

Feb 20 2018

madhuvishy closed T186235: Tools and Misc NFS weekly backups failing since 1/17 as Resolved.

Yup looks good, backups have been running fine for the last 2 weeks.

Feb 20 2018, 4:29 PM · cloud-services-team (Kanban), Patch-For-Review, Data-Services
madhuvishy closed T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D as Resolved.

The servers are moved and up and running! Thanks for your work @Cmjohnson.

Feb 20 2018, 4:27 PM · Patch-For-Review, ops-eqiad, Data-Services, DC-Ops, User-ArielGlenn, SRE
madhuvishy closed T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D, a subtask of T182540: get datset1001, ms1001 ready for decommission, as Resolved.
Feb 20 2018, 4:27 PM · Patch-For-Review, Dumps-Generation

Feb 18 2018

madhuvishy added a comment to T185434: PAWS fails creating a server for new user .

I renamed the hack script in tools.paws to paws-userhomes-hack.bash, it now looks like:

Feb 18 2018, 12:46 AM · Tracking-Neverending, PAWS

Feb 14 2018

madhuvishy added a comment to T186585: Review m5 backups.

Thanks for this work @jcrespo!

Feb 14 2018, 6:02 PM · DBA, cloud-services-team (Kanban), Data-Services
madhuvishy closed T183029: Stop managing account creation for labsdb1001 and 1003 through the maintain-dbusers script, a subtask of T142807: Migrate all users to new Wiki Replica cluster and decommission old hardware, as Resolved.
Feb 14 2018, 6:02 PM · User-bd808, Patch-For-Review, Goal, cloud-services-team (FY2017-18), Data-Services, DBA
madhuvishy closed T183029: Stop managing account creation for labsdb1001 and 1003 through the maintain-dbusers script as Resolved.
Feb 14 2018, 6:02 PM · cloud-services-team (Kanban), Data-Services
madhuvishy added a comment to T183029: Stop managing account creation for labsdb1001 and 1003 through the maintain-dbusers script.

I've dropped the metadata for labsdb1001 and 1003 from labsdbaccounts.account_host. It now looks like

Feb 14 2018, 6:01 PM · cloud-services-team (Kanban), Data-Services
madhuvishy added a comment to T183029: Stop managing account creation for labsdb1001 and 1003 through the maintain-dbusers script.

This is the new list of m5 databases being backed up- if you confirm that is as intended, you can proceed now we have proper[sic] backups.

root@dbstore2001:/srv/backups/m5.20180214102352$ ls *-schema-create.sql.gz
ceilometer-schema-create.sql.gz
designate_pool_manager-schema-create.sql.gz
designate-schema-create.sql.gz
glance-schema-create.sql.gz
keystone-schema-create.sql.gz
labsdbaccounts-schema-create.sql.gz
labspuppet-schema-create.sql.gz
neutron-schema-create.sql.gz
nodepooldb-schema-create.sql.gz
nova-schema-create.sql.gz
striker-schema-create.sql.gz
Feb 14 2018, 5:55 PM · cloud-services-team (Kanban), Data-Services

Feb 8 2018

madhuvishy renamed T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D from Move labstore1006 and 1007 to 10G enabled racks in row D to Move labstore1006 and 1007 to 10G enabled racks in row A & D.
Feb 8 2018, 12:12 AM · Patch-For-Review, ops-eqiad, Data-Services, DC-Ops, User-ArielGlenn, SRE

Feb 7 2018

madhuvishy added a comment to T186585: Review m5 backups.

puppet seems to be the only other one but no in Cloud Services knows much about it or maintains it - we only found data in there from 2012, and it doesn't seemed to be referenced anywhere in puppet.

Feb 7 2018, 11:27 PM · DBA, cloud-services-team (Kanban), Data-Services
madhuvishy added a comment to T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D.

+1 on moving only once!

Feb 7 2018, 11:25 PM · Patch-For-Review, ops-eqiad, Data-Services, DC-Ops, User-ArielGlenn, SRE
madhuvishy added a comment to T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D.

@ayounsi No we can't lose both without service interruption. I am not sure how we can have row level redundancy in this case if there is only 10G availability in one row.

Feb 7 2018, 11:20 PM · Patch-For-Review, ops-eqiad, Data-Services, DC-Ops, User-ArielGlenn, SRE
madhuvishy added a comment to T175768: Improvements for the Toolforge 'webservice' command.

@srishakatux Perfect, thank you!

Feb 7 2018, 11:13 PM · Toolforge
madhuvishy renamed T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D from set up labstore1006,1007 for use of their 10G nics to Move labstore1006 and 1007 to 10G enabled racks in row D.
Feb 7 2018, 11:01 PM · Patch-For-Review, ops-eqiad, Data-Services, DC-Ops, User-ArielGlenn, SRE
madhuvishy added a comment to T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D.

@Cmjohnson So to clarify, do both row A and D (or the racks we have these servers in - D6 and A1) not have 10G enabled?

Feb 7 2018, 10:50 PM · Patch-For-Review, ops-eqiad, Data-Services, DC-Ops, User-ArielGlenn, SRE
madhuvishy added a comment to T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D.

@Cmjohnson Can we move them to a row with 10G then? These are in public vlan so don't need labs-support. I believe they are currently in A and D.

Feb 7 2018, 10:45 PM · Patch-For-Review, ops-eqiad, Data-Services, DC-Ops, User-ArielGlenn, SRE
madhuvishy updated subscribers of T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D.

@Cmjohnson When we racked labstore1006 & 7 we approved the proposal for racking in 1GBE racks (T167984). I did not know that we had specifically ordered (Hardware request - T161311) 10G NICs on these boxes because the public dumps servers need those enabled (discussed in T118154#3017229)

Feb 7 2018, 10:41 PM · Patch-For-Review, ops-eqiad, Data-Services, DC-Ops, User-ArielGlenn, SRE
madhuvishy edited projects for T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D, added: DC-Ops, Data-Services; removed Cloud-Services.
Feb 7 2018, 10:35 PM · Patch-For-Review, ops-eqiad, Data-Services, DC-Ops, User-ArielGlenn, SRE

Feb 6 2018

madhuvishy added a comment to T175768: Improvements for the Toolforge 'webservice' command.

@srishakatux Yes, we are willing to mentor this for GSoC 2018 or Outreachy Round 16. Let me know if there's anything I need to do on my side to have this up as a project. Thanks :)

Feb 6 2018, 5:07 PM · Toolforge
madhuvishy added a comment to T186585: Review m5 backups.

Drop: test_labsdbaccounts
Backup: labsdbaccounts

Feb 6 2018, 4:27 PM · DBA, cloud-services-team (Kanban), Data-Services
madhuvishy added a comment to T183029: Stop managing account creation for labsdb1001 and 1003 through the maintain-dbusers script.

@jcrespo I'd like to drop all the accounts metadata for labsdb1001 & 3 from labsdbaccounts.account_host on m5-master to close this task.

Feb 6 2018, 12:31 AM · cloud-services-team (Kanban), Data-Services

Feb 5 2018

madhuvishy closed T185851: Request increased quota for Wikiapiary Cloud VPS project as Resolved.
Feb 5 2018, 8:28 PM · Cloud-VPS (Quota-requests)
madhuvishy updated the task description for T185493: Onboard bstorm to WMF.
Feb 5 2018, 7:22 PM · cloud-services-team (Kanban), SRE
madhuvishy updated the task description for T185493: Onboard bstorm to WMF.
Feb 5 2018, 6:32 PM · cloud-services-team (Kanban), SRE
madhuvishy updated the task description for T185493: Onboard bstorm to WMF.
Feb 5 2018, 6:28 PM · cloud-services-team (Kanban), SRE
madhuvishy closed T185591: Requesting access to ops group in admin for bstorm as Resolved.
Feb 5 2018, 6:27 PM · Patch-For-Review, SRE, SRE-Access-Requests
madhuvishy closed T185591: Requesting access to ops group in admin for bstorm, a subtask of T185493: Onboard bstorm to WMF, as Resolved.
Feb 5 2018, 6:27 PM · cloud-services-team (Kanban), SRE
madhuvishy updated the task description for T185591: Requesting access to ops group in admin for bstorm.
Feb 5 2018, 6:26 PM · Patch-For-Review, SRE, SRE-Access-Requests
madhuvishy added a comment to T186235: Tools and Misc NFS weekly backups failing since 1/17.

@chasemp The crons are scheduled for tomorrow and day after :) My manual backups got done fine.

Feb 5 2018, 5:18 PM · cloud-services-team (Kanban), Patch-For-Review, Data-Services
madhuvishy moved T185851: Request increased quota for Wikiapiary Cloud VPS project from Inbox to Approved on the Cloud-VPS (Quota-requests) board.
Feb 5 2018, 4:59 AM · Cloud-VPS (Quota-requests)
madhuvishy added a comment to T185851: Request increased quota for Wikiapiary Cloud VPS project.

@MarkAHershberger I've applied the quota increase - let me know if it's all good. Thanks!

Feb 5 2018, 4:51 AM · Cloud-VPS (Quota-requests)

Feb 1 2018

madhuvishy committed R2073:51d39e39bab2: Add API endpoint for hierakey.
Add API endpoint for hierakey
Feb 1 2018, 10:27 PM
madhuvishy triaged T186235: Tools and Misc NFS weekly backups failing since 1/17 as Medium priority.
Feb 1 2018, 5:11 PM · cloud-services-team (Kanban), Patch-For-Review, Data-Services
madhuvishy added a comment to T186235: Tools and Misc NFS weekly backups failing since 1/17.

Fixed with https://gerrit.wikimedia.org/r/407460. I'm running manual backup jobs now for both shares on screen. Will close after confirming that the scheduled crons run successfully next week.

Feb 1 2018, 5:11 PM · cloud-services-team (Kanban), Patch-For-Review, Data-Services
madhuvishy created T186235: Tools and Misc NFS weekly backups failing since 1/17.
Feb 1 2018, 4:59 PM · cloud-services-team (Kanban), Patch-For-Review, Data-Services

Jan 31 2018

madhuvishy added a comment to T183970: wikidumpparse is using 1.2TB of 5T available NFS misc storage.

@notconfusing Great, thank you!

Jan 31 2018, 5:43 PM · cloud-services-team, Cloud-VPS

Jan 29 2018

madhuvishy added a comment to T185574: toolsadmin removed rush from admin tool and can't find the user.

Noting here that I added Brooke on Tue, Jan 23rd after Bryan's fix, and made sure Rush was still in the list after I did so.

Jan 29 2018, 8:03 PM · cloud-services-team (Kanban), Striker
madhuvishy added a comment to T183970: wikidumpparse is using 1.2TB of 5T available NFS misc storage.

@notconfusing Is this service still active? Are there ongoing clean up jobs in place to delete files that are generated? I see that the usage has now grown to 160G, and want to make sure we don't end up with really high utilization again. Thanks!

Jan 29 2018, 5:51 PM · cloud-services-team, Cloud-VPS

Jan 23 2018

madhuvishy added a parent task for T185591: Requesting access to ops group in admin for bstorm: T185493: Onboard bstorm to WMF.
Jan 23 2018, 7:46 PM · Patch-For-Review, SRE, SRE-Access-Requests
madhuvishy added a subtask for T185493: Onboard bstorm to WMF: T185591: Requesting access to ops group in admin for bstorm.
Jan 23 2018, 7:46 PM · cloud-services-team (Kanban), SRE
madhuvishy updated the task description for T185493: Onboard bstorm to WMF.
Jan 23 2018, 7:43 PM · cloud-services-team (Kanban), SRE
madhuvishy updated the task description for T185493: Onboard bstorm to WMF.
Jan 23 2018, 7:41 PM · cloud-services-team (Kanban), SRE
madhuvishy added a comment to T185591: Requesting access to ops group in admin for bstorm.

@RobH Yup, +1. We are tracking the full list here at T185493

Jan 23 2018, 7:37 PM · Patch-For-Review, SRE, SRE-Access-Requests