elukey (Luca Toscano)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Jan 5 2016, 9:54 PM (145 w, 5 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
LToscano (WMF) [ Global Accounts ]

Recent Activity

Today

elukey closed T207487: Puppet broken on deployment-deploy* as Resolved.

Fixed by Joe with https://gerrit.wikimedia.org/r/468937

Mon, Oct 22, 9:10 AM · Patch-For-Review, Puppet, Beta-Cluster-Infrastructure
elukey added a comment to T203786: Mcrouter periodically reports soft TKOs for mc[1,2]035 leading to MW Memcached exceptions.

Very interesting discovery today. The probe_delay_initial_ms (time to wait before sending the first health check to memcached after it has been marked with TKO) is 10s. This is the timeline of one mcrouter TKO workflow on mw1347:

Mon, Oct 22, 8:16 AM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Patch-For-Review, Performance-Team (Radar), Wikimedia-production-error, Gadgets, User-Elukey, MediaWiki-Cache, Operations

Fri, Oct 19

elukey triaged T207424: Many errors on "MobileWikiAppiOSSearch" and "MobileWikiAppiOSUserHistory" as High priority.
Fri, Oct 19, 5:50 AM · iOS-app-v6.1-Narwhal-On-A-Bumper-Car, iOS-app-Bugs, iOS-app-feature-Analytics, Analytics, Wikipedia-iOS-App-Backlog

Thu, Oct 18

elukey moved T206839: Upgrade to Druid 0.12.3 from Backlog to In Progress on the User-Elukey board.
Thu, Oct 18, 12:27 PM · Analytics-Kanban, User-Elukey, Analytics
elukey added a comment to T203786: Mcrouter periodically reports soft TKOs for mc[1,2]035 leading to MW Memcached exceptions.

Recap of what we did so far:

Thu, Oct 18, 9:26 AM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Patch-For-Review, Performance-Team (Radar), Wikimedia-production-error, Gadgets, User-Elukey, MediaWiki-Cache, Operations
elukey updated the task description for T172532: Refactor analytics cronjobs to alarm on failure reliably.
Thu, Oct 18, 9:04 AM · Patch-For-Review, Analytics-Kanban, User-Elukey, Analytics

Wed, Oct 17

elukey set the point value for T128623: Purge all Schema:Echo data after 90 days to 3.
Wed, Oct 17, 3:18 PM · Patch-For-Review, Analytics-Kanban, Growth-Team, Collaboration-Team-Triage, Notifications, Analytics, DBA
elukey moved T128623: Purge all Schema:Echo data after 90 days from Next Up to Done on the Analytics-Kanban board.
Wed, Oct 17, 3:18 PM · Patch-For-Review, Analytics-Kanban, Growth-Team, Collaboration-Team-Triage, Notifications, Analytics, DBA
elukey added a comment to T128623: Purge all Schema:Echo data after 90 days.

Tables dropped with Marcel on db110[7,8] (eventlogging master/slave). Marcel checked and nothing is there on HDFS.

Wed, Oct 17, 3:18 PM · Patch-For-Review, Analytics-Kanban, Growth-Team, Collaboration-Team-Triage, Notifications, Analytics, DBA
elukey updated the task description for T172532: Refactor analytics cronjobs to alarm on failure reliably.
Wed, Oct 17, 12:36 PM · Patch-For-Review, Analytics-Kanban, User-Elukey, Analytics
elukey moved T206943: JVM pauses cause Yarn master to failover from Backlog to In Progress on the User-Elukey board.
Wed, Oct 17, 11:46 AM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics
elukey added a comment to T206943: JVM pauses cause Yarn master to failover.

As reference, https://www.slideshare.net/HadoopSummit/operating-and-supporting-apache-hbase-best-practices-and-improvements (slide 15) shows a similar problem on HBase, that was related to the kernel driver for the disk controller.

Wed, Oct 17, 11:37 AM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics
elukey added a comment to T206943: JVM pauses cause Yarn master to failover.

It happened also on the 16th, but didn't lead to any failover:

Wed, Oct 17, 10:26 AM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics
elukey added a comment to T206943: JVM pauses cause Yarn master to failover.

As reference, this is what https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/JvmPauseMonitor.html does:

Wed, Oct 17, 10:22 AM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics
elukey added a comment to T206943: JVM pauses cause Yarn master to failover.

I was about to send another code change for the GC, but then I took a look again to the logs in the description and realized that I've missed an important bit:

Wed, Oct 17, 9:45 AM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics

Tue, Oct 16

elukey moved T206542: eventlogging logs taking a huge amount of space on eventlog1002 and stat1005 from In Progress to In Code Review on the Analytics-Kanban board.
Tue, Oct 16, 1:51 PM · Analytics-Kanban, Patch-For-Review, Analytics
elukey moved T172532: Refactor analytics cronjobs to alarm on failure reliably from Next Up to In Progress on the Analytics-Kanban board.
Tue, Oct 16, 1:51 PM · Patch-For-Review, Analytics-Kanban, User-Elukey, Analytics
elukey added a comment to T203669: Return to real time banner impressions in Druid.

@AndyRussG sorry for the lag but I had to clarify with Joseph some details :)

Tue, Oct 16, 1:43 PM · Analytics-Kanban, User-Elukey, Analytics
elukey moved T207165: eventlogging_db_sanitization script failed from Next Up to Done on the Analytics-Kanban board.
Tue, Oct 16, 1:30 PM · Analytics-Kanban, Analytics
elukey claimed T207165: eventlogging_db_sanitization script failed.
Tue, Oct 16, 1:30 PM · Analytics-Kanban, Analytics
elukey added a comment to T207165: eventlogging_db_sanitization script failed.

15:24 <icinga-wm> RECOVERY - Check systemd state on db1108 is OK: OK - running: The system is fully operational
15:29 <icinga-wm> RECOVERY - Check systemd state on db1107 is OK: OK - running: The system is fully operational

Tue, Oct 16, 1:30 PM · Analytics-Kanban, Analytics
elukey added a comment to T207165: eventlogging_db_sanitization script failed.

Thanks! So this might be the case of schema present only on Hadoop and not on Mysql? If so the logic that triggered the above check needs to be removed :)

Tue, Oct 16, 12:57 PM · Analytics-Kanban, Analytics
elukey updated subscribers of T207165: eventlogging_db_sanitization script failed.

@Gilles hi! Do you know when ResourceTiming will start registering events in Eventlogging?

Tue, Oct 16, 12:50 PM · Analytics-Kanban, Analytics
elukey added a comment to T207165: eventlogging_db_sanitization script failed.

This should be a protection mechanism that in this case caused a false positive. So https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/466607/ introduces a new schema in the whitelist, but probably no event for the schema has been collected from Eventlogging yet (hence no table created on the DB).

Tue, Oct 16, 12:49 PM · Analytics-Kanban, Analytics
elukey updated subscribers of T206965: Degraded RAID on dbstore1002.

@Cmjohnson this server is OOW but the replacement will take time to arrive (still in procurement..) and this host is really important for the research users. Do we have a spare disk that we can swap?

Tue, Oct 16, 6:45 AM · Analytics, ops-eqiad, Operations
elukey triaged T206965: Degraded RAID on dbstore1002 as High priority.
Tue, Oct 16, 6:44 AM · Analytics, ops-eqiad, Operations

Mon, Oct 15

elukey added a comment to T203786: Mcrouter periodically reports soft TKOs for mc[1,2]035 leading to MW Memcached exceptions.

All mcrouters now use 5 persistent conns to each shard, the above graph shows the increase that each memcached server observed after the rollout. We still see some connection yield so this might no be enough, it really depends how mcrouter handles short bursts with more than one connection.

Mon, Oct 15, 4:10 PM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Patch-For-Review, Performance-Team (Radar), Wikimedia-production-error, Gadgets, User-Elukey, MediaWiki-Cache, Operations
elukey moved T206943: JVM pauses cause Yarn master to failover from Next Up to In Progress on the Analytics-Kanban board.
Mon, Oct 15, 3:00 PM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics
elukey updated subscribers of T206331: Git push and pull don't complete.

Thanks Erik, it should work now :)

Mon, Oct 15, 2:04 PM · User-Elukey, Analytics, Analytics-Wikistats
elukey added a comment to T206331: Git push and pull don't complete.

I'd need to know the following data:

Mon, Oct 15, 12:30 PM · User-Elukey, Analytics, Analytics-Wikistats
elukey added a comment to T206331: Git push and pull don't complete.

Updating priority as I need to check-in bug fixes, which is way overdue. thanks.

Mon, Oct 15, 10:45 AM · User-Elukey, Analytics, Analytics-Wikistats
elukey added a project to T206943: JVM pauses cause Yarn master to failover: User-Elukey.
Mon, Oct 15, 7:48 AM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics

Sun, Oct 14

elukey triaged T206943: JVM pauses cause Yarn master to failover as High priority.
Sun, Oct 14, 8:54 AM · Patch-For-Review, User-Elukey, Analytics-Kanban, Analytics

Fri, Oct 12

elukey moved T206839: Upgrade to Druid 0.12.3 from Next Up to In Progress on the Analytics-Kanban board.
Fri, Oct 12, 4:05 PM · Analytics-Kanban, User-Elukey, Analytics
elukey added a project to T206839: Upgrade to Druid 0.12.3: Analytics-Kanban.
Fri, Oct 12, 4:05 PM · Analytics-Kanban, User-Elukey, Analytics
elukey added a comment to T206839: Upgrade to Druid 0.12.3.

Created the 0.12.3 debs on boron and deployed the new version in Labs, ready for testing!

Fri, Oct 12, 4:05 PM · Analytics-Kanban, User-Elukey, Analytics
elukey placed T196685: rack/setup/install rdb10[09|10].eqiad.wmnet up for grabs.
Fri, Oct 12, 12:52 PM · User-Joe, User-Elukey, Operations
elukey added a project to T206839: Upgrade to Druid 0.12.3: User-Elukey.
Fri, Oct 12, 9:18 AM · Analytics-Kanban, User-Elukey, Analytics
elukey updated the task description for T206839: Upgrade to Druid 0.12.3.
Fri, Oct 12, 8:55 AM · Analytics-Kanban, User-Elukey, Analytics
elukey triaged T206839: Upgrade to Druid 0.12.3 as Normal priority.
Fri, Oct 12, 8:52 AM · Analytics-Kanban, User-Elukey, Analytics
elukey closed T204970: setup/install an-coord1001/wmf7621 as Resolved.
Fri, Oct 12, 8:11 AM · User-Elukey, Patch-For-Review, ops-eqiad, Analytics, Operations
elukey closed T201939: rack/setup/install an-master100[12].eqiad.wmnet as Resolved.
Fri, Oct 12, 8:11 AM · Patch-For-Review, User-Elukey, Analytics, Operations
elukey moved T206524: Decommission analytics1003 from Backlog to Keep an eye on it on the User-Elukey board.
Fri, Oct 12, 8:09 AM · decommission, DC-Ops, User-Elukey, Analytics

Thu, Oct 11

elukey added a comment to T203786: Mcrouter periodically reports soft TKOs for mc[1,2]035 leading to MW Memcached exceptions.

Finally build and deployed the new prometheus-memcached-exporter on the mc* hosts, now https://grafana.wikimedia.org/dashboard/db/memcache shows two new metrics, including the rate of connection yields broken down by shard.

Thu, Oct 11, 2:37 PM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Patch-For-Review, Performance-Team (Radar), Wikimedia-production-error, Gadgets, User-Elukey, MediaWiki-Cache, Operations
elukey updated the task description for T203165: Reboot Analytics hosts for kernel security upgrades.
Thu, Oct 11, 2:20 PM · Analytics-Kanban, Analytics
elukey updated the task description for T198694: Q1 2018/19 Analytics procurement.
Thu, Oct 11, 8:56 AM · Analytics-Kanban, User-Elukey
elukey closed T203852: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement) as Resolved.

Done! Will follow up in another task to replace stat1005 with this new host.

Thu, Oct 11, 8:55 AM · Patch-For-Review, User-Elukey, ops-eqiad, Operations, Analytics
elukey claimed T203852: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement).
Thu, Oct 11, 8:54 AM · Patch-For-Review, User-Elukey, ops-eqiad, Operations, Analytics
elukey moved T203852: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement) from Backlog to In Progress on the User-Elukey board.
Thu, Oct 11, 7:26 AM · Patch-For-Review, User-Elukey, ops-eqiad, Operations, Analytics

Wed, Oct 10

elukey updated the task description for T203852: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement).
Wed, Oct 10, 4:58 PM · Patch-For-Review, User-Elukey, ops-eqiad, Operations, Analytics
elukey updated the task description for T203852: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement).
Wed, Oct 10, 4:58 PM · Patch-For-Review, User-Elukey, ops-eqiad, Operations, Analytics
elukey added a project to T203693: Update to CDH 6 or other up-to-date Hadoop distribution: User-Elukey.
Wed, Oct 10, 3:57 PM · User-Elukey, Analytics-Cluster, Analytics
elukey added a comment to T203693: Update to CDH 6 or other up-to-date Hadoop distribution.
As of 6.0 we (Cloudera) no longer support/build on debian:
https://www.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_deprecated_items.html#concept_ylw_bc2_rbb
Sorry to be dissappoint.
We continue support for debian on 5.x though.
Wed, Oct 10, 3:57 PM · User-Elukey, Analytics-Cluster, Analytics
elukey updated the task description for T203693: Update to CDH 6 or other up-to-date Hadoop distribution.
Wed, Oct 10, 3:56 PM · User-Elukey, Analytics-Cluster, Analytics
elukey added a comment to T206542: eventlogging logs taking a huge amount of space on eventlog1002 and stat1005.

Both changes merged, the space consumption should go down on both eventlog1002 and stat1005 after the next logrotate run. Keeping this task open to verify this.

Wed, Oct 10, 3:19 PM · Analytics-Kanban, Patch-For-Review, Analytics
elukey added a comment to T205814: Switch the main etcd cluster in eqiad to use conf1004-1006.

Opened https://phabricator.wikimedia.org/T206626 to fully decom conf100[1-3] (not in service anymore and with role::spare::system).

Wed, Oct 10, 12:42 PM · Patch-For-Review, Operations
elukey triaged T206626: Decommission conf100[1-3] as Normal priority.
Wed, Oct 10, 12:40 PM · ops-eqiad, decommission, Operations
elukey changed the status of Unknown Object (Task), a subtask of T198694: Q1 2018/19 Analytics procurement, from Stalled to Open.
Wed, Oct 10, 9:03 AM · Analytics-Kanban, User-Elukey
elukey added a comment to T203852: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement).

@Cmjohnson quick question (might be wrong since I am a n00b with Juniper): is stat1007 in the analytics VLAN?

Wed, Oct 10, 8:48 AM · Patch-For-Review, User-Elukey, ops-eqiad, Operations, Analytics
elukey moved T206542: eventlogging logs taking a huge amount of space on eventlog1002 and stat1005 from Next Up to In Progress on the Analytics-Kanban board.
Wed, Oct 10, 8:02 AM · Analytics-Kanban, Patch-For-Review, Analytics
elukey claimed T206542: eventlogging logs taking a huge amount of space on eventlog1002 and stat1005.
Wed, Oct 10, 8:01 AM · Analytics-Kanban, Patch-For-Review, Analytics
elukey edited projects for T203852: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement), added: User-Elukey; removed Patch-For-Review.
Wed, Oct 10, 7:55 AM · Patch-For-Review, User-Elukey, ops-eqiad, Operations, Analytics

Tue, Oct 9

elukey moved T205509: Replace the Analytics Hadoop coordinator - Hive/Oozie/etc... (hardware refresh) from In Progress to Done on the User-Elukey board.
Tue, Oct 9, 6:53 PM · User-Elukey, Patch-For-Review, Analytics-Kanban, Analytics
elukey triaged T206542: eventlogging logs taking a huge amount of space on eventlog1002 and stat1005 as High priority.
Tue, Oct 9, 3:13 PM · Analytics-Kanban, Patch-For-Review, Analytics
elukey updated the task description for T198694: Q1 2018/19 Analytics procurement.
Tue, Oct 9, 12:37 PM · Analytics-Kanban, User-Elukey
elukey closed T192642: Upgrade Analytics infrastructure to Debian Stretch as Resolved.
Tue, Oct 9, 10:53 AM · User-Elukey, Analytics-Kanban
elukey moved T205509: Replace the Analytics Hadoop coordinator - Hive/Oozie/etc... (hardware refresh) from Ready to Deploy to Done on the Analytics-Kanban board.
Tue, Oct 9, 10:52 AM · User-Elukey, Patch-For-Review, Analytics-Kanban, Analytics
elukey triaged T206524: Decommission analytics1003 as Normal priority.
Tue, Oct 9, 10:49 AM · decommission, DC-Ops, User-Elukey, Analytics

Mon, Oct 8

elukey added a project to T206484: Manage Hue via systemd unit: User-Elukey.
Mon, Oct 8, 6:23 PM · User-Elukey, Analytics-Cluster, Operations, Analytics
elukey moved T206020: Logrotate of refinery rotating on size rather than time from In Code Review to Done on the Analytics-Kanban board.
Mon, Oct 8, 4:40 PM · Patch-For-Review, Analytics-Kanban, Analytics
elukey claimed T206331: Git push and pull don't complete.
Mon, Oct 8, 3:37 PM · User-Elukey, Analytics, Analytics-Wikistats
elukey added a comment to T203786: Mcrouter periodically reports soft TKOs for mc[1,2]035 leading to MW Memcached exceptions.

I re-examined the problem from a fresh start, and also tried to validate Joe's initial point about TKO not being handled by mcrouter removing the (failing) shard from the consistent hashing. I think that there are two main issues from what I can see:

Mon, Oct 8, 2:58 PM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Patch-For-Review, Performance-Team (Radar), Wikimedia-production-error, Gadgets, User-Elukey, MediaWiki-Cache, Operations
elukey added a comment to T205509: Replace the Analytics Hadoop coordinator - Hive/Oozie/etc... (hardware refresh).

This morning we successfully moved the Druid clusters to an-coord1001, tomorrow will do hive/oozie and the cron jobs.

Mon, Oct 8, 12:55 PM · User-Elukey, Patch-For-Review, Analytics-Kanban, Analytics
elukey moved T205509: Replace the Analytics Hadoop coordinator - Hive/Oozie/etc... (hardware refresh) from Backlog to In Progress on the User-Elukey board.
Mon, Oct 8, 12:54 PM · User-Elukey, Patch-For-Review, Analytics-Kanban, Analytics

Fri, Oct 5

elukey moved T205509: Replace the Analytics Hadoop coordinator - Hive/Oozie/etc... (hardware refresh) from In Progress to Ready to Deploy on the Analytics-Kanban board.
Fri, Oct 5, 3:26 PM · User-Elukey, Patch-For-Review, Analytics-Kanban, Analytics
elukey added a comment to T205509: Replace the Analytics Hadoop coordinator - Hive/Oozie/etc... (hardware refresh).

The host has been set up with basic functionalities, and all daemons and mariadb seem working fine. I also set up root/hive/oozie/druid users in mariadb.

Fri, Oct 5, 3:15 PM · User-Elukey, Patch-For-Review, Analytics-Kanban, Analytics
elukey added a project to T205509: Replace the Analytics Hadoop coordinator - Hive/Oozie/etc... (hardware refresh): User-Elukey.
Fri, Oct 5, 3:12 PM · User-Elukey, Patch-For-Review, Analytics-Kanban, Analytics
elukey added a comment to T203244: analytics1068 doesn't boot.

ping :)

Fri, Oct 5, 3:05 PM · ops-eqiad, Operations, Analytics
elukey added a comment to T206331: Git push and pull don't complete.
Last login: Wed Oct  3 10:23:32 2018 from 91.198.174.113
elukey@stat1005:~$ cat /etc/gitconfig
# vim: set ts=4 sw=4 et:
# This file is managed by Puppet!
# puppet:://modules/git/gitconfig.erb
# git::userconfig for 'git::systemconfig'
Fri, Oct 5, 3:00 PM · User-Elukey, Analytics, Analytics-Wikistats
elukey added a comment to T206331: Git push and pull don't complete.

Hi Erik, can you give me an example of command that you give that hangs? git should use by default http_proxy configs on stat1005 (system property), but it might not work with your settings.

Fri, Oct 5, 2:59 PM · User-Elukey, Analytics, Analytics-Wikistats
elukey updated the task description for T192642: Upgrade Analytics infrastructure to Debian Stretch.
Fri, Oct 5, 1:21 PM · User-Elukey, Analytics-Kanban
elukey moved T204970: setup/install an-coord1001/wmf7621 from Backlog to Done on the User-Elukey board.
Fri, Oct 5, 1:21 PM · User-Elukey, Patch-For-Review, ops-eqiad, Analytics, Operations
elukey set the point value for T202962: Upgrade bohrium (piwik/matomo) to Debian Stretch to 8.
Fri, Oct 5, 12:02 PM · Patch-For-Review, Analytics-Kanban, Analytics
elukey moved T202962: Upgrade bohrium (piwik/matomo) to Debian Stretch from In Progress to Done on the Analytics-Kanban board.
Fri, Oct 5, 12:02 PM · Patch-For-Review, Analytics-Kanban, Analytics
elukey closed T206315: Decommission bohrium as Resolved.
Fri, Oct 5, 12:02 PM · Patch-For-Review, Operations, Analytics-Kanban, Analytics
elukey closed T206315: Decommission bohrium, a subtask of T202962: Upgrade bohrium (piwik/matomo) to Debian Stretch, as Resolved.
Fri, Oct 5, 12:02 PM · Patch-For-Review, Analytics-Kanban, Analytics
elukey set the point value for T206315: Decommission bohrium to 3.
Fri, Oct 5, 12:00 PM · Patch-For-Review, Operations, Analytics-Kanban, Analytics
elukey added a comment to T206315: Decommission bohrium.
elukey@ganeti1001:~$ sudo gnt-instance remove bohrium.eqiad.wmnet
This will remove the volumes of the instance bohrium.eqiad.wmnet
(including mirrors), thus removing all the data of the instance.
Continue?
y/[n]/?: y
Fri, Oct 5, 12:00 PM · Patch-For-Review, Operations, Analytics-Kanban, Analytics
elukey triaged T206315: Decommission bohrium as Normal priority.
Fri, Oct 5, 11:37 AM · Patch-For-Review, Operations, Analytics-Kanban, Analytics

Thu, Oct 4

elukey added a comment to T206020: Logrotate of refinery rotating on size rather than time .

We do have a logrotate config on an1003:

Thu, Oct 4, 5:05 PM · Patch-For-Review, Analytics-Kanban, Analytics
elukey updated subscribers of T203786: Mcrouter periodically reports soft TKOs for mc[1,2]035 leading to MW Memcached exceptions.

Adding also @aaron to get his opinion, no idea about how to trace back what piece of code uses the key listed above :)

Thu, Oct 4, 4:39 PM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Patch-For-Review, Performance-Team (Radar), Wikimedia-production-error, Gadgets, User-Elukey, MediaWiki-Cache, Operations
elukey moved T205509: Replace the Analytics Hadoop coordinator - Hive/Oozie/etc... (hardware refresh) from Next Up to In Progress on the Analytics-Kanban board.
Thu, Oct 4, 4:33 PM · User-Elukey, Patch-For-Review, Analytics-Kanban, Analytics
elukey edited projects for T206217: Allow Analytics team members to restart Turnilo and Superset, added: SRE-Access-Requests; removed hardware-requests.
Thu, Oct 4, 1:20 PM · Patch-For-Review, SRE-Access-Requests, Analytics, Operations
elukey triaged T206217: Allow Analytics team members to restart Turnilo and Superset as Normal priority.
Thu, Oct 4, 1:19 PM · Patch-For-Review, SRE-Access-Requests, Analytics, Operations

Wed, Oct 3

elukey added a comment to T200792: Run A/B test on page issues (Farsi, Japanese, Russian, English).

@elukey can we proceed with Swatting this patch or do you need anything more?

Wed, Oct 3, 4:21 PM · Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q2), Patch-For-Review, Page-Issue-Warnings, User-notice, Wikimedia-Site-requests
elukey added a comment to T204970: setup/install an-coord1001/wmf7621.

Assigning to Rob to see if anything needs to be done from the DC ops side before closing.

Wed, Oct 3, 2:33 PM · User-Elukey, Patch-For-Review, ops-eqiad, Analytics, Operations
elukey reassigned T204970: setup/install an-coord1001/wmf7621 from elukey to RobH.
Wed, Oct 3, 2:32 PM · User-Elukey, Patch-For-Review, ops-eqiad, Analytics, Operations
elukey updated the task description for T204970: setup/install an-coord1001/wmf7621.
Wed, Oct 3, 2:32 PM · User-Elukey, Patch-For-Review, ops-eqiad, Analytics, Operations
elukey added a comment to T200792: Run A/B test on page issues (Farsi, Japanese, Russian, English).

We basically care about the following rates:

Wed, Oct 3, 2:23 PM · Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q2), Patch-For-Review, Page-Issue-Warnings, User-notice, Wikimedia-Site-requests
elukey added a comment to T204970: setup/install an-coord1001/wmf7621.

After hitting by mistake F10 the host got stuck several times in:

Unified Server Configurator does not support console redirection

After some reboots:

UEFI0019: Lifecycle Controller (LC) is unable to complete a requested task or
function and prevented the boot process from completing on multiple attempts.
LC is in Recovery Mode.
Repair Lifecycle Controller firmware using the Lifecycle Controller Dell Update
Package (DUP) or Lifecycle Controller Repair Package via iDRAC. For more
information, see Lifecycle Controller User's Guide.

Tried to follow https://wikitech.wikimedia.org/wiki/Platform-specific_documentation/Dell_PowerEdge_RN10#Unified_Server_Configurator_does_not_support_console_redirection but didn't manage to hit the ctrl+e.

Wed, Oct 3, 1:46 PM · User-Elukey, Patch-For-Review, ops-eqiad, Analytics, Operations