Ottomata (Andrew Otto)
User

Projects (7)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Oct 9 2014, 4:50 PM (185 w, 15 m)
Availability
Available
IRC Nick
ottomata
LDAP User
Ottomata
MediaWiki User
Unknown

Recent Activity

Today

Ottomata updated the task description for T192557: Reimage the Debian Jessie Analytics worker nodes to Stretch..
Thu, Apr 26, 3:44 PM · Analytics-Kanban, Patch-For-Review, Analytics
Ottomata updated the task description for T192557: Reimage the Debian Jessie Analytics worker nodes to Stretch..
Thu, Apr 26, 1:45 PM · Analytics-Kanban, Patch-For-Review, Analytics
Ottomata added a comment to T139487: Get 'sparklyr' working on stats1005.

Anyone can kill YARN jobs that they own:

Thu, Apr 26, 1:09 PM · Product-Analytics, Analytics, Discovery-Analysis

Yesterday

Ottomata added a comment to T192832: Upgrade to Stretch and Java 8 for Kafka main cluster.

Ok, I think we are ready to do this! If there are no objections, I'll start on codfw tomorrow.

Wed, Apr 25, 8:19 PM · Patch-For-Review, EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata set the point value for T193080: Enable snappy compression for eventbus Kafka producer to 3.
Wed, Apr 25, 8:09 PM · Services (watching), Patch-For-Review, Analytics, EventBus
Ottomata created T193080: Enable snappy compression for eventbus Kafka producer.
Wed, Apr 25, 8:09 PM · Services (watching), Patch-For-Review, Analytics, EventBus
Ottomata updated the task description for T192557: Reimage the Debian Jessie Analytics worker nodes to Stretch..
Wed, Apr 25, 6:54 PM · Analytics-Kanban, Patch-For-Review, Analytics
Ottomata added a comment to T192839: [Spike] Plan ingress system to use with new Kafka topic for landing page and impression data.

As in, maybe we should change as little as possible, and instead think about how to get a much better system in the medium-term?

Possibly! A huge part of the Event Data Platform is to make it easier to get events into different stores, including MySQL, most likely using Kafka Connect.

Wed, Apr 25, 5:44 PM · Fundraising Sprint Ivory and eggshell white are the same color, Fundraising Sprint HTTP originally stood for Happy Turtle Transfer Protocol, Fundraising-Backlog
Ottomata added a comment to T189137: Migrate CirrusSearch jobs to Kafka queue.

I don't love it! I feel like 4Mb is already huge. Consider troubleshooting some problem with kafkacat -C | jq .. Gotta consume a individual 4Mb messages.

Wed, Apr 25, 3:15 PM · Patch-For-Review, Services (doing), MediaWiki-JobQueue, ChangeProp, EventBus, Operations, User-Joe, Analytics, User-Elukey
Ottomata updated the task description for T192557: Reimage the Debian Jessie Analytics worker nodes to Stretch..
Wed, Apr 25, 2:22 PM · Analytics-Kanban, Patch-For-Review, Analytics
Ottomata added a comment to T189137: Migrate CirrusSearch jobs to Kafka queue.

I already feel like 4Mb messages are a lot, and would much prefer not to increase the max message size more. Can these jobs be split up?

Wed, Apr 25, 2:16 PM · Patch-For-Review, Services (doing), MediaWiki-JobQueue, ChangeProp, EventBus, Operations, User-Joe, Analytics, User-Elukey
Ottomata added a comment to T191859: [EPIC] Reading List Sync service analytics.

One nit! Remember that your json field names are going to be directly mapped to caseless SQL column names, so please avoid using camelCase when you can. snake_case is much better. E.g. app_install_id cross_device_id, etc. :)

Wed, Apr 25, 1:36 PM · Product-Analytics, Analytics, Privacy, Reading-Infrastructure-Team-Backlog, Wikipedia-iOS-App-Backlog, Wikipedia-Android-App-Backlog, Reading List Service
Ottomata updated subscribers of T192819: Event Logging schemas for Wikipedia iOS app.
Wed, Apr 25, 1:32 PM · iOS-app-v5.8.1-Manatee-On-A-Scootscoot, Product-Analytics, Wikipedia-iOS-App-Backlog
Ottomata added a comment to T192819: Event Logging schemas for Wikipedia iOS app.

We might need some input from another analytics team member (@joal ?) about how this would work in Druid. Druid is usually (but maybe not always) inherently time series, so the idea of putting an updated state into it seems a little weird. I might be wrong though.

Wed, Apr 25, 1:32 PM · iOS-app-v5.8.1-Manatee-On-A-Scootscoot, Product-Analytics, Wikipedia-iOS-App-Backlog

Tue, Apr 24

Ottomata moved T192832: Upgrade to Stretch and Java 8 for Kafka main cluster from Next Up to In Progress on the Analytics-Kanban board.
Tue, Apr 24, 8:40 PM · Patch-For-Review, EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata moved T192831: Use profile and prometheus for role::kafka::main::broker from Next Up to Done on the Analytics-Kanban board.
Tue, Apr 24, 8:40 PM · Patch-For-Review, EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata added a comment to T192819: Event Logging schemas for Wikipedia iOS app.

BTW, don't forget you probably want is_anon and primary_language, etc.! I know you are trying to be consistent, but we should fix the old bad usage asap. Mixed case doesn't work well in SQL systems.

Tue, Apr 24, 7:01 PM · iOS-app-v5.8.1-Manatee-On-A-Scootscoot, Product-Analytics, Wikipedia-iOS-App-Backlog
Ottomata added a comment to T192819: Event Logging schemas for Wikipedia iOS app.

I was trying to avoid this much simpler design just because I don't want to send all the properties every time when only one of them has changed

Tue, Apr 24, 6:59 PM · iOS-app-v5.8.1-Manatee-On-A-Scootscoot, Product-Analytics, Wikipedia-iOS-App-Backlog
Ottomata added a comment to T191859: [EPIC] Reading List Sync service analytics.

BTW, just checking that you are synced up with the iOS team on this. It sounds like you are both trying to do similar things! See T192819.

Tue, Apr 24, 4:31 PM · Product-Analytics, Analytics, Privacy, Reading-Infrastructure-Team-Backlog, Wikipedia-iOS-App-Backlog, Wikipedia-Android-App-Backlog, Reading List Service
Ottomata added a comment to T191859: [EPIC] Reading List Sync service analytics.

would be much easier and simpler to do on the backend side.

Tue, Apr 24, 4:29 PM · Product-Analytics, Analytics, Privacy, Reading-Infrastructure-Team-Backlog, Wikipedia-iOS-App-Backlog, Wikipedia-Android-App-Backlog, Reading List Service
Ottomata added a comment to T192819: Event Logging schemas for Wikipedia iOS app.

The properties field is a nested data structure, so in a Hive table, you'll get a struct field, and be able to access the fields like:
SELECT event.properties.readlinglist_count or SELECT event.properties.*, etc. We recommend flat structures because there are many systems (like Druid), where nested fields aren't supported. (We actually have to flatten any EventLogging data for Druid anyway, because technically all schema fields are nested, since they are enclosed in the capsule as the event field.) We can support nested fields just fine, but they might not be very future proof, so we recommend in the schema guidelines that you usually avoid them.

Tue, Apr 24, 2:02 PM · iOS-app-v5.8.1-Manatee-On-A-Scootscoot, Product-Analytics, Wikipedia-iOS-App-Backlog

Mon, Apr 23

Ottomata added a comment to T192819: Event Logging schemas for Wikipedia iOS app.

Yes, we will follow ISO 8601 with a minor deviation

That deviation (using the timezone) is fine, but let's call the field something with dt in the name then. We try to use 'ts' when it is in integer timestamp. event_dt makes the most sense to me, and will is consistent with the concept of 'event time' in general, not just 'client time'.

Mon, Apr 23, 7:57 PM · iOS-app-v5.8.1-Manatee-On-A-Scootscoot, Product-Analytics, Wikipedia-iOS-App-Backlog
Ottomata triaged T192832: Upgrade to Stretch and Java 8 for Kafka main cluster as Normal priority.
Mon, Apr 23, 6:50 PM · Patch-For-Review, EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata triaged T192831: Use profile and prometheus for role::kafka::main::broker as Normal priority.
Mon, Apr 23, 6:48 PM · Patch-For-Review, EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata updated the task description for T167039: Upgrade Kafka on main cluster with security features.
Mon, Apr 23, 5:44 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata added a comment to T192819: Event Logging schemas for Wikipedia iOS app.

I think A-team is drafting a more comprehensive response, but here's a few quick ones:

Mon, Apr 23, 5:15 PM · iOS-app-v5.8.1-Manatee-On-A-Scootscoot, Product-Analytics, Wikipedia-iOS-App-Backlog
Ottomata set the point value for T192760: Please install JSON.pm at stat1005 for Wikistats_1 to 1.
Mon, Apr 23, 2:55 PM · Analytics-Kanban, Analytics-Wikistats
Ottomata moved T192387: Switch to --new.consumer for main -> analytics MirrorMaker from Ready to Deploy to Done on the Analytics-Kanban board.
Mon, Apr 23, 2:54 PM · Patch-For-Review, Analytics-Kanban, Analytics
Ottomata moved T192760: Please install JSON.pm at stat1005 for Wikistats_1 from Next Up to Done on the Analytics-Kanban board.
Mon, Apr 23, 1:47 PM · Analytics-Kanban, Analytics-Wikistats
Ottomata edited projects for T192760: Please install JSON.pm at stat1005 for Wikistats_1, added: Analytics-Kanban; removed Analytics.
Mon, Apr 23, 1:47 PM · Analytics-Kanban, Analytics-Wikistats
Ottomata added a comment to T192760: Please install JSON.pm at stat1005 for Wikistats_1.

Ah great! Was going to ask for a phab task. https://gerrit.wikimedia.org/r/#/c/428337/

Mon, Apr 23, 1:47 PM · Analytics-Kanban, Analytics-Wikistats
Ottomata added a comment to T192640: Reimage stat1004 with Debian Stretch.

Into it, but I think I'd like to wait until we get all the workers upgraded to stretch before we do stat1004.

Mon, Apr 23, 1:44 PM · Analytics, User-Elukey

Thu, Apr 19

Ottomata moved T192387: Switch to --new.consumer for main -> analytics MirrorMaker from Next Up to Ready to Deploy on the Analytics-Kanban board.
Thu, Apr 19, 4:16 PM · Patch-For-Review, Analytics-Kanban, Analytics
Ottomata added a comment to T136732: Puppetize job that saves old versions of Maxmind geoIP database.

I don't have much context of how geowiki runs, but storing this in HDFS would be fine. We (I?) just thought it would be better to use some non analytics based way of doing this, and since geoip already comes from Puppet, we just thought of expanding that there.

Thu, Apr 19, 2:29 PM · Puppet, Patch-For-Review, Analytics-Kanban
Ottomata added a comment to T192348: SparkR on Spark 2.3.0 - Testing on Large Data Sets.

Gonna ping @JAllemandou on this one ^ :)

Thu, Apr 19, 1:17 PM · User-GoranSMilovanovic, Analytics-Kanban, Patch-For-Review, WMDE-Analytics-Engineering

Wed, Apr 18

Ottomata updated subscribers of T167039: Upgrade Kafka on main cluster with security features.
Wed, Apr 18, 4:40 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata updated the task description for T167039: Upgrade Kafka on main cluster with security features.
Wed, Apr 18, 4:40 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata added a comment to T136732: Puppetize job that saves old versions of Maxmind geoIP database.

That'd be fine!

Wed, Apr 18, 4:05 PM · Puppet, Patch-For-Review, Analytics-Kanban
Ottomata added a comment to T136732: Puppetize job that saves old versions of Maxmind geoIP database.

We could do that, but we wanted something centralized and reproducable (e.g. include a puppet class, get the historical dbs). We would have just put this as is in gerrit and auto-committed to it, but we can't host it anywhere publicly, since we pay for these files.

Wed, Apr 18, 3:16 PM · Puppet, Patch-For-Review, Analytics-Kanban
Ottomata updated subscribers of T192387: Switch to --new.consumer for main -> analytics MirrorMaker.
Wed, Apr 18, 2:20 PM · Patch-For-Review, Analytics-Kanban, Analytics
Ottomata added a comment to T136732: Puppetize job that saves old versions of Maxmind geoIP database.

@fdans, puppet will do that.

Wed, Apr 18, 2:05 PM · Puppet, Patch-For-Review, Analytics-Kanban
Ottomata closed T183935: rack/setup/install notebook100[34] as Resolved.
Wed, Apr 18, 2:03 PM · Patch-For-Review, Analytics, Operations
Ottomata added a comment to T164008: Update druid to 0.10.

K sounds good. I'd go for 0.11 (after labs testing), but if you prefer to 0.10 first, that sounds fine too.

Wed, Apr 18, 1:57 PM · Analytics-Kanban, User-Elukey, Analytics, Patch-For-Review
Ottomata added a comment to T192348: SparkR on Spark 2.3.0 - Testing on Large Data Sets.

Whaa? I did the very cumin rm apt-get update thing you did yesterday! Puppet ran everywhere fine then...or at least I thought it did.

Wed, Apr 18, 1:53 PM · User-GoranSMilovanovic, Analytics-Kanban, Patch-For-Review, WMDE-Analytics-Engineering
Ottomata added a comment to T136732: Puppetize job that saves old versions of Maxmind geoIP database.

Great! 17G is a little big for puppetmasters as is now, but we can ask ops if we can expand the partition, or add another one. We'll talk about this with them today.

Wed, Apr 18, 1:45 PM · Puppet, Patch-For-Review, Analytics-Kanban

Tue, Apr 17

Ottomata moved T192387: Switch to --new.consumer for main -> analytics MirrorMaker from Incoming to Kafka Work on the Analytics board.
Tue, Apr 17, 6:00 PM · Patch-For-Review, Analytics-Kanban, Analytics
Ottomata triaged T192387: Switch to --new.consumer for main -> analytics MirrorMaker as Normal priority.
Tue, Apr 17, 6:00 PM · Patch-For-Review, Analytics-Kanban, Analytics
Ottomata added a comment to T189283: Replace cron jobs from EZachte's home directory on stat1005 with rsync fetches.

Think so! We can reopen if there are more problems.

Tue, Apr 17, 5:57 PM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown
Ottomata moved T192348: SparkR on Spark 2.3.0 - Testing on Large Data Sets from Next Up to Done on the Analytics-Kanban board.
Tue, Apr 17, 5:55 PM · User-GoranSMilovanovic, Analytics-Kanban, Patch-For-Review, WMDE-Analytics-Engineering
Ottomata edited projects for T192348: SparkR on Spark 2.3.0 - Testing on Large Data Sets, added: Analytics-Kanban; removed Analytics.
Tue, Apr 17, 5:55 PM · User-GoranSMilovanovic, Analytics-Kanban, Patch-For-Review, WMDE-Analytics-Engineering
Ottomata claimed T192348: SparkR on Spark 2.3.0 - Testing on Large Data Sets.
Tue, Apr 17, 5:55 PM · User-GoranSMilovanovic, Analytics-Kanban, Patch-For-Review, WMDE-Analytics-Engineering
Ottomata added a comment to T192348: SparkR on Spark 2.3.0 - Testing on Large Data Sets.

Ok! Got it. Should be good now.

Tue, Apr 17, 5:54 PM · User-GoranSMilovanovic, Analytics-Kanban, Patch-For-Review, WMDE-Analytics-Engineering
Ottomata added a comment to T192348: SparkR on Spark 2.3.0 - Testing on Large Data Sets.

STill working on this, am having versioning problems on different nodes...

Tue, Apr 17, 4:58 PM · User-GoranSMilovanovic, Analytics-Kanban, Patch-For-Review, WMDE-Analytics-Engineering
Ottomata claimed T167039: Upgrade Kafka on main cluster with security features.
Tue, Apr 17, 3:15 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata moved T189464: Fix Mirror Maker erratic behavior when replicating from main-eqiad to jumbo from Next Up to Paused on the Analytics-Kanban board.
Tue, Apr 17, 3:14 PM · Analytics-Kanban, Patch-For-Review, User-Elukey, Analytics, Analytics-Cluster
Ottomata moved T189464: Fix Mirror Maker erratic behavior when replicating from main-eqiad to jumbo from In Progress to Next Up on the Analytics-Kanban board.
Tue, Apr 17, 3:14 PM · Analytics-Kanban, Patch-For-Review, User-Elukey, Analytics, Analytics-Cluster
Ottomata moved T167039: Upgrade Kafka on main cluster with security features from Next Up to In Progress on the Analytics-Kanban board.
Tue, Apr 17, 3:14 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata added a comment to T189283: Replace cron jobs from EZachte's home directory on stat1005 with rsync fetches.

Yeah, rats. Madhu's original rsync crons used the --delete flag. I disabled that flag in https://gerrit.wikimedia.org/r/426931, but by that time it had already run once. I'm now rsyncing over pagecounts-ez over (without --delete) from (old) dataset1001 to restore anything that was there before.

Tue, Apr 17, 1:37 PM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown

Mon, Apr 16

Ottomata added a comment to T167039: Upgrade Kafka on main cluster with security features.

For reference, just tested in labs. MirrorMaker 0.9 works (but is flaky and buggy) with both 0.9 and 1.x brokers, but 1.x MirrorMaker will only work with 1.x brokers, not 0.9.

Mon, Apr 16, 8:43 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata updated the task description for T175461: Port Kafka clients to new jumbo cluster.
Mon, Apr 16, 8:10 PM · Analytics, Patch-For-Review, Analytics-Cluster
Ottomata added a comment to T188136: Migrate Mediawiki Monolog Kafka producer to Kafka Jumbo.

Ah @elukey I looked more into this and remembered that this might actually cause TCP issues after all. I think we should not move this to jumbo yet, but first look into getting a new PHP Kafka client deployed. Let's hold on that though, and focus on the Kafka main and MirrorMaker stuff. This can be our last hold out Kafka analytics, and we don't need MirrorMaker for this, so it doesn't block us.

Mon, Apr 16, 8:10 PM · Analytics, Patch-For-Review, Analytics-Kanban, Analytics-Cluster
Ottomata added a parent task for T189618: Investigate group.initial.rebalance.delay.ms Kafka setting: T167039: Upgrade Kafka on main cluster with security features.
Mon, Apr 16, 8:02 PM · Services (blocked), User-Elukey, EventBus, Analytics
Ottomata added a subtask for T167039: Upgrade Kafka on main cluster with security features: T189618: Investigate group.initial.rebalance.delay.ms Kafka setting.
Mon, Apr 16, 8:02 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata merged T190853: Upgrade main Kafka clusters to 1.0 into T167039: Upgrade Kafka on main cluster with security features.
Mon, Apr 16, 8:00 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata merged task T190853: Upgrade main Kafka clusters to 1.0 into T167039: Upgrade Kafka on main cluster with security features.
Mon, Apr 16, 8:00 PM · Analytics
Ottomata added a comment to T177927: Refactor kafka_config.rb and and kafka_cluster_name.rb in puppet to avoid explicit hiera calls.

Hm, I just thought about this a little bit, and I'm not so sure we should do it. The hiera info that this function is looking up is always at the global common.yaml level. There is never a case in given environment (e.g. labs vs prod) where the value of kafka_clusters is different per role or node or whatever.

Mon, Apr 16, 7:59 PM · Analytics, Traffic, Operations, User-Elukey
Ottomata set the point value for T190940: Use --new.consumer for main codfw <-> eqiad Kafka MirrorMaker to 8.
Mon, Apr 16, 7:27 PM · Patch-For-Review, Analytics-Kanban, EventBus, Services (watching), Analytics
Ottomata moved T190940: Use --new.consumer for main codfw <-> eqiad Kafka MirrorMaker from Paused to Done on the Analytics-Kanban board.
Mon, Apr 16, 7:27 PM · Patch-For-Review, Analytics-Kanban, EventBus, Services (watching), Analytics
Ottomata added a comment to T182163: Update to latest kafkacat.

That timeline sounds fine to me!

Mon, Apr 16, 2:46 PM · Analytics, Services (watching)
Ottomata added a comment to T189530: Possible statsv corruption?.

Not sure I totally understand the problem. What is an example of a good metric name and a corrupted one here? You are saying that metric names are coming in with dots in the name as they should, and sometimes something is replacing those dots with underscores, but not all of the time?

Mon, Apr 16, 2:46 PM · Performance-Team (Radar), Analytics
Ottomata added a comment to T189283: Replace cron jobs from EZachte's home directory on stat1005 with rsync fetches.

The cron had been disabled, because the source locations didn't exist and were breaking things. I just reenabled them.

Mon, Apr 16, 2:44 PM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown

Thu, Apr 12

Ottomata added a comment to T167039: Upgrade Kafka on main cluster with security features.

New producer in MirrorMaker 1.x is not compatible with old 0.9 broker. So once this upgrade happens, we can no longer Mirror to Old Kafka analytics cluster. What to do...?

Thu, Apr 12, 8:13 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata updated the task description for T167039: Upgrade Kafka on main cluster with security features.
Thu, Apr 12, 8:12 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata updated the task description for T167039: Upgrade Kafka on main cluster with security features.
Thu, Apr 12, 7:53 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata updated the task description for T167039: Upgrade Kafka on main cluster with security features.
Thu, Apr 12, 7:34 PM · EventBus, Services (watching), Analytics-Kanban, Analytics
Ottomata added a comment to T190940: Use --new.consumer for main codfw <-> eqiad Kafka MirrorMaker.

We had planned to pause this until after the main Kafka upgrade in T167039, but from https://kafka.apache.org/documentation/#upgrade_1_1_0:

Thu, Apr 12, 7:15 PM · Patch-For-Review, Analytics-Kanban, EventBus, Services (watching), Analytics
Ottomata moved T188025: Create refinery-spark package from In Code Review to Done on the Analytics-Kanban board.
Thu, Apr 12, 7:08 PM · Patch-For-Review, Analytics-Kanban
Ottomata moved T190400: Review changes to /etc/java-8-openjdk/security/java.security in Kafka from u162 update from In Code Review to Done on the Analytics-Kanban board.
Thu, Apr 12, 7:08 PM · Analytics-Kanban, Patch-For-Review, Operations
Ottomata moved T183145: Refresh SWAP notebook hardware from Ready to Deploy to Done on the Analytics-Kanban board.
Thu, Apr 12, 7:03 PM · Patch-For-Review, Analytics-Kanban
Ottomata updated the task description for T192103: Decommission notebook1001.
Thu, Apr 12, 7:02 PM · hardware-requests, Operations
Ottomata renamed T192103: Decommission notebook1001 from Decommission notebook100[12] to Decommission notebook1001.
Thu, Apr 12, 6:52 PM · hardware-requests, Operations
Ottomata added a subtask for T183145: Refresh SWAP notebook hardware: T192103: Decommission notebook1001.
Thu, Apr 12, 6:52 PM · Patch-For-Review, Analytics-Kanban
Ottomata added a parent task for T192103: Decommission notebook1001: T183145: Refresh SWAP notebook hardware.
Thu, Apr 12, 6:52 PM · hardware-requests, Operations
Ottomata renamed T192103: Decommission notebook1001 from Reclaim/Decommission (specify) hostname[S] to Decommission notebook100[12].
Thu, Apr 12, 6:51 PM · hardware-requests, Operations
Ottomata created T192103: Decommission notebook1001.
Thu, Apr 12, 6:51 PM · hardware-requests, Operations
Ottomata moved T190940: Use --new.consumer for main codfw <-> eqiad Kafka MirrorMaker from Ready to Deploy to Paused on the Analytics-Kanban board.
Thu, Apr 12, 6:50 PM · Patch-For-Review, Analytics-Kanban, EventBus, Services (watching), Analytics
Ottomata updated the task description for T159962: Spark 2 as cluster default (working with oozie).
Thu, Apr 12, 6:48 PM · Patch-For-Review, Analytics-Kanban
Ottomata added a comment to T183145: Refresh SWAP notebook hardware.

But in any case I can't get pyhive to work either right now

Thu, Apr 12, 6:33 PM · Patch-For-Review, Analytics-Kanban
Ottomata updated the task description for T191101: Let's do this: Rollout page previews to 100% of anons on English Wikipedia.
Thu, Apr 12, 1:25 PM · Readers-Web-Kanbanana-Board, RESTBase-API, Services (doing), Patch-For-Review, Wikimedia-Site-requests, Page-Previews, Readers-Web-Backlog
Ottomata added a comment to T192005: Disable MirrorMaker for job queue events.

Ok, but we should keep (other?) change-prop topics mirrored?

Thu, Apr 12, 1:22 PM · Services (done), Analytics-Kanban, Analytics, ChangeProp, MediaWiki-JobQueue, EventBus
Ottomata added a comment to T191464: Enable CP4JQ support for private wikis.

+1 :) To clarify: we haven't officially scheduled any MirrorMaker TLS work, but after we upgrade main Kafka clusters, it should be relatively easy to set this up.

Thu, Apr 12, 1:19 PM · Services (done), MW-1.31-release-notes (WMF-deploy-2018-04-10 (1.31.0-wmf.29)), Analytics, ChangeProp, MediaWiki-JobQueue, EventBus
Ottomata added a comment to T192005: Disable MirrorMaker for job queue events.

This goes for change-prop too, right? So blacklist .+\.(job|change-prop)\..+?

Thu, Apr 12, 1:18 PM · Services (done), Analytics-Kanban, Analytics, ChangeProp, MediaWiki-JobQueue, EventBus

Wed, Apr 11

Ottomata added a comment to T191464: Enable CP4JQ support for private wikis.

Timeline for upgrading main is Q4, but MirrorMaker +TLS wasn't in the plan. I don't think we should block your work though (private cross DC data has happened for a long time, we jut have to get incrementally better).

Wed, Apr 11, 3:23 PM · Services (done), MW-1.31-release-notes (WMF-deploy-2018-04-10 (1.31.0-wmf.29)), Analytics, ChangeProp, MediaWiki-JobQueue, EventBus
Ottomata added a comment to T191464: Enable CP4JQ support for private wikis.

Hm, however, we’re trying to make internal ‘private’ cross DC data all go
over TLS. If we do this, we would want to have TLS enabled for main-eqiad
<-> main-codfw MirrorMaker instance. To do that we need to upgrade main
Kafkas first.

Wed, Apr 11, 1:18 PM · Services (done), MW-1.31-release-notes (WMF-deploy-2018-04-10 (1.31.0-wmf.29)), Analytics, ChangeProp, MediaWiki-JobQueue, EventBus
Ottomata added a comment to T191464: Enable CP4JQ support for private wikis.

Hm, currently the data we import into Hadoop is readable by anyone with a
Hadoop account (not just analytics-privatedata-users) (but overlap of
Hadoop by non privatedata-users is very small). We could change the
permissions on the event database (or even possibly just a few job topic
tables) to be readable only by analytics-privatedata-users.

Wed, Apr 11, 1:17 PM · Services (done), MW-1.31-release-notes (WMF-deploy-2018-04-10 (1.31.0-wmf.29)), Analytics, ChangeProp, MediaWiki-JobQueue, EventBus

Tue, Apr 10

Ottomata moved T189464: Fix Mirror Maker erratic behavior when replicating from main-eqiad to jumbo from Done to In Progress on the Analytics-Kanban board.
Tue, Apr 10, 8:37 PM · Analytics-Kanban, Patch-For-Review, User-Elukey, Analytics, Analytics-Cluster
Ottomata added a comment to T189283: Replace cron jobs from EZachte's home directory on stat1005 with rsync fetches.

ping @ezachte

Tue, Apr 10, 2:54 PM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown
Ottomata added a comment to T189283: Replace cron jobs from EZachte's home directory on stat1005 with rsync fetches.

Or, more correctly, he hasn't updated his jobs to write the files to /srv/dumps on stat1005?

Tue, Apr 10, 2:53 PM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown

Mon, Apr 9

Ottomata moved T191567: Updating Wikistats 2.0 github Readme from Ready to Deploy to Done on the Analytics-Kanban board.
Mon, Apr 9, 3:10 PM · Analytics-Kanban, Patch-For-Review, Analytics, Analytics-Wikistats
Ottomata assigned T177460: Add the prometheus jmx exporter to all the Zookeeper daemons to elukey.
Mon, Apr 9, 3:03 PM · Analytics-Kanban, Patch-For-Review, User-Elukey, Analytics