Page MenuHomePhabricator

Ottomata (Andrew Otto)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Oct 9 2014, 4:50 PM (232 w, 1 d)
Availability
Available
IRC Nick
ottomata
LDAP User
Ottomata
MediaWiki User
Ottomata [ Global Accounts ]

Recent Activity

Yesterday

Ottomata created T219032: EventGate should be able to configure hasty and guaranteed kafka producers individually.
Fri, Mar 22, 8:25 PM · Services (watching), EventBus, Analytics
Ottomata committed rDEPLOYCHARTS425b123c803a: eventgate-analytics - Incorrectly indexed 0.0.17, use 0.0.18 instead (authored by Ottomata).
eventgate-analytics - Incorrectly indexed 0.0.17, use 0.0.18 instead
Fri, Mar 22, 7:52 PM
Ottomata committed rDEPLOYCHARTSdc00c54c2801: eventgate-analytics - set broker.address.family: v4 to workaround k8s IPv6 issue (authored by Ottomata).
eventgate-analytics - set broker.address.family: v4 to workaround k8s IPv6 issue
Fri, Mar 22, 7:28 PM
Ottomata committed rDEPLOYCHARTSa549152be83d: eventgate-analytics - set broker.address.family: v4 to workaround k8s IPv6 issue (authored by Ottomata).
eventgate-analytics - set broker.address.family: v4 to workaround k8s IPv6 issue
Fri, Mar 22, 7:28 PM
Ottomata added a comment to T218268: eventgate-analytics k8s pods occasionally can't produce to kafka.

librdkafka will use the system resolver to resolve the broker hostname. [...] librdkafka will, in a round-robin fashion, attempt to connect to all addresses the hostname resolves to. If the broker is only listening to the IPv4 address then the clients connection attempt to the IPv6 address will fail.

To limit the address families the clients connects to, set the broker.address.family configuration property to v4 or v6.

Fri, Mar 22, 7:02 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations
Ottomata added a comment to T218268: eventgate-analytics k8s pods occasionally can't produce to kafka.

Ohooohooooo! Could this be related to IPv6? On a successful run, here's the connect message for kafka-jumbo1003:

Fri, Mar 22, 6:31 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations
Ottomata added a comment to T218268: eventgate-analytics k8s pods occasionally can't produce to kafka.

Interesting fact: kafka-jumbo1003 and kafka-jumbo1006 are the only brokers in this cluster that are not in Row A or Row B. kubestage1001 is in Row A and kubestage1002 is in Row B. Could there be some very sneaky and intermittent network troubles between k8s nodes and hosts in different rows?

Fri, Mar 22, 5:38 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations
Ottomata added a comment to T218268: eventgate-analytics k8s pods occasionally can't produce to kafka.

I tcpdumped while reproducing this on kubestage1001 and the kafka brokers. I could see the looped metadata requests going through fine, but the topic partition leader on kafka-jumbo1003 did not receive any traffic between the initial producer connection, and when the produce request is finally sent about 2 minutes after the HTTP POST. Here's the tcp dump lines 2 minutes apart from kubestage1001:

Fri, Mar 22, 5:24 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations
Ottomata added a comment to T213193: Migrate changeprop to kubernetes.

BTW, I'm currently having a node-rdkafka + k8s related issue in EventGate: T218268: eventgate-analytics k8s pods occasionally can't produce to kafka. It is possible that there is some EventGate specific code causing this problem, but at the moment it only happens in k8s. I'd expect change-prop to exhibit this problem as well.

Fri, Mar 22, 4:58 PM · Patch-For-Review, Services (watching), Release-Engineering-Team (Next), Release Pipeline, serviceops, ChangeProp
Ottomata added a project to T218617: Fix EventLogging schemas that use array for items type: Analytics-EventLogging.
Fri, Mar 22, 4:11 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics
Ottomata claimed T218617: Fix EventLogging schemas that use array for items type.
Fri, Mar 22, 4:11 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics
Ottomata moved T218617: Fix EventLogging schemas that use array for items type from Next Up to In Progress on the Analytics-Kanban board.
Fri, Mar 22, 4:03 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics
Ottomata added a project to T218617: Fix EventLogging schemas that use array for items type: Analytics-Kanban.
Fri, Mar 22, 4:03 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics
Ottomata moved T218268: eventgate-analytics k8s pods occasionally can't produce to kafka from Backlog to In Progress on the EventBus board.
Fri, Mar 22, 3:47 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations
Ottomata added a comment to T218617: Fix EventLogging schemas that use array for items type.

I can make the EventLogging extension accept the proper items schema object format, but I can't seem to make it properly validate instances against it. I really don't think we should be implementing and debugging a custom JSONSchema validator implementation. Can't wait til we get rid of EventLoggging! AHHAHGH.

Fri, Mar 22, 3:09 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics
Ottomata added a comment to T218617: Fix EventLogging schemas that use array for items type.
public function getSequenceChildRef( $i ) {
    TODO: make this conform to draft-03 by also allowing single object

https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/includes/JsonSchema.php#L454

Fri, Mar 22, 3:02 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics
Ottomata added a comment to T218617: Fix EventLogging schemas that use array for items type.

So, wow. EventLogging's schemaschema.json is forcing that the array items is itself an array, which only validates that each element in the instance array matches the item type in the schema items array. E.g.

Fri, Mar 22, 2:19 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics

Thu, Mar 21

Ottomata added a comment to T218268: eventgate-analytics k8s pods occasionally can't produce to kafka.

I don't know much more, but I have a lot more data!

Thu, Mar 21, 10:59 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations
Ottomata added a comment to T218617: Fix EventLogging schemas that use array for items type.

HUH! I wonder if this is why the items are arrays......is EventLogging extension enforcing this with its weirdo JSONSchema? Will check...

Thu, Mar 21, 9:28 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics
Ottomata added a comment to T218617: Fix EventLogging schemas that use array for items type.

IIUC this only requires changes on the schema on meta wiki

Thu, Mar 21, 8:18 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics
Ottomata committed rDEPLOYCHARTS81f6dc926ba8: eventgate-analytics - allow for extra app config in values (authored by Ottomata).
eventgate-analytics - allow for extra app config in values
Thu, Mar 21, 3:29 PM
Ottomata committed rDEPLOYCHARTSf9517f22619d: eventgate-analytics Fix misplaced '_histogram' metric suffix (authored by Ottomata).
eventgate-analytics Fix misplaced '_histogram' metric suffix
Thu, Mar 21, 12:35 AM
Ottomata committed rDEPLOYCHARTSa33a23dbacb2: eventgate-analytics Set rdkafka log.connection.close: false (authored by Ottomata).
eventgate-analytics Set rdkafka log.connection.close: false
Thu, Mar 21, 12:35 AM
Ottomata committed rDEPLOYCHARTS26adacc88588: eventgate-analytics - remove confusing '_histogram' suffix from summary… (authored by Ottomata).
eventgate-analytics - remove confusing '_histogram' suffix from summary…
Thu, Mar 21, 12:35 AM
Ottomata committed rDEPLOYCHARTS4851afc2c26c: eventgate-analytics Fix misplaced '_histogram' metric suffix (authored by Ottomata).
eventgate-analytics Fix misplaced '_histogram' metric suffix
Thu, Mar 21, 12:35 AM
Ottomata committed rDEPLOYCHARTSa2f89ea9088d: eventgate-analytics - adjustments to statsd exporter matches (authored by Ottomata).
eventgate-analytics - adjustments to statsd exporter matches
Thu, Mar 21, 12:35 AM
Ottomata committed rDEPLOYCHARTSe7afe23b1b36: Change eventgate-analytics rdkafka statistics.interval.ms to 30s (authored by Ottomata).
Change eventgate-analytics rdkafka statistics.interval.ms to 30s
Thu, Mar 21, 12:35 AM
Ottomata committed rDEPLOYCHARTS02a87f2a639d: Add eventgate-analytics 0.0.11 chart package (authored by Ottomata).
Add eventgate-analytics 0.0.11 chart package
Thu, Mar 21, 12:35 AM
Gerrit Code Review <gerrit@wikimedia.org> committed rDEPLOYCHARTSd9e4ba89e566: Merge "eventgate-analytics - Add statsd prometheus mappings for node-rdkafka… (authored by Ottomata).
Merge "eventgate-analytics - Add statsd prometheus mappings for node-rdkafka…
Thu, Mar 21, 12:35 AM

Wed, Mar 20

Ottomata claimed T218268: eventgate-analytics k8s pods occasionally can't produce to kafka.
Wed, Mar 20, 8:32 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations
Ottomata moved T218268: eventgate-analytics k8s pods occasionally can't produce to kafka from Next Up to In Progress on the Analytics-Kanban board.
Wed, Mar 20, 8:32 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations
Ottomata moved T218305: EventGate wikimedia implementation should emit rdkafka stats from Next Up to Done on the Analytics-Kanban board.
Wed, Mar 20, 8:31 PM · Analytics-Kanban, Patch-For-Review, Services (watching), EventBus, Analytics
Ottomata set the point value for T218305: EventGate wikimedia implementation should emit rdkafka stats to 5.
Wed, Mar 20, 8:31 PM · Analytics-Kanban, Patch-For-Review, Services (watching), EventBus, Analytics
Ottomata added a project to T218305: EventGate wikimedia implementation should emit rdkafka stats: Analytics-Kanban.
Wed, Mar 20, 8:31 PM · Analytics-Kanban, Patch-For-Review, Services (watching), EventBus, Analytics
Ottomata added a comment to T218305: EventGate wikimedia implementation should emit rdkafka stats.

I like that idea! I'll make an overview.

Wed, Mar 20, 5:34 PM · Analytics-Kanban, Patch-For-Review, Services (watching), EventBus, Analytics
Ottomata updated subscribers of T218305: EventGate wikimedia implementation should emit rdkafka stats.

@akosiaris, @fgiunchedi when you get a chance I'd appreciate a lookover of this dashboard:

Wed, Mar 20, 5:00 PM · Analytics-Kanban, Patch-For-Review, Services (watching), EventBus, Analytics
Ottomata added a comment to T218758: Improve speed and reliability of Yarn's Resource Manager failover.

BTW great finds yall! Thanks so much for figure this out and the great write up!

Wed, Mar 20, 1:17 PM · Patch-For-Review, Analytics, Analytics-Kanban
Ottomata added a comment to T218758: Improve speed and reliability of Yarn's Resource Manager failover.

Do we need to keep a history of 10k application ids in our Yarn state?

No I don't think so! 10K (4ish days) is nice to have in memory I suppose, but I see no reason to bog down start up times with 4 days of completed app ids. 1 or 2 days would be plenty.

Wed, Mar 20, 1:17 PM · Patch-For-Review, Analytics, Analytics-Kanban

Tue, Mar 19

Ottomata added a comment to T218680: EventGate Helm chart should POST test event for readinessProbe.

Ya, was thinking it'd have to be exec, and then if we can/should use service_checker that'd be fine. @akosiaris do you think we shouldn't do this?

Tue, Mar 19, 5:51 PM · EventBus, Analytics
Ottomata added a comment to T218680: EventGate Helm chart should POST test event for readinessProbe.

Good idea! The test x-amples are there. We should add a custom spec for the wikimedia-eventgate implementation with our x-ample event.

Tue, Mar 19, 4:31 PM · EventBus, Analytics
Ottomata created T218680: EventGate Helm chart should POST test event for readinessProbe.
Tue, Mar 19, 2:38 PM · EventBus, Analytics

Mon, Mar 18

Ottomata renamed T218617: Fix EventLogging schemas that use array for items type from MobileWikiAppiOSUserHistory schema uses array for items type to Fix EventLogging schemas that use array for items type.
Mon, Mar 18, 10:51 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics
Ottomata created T218617: Fix EventLogging schemas that use array for items type.
Mon, Mar 18, 10:33 PM · Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, Fundraising-Backlog, Product-Analytics, Analytics
Ottomata moved T218305: EventGate wikimedia implementation should emit rdkafka stats from Backlog to In Progress on the EventBus board.
Mon, Mar 18, 10:27 PM · Analytics-Kanban, Patch-For-Review, Services (watching), EventBus, Analytics
Ottomata committed rDEPLOYCHARTSfadc7ad8591d: eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd… (authored by Ottomata).
eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd…
Mon, Mar 18, 4:06 PM
Ottomata committed rDEPLOYCHARTSb7cfa2d6794a: eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd… (authored by Ottomata).
eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd…
Mon, Mar 18, 3:11 PM
Ottomata committed rDEPLOYCHARTSa7893c714aee: eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd… (authored by Ottomata).
eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd…
Mon, Mar 18, 2:27 PM
Ottomata added a comment to T212529: Standardize datetimes/timestamps in the Data Lake.

Hm. But, also because the Hive/ANSI SQL (thanks for correction Neil, didn't realize that :) ) format works with the time-based functions, right? If not, then could we just switch the mediawiki_history event_timestamp etc. to event_dt ISO-8601 now? (I'm not saying we should!)

Mon, Mar 18, 1:24 PM · MW-1.33-notes (1.33.0-wmf.21; 2019-03-12), Patch-For-Review, Analytics, Product-Analytics
Ottomata committed rDEPLOYCHARTS67d9379e0acf: eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd… (authored by Ottomata).
eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd…
Mon, Mar 18, 1:22 AM

Fri, Mar 15

Ottomata added a comment to T212529: Standardize datetimes/timestamps in the Data Lake.

Are you sure this is actually the case

Fri, Mar 15, 4:19 AM · MW-1.33-notes (1.33.0-wmf.21; 2019-03-12), Patch-For-Review, Analytics, Product-Analytics

Thu, Mar 14

Ottomata committed rDEPLOYCHARTS731d293a06e5: eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd… (authored by Ottomata).
eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd…
Thu, Mar 14, 9:46 PM
Ottomata committed rDEPLOYCHARTS34357a99040b: eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd… (authored by Ottomata).
eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd…
Thu, Mar 14, 9:36 PM
Ottomata committed rDEPLOYCHARTSef9a724ae06d: eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd… (authored by Ottomata).
eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd…
Thu, Mar 14, 7:23 PM
Ottomata committed rDEPLOYCHARTS73d2b3147d12: eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd… (authored by Ottomata).
eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd…
Thu, Mar 14, 7:23 PM
Ottomata renamed T218346: Modern Event Platform: Deploy instance of EventGate service that produces events to kafka main from Modern Event Platform: Deploy instance of EventGate service that processes events from kafka main to Modern Event Platform: Deploy instance of EventGate service that produces events to kafka main .
Thu, Mar 14, 7:17 PM · Core Platform Team Backlog (Watching / External), Services (watching), Analytics-EventLogging, EventBus, Analytics
Ottomata committed rDEPLOYCHARTS6c891b211307: Add eventgate-analytics-0.0.10.tgz (authored by Ottomata).
Add eventgate-analytics-0.0.10.tgz
Thu, Mar 14, 4:19 PM
Ottomata committed rDEPLOYCHARTS5b4fd88d615a: eventgaate-analytics - Enable rdkafka statsd metrics (authored by Ottomata).
eventgaate-analytics - Enable rdkafka statsd metrics
Thu, Mar 14, 4:04 PM
Ottomata created T218305: EventGate wikimedia implementation should emit rdkafka stats.
Thu, Mar 14, 2:58 PM · Analytics-Kanban, Patch-For-Review, Services (watching), EventBus, Analytics
Ottomata committed rDEPLOYCHARTSb2617c22ef0a: eventgate-analytics - Fix liveness and readiness probes (authored by Ottomata).
eventgate-analytics - Fix liveness and readiness probes
Thu, Mar 14, 2:48 PM
Ottomata added a project to T218238: Vagrant initial provision fails on NodeJS version mismatch: Analytics.
Thu, Mar 14, 1:44 PM · Analytics-Kanban, Analytics, MediaWiki-Vagrant
Ottomata claimed T218238: Vagrant initial provision fails on NodeJS version mismatch.
Thu, Mar 14, 1:44 PM · Analytics-Kanban, Analytics, MediaWiki-Vagrant
Ottomata added a subtask for T210704: Migrate node-based services in production to node10: T218238: Vagrant initial provision fails on NodeJS version mismatch.
Thu, Mar 14, 1:43 PM · serviceops, Core Platform Team Backlog (Later), Patch-For-Review, Services (next), Operations
Ottomata added a parent task for T218238: Vagrant initial provision fails on NodeJS version mismatch: T210704: Migrate node-based services in production to node10.
Thu, Mar 14, 1:42 PM · Analytics-Kanban, Analytics, MediaWiki-Vagrant
Ottomata added a comment to T218238: Vagrant initial provision fails on NodeJS version mismatch.

So this problem is due to the fact that some node services aren't yet compatible (or haven't been checked for compatibility) for node 10. EventGate (which will eventually be replacing EventLogging (which includes service eventbus) requires node 10.

Thu, Mar 14, 1:42 PM · Analytics-Kanban, Analytics, MediaWiki-Vagrant
Ottomata added a comment to T218260: Decrease timeout for EventBus extension for analytics events.

I wonder if we should also use ?hasty=true mode for mediawiki 'analytics' events? This would use a non-ACKed producer and not ever block the MW waiting for a response.

Thu, Mar 14, 1:38 PM · Analytics-Kanban, Patch-For-Review, Analytics, Core Platform Team Kanban (Doing), Services (doing), EventBus
Ottomata added a comment to T218268: eventgate-analytics k8s pods occasionally can't produce to kafka.
{
  "_index": "logstash-2019.03.13",
  "_type": "eventgate",
  "_id": "AWl4sxguNBo9dX1kfcii",
  "_score": 1,
  "_source": {
    "err_errno": -1,
    "source_host": "10.64.64.93",
    "level": "ERROR",
    "err_code": -1,
    "pid": 140,
    "err_origin": "local",
    "type": "eventgate",
    "message": "event 32d56482-45cc-11e9-be6d-1418776134a1 of schema at /mediawiki/api/request/0.0.1 destined to stream mediawiki.api-request encountered an error: message timed out",
    "version": "1.0",
    "normalized_message": "event 32d56482-45cc-11e9-be6d-1418776134a1 of schema at /mediawiki/api/request/0.0.1 destined to stream mediawiki.api-request encountered an error: message timed out",
    "tags": [
      "input-gelf-12201",
      "es",
      "gelf",
      "normalized_message_untrimmed"
    ],
    "err_message": "message timed out",
    "@timestamp": "2019-03-13T20:16:36.049Z",
    "err_name": "Error",
    "host": "eventgate-analytics-production-5d866bc9dd-nc2qk",
    "@version": "1",
    "gelf_level": "3",
    "err_stack": "Error: Local: Message timed out"
  },
  "fields": {
    "@timestamp": [
      1552508196049
    ]
  }
}
Thu, Mar 14, 1:38 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations

Wed, Mar 13

Ottomata updated subscribers of T218238: Vagrant initial provision fails on NodeJS version mismatch.

Ah hm. @Pchelolo does visualeditor require eventbus? I guess for restbase/changeprop stuff?

Wed, Mar 13, 10:00 PM · Analytics-Kanban, Analytics, MediaWiki-Vagrant
Ottomata updated subscribers of T218268: eventgate-analytics k8s pods occasionally can't produce to kafka.

@akosiaris let's try to figure this out tomorrow. :)

Wed, Mar 13, 9:41 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations
Ottomata created T218268: eventgate-analytics k8s pods occasionally can't produce to kafka.
Wed, Mar 13, 9:41 PM · Analytics-Kanban, Patch-For-Review, Analytics, Prod-Kubernetes, EventBus, serviceops, Operations
Ottomata committed rDEPLOYCHARTS9b736fee8dd8: Use httpGet for liveness_probe (authored by Ottomata).
Use httpGet for liveness_probe
Wed, Mar 13, 9:10 PM
Ottomata committed rDEPLOYCHARTS6968e9240f90: Use httpGet for liveness_probe (authored by Ottomata).
Use httpGet for liveness_probe
Wed, Mar 13, 9:10 PM
Ottomata committed rDEPLOYCHARTS2a3c2b1c05f4: Use httpGet for liveness_probe (authored by Ottomata).
Use httpGet for liveness_probe
Wed, Mar 13, 9:10 PM
Ottomata added a parent task for T216297: Develop method for identifying reverts in EventBus data: T152434: Add method to Revision to check if it was a Revert, and whether an edit was Reverted.
Wed, Mar 13, 6:33 PM · Core Platform Team Backlog (Watching / External), Contributors-Analysis, Product-Analytics
Ottomata added a subtask for T152434: Add method to Revision to check if it was a Revert, and whether an edit was Reverted: T216297: Develop method for identifying reverts in EventBus data.
Wed, Mar 13, 6:33 PM · Core Platform Team Backlog (Watching / External), Readers-Web-Backlog (Tracking), Reading-Infrastructure-Team-Backlog, Trending-Service, Epic, MediaWiki-Page-editing, Contributors-Team, Collaboration-Team-Triage, MediaWiki-Interface
Ottomata added a comment to T216297: Develop method for identifying reverts in EventBus data.

Hmmmm, if MW can know if an edit is a revert via a revert tool (not just a copy/paste of old content), then I think we could include that fact in the event, e.g. is_revert: true or something.

Wed, Mar 13, 4:12 PM · Core Platform Team Backlog (Watching / External), Contributors-Analysis, Product-Analytics

Tue, Mar 12

Ottomata added a comment to T217619: Publishing html files generated on notebook hosts.

It is in /usr/local/bin, perhaps your PATH doesn't contain it!

Tue, Mar 12, 10:05 PM · Patch-For-Review, Analytics-Kanban, Product-Analytics, Analytics-SWAP, Analytics
Ottomata added a comment to T212529: Standardize datetimes/timestamps in the Data Lake.

@Neil_P._Quinn_WMF, I discussed this with Joseph a bit more today and realized that even though what I said above is true, it might not be exactly what we want. I didn't realize that the Hive/Java format fields you mentioned in mediawiki_history were actually 'timestamp' fields, not 'datetime' ones. While we prefer string datetime fields in ISO-8601 format for both human and machine readability and consistency, there isn't a hard requirement to string datetimes over integer unix epoch timestamps. https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/Schema_Guidelines#Schema_set_up says

Tue, Mar 12, 6:18 PM · MW-1.33-notes (1.33.0-wmf.21; 2019-03-12), Patch-For-Review, Analytics, Product-Analytics
Ottomata committed rDEPLOYCHARTSead18136492d: eventgate-analytics - add schema_precache_uris configuration (authored by Ottomata).
eventgate-analytics - add schema_precache_uris configuration
Tue, Mar 12, 3:29 PM
Ottomata committed rDEPLOYCHARTSee1c52ef66cd: eventgate-analytics - add schema_precache_uris configuration (authored by Ottomata).
eventgate-analytics - add schema_precache_uris configuration
Tue, Mar 12, 3:29 PM
Ottomata moved T217661: EventGate (in k8s) takes a long time to load new schemas from Next Up to Done on the Analytics-Kanban board.
Tue, Mar 12, 3:14 PM · Patch-For-Review, Analytics-Kanban, Services (watching), EventBus, Analytics
Ottomata moved T217661: EventGate (in k8s) takes a long time to load new schemas from Backlog to Done on the EventBus board.
Tue, Mar 12, 3:14 PM · Patch-For-Review, Analytics-Kanban, Services (watching), EventBus, Analytics
Ottomata moved T217041: Use Z UTC suffix in EventBus emitted events rather than +00:00 from Backlog to In Progress on the EventBus board.
Tue, Mar 12, 2:39 PM · Core Platform Team Backlog (Watching / External), Services (watching), EventBus, Analytics, Product-Analytics
Ottomata closed T216163: Add monolog adapters for Eventbus as Resolved.
Tue, Mar 12, 2:39 PM · Core Platform Team Kanban (Done with CPT), Patch-For-Review, Services (doing), Analytics-EventLogging, EventBus, Analytics
Ottomata closed T216163: Add monolog adapters for Eventbus, a subtask of T214080: Rewrite Avro schemas (ApiAction, CirrusSearchRequestSet) as JSONSchema and produce to EventGate, as Resolved.
Tue, Mar 12, 2:39 PM · MW-1.33-notes (1.33.0-wmf.21; 2019-03-12), Analytics-Kanban, Patch-For-Review, Services (watching), Discovery, Analytics-EventLogging, EventBus, Analytics
Ottomata added a comment to T217412: Enable encryption and authentication for TLS-based Hadoop services.

Ah I see! Is that a problem? Can a CA not create multiple certificates with the same CN?

Tue, Mar 12, 2:24 PM · Analytics-Kanban, Patch-For-Review, User-Elukey, Analytics
Ottomata added a comment to T212529: Standardize datetimes/timestamps in the Data Lake.

Hm, no I mean once we get Hive 1.2.0+, we can do

Tue, Mar 12, 1:52 PM · MW-1.33-notes (1.33.0-wmf.21; 2019-03-12), Patch-For-Review, Analytics, Product-Analytics
Ottomata closed T199432: Consider disabling automatic topic creation in main-kafka as Declined.
Tue, Mar 12, 1:46 PM · User-Elukey, Core Platform Team Backlog (Designing), ChangeProp, EventBus, WMF-JobQueue, Services (designing), Analytics
Ottomata added a comment to T199432: Consider disabling automatic topic creation in main-kafka.

I think we should close. Maybe we'll do this one day if we have a really solid 'stream (+topic) config' system, but I doubt we'll do it before that.

Tue, Mar 12, 1:46 PM · User-Elukey, Core Platform Team Backlog (Designing), ChangeProp, EventBus, WMF-JobQueue, Services (designing), Analytics
Ottomata added a comment to T217412: Enable encryption and authentication for TLS-based Hadoop services.

use a self signed CA

Tue, Mar 12, 1:45 PM · Analytics-Kanban, Patch-For-Review, User-Elukey, Analytics
Ottomata added a comment to T217967: Publish both shaded and unshaded artifacts from analytics refinery.

Cool thank you!!!

Tue, Mar 12, 1:38 PM · Patch-For-Review, Discovery-Search (Current work), Analytics
Ottomata added a comment to T211981: Improve article-recommender scripts.

Or, if you want to have your artifact deployed in analytics/refinery/artifacts, you can manually git add it there (as long as your local analytics/refinery checkout has git fat initialized.)

Tue, Mar 12, 1:35 PM · Patch-For-Review, Article-Recommendation, Research
Ottomata added a comment to T211981: Improve article-recommender scripts.

Hm, why do you need the the artifact in analytics/refinery? Can you just use scap+git fat to deploy research/article-recommender/deploy to e.g. stat1007 and put the zip file where ever it needs to go (HDFS?).

Tue, Mar 12, 1:33 PM · Patch-For-Review, Article-Recommendation, Research
Ottomata added a comment to T212529: Standardize datetimes/timestamps in the Data Lake.

@JAllemandou, my understanding is that if we get a newer version of Hive, we will be able to use Hive timestamp types with ISO-8601 string formats. If that's correct, I think we should keep ISO-8601 string as our convention, and then use Hive timestamps with them after we (one day) upgrade.

Tue, Mar 12, 1:28 PM · MW-1.33-notes (1.33.0-wmf.21; 2019-03-12), Patch-For-Review, Analytics, Product-Analytics

Fri, Mar 8

Ottomata claimed T215442: Make Refine use JSONSchemas of event data to support Map types and proper types for integers vs decimals.
Fri, Mar 8, 5:08 PM · MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), Patch-For-Review, Analytics-Kanban, EventBus, Analytics
Ottomata changed the point value for T215442: Make Refine use JSONSchemas of event data to support Map types and proper types for integers vs decimals from 0 to 8.
Fri, Mar 8, 5:05 PM · MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), Patch-For-Review, Analytics-Kanban, EventBus, Analytics

Thu, Mar 7

Ottomata moved T214080: Rewrite Avro schemas (ApiAction, CirrusSearchRequestSet) as JSONSchema and produce to EventGate from In Code Review to In Progress on the Analytics-Kanban board.
Thu, Mar 7, 10:21 PM · MW-1.33-notes (1.33.0-wmf.21; 2019-03-12), Analytics-Kanban, Patch-For-Review, Services (watching), Discovery, Analytics-EventLogging, EventBus, Analytics
Ottomata added a comment to T209857: Create Autonomous Systems ranking based on RUM data.

Not '/datasets mount', but just the /srv/published-datasets directory.

Thu, Mar 7, 9:38 PM · Patch-For-Review, MW-1.33-notes (1.33.0-wmf.8; 2018-12-11), Performance-Team
Ottomata updated subscribers of T212529: Standardize datetimes/timestamps in the Data Lake.

I'd be even more happy with standardizing on YYYY-mm-dd HH:MM:SS

Thu, Mar 7, 3:14 PM · MW-1.33-notes (1.33.0-wmf.21; 2019-03-12), Patch-For-Review, Analytics, Product-Analytics
Ottomata added a comment to T211981: Improve article-recommender scripts.

I think the way Baho is doing this is fine. It isn't that different than how we package up dependencies and artifacts for Java in e.g. refinery, or for Python in e.g. superset, or for Node in e.g. EventGate and change-prop. In the superset and change-prop cases, we use a separate 'deploy' git repository for the dependency artifacts. In the refinery case, we use Archiva + git-fat to avoid keeping the dependencies in git. This is also to how the ORES folks are packaging, except they use git-lfs somehow.

Thu, Mar 7, 1:59 PM · Patch-For-Review, Article-Recommendation, Research
Ottomata added a comment to T206268: Evaluate using TypeScript on node projects.

5 seconds for tests sounds dreamy...come on over to the Java world...:p

Thu, Mar 7, 1:51 PM · Core Platform Team Backlog (Watching / External), Services (watching), Analytics