Page MenuHomePhabricator

Ottomata (Andrew Otto)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Oct 9 2014, 4:50 PM (468 w, 1 d)
Availability
Available
IRC Nick
ottomata
LDAP User
Ottomata
MediaWiki User
Ottomata [ Global Accounts ]

Recent Activity

Fri, Sep 22

Ottomata added a project to T329327: Frequent `429 Client Error: Too Many Requests for url: https://stream.wikimedia.org/v2/stream/recentchange` errors in SULWatcher: EventStreams.
Fri, Sep 22, 3:25 PM · EventStreams, Data Engineering and Event Platform Team, Event-Platform, Data-Engineering, stewardbots
Ottomata added projects to T308931: Error 429: too many requests for stream.wikimedia.org: Event-Platform, Data Engineering and Event Platform Team.
Fri, Sep 22, 3:24 PM · Data-Engineering, Data Engineering and Event Platform Team, Event-Platform, EventStreams, Pywikibot
Ottomata added projects to T329327: Frequent `429 Client Error: Too Many Requests for url: https://stream.wikimedia.org/v2/stream/recentchange` errors in SULWatcher: Event-Platform, Data Engineering and Event Platform Team.
Fri, Sep 22, 3:23 PM · EventStreams, Data Engineering and Event Platform Team, Event-Platform, Data-Engineering, stewardbots

Wed, Sep 20

Ottomata added a comment to T346899: eventgate: cache refreshes should fetch stream configs using in batches.

Why only eventgate-analytics-external is configured to refresh its internal cache, without and embdded schema registry fallback?

Wed, Sep 20, 2:35 PM · Data-Engineering, Event-Platform, Data Engineering and Event Platform Team

Tue, Sep 5

Ottomata added a comment to T345193: Document the onboarding journey on Event Platfrom.

Would this task be more appropriately titled "Document the onboarding journey on for building simple streaming enrichment apps"?

Tue, Sep 5, 2:19 PM · Data Engineering and Event Platform Team (Sprint 2), Data-Engineering, Event-Platform

Thu, Aug 31

Ottomata added a comment to T345317: Make jsonschema-tools merge values of enums when merging allOf.

AIUI the json-schema-merge-allof package allows for its behaviour to be changed on a per-keyword

Thu, Aug 31, 4:29 PM · Patch-For-Review, Metrics Platform Backlog, Data-Engineering

Jul 27 2023

Ottomata added a comment to T341976: IP Masking: Change User::isRegistered() and User::isAnon() with User::isNamed in Growth-managed extensions.

Related: T336176: MediaWiki user types

Jul 27 2023, 6:33 PM · MW-1.41-notes (1.41.0-wmf.25; 2023-09-05), IP-Masking-Growth-Team, Growth-Team (Current Sprint), IP Masking

Jul 18 2023

Ottomata added a comment to T340765: jsonschema-tools test should fail if fields are removed in new (non major) version.

Added: https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas/Guidelines#Exception:_major_schema_version_upgrades

Jul 18 2023, 4:35 PM · Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B), Data-Engineering
Ottomata added a comment to T341134: Investigate drift between `dt` and `meta.dt`.

Filter out null edits

HuH! Interesting. Current me does not remember this at all.

Jul 18 2023, 4:29 PM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Data Products (Sprint 0), Patch-For-Review
Ottomata updated the task description for T341229: ProduceCanaryEvents job should be scheduled by Airflow.
Jul 18 2023, 2:18 PM · Data-Engineering

Jul 11 2023

Ottomata added a comment to T341134: Investigate drift between `dt` and `meta.dt`.

What's weird about this is that these are actual old state changes. It's not just that there is drift, some events that happened years ago and being emitted into the stream now. E.g. Why is an edit event for https://pl.wikipedia.org/w/index.php?diff=prev&oldid=1130401 being emitted now?

Jul 11 2023, 1:10 AM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Data Products (Sprint 0), Patch-For-Review

Jul 8 2023

Ottomata added a comment to T341134: Investigate drift between `dt` and `meta.dt`.

close it in a few days if there are no more comments,

We should keep this open, I really think this should not happen.

Jul 8 2023, 6:56 PM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Data Products (Sprint 0), Patch-For-Review
Ottomata added a comment to T340044: Make meta.dt required on all schemas that declare it.

We removed meta.dt requiredness in common/2.0.0.

Jul 8 2023, 6:52 PM · Data Engineering and Event Platform Team, Data-Engineering, Event-Platform
Ottomata added a comment to T338357: Pushing jobs to jobqueue is slow again.

. I understand its impact for MediaWiki's clients, but would some value like 5 considerable to see how Kafka behaves? I mean if the queues start to get down after it.

Jul 8 2023, 6:38 PM · ChangeProp, WMF-JobQueue

Jul 7 2023

Ottomata updated the task description for T307959: [Event Platform] Design and Implement realtime enrichment pipeline for MW page change with content.
Jul 7 2023, 2:36 PM · Data Engineering and Event Platform Team, Data-Engineering, Event-Platform, Epic
Ottomata updated subscribers of T341277: mediawiki page_content_change should generate new meta.id field.
Jul 7 2023, 2:35 PM · Data Engineering and Event Platform Team (Sprint 1), Data-Engineering, Event-Platform
Ottomata added a comment to T341229: ProduceCanaryEvents job should be scheduled by Airflow.

Oo, it would be really nice if we could modify the job logic a little bit, to be able to produce events with the time appropriate for the schedule task time. That way, we could backfill more easily.

Jul 7 2023, 2:23 PM · Data-Engineering
Ottomata added a comment to T340666: mediawiki-event-enrichment jobs alerting.

So cool!

Jul 7 2023, 1:35 PM · Observability-Alerting, Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B), Discovery-Search

Jul 6 2023

Ottomata added a comment to T323786: REST framework: Add support for outputting an OpenAPI (swagger) spec .

Would love to see all of our REST APIs specify deterministic output types, ideally following some of the same schema guidelines used by Event Platform. This would allow us to use REST API endpoints as 'lookup tables', and allow us to join them and other datasets with SQL.

Jul 6 2023, 7:39 PM · MediaWiki-REST-API, RESTBase Sunsetting, Patch-For-Review, Platform Team Workboards (MW Expedition), Epic, Foundational Technology Requests, Code-Health
Ottomata updated the task description for T267648: Adopt conventions for server receive and client/event timestamps in non analytics event schemas.
Jul 6 2023, 6:52 PM · Data Engineering and Event Platform Team, Data-Engineering, MW-1.41-notes (1.41.0-wmf.15; 2023-06-27), Patch-For-Review, Platform Team Workboards (Clinic Duty Team), Event-Platform, Better Use Of Data, Analytics
Ottomata added a comment to T267648: Adopt conventions for server receive and client/event timestamps in non analytics event schemas.

Meeting today, discussed / decided the following:

Jul 6 2023, 6:49 PM · Data Engineering and Event Platform Team, Data-Engineering, MW-1.41-notes (1.41.0-wmf.15; 2023-06-27), Patch-For-Review, Platform Team Workboards (Clinic Duty Team), Event-Platform, Better Use Of Data, Analytics
Ottomata created T341277: mediawiki page_content_change should generate new meta.id field.
Jul 6 2023, 6:39 PM · Data Engineering and Event Platform Team (Sprint 1), Data-Engineering, Event-Platform
Ottomata added a comment to T341134: Investigate drift between `dt` and `meta.dt`.

When investigating, maybe start with event.mediawiki_page_change_v1 instead of page_content_change?

Jul 6 2023, 2:26 PM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Data Products (Sprint 0), Patch-For-Review
Ottomata updated the task description for T336817: Release mediawiki.page_change.v1 stream.
Jul 6 2023, 2:19 PM · Data Engineering and Event Platform Team (Sprint 1), MW-1.41-notes (1.41.0-wmf.13; 2023-06-13), Event-Platform (Sprint 14 B), Patch-For-Review
Ottomata updated subscribers of T312785: Change the way Refine handles its status (currently flags in partitions).
Jul 6 2023, 1:59 PM · Data Engineering and Event Platform Team
Ottomata added a comment to T312785: Change the way Refine handles its status (currently flags in partitions).

Possibly related: T340471: [Airflow] P.O.C. on Iceberg sensor using Snapshot metadata to keep status of updates, if/when we decide to move all event tables to Iceberg.

Jul 6 2023, 1:58 PM · Data Engineering and Event Platform Team
Ottomata merged tasks T296534: Deprecate Refine Scheduler , T296523: Refine to Airflow Migration: User Story, T296532: Complete Refine to Airflow Migration (100%), T296531: Migrate the Medium Risk Refine Jobs to Airflow (50%), T296530: Migrate the Selected Refine Jobs to Airflow (1%) into T307505: Refine jobs should be scheduled by Airflow.
Jul 6 2023, 1:57 PM · Data Engineering and Event Platform Team, Data-Engineering, Data Pipelines
Ottomata merged task T296531: Migrate the Medium Risk Refine Jobs to Airflow (50%) into T307505: Refine jobs should be scheduled by Airflow.
Jul 6 2023, 1:57 PM · Data Engineering and Event Platform Team, Data Pipelines
Ottomata merged task T296530: Migrate the Selected Refine Jobs to Airflow (1%) into T307505: Refine jobs should be scheduled by Airflow.
Jul 6 2023, 1:57 PM · Data Engineering and Event Platform Team, Data Pipelines
Ottomata merged task T296532: Complete Refine to Airflow Migration (100%) into T307505: Refine jobs should be scheduled by Airflow.
Jul 6 2023, 1:57 PM · Data Engineering and Event Platform Team, Data Pipelines
Ottomata merged task T296523: Refine to Airflow Migration: User Story into T307505: Refine jobs should be scheduled by Airflow.
Jul 6 2023, 1:57 PM · Data Engineering and Event Platform Team, Epic, Data Pipelines
Ottomata merged task T296534: Deprecate Refine Scheduler into T307505: Refine jobs should be scheduled by Airflow.
Jul 6 2023, 1:57 PM · Data Engineering and Event Platform Team, Data Pipelines
Ottomata added a parent task for T312785: Change the way Refine handles its status (currently flags in partitions): T307505: Refine jobs should be scheduled by Airflow.
Jul 6 2023, 1:56 PM · Data Engineering and Event Platform Team
Ottomata added a subtask for T307505: Refine jobs should be scheduled by Airflow: T312785: Change the way Refine handles its status (currently flags in partitions).
Jul 6 2023, 1:56 PM · Data Engineering and Event Platform Team, Data-Engineering, Data Pipelines
Ottomata renamed T307505: Refine jobs should be scheduled by Airflow from Migrate 1+ Refine jobs to Refine jobs should be scheduled by Airflow.
Jul 6 2023, 1:55 PM · Data Engineering and Event Platform Team, Data-Engineering, Data Pipelines
Ottomata created T341229: ProduceCanaryEvents job should be scheduled by Airflow.
Jul 6 2023, 1:51 PM · Data-Engineering
Ottomata added a comment to T326002: EventGate occasionally fails to ingest specific schemas.

FWIW, EventStreamConfig in wikimedia-event-utilities I think only requests all stream config once on instantiation, not individually for each stream. So its not like there are a bunch of http requests all at once. There is only one.

Jul 6 2023, 1:38 PM · Data Engineering and Event Platform Team (Sprint 3), Patch-For-Review, MW-1.41-notes (1.41.0-wmf.28; 2023-09-26), Event-Platform, Data-Engineering, Data Pipelines
Ottomata updated subscribers of T326002: EventGate occasionally fails to ingest specific schemas.
Jul 6 2023, 1:37 PM · Data Engineering and Event Platform Team (Sprint 3), Patch-For-Review, MW-1.41-notes (1.41.0-wmf.28; 2023-09-26), Event-Platform, Data-Engineering, Data Pipelines
Ottomata added a comment to T326002: EventGate occasionally fails to ingest specific schemas.

Let's wait a few days, and if we see none (or fewer?) canary events, let's resolve this task.

Jul 6 2023, 1:36 PM · Data Engineering and Event Platform Team (Sprint 3), Patch-For-Review, MW-1.41-notes (1.41.0-wmf.28; 2023-09-26), Event-Platform, Data-Engineering, Data Pipelines
Ottomata added a comment to T341140: Check if node-rdkafka's version on changeprop can be upgraded from 2.8.1.

Pretty sure there is a configurable env var BUILD_LIBRDKAFKA that can conditionally disable this. eventgate-wikimedia installs the librdkafka1 .deb and sets BUILD_LIBRDKAFKA=0

Jul 6 2023, 1:05 PM · Patch-For-Review, serviceops, ChangeProp, WMF-JobQueue

Jul 5 2023

Ottomata added a subtask for T331283: [NEEDS GROOMING] Store Flink HA metadata in Zookeeper: T341137: Test version compatibility between production Kafka and newer ZooKeeper.
Jul 5 2023, 5:18 PM · Data Engineering and Event Platform Team (Sprint 3), Discovery-Search, Event-Platform, serviceops-radar, Data-Engineering
Ottomata added a parent task for T341137: Test version compatibility between production Kafka and newer ZooKeeper: T331283: [NEEDS GROOMING] Store Flink HA metadata in Zookeeper.
Jul 5 2023, 5:18 PM · Discovery-Search (Current work), serviceops-radar, Data-Platform-SRE
apaskulin awarded T303546: Gitlab CI should be able to publish static html docs a Party Time token.
Jul 5 2023, 4:34 PM · Toolforge
Ottomata added a comment to T341096: mediawiki-event-enrichment taskmanager crashes at startup.

@gmodena and I debugged this today, and realized it was because we never implemented support for specifying the schema versions used by the Kafka source, and always use the latest. This conflicts with the RowTypeInfo we use when reading from the source into a DataStream.

Jul 5 2023, 4:24 PM · Data-Engineering, Event-Platform (Sprint 14 B)
Ottomata created T341138: mediawiki-event-enrichment deployment process should include producing an event in staging and verifying success.
Jul 5 2023, 3:21 PM · Data-Engineering, Event-Platform
Ottomata added a comment to T335706: eventutilities-python EventProcessFunction throws NPE if user func returns None.

@gmodena we can close this task, ya?

Jul 5 2023, 12:49 PM · Data Engineering and Event Platform Team, Event-Platform, Data-Engineering

Jul 3 2023

Ottomata added a comment to T338169: mw-page-content-change-enrich should partition by and process by wiki_id,page_id.

In the end the main drawback is that this modulo value must never change unless you drain the pipeline before changing it (the keyed state cannot be redistributed by flink automatically if the key function changes).

Jul 3 2023, 7:31 PM · Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B)

Jun 30 2023

Ottomata updated the task description for T307959: [Event Platform] Design and Implement realtime enrichment pipeline for MW page change with content.
Jun 30 2023, 4:56 PM · Data Engineering and Event Platform Team, Data-Engineering, Event-Platform, Epic
Ottomata updated the task description for T307959: [Event Platform] Design and Implement realtime enrichment pipeline for MW page change with content.
Jun 30 2023, 4:54 PM · Data Engineering and Event Platform Team, Data-Engineering, Event-Platform, Epic
Ottomata updated the task description for T307959: [Event Platform] Design and Implement realtime enrichment pipeline for MW page change with content.
Jun 30 2023, 4:52 PM · Data Engineering and Event Platform Team, Data-Engineering, Event-Platform, Epic

Jun 29 2023

Ottomata added a comment to T335860: Implement job to transform mediawiki.page_content_change.

Backfill:

Jun 29 2023, 8:48 PM · Data Products (Sprint 0), Data Engineering and Event Platform Team (Sprint 1), Patch-For-Review
Ottomata added a comment to T340666: mediawiki-event-enrichment jobs alerting.

Also relevant: T329070: Automated event stream throughput alerting for important state change streams

Jun 29 2023, 4:36 PM · Observability-Alerting, Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B), Discovery-Search
Ottomata added a comment to T340765: jsonschema-tools test should fail if fields are removed in new (non major) version.

@tchin can you go ahead and do this one along with T300404?

Jun 29 2023, 4:11 PM · Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B), Data-Engineering
Ottomata created T340765: jsonschema-tools test should fail if fields are removed in new (non major) version.
Jun 29 2023, 4:07 PM · Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B), Data-Engineering

Jun 28 2023

Ottomata added a comment to T321854: Move Spark JsonSchemaConverter out of analytics/refinery/source and into wikimedia-event-utilities.

could implement Draft-3 required field support in eventutilities-core JsonSchemaConverter

Done in patch. This will allow us to use this class in refinery-source, and delete the one there.

Jun 28 2023, 9:22 PM · Data Engineering and Event Platform Team, Data-Engineering, Patch-For-Review, Event-Platform
Ottomata closed T340067: EventBus should set dt fields with greater precision than second as Declined.

I'm declining this task. I realized that we let eventgate handle setting meta.dt right now anyway, so its precision should be fine. Not much we can do about MW provided timestamps, since they are in second precision anyway.

Jun 28 2023, 5:59 PM · Event-Platform (Sprint 14 B), Data-Engineering
Ottomata added a comment to T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023.

^ sounds good!

Jun 28 2023, 4:23 PM · Wikidata Analytics (Kanban), Wikidata
Ottomata placed T320655: Refactor EventBus extension Hooks to use new hook system up for grabs.
Jun 28 2023, 3:44 PM · Data Engineering and Event Platform Team, Data-Engineering, Event-Platform
Ottomata updated the task description for T307959: [Event Platform] Design and Implement realtime enrichment pipeline for MW page change with content.
Jun 28 2023, 3:35 PM · Data Engineering and Event Platform Team, Data-Engineering, Event-Platform, Epic
Ottomata triaged T340666: mediawiki-event-enrichment jobs alerting as Medium priority.
Jun 28 2023, 3:34 PM · Observability-Alerting, Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B), Discovery-Search
Ottomata moved T340666: mediawiki-event-enrichment jobs alerting from Backlog to Estimated/ Discussed on the Event-Platform board.
Jun 28 2023, 3:34 PM · Observability-Alerting, Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B), Discovery-Search
Ottomata created T340666: mediawiki-event-enrichment jobs alerting.
Jun 28 2023, 3:33 PM · Observability-Alerting, Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B), Discovery-Search
Ottomata moved T338169: mw-page-content-change-enrich should partition by and process by wiki_id,page_id from Backlog to Estimated/ Discussed on the Event-Platform board.
Jun 28 2023, 3:21 PM · Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B)
Ottomata triaged T338169: mw-page-content-change-enrich should partition by and process by wiki_id,page_id as Medium priority.
Jun 28 2023, 3:21 PM · Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B)
Ottomata moved T300404: jsonschema-tools tests should fail if schema $id does not match title or path from To be Estimated/To be discussed to Estimated/ Discussed on the Event-Platform board.
Jun 28 2023, 3:16 PM · Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B)
Ottomata triaged T300404: jsonschema-tools tests should fail if schema $id does not match title or path as Medium priority.
Jun 28 2023, 3:15 PM · Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B)
Ottomata assigned T300404: jsonschema-tools tests should fail if schema $id does not match title or path to tchin.
Jun 28 2023, 3:15 PM · Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B)
Ottomata closed T325359: Flink Restart Strategy for Enrichment Service as Invalid.

Handled by flink operator and in config/documentation

Jun 28 2023, 3:15 PM · Data-Engineering-Planning, Event-Platform
Ottomata reassigned T311603: [Shared Event Platform][NEEDS GROOMING] should we guarantee ordering in Mediawiki Stream Enrichment? from Ottomata to gmodena.
Jun 28 2023, 3:14 PM · Data-Engineering-Planning, Event-Platform
Ottomata closed T311603: [Shared Event Platform][NEEDS GROOMING] should we guarantee ordering in Mediawiki Stream Enrichment? as Resolved.

This should be done, even though we process in async, we emit events in the order the tasks receive them.

Jun 28 2023, 3:14 PM · Data-Engineering-Planning, Event-Platform
Ottomata triaged T340059: Flink k8s operator in staging sometimes will not sync changes to FlinkDeployments as High priority.
Jun 28 2023, 3:13 PM · serviceops-radar, Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B)
Ottomata moved T340059: Flink k8s operator in staging sometimes will not sync changes to FlinkDeployments from To be Estimated/To be discussed to Estimated/ Discussed on the Event-Platform board.
Jun 28 2023, 3:13 PM · serviceops-radar, Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B)
Ottomata updated the task description for T307959: [Event Platform] Design and Implement realtime enrichment pipeline for MW page change with content.
Jun 28 2023, 3:08 PM · Data Engineering and Event Platform Team, Data-Engineering, Event-Platform, Epic
Ottomata moved T340059: Flink k8s operator in staging sometimes will not sync changes to FlinkDeployments from Backlog to To be Estimated/To be discussed on the Event-Platform board.
Jun 28 2023, 3:03 PM · serviceops-radar, Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B)
Ottomata added a comment to T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023.

We should avoid team names in functional code / namespacing. Team names change often.

Jun 28 2023, 2:04 PM · Wikidata Analytics (Kanban), Wikidata
Ottomata updated the task description for T337399: Use ECS logging fields when adding extra info to mediawiki-event-enrichment.
Jun 28 2023, 1:11 PM · Event-Platform (Sprint 14 B), Data-Engineering
Ottomata moved T337399: Use ECS logging fields when adding extra info to mediawiki-event-enrichment from Next Up to Done on the Event-Platform (Sprint 14 B) board.
Jun 28 2023, 1:11 PM · Event-Platform (Sprint 14 B), Data-Engineering

Jun 27 2023

Ottomata updated the task description for T340067: EventBus should set dt fields with greater precision than second.
Jun 27 2023, 8:37 PM · Event-Platform (Sprint 14 B), Data-Engineering
Ottomata updated subscribers of T321854: Move Spark JsonSchemaConverter out of analytics/refinery/source and into wikimedia-event-utilities.

WIP patch for doing this ^.

Jun 27 2023, 8:28 PM · Data Engineering and Event Platform Team, Data-Engineering, Patch-For-Review, Event-Platform
Ottomata updated the task description for T253058: DRY kafka broker declaration in helmfiles.
Jun 27 2023, 4:50 PM · Data-Engineering, Data-Platform-SRE, Data Engineering and Event Platform Team, serviceops, SRE, Event-Platform
Ottomata updated the task description for T253058: DRY kafka broker declaration in helmfiles.
Jun 27 2023, 4:50 PM · Data-Engineering, Data-Platform-SRE, Data Engineering and Event Platform Team, serviceops, SRE, Event-Platform
Ottomata added a comment to T329629: Improve Event Platform and MediaWiki Event Enrichment wikitech documentation.
Jun 27 2023, 3:06 PM · Event-Platform (Sprint 14 B), Data-Engineering-Planning
Ottomata updated the task description for T329629: Improve Event Platform and MediaWiki Event Enrichment wikitech documentation.
Jun 27 2023, 3:05 PM · Event-Platform (Sprint 14 B), Data-Engineering-Planning
Ottomata updated the task description for T329629: Improve Event Platform and MediaWiki Event Enrichment wikitech documentation.
Jun 27 2023, 2:43 PM · Event-Platform (Sprint 14 B), Data-Engineering-Planning
Ottomata added a comment to T309717: Event Utilities partially downloads schemas.

Added more logging, and scheduled it for next week's train: https://etherpad.wikimedia.org/p/analytics-weekly-train

Jun 27 2023, 2:03 PM · Data-Engineering
Ottomata updated the task description for T329629: Improve Event Platform and MediaWiki Event Enrichment wikitech documentation.
Jun 27 2023, 1:39 PM · Event-Platform (Sprint 14 B), Data-Engineering-Planning
Ottomata updated the task description for T329629: Improve Event Platform and MediaWiki Event Enrichment wikitech documentation.
Jun 27 2023, 1:34 PM · Event-Platform (Sprint 14 B), Data-Engineering-Planning
Ottomata moved T340166: All eventgate clusters should be able to use remote schema repos from In Review to Done on the Event-Platform (Sprint 14 B) board.
Jun 27 2023, 1:08 PM · Event-Platform (Sprint 14 B), Data-Engineering
Ottomata moved T330236: Event partitions missing since 2023-02-21T10:00 for stream without events (canary events not produced?) from In Review to Done on the Event-Platform (Sprint 14 B) board.
Jun 27 2023, 1:08 PM · Event-Platform (Sprint 14 B), Data Pipelines (Sprint 14), Data-Engineering-Planning
Ottomata edited projects for T309717: Event Utilities partially downloads schemas, added: Data-Engineering; removed Data-Engineering-Planning.
Jun 27 2023, 12:56 PM · Data-Engineering
Ottomata added a comment to T309717: Event Utilities partially downloads schemas.

The conditional checks if the type is Java null (meaning not present), or if it is set to JSONSchema "null".

Wait, no, the check that throws this error is:

Jun 27 2023, 12:55 PM · Data-Engineering
Ottomata added a comment to T326002: EventGate occasionally fails to ingest specific schemas.

BTW, I don't think this is related to T309717: Event Utilities partially downloads schemas anymore.

Jun 27 2023, 12:53 PM · Data Engineering and Event Platform Team (Sprint 3), Patch-For-Review, MW-1.41-notes (1.41.0-wmf.28; 2023-09-26), Event-Platform, Data-Engineering, Data Pipelines
Ottomata added a comment to T326002: EventGate occasionally fails to ingest specific schemas.

I think we can close this? In T330236: Event partitions missing since 2023-02-21T10:00 for stream without events (canary events not produced?) we merged and deployed https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/894642/, which adds a timeout (and some retries?).

Jun 27 2023, 12:53 PM · Data Engineering and Event Platform Team (Sprint 3), Patch-For-Review, MW-1.41-notes (1.41.0-wmf.28; 2023-09-26), Event-Platform, Data-Engineering, Data Pipelines
Ottomata added a comment to T338233: mw-page-content-change-enrich should enable HA with k8s ConfigMaps.

The wikitech doc now says that we (as in service ops) are required to save and restore the state.

This would only be required after cluster updates (like for WQDS), not for regular application lifecycles.

Jun 27 2023, 12:47 PM · Event-Platform (Sprint 14 B), Data-Engineering
Ottomata added a comment to T309717: Event Utilities partially downloads schemas.

Alright! We are finally on Spark 3, and deployed the error logging change that Dan wrote last year!

Jun 27 2023, 12:39 PM · Data-Engineering

Jun 26 2023

Ottomata added a comment to T340491: Requesting access to Kerberos for cjming.

Approved.

Jun 26 2023, 9:07 PM · SRE, SRE-Access-Requests
Ottomata updated the task description for T307959: [Event Platform] Design and Implement realtime enrichment pipeline for MW page change with content.
Jun 26 2023, 8:43 PM · Data Engineering and Event Platform Team, Data-Engineering, Event-Platform, Epic
Ottomata merged T331542: EventStreamCatalog should not remove user specified options in CREATE TABLE statements into T333795: Event Catalog: Standardize Options Handling.
Jun 26 2023, 8:42 PM · Data Engineering and Event Platform Team (Sprint 0), Event-Platform (Sprint 14 B)
Ottomata merged task T331542: EventStreamCatalog should not remove user specified options in CREATE TABLE statements into T333795: Event Catalog: Standardize Options Handling.
Jun 26 2023, 8:42 PM · Data-Engineering, Event-Platform
Ottomata created T340492: Set up multi DC Kafka stretch cluster.
Jun 26 2023, 8:39 PM · Data-Platform-SRE, Data Engineering and Event Platform Team, Discovery-Search, Event-Platform, Data-Engineering