Page MenuHomePhabricator

Ottomata (Andrew Otto)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Oct 9 2014, 4:50 PM (545 w, 6 d)
Availability
Available
IRC Nick
ottomata
LDAP User
Ottomata
MediaWiki User
Ottomata [ Global Accounts ]

Recent Activity

Today

Ottomata added a comment to T390013: Bug: event validation error: bad mediawiki.job.* meta.request_id field.

If the errors are only for GlobalVanishJob, then I'd guess that job is setting $params['requestId'] as an integer.

Wed, Mar 26, 4:11 PM · MW-Interfaces-Team, WMF-JobQueue, Event-Platform, Data-Engineering
Ottomata added a comment to T390013: Bug: event validation error: bad mediawiki.job.* meta.request_id field.

Likely culprit is in this block:

Wed, Mar 26, 4:09 PM · MW-Interfaces-Team, WMF-JobQueue, Event-Platform, Data-Engineering
Ottomata added a comment to T314956: [Event Platform] Declare webrequest as an Event Platform stream.

I still think we should do this, even if we are not doing it now.

Wed, Mar 26, 2:01 PM · Patch-For-Review, Data-Engineering, Event-Platform
Ottomata added a comment to T390012: Bug: event validation error: mediawiki.page-restrictions-change.

Hm, actually this looks like it has been happening longer than a week:

Wed, Mar 26, 2:35 AM · Event-Platform, Data-Engineering
Ottomata created T390013: Bug: event validation error: bad mediawiki.job.* meta.request_id field.
Wed, Mar 26, 2:25 AM · MW-Interfaces-Team, WMF-JobQueue, Event-Platform, Data-Engineering
Ottomata created T390012: Bug: event validation error: mediawiki.page-restrictions-change.
Wed, Mar 26, 2:22 AM · Event-Platform, Data-Engineering

Yesterday

Ottomata added a comment to T389819: Remove deprecated parameters from ServerSideAccountCreation.

Or... you could make a brand new Metrics Platform based instrumentation

oh yes, that would be best!

Tue, Mar 25, 7:23 PM · Data-Engineering, Event-Platform, Patch-For-Review, Technical-Debt, MediaWiki-extensions-Campaigns, MediaWiki-extensions-EventLogging
Ottomata updated subscribers of T389602: Model domain events for logged actions.
Tue, Mar 25, 7:17 PM · MW-Interfaces-Team, MediaWiki-DomainEvents
Ottomata added a comment to T389602: Model domain events for logged actions.

The aggregate is defined as the collection of all things that are changed together atomically

Tue, Mar 25, 7:16 PM · MW-Interfaces-Team, MediaWiki-DomainEvents
Ottomata moved T389666: NEW/CHANGE FEATURE REQUEST: make available the centralauth.globaluser table in Data Lake from Incoming (new tickets) to Needs Clarification on the Data-Engineering board.
Tue, Mar 25, 6:00 PM · Experimentation Lab, Data-Platform, Data-Engineering
Ottomata edited projects for T389720: kube-apiserver.service on dse-k8s-ctrl restarting and paging, added: Data-Platform-SRE; removed Data-Engineering.
Tue, Mar 25, 5:56 PM · Data-Platform-SRE (2025.03.22 - 2025.04.11), Kubernetes
Ottomata moved T389819: Remove deprecated parameters from ServerSideAccountCreation from Incoming (new tickets) to Needs Clarification on the Data-Engineering board.
Tue, Mar 25, 5:54 PM · Data-Engineering, Event-Platform, Patch-For-Review, Technical-Debt, MediaWiki-extensions-Campaigns, MediaWiki-extensions-EventLogging
Ottomata added a comment to T389819: Remove deprecated parameters from ServerSideAccountCreation.

@Reedy how can we help?

Tue, Mar 25, 5:54 PM · Data-Engineering, Event-Platform, Patch-For-Review, Technical-Debt, MediaWiki-extensions-Campaigns, MediaWiki-extensions-EventLogging
Ottomata updated subscribers of T389903: Analytics Cluster Dataset Usage Discovery Task.
Tue, Mar 25, 5:49 PM · Data-Engineering (Q4 2025 April 1st - June 30th)
Ottomata added a project to T370551: Bug: Cassandra Unique Devices data quality issue: mobile data: Movement-Insights.
Tue, Mar 25, 5:46 PM · Movement-Insights, Data-Engineering, Wikifunctions, Abstract Wikipedia team, Data-Platform
Ottomata added a comment to T370551: Bug: Cassandra Unique Devices data quality issue: mobile data.

Related: T214998: RFC: Remove m-dot subdomain, serve mobile and desktop variants through the same URL

Tue, Mar 25, 5:45 PM · Movement-Insights, Data-Engineering, Wikifunctions, Abstract Wikipedia team, Data-Platform
Ottomata moved T373871: Log Api-User-Agent header in Turnilo from Incoming (new tickets) to Next Up on the Data-Engineering board.
Tue, Mar 25, 5:42 PM · OKR-Work, Data-Engineering, Traffic, MW-Interfaces-Team
Ottomata added a project to T373871: Log Api-User-Agent header in Turnilo: OKR-Work.

This should fit into work for WE5 FY2025-2026

Tue, Mar 25, 5:41 PM · OKR-Work, Data-Engineering, Traffic, MW-Interfaces-Team
Ottomata added a comment to T376903: [Spike] How to Instrument metrics for summarization experiment.

Indeed for the upcoming iteration we have T389097 in our current sprint

Tue, Mar 25, 5:21 PM · Web-Team-Backlog-Archived (FY2024-25 Q2 Sprint 3), FY2024-25 KR 3.1 Content Discovery
Ottomata added a comment to T389602: Model domain events for logged actions.

Can you say more about the reasoning around wanting to make a LogEntryCreated event a subordinate of PageAggregate?

Tue, Mar 25, 5:19 PM · MW-Interfaces-Team, MediaWiki-DomainEvents
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

Okay, T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields is done. I also fixed and merged @jsn.sherman's MR 50 in schemas-event-secondary.

Tue, Mar 25, 4:27 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata moved T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields from Next Up to Done on the Data-Engineering (Q3 2025 January 1st - March 31th) board.
Tue, Mar 25, 4:25 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform
Ottomata claimed T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields.
Tue, Mar 25, 4:25 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform
Ottomata triaged T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields as High priority.
Tue, Mar 25, 4:25 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform
Ottomata moved T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields from Incoming (new tickets) to Q3 2025 January 1st - March 31th on the Data-Engineering board.
Tue, Mar 25, 4:25 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform
Ottomata updated the task description for T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields.
Tue, Mar 25, 4:24 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform
Ottomata added a comment to T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields.

@Ottomata in T382147 I went back to just editing the 1.0.0 examples in place instead of incrementing and tests are still failing today;

Tue, Mar 25, 4:09 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform

Mon, Mar 24

Ottomata added a comment to T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields.

Patch up for review. I will not have time to write an extensive unit test for this.

Mon, Mar 24, 9:41 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform
jsn.sherman awarded T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields a Manufacturing Defect? token.
Mon, Mar 24, 9:37 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform
Ottomata updated the task description for T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields.
Mon, Mar 24, 9:34 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform
Ottomata created T389881: Bug: jsonschema-tools generates non deterministic examples for date format fields.
Mon, Mar 24, 9:25 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

Ah ha! I found it.

Mon, Mar 24, 9:22 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

^ sounds like it will pass until tomorrow!

Mon, Mar 24, 7:41 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

Hm, you could consider manually editing the 1.0.0 versions and fixing their examples too. Just in case?

Mon, Mar 24, 6:47 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

Very strange.

Mon, Mar 24, 6:45 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata added a comment to T389542: NEW/CHANGE FEATURE REQUEST: Documentation for v1 Enterprise endpoint deprecation .

^ <3 thank you!

Mon, Mar 24, 5:48 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Experimentation Lab, Data-Platform
Ottomata added a comment to T389822: ServerSideAccountCreation violates Identifier Naming Rules?.

broke those rules ;)

Mon, Mar 24, 5:32 PM · Data-Engineering, Technical-Debt, Event-Platform, MediaWiki-extensions-Campaigns
Ottomata added a comment to T389822: ServerSideAccountCreation violates Identifier Naming Rules?.

New fields should use the new rules.

Mon, Mar 24, 5:27 PM · Data-Engineering, Technical-Debt, Event-Platform, MediaWiki-extensions-Campaigns
Ottomata added a comment to T389602: Model domain events for logged actions.

You could do both 1. and 2.?

Mon, Mar 24, 5:26 PM · MW-Interfaces-Team, MediaWiki-DomainEvents
Ottomata added a comment to T389822: ServerSideAccountCreation violates Identifier Naming Rules?.

This is a legacy schema, migrated from the older eventlogging system. From https://gitlab.wikimedia.org/repos/data-engineering/schemas-event-secondary

Mon, Mar 24, 5:22 PM · Data-Engineering, Technical-Debt, Event-Platform, MediaWiki-extensions-Campaigns
Ottomata renamed T382614: Document the Who Are Moderators work publicly from Document the work publicly to Document the Who Are Moderators work publicly.
Mon, Mar 24, 5:19 PM · Essential-Work, Research
Ottomata added a comment to T389602: Model domain events for logged actions.

Hm, can you clarify: do you mean to create events for logs? Or to use events to insert into the logging table?

Mon, Mar 24, 4:55 PM · MW-Interfaces-Team, MediaWiki-DomainEvents

Thu, Mar 20

Ottomata added a comment to T341649: Provide an easy way for MediaWiki to fetch aggregate statistics from the data lake.

BTW I gave a state of the data platform talk at the WMF Data Strategy convening in November 2024.

Thu, Mar 20, 8:26 PM · Data-Engineering, Data Pipelines
Ottomata added a comment to T389542: NEW/CHANGE FEATURE REQUEST: Documentation for v1 Enterprise endpoint deprecation .

If you feel frisky, please submit a patch. The files you'll need to change are static html, at

Thu, Mar 20, 8:15 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Experimentation Lab, Data-Platform
Ottomata updated subscribers of T341649: Provide an easy way for MediaWiki to fetch aggregate statistics from the data lake.
Thu, Mar 20, 8:11 PM · Data-Engineering, Data Pipelines
Ottomata added a comment to T341649: Provide an easy way for MediaWiki to fetch aggregate statistics from the data lake.

Related: T258511: Data Lake incremental Data Updates

Thu, Mar 20, 8:10 PM · Data-Engineering, Data Pipelines
Ottomata awarded T341649: Provide an easy way for MediaWiki to fetch aggregate statistics from the data lake a Barnstar token.
Thu, Mar 20, 8:09 PM · Data-Engineering, Data Pipelines
Ottomata added a project to T341649: Provide an easy way for MediaWiki to fetch aggregate statistics from the data lake: Data-Engineering.
Thu, Mar 20, 8:08 PM · Data-Engineering, Data Pipelines
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

Oh! incrementing the version is fine too.

Thu, Mar 20, 6:35 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

Hm, technically you can just update, this only changes the example, so as long as rematerialize all the right files (the fragment and the dependent concrete schemas).

Thu, Mar 20, 6:34 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Restricted Application added a project to T307328: Scalability issues of recentchanges table: Moderator-Tools-Team.
Thu, Mar 20, 1:37 PM · Moderator-Tools-Team, Wikidata-Campsite, Wikidata, Wikimedia-Slow-DB-Query, Data-Persistence (work done), MediaWiki-Recent-changes

Wed, Mar 19

Ottomata added a comment to T366544: Use the Spark-Iceberg built in CDC mechanism to PoC a replacement for wmf.wikimedia_wikitext_current.

(BTW, maybe the more correct term for this is 'star' rather than 'snowflake'? @JAllemandou ?)

Wed, Mar 19, 7:27 PM · Data-Engineering (Q3 2025 January 1st - March 31th)
Ottomata added a comment to T366544: Use the Spark-Iceberg built in CDC mechanism to PoC a replacement for wmf.wikimedia_wikitext_current.

+1 for wmf_content.mediawiki_content_current_v1

Wed, Mar 19, 6:18 PM · Data-Engineering (Q3 2025 January 1st - March 31th)
Ottomata closed T379936: Make DomainEvents serializable, a subtask of T379935: DomainEvents - Broadcasting and receiving cross-process events, as Invalid.
Wed, Mar 19, 5:19 PM · Data-Engineering-Roadmap, Epic, MediaWiki-DomainEvents, MW-Interfaces-Team
Ottomata closed T379936: Make DomainEvents serializable as Invalid.

We aren't doing any active development work on this, and the plan may change significantly when we do.

Wed, Mar 19, 5:19 PM · Data-Engineering (Q3 2025 January 1st - March 31th), MW-Interfaces-Team
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

Maybe there is a subtle race condition in the test when it dereferences and materializes current.yaml to compare against the materialized file. Somehow maybe it is checking the current.yaml file for examples, not seeing any, generating them, THEN dereferencing and merging schemas together?

Wed, Mar 19, 5:05 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

I think I see it. I'm not exactly sure why this is happening, but:

Wed, Mar 19, 4:59 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

If a schema does not have examples, and shouldGenerateExamples is enabled (it is), then https://github.com/json-schema-faker/json-schema-faker is used to generate the examples.

Wed, Mar 19, 4:53 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

Very strange.

Wed, Mar 19, 4:50 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata added a comment to T384329: Update canary_events DAG to use an internal domain and/or the service mesh to obtain its eventstream config.

@brouberol lots of patches, is this done?

Wed, Mar 19, 3:57 PM · Data-Platform-SRE (2025.03.01 - 2025.03.21), Data-Engineering (Q3 2025 January 1st - March 31th), Essential-Work
Ottomata added a comment to T365203: [Data Quality] Implement wiki completeness check for MediaWiki History.

Thanks Sandra!

Wed, Mar 19, 2:56 PM · Essential-Work, Data-Engineering (Q3 2025 January 1st - March 31th)
Ottomata added a comment to T388397: Make webrequest_frontend being ingested using the in-data `dt` field.

BTW, there was a request to do this for varnishkafka, but it was declined when it was intended to do it form ATS instead:

Wed, Mar 19, 2:21 PM · Traffic, DPE HAProxy Migration, Data-Engineering (Q3 2025 January 1st - March 31th)

Tue, Mar 18

Ottomata added a comment to T333497: Include image/file changes in page-links-change.

which tried to unify the different link types internally

Tue, Mar 18, 8:29 PM · Data-Engineering, Event-Platform, EventStreams
Ottomata added a comment to T389248: Create a template for building Maven projects.

Duplicate of T386406: Create Gitlab CI templates for JVM packages?

Tue, Mar 18, 8:09 PM · Java-Scala-Standardization
Ottomata moved T375298: Stop using spark.jars.packages from Backlog to Next Up on the Data-Engineering board.
Tue, Mar 18, 5:57 PM · DPE-Mediawiki-Content, Data-Engineering
Ottomata added a subtask for T138207: [Open question] Improve bot identification at scale: Unknown Object (Task).
Tue, Mar 18, 5:54 PM · Data-Engineering-Icebox, Data-Engineering, Research-Freezer
Ottomata removed a project from T388855: Search Update Pipeline requests to Action API are logged as coming from 127.0.0.1: Data-Engineering.
Tue, Mar 18, 5:39 PM · Data-Platform-SRE (2025.03.22 - 2025.04.11), Discovery-Search, serviceops
Ottomata added a comment to T387631: Are eqiad.mediawiki.job.CampaignEventsFindPotentialInvitees jobs being processed in beta?.

I don't know what is wrong, but some details to help you search:

Tue, Mar 18, 4:13 PM · User-bd808, WMF-JobQueue, Beta-Cluster-Infrastructure
Ottomata added a comment to T389026: Rethink rev_sha1 field.

My question is that whether a bigint or an int field would be "good enough" for your usecases.

Tue, Mar 18, 1:34 PM · Schema-change, DBA, Data-Engineering, MediaWiki-Core-Revision-backend
Ottomata added a member for Trusted-Contributors: SSalgaonkar-WMF.
Tue, Mar 18, 12:01 AM

Mon, Mar 17

Ottomata updated subscribers of T376903: [Spike] How to Instrument metrics for summarization experiment.

@Jdlrobson-WMF @SToyofuku-WMF a couple of your comments made me want to ask a question!

Mon, Mar 17, 6:18 PM · Web-Team-Backlog-Archived (FY2024-25 Q2 Sprint 3), FY2024-25 KR 3.1 Content Discovery
Ottomata added a comment to T380867: Allow looking up permissions directly across multiple wikis.

Aye okay, for a minute there it was also reminding me of T309738: Move Mediawiki QueryPages computation to Hadoop.

Mon, Mar 17, 6:01 PM · Trust and Safety Product Team, MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, CheckUser-GlobalContributions
Ottomata added a comment to T388925: Emit the PageMovedEvent before the PageRevisionUpdateEvent during page moves.

mass-renaming pages using a maintenance script would emit page moved events, but would not create dummy revisions.

Mon, Mar 17, 5:35 PM · MediaWiki-DomainEvents, MediaWiki-Page-rename, MW-Interfaces-Team, MediaWiki-Core-Revision-backend
Ottomata added a comment to T380867: Allow looking up permissions directly across multiple wikis.

Ideally the system would be generating the data on-the-fly because when permissions change they should ideally be applied immediately.

Mon, Mar 17, 5:27 PM · Trust and Safety Product Team, MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, CheckUser-GlobalContributions
Ottomata added a comment to T384874: DomainEvents - [Hypothesis] WE5.2.6 Event Broadcasting Discovery & Design.

Status update:
Design doc is in Draft 1.0 state. and is ready for general review.

Mon, Mar 17, 5:11 PM · MW-Interfaces-Team (MWI-Roadmap), Data-Engineering (Q3 2025 January 1st - March 31th), MediaWiki-DomainEvents
Ottomata added a comment to T380867: Allow looking up permissions directly across multiple wikis.

Ignorant drive by comment:

Mon, Mar 17, 5:08 PM · Trust and Safety Product Team, MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, CheckUser-GlobalContributions
Ottomata updated the task description for T365203: [Data Quality] Implement wiki completeness check for MediaWiki History.
Mon, Mar 17, 4:05 PM · Essential-Work, Data-Engineering (Q3 2025 January 1st - March 31th)
Ottomata updated subscribers of T241741: Keep canonical_data.wikis updated.
Mon, Mar 17, 4:04 PM · Patch-For-Review, Data-Engineering-Icebox, Data-Engineering, Data Pipelines, Movement-Insights
Ottomata updated subscribers of T339928: Automatically update the canonical data tables.
Mon, Mar 17, 4:04 PM · Data-Engineering-Icebox, Data-Engineering, Analytics-Canonical-Data, Movement-Insights
Ottomata added a comment to T387908: [BUG] new eventgate-wikimedia header enrich config loses client set headers.

Note: this task is currently blocking T383814: Upgrade eventgate-wikimedia to node20 as the node 20 upgrade was done after the addition of enrich_fields_with_http_headers.

Mon, Mar 17, 3:14 PM · Data-Engineering (Q3 2025 January 1st - March 31th), Event-Platform, Product-Analytics
Ottomata updated subscribers of T389026: Rethink rev_sha1 field.

As this is being considered, please keep in mind that rev_sha1 is used in downstream data pipelines: in the WMF Data Platform, for training ML models, and I'd expect it is used by users of cloud replicas too!

Mon, Mar 17, 2:06 PM · Schema-change, DBA, Data-Engineering, MediaWiki-Core-Revision-backend
Ottomata added a comment to T388693: Requesting access to analytics-privatedata-users group for DSantamaria.

Hi @BCornwall ! group owner approval for analytics-privatedata-users is not needed for WMF or WMDE staff.

Mon, Mar 17, 2:00 PM · Data-Engineering, SRE, SRE-Access-Requests
Ottomata added a comment to T388925: Emit the PageMovedEvent before the PageRevisionUpdateEvent during page moves.

FWIW, here is how a move looks in mediawiki.page_change.v1:

Mon, Mar 17, 1:41 PM · MediaWiki-DomainEvents, MediaWiki-Page-rename, MW-Interfaces-Team, MediaWiki-Core-Revision-backend
Ottomata added a comment to T388925: Emit the PageMovedEvent before the PageRevisionUpdateEvent during page moves.

Since these are different events, should a listener be reasoning about the order they receive them in?

Mon, Mar 17, 1:09 PM · MediaWiki-DomainEvents, MediaWiki-Page-rename, MW-Interfaces-Team, MediaWiki-Core-Revision-backend

Fri, Mar 14

Ottomata added a parent task for T120242: Eventually Consistent MediaWiki State Change Events: T291120: MediaWiki Event Carried State Transfer - Problem Statement.
Fri, Mar 14, 7:11 PM · Data-Engineering, Analytics, DBA, WMF-Architecture-Team, Platform Team Legacy (Later), Event-Platform, Services (later)
Ottomata added a subtask for T291120: MediaWiki Event Carried State Transfer - Problem Statement: T120242: Eventually Consistent MediaWiki State Change Events.
Fri, Mar 14, 7:11 PM · Data-Engineering, Platform Engineering, Event-Platform, tech-decision-forum

Thu, Mar 13

Ottomata updated subscribers of T388825: Some events in mediawiki.page_change.v1 refers to auth.wikimedia.org in meta.uri and meta.domain.

Huh! For the reconciliation.

Thu, Mar 13, 8:03 PM · Data-Engineering, Event-Platform
Ottomata added a comment to T209453: Refine: Use Spark SQL instead of Hive JDBC .

Amazing. So we just need to get that bug fixed, convert everything to Iceberg, and then we can stop using JDBC! ;)

Thu, Mar 13, 7:04 PM · Data-Engineering-Icebox, Data-Engineering, Data Pipelines
Ottomata added a comment to T388825: Some events in mediawiki.page_change.v1 refers to auth.wikimedia.org in meta.uri and meta.domain.

Actually, this could be a problem for dumps 2 via page_content_change enrichment! Enrichment happens via a action api uri constructed using meta.domain

Thu, Mar 13, 6:00 PM · Data-Engineering, Event-Platform
Ottomata added a comment to T388825: Some events in mediawiki.page_change.v1 refers to auth.wikimedia.org in meta.uri and meta.domain.

referencing the expected database.

Thu, Mar 13, 5:58 PM · Data-Engineering, Event-Platform
Ottomata added a comment to T388825: Some events in mediawiki.page_change.v1 refers to auth.wikimedia.org in meta.uri and meta.domain.

Responsible code:

Thu, Mar 13, 5:45 PM · Data-Engineering, Event-Platform
Ottomata added a comment to T209453: Refine: Use Spark SQL instead of Hive JDBC .
ADD COLUMN revision_content_slots.value.origin_rev_id bigint;
Thu, Mar 13, 3:39 PM · Data-Engineering-Icebox, Data-Engineering, Data Pipelines

Wed, Mar 12

Ottomata added a comment to T387010: Domain Events: allow listeners to execute code via the job queue.

Thought: if we do this, is there a way for the subscriber to submit an event to be processed by a foreign wiki?

Wed, Mar 12, 5:51 PM · Patch-For-Review, MediaWiki-DomainEvents
Ottomata updated subscribers of T367346: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets.
Wed, Mar 12, 4:48 PM · Data-Engineering
Ottomata added a comment to T382147: Configure a metrics platform stream with a custom schema to record how Nuke users filter pages to delete.

By default, streams have canary_events_enabled.
https://wikitech.wikimedia.org/wiki/Data_Platform/Systems/Hadoop_Event_Ingestion_Lifecycle#Canary_Events

Wed, Mar 12, 4:27 PM · Experimentation Lab Radar, Moderator-Tools-Team (Kanban), MediaWiki-extensions-Nuke
Ottomata closed T357468: [Dataset Config Store] Setup initial CI checks, a subtask of T354557: Dataset Config Store, as Declined.
Wed, Mar 12, 3:34 PM · Epic, Data-Engineering
Ottomata closed T357468: [Dataset Config Store] Setup initial CI checks as Declined.

Being bold and declining task, please reopen if incorrect.

Wed, Mar 12, 3:34 PM · Data-Engineering
Ottomata closed T362291: [Datasets Config] Define and implement SLOs, monitoring and logging as Declined.

Being bold and declining this island task.

Wed, Mar 12, 3:33 PM · Data-Engineering
Ottomata edited projects for T388537: Migrate data-engineering jobs to mw-cron, added: Structured Data Engineering, Image-Suggestions; removed Data-Engineering.
Wed, Mar 12, 3:38 AM · Image-Suggestions, Structured-Data-Backlog, Structured Data Engineering, serviceops
Ottomata renamed T384723: [Dumps 2] Pay down technical debt from Pay down technical debt to [Dumps 2] Pay down technical debt.
Wed, Mar 12, 1:41 AM · DPE-Mediawiki-Content