Page MenuHomePhabricator

Ottomata (Andrew Otto)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Oct 9 2014, 4:50 PM (505 w, 6 d)
Availability
Available
IRC Nick
ottomata
LDAP User
Ottomata
MediaWiki User
Ottomata [ Global Accounts ]

Recent Activity

Today

Ottomata added a comment to T367923: Event validation errors for mediawiki.page_change.v1 since 2024-03-20.

Hm. Technically, removing required-ness of a field is a breaking change. jsonschema-tools will require a major version bump.

Wed, Jun 19, 3:31 PM · Patch-For-Review, Data-Engineering, Event-Platform

Yesterday

Ottomata added a comment to T367923: Event validation errors for mediawiki.page_change.v1 since 2024-03-20.

I think I introduced this bug in T342487: [Event Platform] Actor performing suppression revealed publicly. We are not setting performer correctly, but we never made performer a non-required field in the event schema?

Tue, Jun 18, 8:47 PM · Patch-For-Review, Data-Engineering, Event-Platform
Ottomata created T367923: Event validation errors for mediawiki.page_change.v1 since 2024-03-20.
Tue, Jun 18, 8:37 PM · Patch-For-Review, Data-Engineering, Event-Platform
Ottomata claimed T346355: [Event Platform] Error: Call to a member function exists() on null (via EventBus PageChangeEventSerializer).
Tue, Jun 18, 6:13 PM · Event-Platform, Wikimedia-production-error, Data-Engineering
Ottomata added a comment to T346355: [Event Platform] Error: Call to a member function exists() on null (via EventBus PageChangeEventSerializer).

Just learned of this today. Will look into it.

Tue, Jun 18, 6:13 PM · Event-Platform, Wikimedia-production-error, Data-Engineering
Ottomata added a comment to T346046: [Search Update Pipeline] Source streams for private wikis.

As I wrote the 'Practical Short Term Solution' I came up again against the awkwardness of the wgEnableEventBus config. I wonder if we shouldn't consider just removing and refactoring that, and to make use of the EventStreamConfig producer specific setting (producers.mediawiki_eventbus.enabled = false) added in T259712: Allow disabling/enabling configured streams via wgEventStreams config.

Tue, Jun 18, 3:55 PM · Discovery-Search (Current work), Data-Engineering, CirrusSearch
Ottomata updated subscribers of T259712: Allow disabling/enabling configured streams via wgEventStreams config.
Tue, Jun 18, 3:51 PM · Event-Platform, MW-1.40-notes (1.40.0-wmf.8; 2022-10-31), Data-Engineering, Better Use Of Data, Product-Data-Infrastructure, Core Platform Team Initiatives (Modern Event Platform (TEC2))
Ottomata closed T274775: Document our level of support of pyspark-based jobs as Declined.

Being bold and declining.

Tue, Jun 18, 3:48 PM · Data-Engineering, Documentation
Ottomata closed T274775: Document our level of support of pyspark-based jobs, a subtask of T311413: Documentathon , as Declined.
Tue, Jun 18, 3:48 PM · Data Pipelines, Data-Engineering-Planning
Ottomata closed T313859: Document destination_event_service Event Platform stream configuration as Resolved.

This is done:
https://wikitech.wikimedia.org/wiki/Event_Platform/Stream_Configuration#destination_event_service

Tue, Jun 18, 3:48 PM · Documentation, Data-Engineering
Ottomata closed T313859: Document destination_event_service Event Platform stream configuration, a subtask of T311413: Documentathon , as Resolved.
Tue, Jun 18, 3:47 PM · Data Pipelines, Data-Engineering-Planning
Ottomata added a comment to T346046: [Search Update Pipeline] Source streams for private wikis.

Discussed this in a meeting with Gabriele today. We discussed an ideal solution long term solution, and also a practical short term solution.

Tue, Jun 18, 3:46 PM · Discovery-Search (Current work), Data-Engineering, CirrusSearch
Ottomata added a comment to T356762: [Refine refactoring] Extract refine schema management into a dedicated tool.

Hi, we probably do T366487: Event Platform schemas should not support type changes to structs as array element or map value types along with this work. It is not urgent (it has been the status quo for a long time), but it would clarify what we can support in Hive now, and also in the Iceberg future.

Tue, Jun 18, 1:16 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Patch-For-Review
Ottomata renamed T366562: [Event Platforom] - Add schema CI test that array ensures properties with object types also enumerate object properties from Test that array properties with object types also enumerate object properties to [Event Platforom] - Add schema CI test that array ensures properties with object types also enumerate object properties.
Tue, Jun 18, 1:15 PM · EventStreams
Ottomata added a parent task for T366487: Event Platform schemas should not support type changes to structs as array element or map value types: T356762: [Refine refactoring] Extract refine schema management into a dedicated tool.
Tue, Jun 18, 1:14 PM · Event-Platform, Data-Engineering
Ottomata added a subtask for T356762: [Refine refactoring] Extract refine schema management into a dedicated tool: T366487: Event Platform schemas should not support type changes to structs as array element or map value types.
Tue, Jun 18, 1:14 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Patch-For-Review
Ottomata added a comment to T256891: EventGate throttling and DOS prevention.

Related: {T306580}

Tue, Jun 18, 12:45 PM · Data-Engineering-Icebox, Analytics
Ottomata added a subtask for T206785: Modern Event Platform: Stream Intake Service (EventGate): Implementation: T256891: EventGate throttling and DOS prevention.
Tue, Jun 18, 12:45 PM · Analytics-Kanban, Platform Team Legacy (Watching / External), Services (watching), MediaWiki-extensions-EventLogging, Event-Platform, Analytics
Ottomata added a parent task for T256891: EventGate throttling and DOS prevention: T206785: Modern Event Platform: Stream Intake Service (EventGate): Implementation.
Tue, Jun 18, 12:45 PM · Data-Engineering-Icebox, Analytics
Ottomata added a comment to T367116: mw-page-content-change-enrich flink app is missing in k8s staging.

Ya let's close.

Tue, Jun 18, 12:35 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Data-Engineering, Event-Platform

Mon, Jun 17

Ottomata added a comment to T367810: Spike: Can we recreate a skeleton page_change (revision_change) event from DB replica alone?.

are not available in mariadb?

Mon, Jun 17, 9:02 PM · Dumps 2.0 (Kanban Board)
Ottomata added a comment to T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.

FYI, we discussed some of this in today's Dumps 2.0 meeting. Notes here

Mon, Jun 17, 8:56 PM · Dumps 2.0 (Kanban Board)
Ottomata added a comment to T264131: Experiment with package publishing workflows on GitLab.

FYI:

Mon, Jun 17, 8:54 PM · Release-Engineering-Team (Priority Backlog 📥), SecTeam-Processed, GitLab (Integrations), User-brennen, GitLab-Test
Ottomata updated the task description for T361853: [Datasets Config][Spike] Understand and document the details and conflicts between Datasets Config, Refine refactor, Dynamic EventStreamConfig, and Metrics Platform Instrumentation Configurator.
Mon, Jun 17, 5:49 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata added a comment to T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.

@fkaelin I think your comment is an argument for pushing the 'eventual consistency' mechanism as far upstream as possible.

Mon, Jun 17, 4:32 PM · Dumps 2.0 (Kanban Board)

Thu, Jun 13

Ottomata updated subscribers of T367406: Migrate Python projects that depend on Archiva for deployment.
Thu, Jun 13, 3:58 PM · Java-Scala-Standardization, Discovery-Search, Data-Engineering, Data-Platform-SRE
Ottomata added a comment to T367406: Migrate Python projects that depend on Archiva for deployment.

FYI, DPE has gitlab CI templates to automate releasing and publishing of python packages to GitLab.

Thu, Jun 13, 3:58 PM · Java-Scala-Standardization, Discovery-Search, Data-Engineering, Data-Platform-SRE
Ottomata added a comment to T359583: Provide a way to get sampled POST body logs.

If/when these go to logstash, they should likely go in http.request.body.content as a string:
https://www.elastic.co/guide/en/ecs/current/ecs-http.html#field-http-request-body-content

Thu, Jun 13, 1:59 PM · MW-Interfaces-Team, Sustainability (Incident Followup), Observability-Logging

Wed, Jun 12

Ottomata added a subtask for T294024: [Airflow] Automate sync'ing archiva packages to HDFS: T322690: Add support for repository artifacts in Airflow.
Wed, Jun 12, 10:44 PM · Data-Engineering-Kanban, Data Pipelines, Data-Engineering
Ottomata added a parent task for T322690: Add support for repository artifacts in Airflow: T294024: [Airflow] Automate sync'ing archiva packages to HDFS.
Wed, Jun 12, 10:44 PM · Data-Engineering, Data Pipelines
Ottomata added a subtask for T322690: Add support for repository artifacts in Airflow: T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS.
Wed, Jun 12, 10:43 PM · Data-Engineering, Data Pipelines
Ottomata added a parent task for T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS: T322690: Add support for repository artifacts in Airflow.
Wed, Jun 12, 10:43 PM · Release-Engineering-Team, Data-Engineering (Q4 2024 April 1st - June 30th), Spike
Ottomata added a comment to T350911: Redesign Data Platform docs on Wikitech.

Heroic! <3

Wed, Jun 12, 9:30 PM · Epic, Data-Engineering, Goal, Tech-Docs-Team
Ottomata added a comment to T367034: Make MetricsPlatform MediaWiki extension deliver JS and PHP Client Libraries.

I have have a feeling that Timo will have objections about 3. and 4. These are ones possibly used by 3rd parties?

Wed, Jun 12, 4:56 PM · Data Products, Metrics Platform Backlog
Ottomata updated the task description for T353817: Create legacy EventLogging proxy HTTP intake (for MediaWikiPingback) endpoint to EventGate.
Wed, Jun 12, 3:53 PM · Patch-For-Review, MW-1.43-notes (1.43.0-wmf.8; 2024-06-04), MediaWiki-Platform-Team (Radar), Data-Engineering, Event-Platform, MediaWiki-General
Ottomata added a comment to T367322: Create a global Maven package registry in Gitlab.

At the moment, I'd like to focus on the Maven part and hope that we don't abuse it as much as we've abused Archiva!

Wed, Jun 12, 2:56 PM · GitLab (Administration, Settings & Policy), Java-Scala-Standardization, Release-Engineering-Team, Data-Platform-SRE
Ottomata added a comment to T367322: Create a global Maven package registry in Gitlab.

A potential use case we have:

Wed, Jun 12, 2:54 PM · GitLab (Administration, Settings & Policy), Java-Scala-Standardization, Release-Engineering-Team, Data-Platform-SRE
Ottomata added a comment to T367322: Create a global Maven package registry in Gitlab.

We should probably try to isolate different type of packages

Possibly! Why do you think so?

Wed, Jun 12, 2:51 PM · GitLab (Administration, Settings & Policy), Java-Scala-Standardization, Release-Engineering-Team, Data-Platform-SRE
Ottomata updated subscribers of T367322: Create a global Maven package registry in Gitlab.

cc @gmodena @tchin

Wed, Jun 12, 2:45 PM · GitLab (Administration, Settings & Policy), Java-Scala-Standardization, Release-Engineering-Team, Data-Platform-SRE
Ottomata added a comment to T367322: Create a global Maven package registry in Gitlab.

I like this idea.

Wed, Jun 12, 2:44 PM · GitLab (Administration, Settings & Policy), Java-Scala-Standardization, Release-Engineering-Team, Data-Platform-SRE
Ottomata updated the task description for T323828: Update Pingback to use the Event Platform.
Wed, Jun 12, 2:12 PM · MW-1.43-notes (1.43.0-wmf.10; 2024-06-18), MediaWiki-Platform-Team (Radar), MediaWiki-General
Ottomata renamed T360924: Replace service runner with a simplified library to better support metrics and debugging: service-utils from Replace service runner with a simplified library to better support metrics and debugging to Replace service runner with a simplified library to better support metrics and debugging: service-utils.
Wed, Jun 12, 1:49 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata placed T361769: Migrate and re-deploy eventstreams using new service runner up for grabs.
Wed, Jun 12, 1:48 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata placed T361768: Migrate and re-deploy eventgate using new service runner up for grabs.
Wed, Jun 12, 1:48 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata added a parent task for T361769: Migrate and re-deploy eventstreams using new service runner: T360924: Replace service runner with a simplified library to better support metrics and debugging: service-utils.
Wed, Jun 12, 1:47 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata added a subtask for T360924: Replace service runner with a simplified library to better support metrics and debugging: service-utils: T361769: Migrate and re-deploy eventstreams using new service runner.
Wed, Jun 12, 1:47 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata added a subtask for T360924: Replace service runner with a simplified library to better support metrics and debugging: service-utils: T361770: Support metrics platform backend migration to new service runner.
Wed, Jun 12, 1:47 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata added a parent task for T361770: Support metrics platform backend migration to new service runner: T360924: Replace service runner with a simplified library to better support metrics and debugging: service-utils.
Wed, Jun 12, 1:47 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata added a subtask for T360924: Replace service runner with a simplified library to better support metrics and debugging: service-utils: T361768: Migrate and re-deploy eventgate using new service runner.
Wed, Jun 12, 1:47 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata added a parent task for T361768: Migrate and re-deploy eventgate using new service runner: T360924: Replace service runner with a simplified library to better support metrics and debugging: service-utils.
Wed, Jun 12, 1:47 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata added a comment to T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.

Hm, @xcollazo @gmodena, another thing to consider: how difficult/possible will it be to reconstruct a mediawiki/page/change event from the MariaDB replicas? Xabriel's proposal has a list of wiki_db and revision_id. We could surely get more, but, perhaps we could create an HTTP API endpoint in EventBus that would cause it to produce a page_change event for a specific revision.

Wed, Jun 12, 1:45 PM · Dumps 2.0 (Kanban Board)
Ottomata added a comment to T366627: [MPIC] Analyse risk of potential performance issues with static approach to stream configuration.

we should use pre-computed essential metrics with all calculations offloaded to Airflow pipelines

Wed, Jun 12, 1:14 PM · Data Products (Data Products Sprint 15), Data-Engineering, Metrics Platform Backlog

Tue, Jun 11

Ottomata updated subscribers of T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.
Tue, Jun 11, 8:30 PM · Dumps 2.0 (Kanban Board)
Ottomata added a comment to T353817: Create legacy EventLogging proxy HTTP intake (for MediaWikiPingback) endpoint to EventGate.

Manually tested the code move today. Looks good! I'll deploy the latest stuff hopefully tomorrow.

Tue, Jun 11, 7:40 PM · Patch-For-Review, MW-1.43-notes (1.43.0-wmf.8; 2024-06-04), MediaWiki-Platform-Team (Radar), Data-Engineering, Event-Platform, MediaWiki-General
Ottomata added a comment to T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.

fetch and produce the latest state of the pair to the stream associated with table event.mediawiki_page_content_change_v1. Perhaps these events should be marked as a 'reconciliation' events, so that a consumer can distinguish them from regular revisions coming from EventBus.

Tue, Jun 11, 7:21 PM · Dumps 2.0 (Kanban Board)
Ottomata added a comment to T367173: Requesting access to Kubernetes deployment for ebysans.

Thank you!

Tue, Jun 11, 6:03 PM · SRE, Data-Engineering, SRE-Access-Requests
Ottomata added a comment to T353817: Create legacy EventLogging proxy HTTP intake (for MediaWikiPingback) endpoint to EventGate.

A way to test:

Tue, Jun 11, 5:57 PM · Patch-For-Review, MW-1.43-notes (1.43.0-wmf.8; 2024-06-04), MediaWiki-Platform-Team (Radar), Data-Engineering, Event-Platform, MediaWiki-General
Ottomata added a comment to T367134: [Refine Refactoring] Integrate Refine workflow configuration into ESC.

Can we connect this to an (epic?) parent task(s) and subscribe some more folks from DE?

Tue, Jun 11, 2:03 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
Ottomata updated the task description for T360969: [SPIKE] Evaluate and document solutions for table-management tooling.
Tue, Jun 11, 1:34 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Spike
Ottomata added a comment to T367173: Requesting access to Kubernetes deployment for ebysans.

@Snwachukwu needs this to finish T344730: Migrate Data Engineering Pipelinelib repos to GitLab

Tue, Jun 11, 1:02 PM · SRE, Data-Engineering, SRE-Access-Requests
Ottomata created T367173: Requesting access to Kubernetes deployment for ebysans.
Tue, Jun 11, 1:01 PM · SRE, Data-Engineering, SRE-Access-Requests

Mon, Jun 10

Ottomata updated the task description for T367116: mw-page-content-change-enrich flink app is missing in k8s staging.
Mon, Jun 10, 8:11 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Data-Engineering, Event-Platform
Ottomata created T367116: mw-page-content-change-enrich flink app is missing in k8s staging.
Mon, Jun 10, 8:10 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Data-Engineering, Event-Platform
Ottomata renamed T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions from [Dumps 2] Reconcillation PySpark job to detect and fetch missing/corrupted revisions to [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.
Mon, Jun 10, 6:55 PM · Dumps 2.0 (Kanban Board)
Ottomata added a comment to T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.

Parking link to example of why we will miss events sometimes: https://wikitech.wikimedia.org/wiki/Incidents/2021-11-25_eventgate-main_outage

Mon, Jun 10, 6:31 PM · Dumps 2.0 (Kanban Board)
Ottomata added a comment to T351564: Implement enriched revision visibility stream.

You know, perhaps we need a more generic revision_change stream, akin to mediawiki.page_change.v1. This would often be redundant with page_change (when page_change_kind == 'create' or ' edit'), but would allow us to represent more than just visibility state changes of a revision.

Mon, Jun 10, 6:20 PM · Dumps 2.0 (Kanban Board), Data Products
Ottomata closed T367073: Requesting access to Kubernetes deployment for amastilovic as Resolved.
Mon, Jun 10, 5:42 PM · Data-Engineering, SRE, SRE-Access-Requests
Ottomata added a comment to T366627: [MPIC] Analyse risk of potential performance issues with static approach to stream configuration.

How will these dashboards be served? Via Presto?

Mon, Jun 10, 5:10 PM · Data Products (Data Products Sprint 15), Data-Engineering, Metrics Platform Backlog
Ottomata updated the task description for T367073: Requesting access to Kubernetes deployment for amastilovic.
Mon, Jun 10, 5:01 PM · Data-Engineering, SRE, SRE-Access-Requests
Ottomata added a comment to T367034: Make MetricsPlatform MediaWiki extension deliver JS and PHP Client Libraries.

FWIW, I think Data Products should own the EventLogging extension and do whatever they think is best.

Mon, Jun 10, 4:17 PM · Data Products, Metrics Platform Backlog
Ottomata added a comment to T367034: Make MetricsPlatform MediaWiki extension deliver JS and PHP Client Libraries.

Event platform is a way to publish events to a stream agnostically, open question about whether or not this creates a duplication of effort between MP and EP. Let's discuss!

Mon, Jun 10, 4:16 PM · Data Products, Metrics Platform Backlog
Ottomata updated subscribers of T367073: Requesting access to Kubernetes deployment for amastilovic.

@thcipriani for group approver

Mon, Jun 10, 4:00 PM · Data-Engineering, SRE, SRE-Access-Requests
Ottomata added a comment to T367073: Requesting access to Kubernetes deployment for amastilovic.

^ patch to do this once approved.

Mon, Jun 10, 3:58 PM · Data-Engineering, SRE, SRE-Access-Requests
Ottomata added a project to T367073: Requesting access to Kubernetes deployment for amastilovic: Data-Engineering.
Mon, Jun 10, 3:57 PM · Data-Engineering, SRE, SRE-Access-Requests
Ottomata updated subscribers of T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS.

Discussed in meeting:

Mon, Jun 10, 3:56 PM · Release-Engineering-Team, Data-Engineering (Q4 2024 April 1st - June 30th), Spike
Ottomata updated the task description for T367073: Requesting access to Kubernetes deployment for amastilovic.
Mon, Jun 10, 3:43 PM · Data-Engineering, SRE, SRE-Access-Requests
Ottomata added a comment to T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS.

@lbowmaker can we clarify the user story / requirement here? As written it makes sense, but we might be missing something.

Mon, Jun 10, 3:18 PM · Release-Engineering-Team, Data-Engineering (Q4 2024 April 1st - June 30th), Spike
Ottomata added a comment to T366627: [MPIC] Analyse risk of potential performance issues with static approach to stream configuration.

If the INSERTs would typically include data for multiple experiment_ids

And if we don't do any special partitioning or inserting?

Mon, Jun 10, 3:03 PM · Data Products (Data Products Sprint 15), Data-Engineering, Metrics Platform Backlog

Fri, Jun 7

Ottomata added a comment to T366627: [MPIC] Analyse risk of potential performance issues with static approach to stream configuration.

We partition the table by experiment_id. Both Iceberg and Hive support this.

Fri, Jun 7, 10:08 PM · Data Products (Data Products Sprint 15), Data-Engineering, Metrics Platform Backlog
Ottomata added a comment to T366627: [MPIC] Analyse risk of potential performance issues with static approach to stream configuration.

I think the question at hand is, how much will the query latency of these 2 situations differ?

Fri, Jun 7, 6:37 PM · Data Products (Data Products Sprint 15), Data-Engineering, Metrics Platform Backlog
Ottomata updated subscribers of T344730: Migrate Data Engineering Pipelinelib repos to GitLab.

@Snwachukwu, can you 'archive' / blank(?) the migrated repos asap? And maybe update their descriptions in gerrit to say they have moved to GitLab with pointers?

Fri, Jun 7, 5:06 PM · Patch-For-Review, Data-Engineering (Q4 2024 April 1st - June 30th), GitLab (Pipeline Services Migration🐤), Event-Platform
Ottomata updated the task description for T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS.
Fri, Jun 7, 1:12 PM · Release-Engineering-Team, Data-Engineering (Q4 2024 April 1st - June 30th), Spike
Ottomata updated subscribers of T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS.

@thcipriani I think I recall you or RelEng mentioning a GitLab CD project. Got any links?

Fri, Jun 7, 1:11 PM · Release-Engineering-Team, Data-Engineering (Q4 2024 April 1st - June 30th), Spike
Ottomata added a comment to T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS.

Tagging Release Engineering for consultation.

Fri, Jun 7, 1:11 PM · Release-Engineering-Team, Data-Engineering (Q4 2024 April 1st - June 30th), Spike
Ottomata added a project to T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS: Release-Engineering-Team.
Fri, Jun 7, 1:11 PM · Release-Engineering-Team, Data-Engineering (Q4 2024 April 1st - June 30th), Spike
Ottomata updated subscribers of T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.
Fri, Jun 7, 1:40 AM · Dumps 2.0 (Kanban Board)

Thu, Jun 6

Ottomata added a comment to T366611: Migrate Data Engineering NodeJS library repos to GitLab.

Note: I have moved the event schema repo GitLab migration to its own task: T366836: Migrate Event Platform Schema Respositories to Gitlab

Thu, Jun 6, 7:16 PM · Event-Platform, Data-Engineering
Ottomata set the point value for T366836: Migrate Event Platform Schema Respositories to Gitlab to 5.
Thu, Jun 6, 6:40 PM · Event-Platform, Data-Engineering
Ottomata set the point value for T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code to 5.
Thu, Jun 6, 6:40 PM · Data-Engineering
Ottomata set the point value for T366537: Create gitlab ci npm publish pipeline and job in workflow_utils gitlab_ci_templates to 5.
Thu, Jun 6, 6:40 PM · Data-Engineering
Ottomata set the point value for T366611: Migrate Data Engineering NodeJS library repos to GitLab to 8.
Thu, Jun 6, 6:40 PM · Event-Platform, Data-Engineering
Ottomata created T366836: Migrate Event Platform Schema Respositories to Gitlab.
Thu, Jun 6, 6:37 PM · Event-Platform, Data-Engineering
Ottomata updated the task description for T366611: Migrate Data Engineering NodeJS library repos to GitLab.
Thu, Jun 6, 6:37 PM · Event-Platform, Data-Engineering

Wed, Jun 5

Ottomata added a comment to T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS.

Q: Have we discussed these ideas with Release Engineering folks? They are currently working on a similar CD project, but it might be MediaWiki focused only.

Wed, Jun 5, 8:10 PM · Release-Engineering-Team, Data-Engineering (Q4 2024 April 1st - June 30th), Spike
Ottomata updated subscribers of T366627: [MPIC] Analyse risk of potential performance issues with static approach to stream configuration.

Are there any query performance optimisations that we couldn't implement because stream configurations would only be static?

Wed, Jun 5, 3:54 PM · Data Products (Data Products Sprint 15), Data-Engineering, Metrics Platform Backlog
Ottomata updated subscribers of T361347: Add documentation related to the kubernetes deployment to the MPIC service page .
Wed, Jun 5, 3:39 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Data Products (Data Products Sprint 14), Metrics Platform Backlog
Ottomata added a comment to T361347: Add documentation related to the kubernetes deployment to the MPIC service page .

In case it is helpful, there are a lot of k8s deployment related instructions on the EventGate/Administration page.

Wed, Jun 5, 3:38 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Data Products (Data Products Sprint 14), Metrics Platform Backlog
Ottomata updated subscribers of T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.
Wed, Jun 5, 3:31 PM · Dumps 2.0 (Kanban Board)
Ottomata added a comment to T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.

@xcollazo: Question related to reconciliation idea in T120242: Eventually Consistent MediaWiki State Change Events.

Wed, Jun 5, 3:31 PM · Dumps 2.0 (Kanban Board)
Ottomata renamed T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions from Reconcillation PySpark job to detect and fetch missing/corrupted revisions to [Dumps 2] Reconcillation PySpark job to detect and fetch missing/corrupted revisions.
Wed, Jun 5, 3:25 PM · Dumps 2.0 (Kanban Board)