Page MenuHomePhabricator

gmodena (GModena (WMF))
User

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Nov 2 2020, 1:15 PM (194 w, 2 d)
Availability
Available
IRC Nick
gmodena
LDAP User
Gmodena
MediaWiki User
GModena (WMF) [ Global Accounts ]

Recent Activity

Fri, Jul 19

gmodena added a comment to T346046: [Search Update Pipeline] Source streams for private wikis.

Couple of WIP patches up for discussion.

The second depends on the first.

@gmodena I like the way this is headed, but I'm not sure if following through (and fully deprecating EventBusFactory) is worth it. Let's get together and discuss.

Fri, Jul 19, 10:52 AM · Data-Engineering (Q1 2024 July 1st - September 30th), MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Patch-For-Review, Discovery-Search (Current work), CirrusSearch

Thu, Jul 18

gmodena added a comment to T370368: [NEEDS GROOMING] We should improve the code health of gobblin-wmf.

When gobblin moves to airflow, we can use Artifact sync to deploy the jar, instead of relying on analytics/refinery.

Thu, Jul 18, 11:14 AM · Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena updated the task description for T370368: [NEEDS GROOMING] We should improve the code health of gobblin-wmf.
Thu, Jul 18, 11:14 AM · Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena added a comment to T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.

A dashboard for EventBus is available in Grafana

Thu, Jul 18, 9:38 AM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena updated the task description for T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.
Thu, Jul 18, 9:37 AM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform

Wed, Jul 17

gmodena created T370368: [NEEDS GROOMING] We should improve the code health of gobblin-wmf.
Wed, Jul 17, 9:22 PM · Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena added a comment to T365005: Evaluate ESC and explore an alternative design..

After re-scoping both Config Store and MPIC, we decided to not move forward with refactoring ESC at this stage. A summary of this task is available on wikitech. A design doc is available at Stream Registry.

Wed, Jul 17, 12:15 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena added a comment to T362785: Add host level instrumentation on webrequest.

DQ job and airflow dag have been updated. Deployment requires a new release of refinery-source and a dag deployment on analytics.

Wed, Jul 17, 11:16 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena updated the task description for T362785: Add host level instrumentation on webrequest.
Wed, Jul 17, 11:14 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena added a comment to T362783: Add instrumentation for actor signatures.

DQ job and airflow dag have been implemented. Deployment requires a new release of refinery-source and a dag deployment on analytics.

Wed, Jul 17, 11:13 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena updated the task description for T362783: Add instrumentation for actor signatures.
Wed, Jul 17, 11:12 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena moved T362785: Add host level instrumentation on webrequest from In Review to Ready to Deploy on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Wed, Jul 17, 11:11 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena moved T362783: Add instrumentation for actor signatures from In Review to Ready to Deploy on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Wed, Jul 17, 11:11 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review

Tue, Jul 16

gmodena moved T365005: Evaluate ESC and explore an alternative design. from In progress to Done on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Tue, Jul 16, 2:13 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena moved T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS from In Review to In progress on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Tue, Jul 16, 2:11 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Release-Engineering-Team, Spike
gmodena created P66612 Page Change Reconciliation Event (create action).
Tue, Jul 16, 11:50 AM
gmodena created P66609 Page Change Reconciliation Event (move action).
Tue, Jul 16, 11:40 AM
gmodena created P66596 Page Change Reconciliation Event (edit action).
Tue, Jul 16, 10:00 AM

Thu, Jul 11

gmodena added a comment to T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.

Instrumentation has been enabled in beta. You can test it by modifying a page on https://simple.wikipedia.beta.wmflabs.org.

Thu, Jul 11, 9:25 AM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena created P66281 eventbus metrics (beta).
Thu, Jul 11, 9:23 AM
gmodena updated the task description for T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.
Thu, Jul 11, 9:19 AM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform

Wed, Jul 3

gmodena added a comment to T368787: Flink job to enrich reconciliation events.

But can we assume the stream will be produced directly into jumbo, and won't have multi dc / replication requirements?

I think we decided we want to do reconcilliation of page_change and page_content_change in general, so that it can be used for Search and others.

Wed, Jul 3, 2:00 PM · Dumps 2.0 (Kanban Board)
gmodena added a comment to T368787: Flink job to enrich reconciliation events.

Consume this new event stream

Wed, Jul 3, 9:09 AM · Dumps 2.0 (Kanban Board)
gmodena added a comment to T368787: Flink job to enrich reconciliation events.

@gmodena and @Ottomata the description above is just me thinking out loud. Kindly please modify as you see fit.

Wed, Jul 3, 8:57 AM · Dumps 2.0 (Kanban Board)

Mon, Jul 1

gmodena added a comment to T368745: MediaWiki reconciliation API and event enrichment pipeline.

This can proceed with in parallel with {{ T368782 }} IMHO.

Mon, Jul 1, 2:01 PM · Dumps 2.0 (Kanban Board)
gmodena added a comment to T368782: MediaWiki Reconciliation API.

Are you thinking of targeting Action or REST endpoints?

Mon, Jul 1, 1:44 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board)
gmodena added a comment to T286814: '.event.pageViewId' should be string, '.event.subTest' should be string, '.event.searchSessionId' should be string.

Alertmanager has been firing again for this error as of 2024-07-01: https://mail.google.com/mail/u/0/#inbox/FMfcgzQVxRGDzSKGpXzlvVqVBwtdCSWR

Mon, Jul 1, 10:22 AM · MW-1.43-notes (1.43.0-wmf.14; 2024-07-16), Discovery-Search (Current work), Wikimedia-production-error, Data-Engineering

Thu, Jun 27

gmodena updated subscribers of T368667: [Event Platform] mw-page-content-change-enrich down in eqiad 2024-06-27.

@Ottomata ack. Alert manager might fire some alerts about kafka consumer lag (it's one of our SLIs). That's expected while the app backfills, and should be safe to silence. Alerts should resolve themselves once the app catches up.
cc / @BTullis who is on ops duty this week.

Thu, Jun 27, 8:51 PM · Data-Engineering, Event-Platform

Mon, Jun 24

gmodena added a comment to T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.

event service name (eventgate name)

We can prase this from $this->url

It won't hurt to add the event service name itself as a EventBus instance property.

Mon, Jun 24, 6:19 PM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena added a comment to T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.

I made some progress on this (see also comment above). Here's an idea of how I would like to name and label metrics. I would like to start small and interate in Beta. Some code paths are difficult to test out locally.

Mon, Jun 24, 2:08 PM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform

Jun 24 2024

gmodena added a comment to T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.

@gmodena I'm considering a change in EventBus for which I'd need to know stream name in the EventBus send() method. I think you said you'd need this too?

Jun 24 2024, 1:35 PM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform

Jun 20 2024

gmodena added a comment to T367923: Event validation errors for mediawiki.page_change.v1 due to missing performer field on revision suppressions.

@Ottomata one thing to be mindful about is downstream consumers. eventutilites_python stream descriptors pin event schema versions:
https://github.com/wikimedia/operations-deployment-charts/blob/master/helmfile.d/services/mw-page-content-change-enrich/values-main.yaml#L31

We might want to version bump that config too.

Jun 20 2024, 9:59 AM · Data-Engineering (Q1 2024 July 1st - September 30th), MW-1.43-notes (1.43.0-wmf.11; 2024-06-25), Event-Platform
gmodena added a comment to T367923: Event validation errors for mediawiki.page_change.v1 due to missing performer field on revision suppressions.

C. Do a minor version bump to 1.2.0 and skip the CI check in this case. I think we can add a CI exception for this rule in .jsonschema-tools skipSchemaTestCases.

Jun 20 2024, 9:49 AM · Data-Engineering (Q1 2024 July 1st - September 30th), MW-1.43-notes (1.43.0-wmf.11; 2024-06-25), Event-Platform

Jun 19 2024

gmodena moved T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib from Sprint Backlog to In Process on the Dumps 2.0 (Kanban Board) board.
Jun 19 2024, 2:02 PM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform

Jun 18 2024

gmodena updated subscribers of T350180: Upgrade prom-client in NodeJS service-runner and enable collectDefaultMetrics.

Apologies for the lack of activity on this task, it somehow fell through the cracks.

Jun 18 2024, 9:52 AM · Data-Engineering, observability, ChangeProp, Event-Platform, service-runner
gmodena added a comment to T367116: mw-page-content-change-enrich flink app is missing in k8s staging.

@amastilovic @Ottomata
Can we close this task? Pods have been up and running for a week, with no (obvious) issue.
I'd still rather not add alerting for staging. This deployment is essentially a no-op. We don't produce into the topic it consumes,
and it's not hooked up to any other downstream service. We use the staging deployment only for manual integration tests.

Jun 18 2024, 9:38 AM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Data-Engineering, Event-Platform

Jun 13 2024

gmodena added a comment to T351117: Move analytics log from Varnish to HAProxy.

The haproxy / benthos feed is now available in raw form under wmf_staging.webrequest_frontend_rc0 and post-processed in wmf_staging.webrequest.

Jun 13 2024, 10:43 AM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic

Jun 12 2024

gmodena awarded T358373: [Dumps 2] Reconcillation mechanism to detect and fetch missing/mismatched revisions a Love token.
Jun 12 2024, 12:54 PM · Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena claimed T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.
Jun 12 2024, 11:25 AM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena added a comment to T358373: [Dumps 2] Reconcillation mechanism to detect and fetch missing/mismatched revisions.

fetch and produce the latest state of the pair to the stream associated with table event.mediawiki_page_content_change_v1. Perhaps these events should be marked as a 'reconciliation' events, so that a consumer can distinguish them from regular revisions coming from EventBus.

I like this idea!

There are probably a few variations on this but I think keeping the late/backfilled events separate from the main streams might be helpful.

Jun 12 2024, 10:51 AM · Patch-For-Review, Dumps 2.0 (Kanban Board)

Jun 11 2024

gmodena added a comment to T367116: mw-page-content-change-enrich flink app is missing in k8s staging.

[...]

Looking into a manual restart. Will ping SRE if we need privileged access to config maps.

Jun 11 2024, 8:53 AM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Data-Engineering, Event-Platform
gmodena added a comment to T367116: mw-page-content-change-enrich flink app is missing in k8s staging.

[...]

Somehow, the Flink app stopped in staging. We don't alert on stuff in staging, so we never noticed.
This isn't affecting anything in production, but it does hurt our confidence when deploying. We should fix.
I'm not exactly sure how, but I think the deployment needs to be deleted from k8s staging in wikikube. Tagging DPE SRE for help.

Jun 11 2024, 8:30 AM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Data-Engineering, Event-Platform

May 24 2024

gmodena added a comment to T346611: [JVM Stewardship] To be discussed: SDK Man.

Will adopting SDKMan be a prescriptive change or just a default option? Would this change affect only a user's development environment, or also impact CI?

May 24 2024, 1:32 PM · Java-Scala-Standardization

May 15 2024

gmodena moved T365005: Evaluate ESC and explore an alternative design. from Incoming (new tickets) to Q4 2024 April 1st - June 30th on the Data-Engineering board.
May 15 2024, 2:32 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena moved T361853: [Datasets Config][Spike] Understand and document the details and conflicts between Datasets Config, Refine refactor, Dynamic EventStreamConfig, and Metrics Platform Instrumentation Configurator from In progress to In Review on the Data-Engineering (Q4 2024 April 1st - June 30th) board.
May 15 2024, 2:32 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
gmodena moved T365005: Evaluate ESC and explore an alternative design. from Next Up to In progress on the Data-Engineering (Q4 2024 April 1st - June 30th) board.
May 15 2024, 2:32 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena claimed T365005: Evaluate ESC and explore an alternative design..
May 15 2024, 2:29 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena set the point value for T365005: Evaluate ESC and explore an alternative design. to 5.
May 15 2024, 2:28 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena renamed T361094: Orchestrate gobblin ingestion task with Airflow and config store. from [NEEDS GROOMING] Orchestrate gobblin ingestion task with Airflow to Orchestrate gobblin ingestion task with Airflow and config store..
May 15 2024, 1:24 PM · Event-Platform, Data-Engineering
gmodena created T365005: Evaluate ESC and explore an alternative design..
May 15 2024, 1:05 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform

May 14 2024

gmodena added a comment to T351117: Move analytics log from Varnish to HAProxy.

adopt topic names that follow EP conventions: <dc>.<topic_name>

I'm sorry for not thinking about this earlier. There is a big of a design flaw in the use of data center as a topic prefix, and really, for topics that are never mirrored to other Kafka clusters, there is no need for topic prefixes at all.

I just added documentation about this here:
https://wikitech.wikimedia.org/wiki/Kafka#Data_center_topic_prefixing_design_flaw

Given that, and the ever expanding list of data centers, and the fact that webrequest is the only stream we have that is produced to from non main data centers, I think we should not use topic prefixing for webrequest.

All producers should use the same topic name, independent of which data center they are in.

Thanks for clarifying @Ottomata.

@Ottomata @Fabfur If we remove prefixing, there is a potential clash between varnishkafka and benthos topics.
How about we name the production Haproxy/benthos topics as follows?

  • webrequest_frontent_text
  • webrequest_frontent_text.error
  • webrequest_frontent_upload
  • webrequest_frontent_upload.error

No problem for us to rename these topics, even with or without "variable" part...

May 14 2024, 12:58 PM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic
gmodena added a comment to T351117: Move analytics log from Varnish to HAProxy.

adopt topic names that follow EP conventions: <dc>.<topic_name>

I'm sorry for not thinking about this earlier. There is a big of a design flaw in the use of data center as a topic prefix, and really, for topics that are never mirrored to other Kafka clusters, there is no need for topic prefixes at all.

I just added documentation about this here:
https://wikitech.wikimedia.org/wiki/Kafka#Data_center_topic_prefixing_design_flaw

Given that, and the ever expanding list of data centers, and the fact that webrequest is the only stream we have that is produced to from non main data centers, I think we should not use topic prefixing for webrequest.

All producers should use the same topic name, independent of which data center they are in.

May 14 2024, 10:06 AM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic
gmodena updated subscribers of T361853: [Datasets Config][Spike] Understand and document the details and conflicts between Datasets Config, Refine refactor, Dynamic EventStreamConfig, and Metrics Platform Instrumentation Configurator.

FWIW, with this outcome, then dynamic ESC as implemented is fine with me :)

May 14 2024, 9:38 AM · Data-Engineering (Q4 2024 April 1st - June 30th)

May 13 2024

gmodena added a comment to T361853: [Datasets Config][Spike] Understand and document the details and conflicts between Datasets Config, Refine refactor, Dynamic EventStreamConfig, and Metrics Platform Instrumentation Configurator.

This comment is the result of some time spent collecting info from various stakeholders, and reviewing documentation and decision records.
It was initially shared as a Google doc (now moved to read only to preserve comment history).

May 13 2024, 9:46 AM · Data-Engineering (Q4 2024 April 1st - June 30th)

Apr 30 2024

gmodena added a comment to T361017: [SPIKE] Can we express Event Platform configs in Datasets Config?.

IMHO it should be explicitly stated that the system we are building is the Airflow Dataset Config store/service, not just a generic configuration repository.

@gmodena @JAllemandou, if this is the case, do we need an external service and datastore? The config is all in git.

Apr 30 2024, 6:33 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Spike, Event-Platform
gmodena added a comment to T351117: Move analytics log from Varnish to HAProxy.

@Fabfur f/up from our chat earlier; these would be the pending config bits that we'll the to finalize when moving to prod topics:

Apr 30 2024, 2:28 PM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic

Apr 29 2024

gmodena moved T361017: [SPIKE] Can we express Event Platform configs in Datasets Config? from Next Up to In Review on the Data-Engineering (Q4 2024 April 1st - June 30th) board.
Apr 29 2024, 6:36 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Spike, Event-Platform
gmodena claimed T361017: [SPIKE] Can we express Event Platform configs in Datasets Config?.
Apr 29 2024, 6:36 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Spike, Event-Platform
gmodena updated subscribers of T361017: [SPIKE] Can we express Event Platform configs in Datasets Config?.

We can easily express a stream config as jsonschema, and expose via datasets-config-service.
I am opposed to a monorepo for all configurations and suggest focusing current efforts on Airflow and Airflow-produced datasets. For integration with Metrics, Platform, and Mediawiki, I lean towards a service mesh approach. The service developed by @tchin could serve as a template.

Apr 29 2024, 6:34 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Spike, Event-Platform
gmodena added a comment to T351117: Move analytics log from Varnish to HAProxy.

I like the overall idea, but I'd prefer to proceed DC-by-DC, in switching topics and shutting down VarnishKakfka when we will be sure about the correctness of data. I'm afraid having two software producing (and sending, and storing) the "same" data on 96 hosts (and soon also MAGRU) could be a little bit expensive for us in terms of bandwidth...

Makes sense. This would require some work on our end to generate webrequest data from two "raw" sources at once, but I think as long as we can filter on dc / hostnames, we should manage. Let me take a better look at how this ETL is setup.

Apr 29 2024, 4:10 PM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic
gmodena added a comment to T351117: Move analytics log from Varnish to HAProxy.
  • there's couple of CRs pending (linked to this phab) and I'd like to have a second run on the event schema naming conventions (cc / @Fabfur). We might want to drop the webrequest_source since we don't currently use in ETL (it's inferred from the HDFS path, not schema).

No problem here, for us is just a matter of removing a line from the Benthos configuration. Let me know if I can proceed!

Apr 29 2024, 9:40 AM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic

Apr 25 2024

gmodena added a comment to T351117: Move analytics log from Varnish to HAProxy.

The haproxy_id field has been added to messages.

Apr 25 2024, 2:06 PM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic
gmodena moved T353940: We should provide DQ integration with Python from In Review to Done on the Data-Engineering (Q4 2024 April 1st - June 30th) board.
Apr 25 2024, 9:07 AM · Data-Engineering (Q4 2024 April 1st - June 30th)

Apr 19 2024

gmodena closed T351117: Move analytics log from Varnish to HAProxy as Resolved.

I'm afraid mixing varnishkafka and benthos payloads would break ingestion piepelines, since old/new events have a different schema. We could reuse the current topics, but we'd have to drain them first.

We can do both, for us it's just a matter of changing a string on puppet. I think decision is more on your side, choose the easiest/best option for you and we'll implement!

Apr 19 2024, 7:00 AM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic
gmodena changed the point value for T362780: [DQ] Add support for distribution metrics in data quality exporters from 2 to 3.
Apr 19 2024, 6:34 AM · Data-Engineering
gmodena set the point value for T362782: [DQ][NEEDS GROOMING] Add support for deequ's RowLevelSchemaValidator in refinery to 3.
Apr 19 2024, 6:33 AM · Data-Engineering
gmodena set the point value for T362780: [DQ] Add support for distribution metrics in data quality exporters to 2.
Apr 19 2024, 6:33 AM · Data-Engineering

Apr 18 2024

gmodena set the point value for T362783: Add instrumentation for actor signatures to 1.
Apr 18 2024, 1:46 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena set the point value for T362785: Add host level instrumentation on webrequest to 1.
Apr 18 2024, 1:46 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena moved T361853: [Datasets Config][Spike] Understand and document the details and conflicts between Datasets Config, Refine refactor, Dynamic EventStreamConfig, and Metrics Platform Instrumentation Configurator from Next Up to In progress on the Data-Engineering (Q4 2024 April 1st - June 30th) board.
Apr 18 2024, 1:35 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
gmodena added a comment to T351117: Move analytics log from Varnish to HAProxy.

About the sequence issue, that's the most plausible hypotheses. We could append (or prepend) other information pieces to the sequence number (like the haproxy process id) to avoid duplicates but we couldn't guarantee the monotonic increase (or the increase, even) in this case. I suggest using this current approach for the moment and eventually rework later.

Apr 18 2024, 1:12 PM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic
gmodena added a comment to T351117: Move analytics log from Varnish to HAProxy.

Next steps: now that we are starting to collect more logs, we can start comparing current / new webrequest records.

Apr 18 2024, 11:04 AM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic

Apr 17 2024

gmodena moved T362783: Add instrumentation for actor signatures from Next Up to In Review on the Data-Engineering (Q4 2024 April 1st - June 30th) board.
Apr 17 2024, 5:57 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena moved T362785: Add host level instrumentation on webrequest from Next Up to In Review on the Data-Engineering (Q4 2024 April 1st - June 30th) board.
Apr 17 2024, 5:57 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena created T362785: Add host level instrumentation on webrequest.
Apr 17 2024, 3:18 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena created T362783: Add instrumentation for actor signatures.
Apr 17 2024, 3:15 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena created T362782: [DQ][NEEDS GROOMING] Add support for deequ's RowLevelSchemaValidator in refinery.
Apr 17 2024, 3:08 PM · Data-Engineering
gmodena created T362780: [DQ] Add support for distribution metrics in data quality exporters.
Apr 17 2024, 3:03 PM · Data-Engineering

Apr 5 2024

gmodena added a parent task for T361017: [SPIKE] Can we express Event Platform configs in Datasets Config?: T361853: [Datasets Config][Spike] Understand and document the details and conflicts between Datasets Config, Refine refactor, Dynamic EventStreamConfig, and Metrics Platform Instrumentation Configurator.
Apr 5 2024, 6:42 AM · Data-Engineering (Q4 2024 April 1st - June 30th), Spike, Event-Platform
gmodena added a subtask for T361853: [Datasets Config][Spike] Understand and document the details and conflicts between Datasets Config, Refine refactor, Dynamic EventStreamConfig, and Metrics Platform Instrumentation Configurator: T361017: [SPIKE] Can we express Event Platform configs in Datasets Config?.
Apr 5 2024, 6:42 AM · Data-Engineering (Q4 2024 April 1st - June 30th)

Apr 4 2024

gmodena added a comment to T351117: Move analytics log from Varnish to HAProxy.

@gmodena you should have some more data to play with now, while I work on the performance optimization and on Benthos internal metrics...

Apr 4 2024, 1:16 PM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic

Mar 27 2024

gmodena created T361094: Orchestrate gobblin ingestion task with Airflow and config store..
Mar 27 2024, 11:50 AM · Event-Platform, Data-Engineering

Mar 26 2024

gmodena moved T359051: eventstreams: change default num_workers to 0 from Ready to Deploy to Done on the Data-Engineering (Sprint 9) board.
Mar 26 2024, 3:37 PM · Data-Engineering (Sprint 9)
gmodena moved T359051: eventstreams: change default num_workers to 0 from In Review to Ready to Deploy on the Data-Engineering (Sprint 9) board.
Mar 26 2024, 3:37 PM · Data-Engineering (Sprint 9)
gmodena created T361017: [SPIKE] Can we express Event Platform configs in Datasets Config?.
Mar 26 2024, 1:57 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Spike, Event-Platform

Mar 25 2024

gmodena updated the task description for T353940: We should provide DQ integration with Python.
Mar 25 2024, 8:13 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
gmodena added a comment to T353940: We should provide DQ integration with Python.

I need to add a wrapper to the Alert generation SerDe

Mar 25 2024, 8:04 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
gmodena moved T353940: We should provide DQ integration with Python from In progress to In Review on the Data-Engineering (Sprint 9) board.
Mar 25 2024, 8:01 PM · Data-Engineering (Q4 2024 April 1st - June 30th)

Mar 22 2024

gmodena added a comment to T314956: [Event Platform] Declare webrequest as an Event Platform stream.

Tagging T360642: Remove extra fields currently sent to Kafka

Mar 22 2024, 8:14 AM · Patch-For-Review, Data-Engineering, Event-Platform
gmodena added a comment to T360642: Remove extra fields currently sent to Kafka.

These are the fields that are sent from Benthos that aren't present in the current webrequest stream:

Mar 22 2024, 8:12 AM · Event-Platform, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic
gmodena updated subscribers of T360642: Remove extra fields currently sent to Kafka.
Mar 22 2024, 8:08 AM · Event-Platform, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic
gmodena added a project to T360642: Remove extra fields currently sent to Kafka: Event-Platform.
Mar 22 2024, 7:58 AM · Event-Platform, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic

Mar 21 2024

gmodena added a comment to T353940: We should provide DQ integration with Python.

lets maybe pair on it?

I'd love to hack on this at the offsite!!

Mar 21 2024, 2:10 PM · Data-Engineering (Q4 2024 April 1st - June 30th)
gmodena added a comment to T350180: Upgrade prom-client in NodeJS service-runner and enable collectDefaultMetrics.

See https://github.com/wikimedia/service-runner/commit/b9c98eab5398413c16df2317562745f6ffe74439

Mar 21 2024, 11:39 AM · Data-Engineering, observability, ChangeProp, Event-Platform, service-runner

Mar 19 2024

gmodena added a project to T360450: Add $schema key to Benthos payload: Event-Platform.
Mar 19 2024, 4:37 PM · Event-Platform, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic
gmodena updated subscribers of T360450: Add $schema key to Benthos payload.

For context: this is the approach we follow with other producers, e.g. Java.

Mar 19 2024, 4:33 PM · Event-Platform, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic

Mar 8 2024

gmodena added a comment to T353940: We should provide DQ integration with Python.

IIUC, the necessity for py4j is only tied to the fact that we developed helper code like the case of HivePartition and DeequAnalyzersToDataQualityMetrics that we'd like to reuse, correct?

Mar 8 2024, 2:40 PM · Data-Engineering (Q4 2024 April 1st - June 30th)

Mar 7 2024

gmodena created T359561: Add user fabfur to analytics-privatedata-users.
Mar 7 2024, 4:19 PM · Patch-For-Review, Data-Platform-SRE (2024.03.25 - 2024.04.14), SRE, SRE-Access-Requests
gmodena moved T353940: We should provide DQ integration with Python from Next Up to In progress on the Data-Engineering (Sprint 9) board.
Mar 7 2024, 10:39 AM · Data-Engineering (Q4 2024 April 1st - June 30th)
gmodena updated subscribers of T353940: We should provide DQ integration with Python.

We can integrate our DQ framework with Python by piggy backing on pyspark 's py4j gateway. Following is a rudimentary example that produces
metrics with data_quality_metrics table format:

Mar 7 2024, 10:36 AM · Data-Engineering (Q4 2024 April 1st - June 30th)