Feed Advanced Search

Advanced Search
Use Results
Edit Query
Hide Query

	Include stories about projects I am a member of.

Thu, May 16

Milimetric added a comment to T363858: Add a MenuButton component to Codex.

quick update: resolved with Eric to work on this as a separate component. Will start on a patch now, keeping it in the Codex sandbox for now with T363432 as the goal.

Thu, May 16, 5:37 PM · Patch-For-Review, Design-System-Team (DST-Sprint-23 (2024-05-13 to 2024-05-24)), Codex

Wed, May 15

Milimetric created T365074: Requesting access to cassandra-staging-devs for milimetric.

Wed, May 15, 9:51 PM · SRE, SRE-Access-Requests

Thu, May 9

Milimetric renamed T364391: commonswiki and enwiki dumps thrashing from commonswiki and enwiki dumps trashing to commonswiki and enwiki dumps thrashing.

Thu, May 9, 11:22 AM · Data Products (Data Products Sprint 13), Dumps-Generation

Milimetric moved T358707: [Commons Impact Metrics] Create Airflow job that formats and loads the data to Cassandra for AQS from In Process to Paused on the Data Products (Data Products Sprint 13) board.

Thu, May 9, 11:20 AM · Data Products (Data Products Sprint 13), Patch-For-Review, Commons-Impact-Metrics

Milimetric moved T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 13) board.

Thu, May 9, 11:19 AM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric moved T364250: May 1, 2024 wikidatawiki dump not started from Testing to Done on the Data Products (Data Products Sprint 13) board.

Thu, May 9, 11:16 AM · Data Products (Data Products Sprint 13), Dumps-Generation

Milimetric moved T362552: Commons Impact AQS: integration tests and deployment for endpoints from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 13) board.

Thu, May 9, 11:15 AM · Patch-For-Review, Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric moved T361669: Implement Category Metrics Snapshot API from Sign Off to Code Review / Tech Input on the Data Products (Data Products Sprint 13) board.

Thu, May 9, 11:14 AM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Wed, May 1

Milimetric moved T358701: [Commons Impact Metrics] Create Airflow job that generates the public dumps from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 13) board.

Wed, May 1, 8:16 PM · Data Products (Data Products Sprint 13), Data-Platform-SRE (2024.04.15 - 2024.05.05), Commons-Impact-Metrics

Milimetric added a comment to T358701: [Commons Impact Metrics] Create Airflow job that generates the public dumps.

ok, I didn't do much here, just provided a very short description and detailed out the schemas as Marcel had them in the design doc. Please let me know if anyone was imagining something else.

Wed, May 1, 8:16 PM · Data Products (Data Products Sprint 13), Data-Platform-SRE (2024.04.15 - 2024.05.05), Commons-Impact-Metrics

Milimetric added a comment to T358701: [Commons Impact Metrics] Create Airflow job that generates the public dumps.

filling out the readme right now, thanks Ben!

Wed, May 1, 6:57 PM · Data Products (Data Products Sprint 13), Data-Platform-SRE (2024.04.15 - 2024.05.05), Commons-Impact-Metrics

Milimetric moved T360646: [Sprint 11 GOAL] SDS 2.5.6 Define User and Technical Requirements for experimentation flagging engine and analyze 3rd party option against those requirements from Sprint Goals to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:51 PM · Data Products (Data Products Sprint 12)

Milimetric moved T360649: [Sprint 11 GOAL] Commons Impact Metrics: Deliver Data Pipeline from Sprint Goals to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:51 PM · Data Products (Data Products Sprint 12)

Milimetric moved T360650: [Sprint 11 GOAL] Commons Impact Metrics: Two endpoints pass unit tests from Sprint Goals to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:50 PM · Data Products (Data Products Sprint 12)

Milimetric moved T362424: [SPRINT 12 GOAL] MP Instrumentation Configuration working prototype demoed on staging and revised based on preliminary feedback and ready for user testing from Sprint Goals to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:50 PM · Data Products (Data Products Sprint 12)

Milimetric moved T362428: [SPRINT 12 GOAL] Decide on approach, toolset, and style for AQS user docs from Sprint Goals to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:49 PM · Data Products (Data Products Sprint 12)

Milimetric moved T361335: Deploy the MP Instrumentation Configuration application to the DSE k8s cluster from Paused to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:45 PM · Data Products, Epic, Data-Platform-SRE (2024.05.06 - 2024.05.26), Metrics Platform Backlog

Milimetric moved T362144: [User Story] Build the MPIC API endpoints - PATCH /instrument/:slug from Code Review / Tech Input to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:41 PM · Data Products (Data Products Sprint 13), Patch-For-Review, Metrics Platform Backlog

Milimetric moved T361343: Create the MPIC Kubernetes chart from Sign Off to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:40 PM · Data-Platform-SRE (2024.05.06 - 2024.05.26), Data Products (Data Products Sprint 12), Metrics Platform Backlog

Milimetric moved T360748: [User Story] Create the MPIC database schema from Sign Off to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:40 PM · Data Products (Data Products Sprint 12), Metrics Platform Backlog

Milimetric moved T361668: Go project and solution setup from Sign Off to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:40 PM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics

Milimetric moved T358681: [Commons Impact Metrics] Productionize SparkSQL and Spark-Scala from Sign Off to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:39 PM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics

Milimetric moved T358699: [Commons Impact Metrics] Create Airflow job that generates the datasets in Iceberg from Sign Off to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:39 PM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics

Milimetric moved T362214: [PHP] Metrics Platform client library validation error(s) from To Deploy to Done on the Data Products (Data Products Sprint 12) board.

Wed, May 1, 6:37 PM · MW-1.43-notes (1.43.0-wmf.2; 2024-04-23), Patch-For-Review, Data Products (Data Products Sprint 12), Metrics Platform Backlog

Milimetric moved T361741: mediawiki_history_snapshot_config_dag fails since the last change about the AQS config table from Sprint Backlog to Done on the Data Products (Data Products Sprint 13) board.

I am not sure this is 100% squashed because the behavior is so weird. Here's what I found, in short:

Wed, May 1, 4:54 PM · Data Products (Data Products Sprint 13)

Apr 18 2024

Milimetric added a comment to T361889: Decision: OpenAPI spec viewer for AQS.

It would be cool to do a quick spike into Scalar and the customization we'd need there. Abstain as a voter here, I like all the options just fine and I have bad aesthetics when it comes to reading docs because I just start hacking and see what happens :)

Apr 18 2024, 7:48 AM · Data Products (Data Products Sprint 12), Tech-Docs-Team, Documentation, AQS2.0

Milimetric added a comment to T361887: Decision: AQS user documentation approach.

+1 for Option 2. For what it's worth, when we initially put up the endpoint docs on wikitech we were just doing so while we waited for a better end user experience than the swagger UI afforded us. I especially like the integration with wikitech described in option 2 (the discovery pages that would lead wiki users to the docs)

Apr 18 2024, 7:45 AM · Data Products (Data Products Sprint 12), Tech-Docs-Team, Documentation, AQS2.0

Apr 16 2024

Milimetric added a comment to T362268: Design the technical architecture for MPIC.

+1, SSR is kind of a pain if done in fancier ways, but in this way you get a lot for free and it helps even reduce code. As a bonus the user gets a great experience.

Apr 16 2024, 11:14 AM · Data Products (Data Products Sprint 12), Metrics Platform Backlog

Apr 15 2024

Milimetric moved T362551: Commons Impact AQS: endpoints with unit tests from Sprint Backlog to Code Review / Tech Input on the Data Products (Data Products Sprint 12) board.

Apr 15 2024, 4:17 PM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric updated the task description for T362552: Commons Impact AQS: integration tests and deployment for endpoints.

Apr 15 2024, 4:17 PM · Patch-For-Review, Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric updated the task description for T362551: Commons Impact AQS: endpoints with unit tests.

Apr 15 2024, 4:17 PM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric changed the point value for T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from 21 to 5.

Apr 15 2024, 4:16 PM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric moved T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from Code Review / Tech Input to Paused on the Data Products (Data Products Sprint 12) board.

I've broken this down into subtasks but I'm keeping it as something between an epic and an actual task. It's coordinating and has all the acceptance criteria, it was just too big. So I'll leave the other two subtasks on the boards while I'm on vacation and put this in paused. This can be resumed whenever you'd like to continue work on coordinating and deployment.

Apr 15 2024, 4:16 PM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric created T362552: Commons Impact AQS: integration tests and deployment for endpoints.

Apr 15 2024, 4:15 PM · Patch-For-Review, Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric created T362551: Commons Impact AQS: endpoints with unit tests.

Apr 15 2024, 4:14 PM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric moved T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 12) board.

Apr 15 2024, 4:04 PM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Apr 11 2024

Sj awarded T249419: RFC: Render data visualizations on the server a Love token.

Apr 11 2024, 1:33 PM · Wikimedia-Performance-recommendation, JavaScript, MediaWiki-extensions-Graph, covid-19, TechCom-RFC

Milimetric added a comment to T361742: Requesting access to shell access to analytics client servers for AndyRussG.

Approved, welcome back Andy :)

Apr 11 2024, 10:45 AM · Patch-For-Review, SRE, SRE-Access-Requests

Milimetric added a comment to T362113: Requesting access to analytics-privatedata-users for Steph Toyofuku.

Approved

Apr 11 2024, 10:43 AM · Patch-For-Review, SRE, SRE-Access-Requests

Apr 4 2024

Milimetric set the point value for T361669: Implement Category Metrics Snapshot API to 3.

Apr 4 2024, 11:12 AM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric set the point value for T356748: Adding a AQS 2.0 endpoint guide to 2.

Apr 4 2024, 11:12 AM · Data Products, AQS2.0

Milimetric set the point value for T361668: Go project and solution setup to 8.

Apr 4 2024, 11:12 AM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics

Milimetric changed the point value for T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from 34 to 21.

Apr 4 2024, 11:11 AM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric moved T354823: [PHP] Remove dispatch method from Code Review / Tech Input to To Deploy on the Data Products (Data Products Sprint 11) board.

Apr 4 2024, 11:09 AM · Data Products (Data Products Sprint 11), Patch-For-Review, Metrics Platform Backlog, good first task

Apr 3 2024

Milimetric moved T358699: [Commons Impact Metrics] Create Airflow job that generates the datasets in Iceberg from Sprint Backlog to In Process on the Data Products (Data Products Sprint 11) board.

Apr 3 2024, 9:43 PM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics

Milimetric claimed T358699: [Commons Impact Metrics] Create Airflow job that generates the datasets in Iceberg.

Apr 3 2024, 9:43 PM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics

Apr 1 2024

Milimetric moved T360501: Update ReadMe Doc for Tests Framework from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 11) board.

Apr 1 2024, 4:14 PM · Data Products (Data Products Sprint 13), AQS2.0

Milimetric moved T360735: [User Story] Build the backend service for MPIC from Code Review / Tech Input to Done on the Data Products (Data Products Sprint 11) board.

Apr 1 2024, 4:09 PM · Metrics Platform Backlog, Data Products (Data Products Sprint 11)

Mar 29 2024

Milimetric added a comment to T361242: Unique devices tables have missing or incorrect data for January and February 2024.

I found a candidate bug. The script used to ask for the year and month, and after the change it asks for the day. generate_druid_unique_devices_per_domain_daily_aggregated_monthly.hql seems to have been adapted to give the correct result, but evidence to the contrary, the druid output seems to be only one day. Running now to prove or disprove.

Mar 29 2024, 6:32 PM · Data-Engineering, Movement-Insights, Data-Platform

Milimetric added a comment to T361242: Unique devices tables have missing or incorrect data for January and February 2024.

NOTE: one key finding here is that DataHub is not kept in sync with these data migrations. If we don't address this, DataHub will become more of a source of confusion than clarity.

Mar 29 2024, 6:06 PM · Data-Engineering, Movement-Insights, Data-Platform

Milimetric added a comment to T361242: Unique devices tables have missing or incorrect data for January and February 2024.

Merge request 582 seems to have changed how we do this monthly druid segment aggregation, so the answer must be around here. Again I checked the new source table, now Iceberg (wmf_readership.unique_devices_per_project_family_daily) and again that seems to have data for all of January, for example.

Mar 29 2024, 6:04 PM · Data-Engineering, Movement-Insights, Data-Platform

Milimetric added a comment to T361242: Unique devices tables have missing or incorrect data for January and February 2024.

Looked at this a bit today.

Mar 29 2024, 5:53 PM · Data-Engineering, Movement-Insights, Data-Platform

Mar 25 2024

Milimetric added a comment to T342577: Data Quality - requestctl not getting set.

@VirginiaPoundstone: Looks like Giuseppe patched varnish to send more requestctls, so maybe that completely or partially solves the problem. I'd have to look through the data to see. I'm going to do a good job focusing and only do that if you put it in the sprint :) (should take no more than an hour, but it's probably not like a few seconds if I want to be more thorough)

Mar 25 2024, 5:14 PM · Data Products, SRE, Traffic

Milimetric moved T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from Sprint Backlog to In Process on the Data Products (Data Products Sprint 11) board.

Mar 25 2024, 4:07 PM · Data Products (Data Products Sprint 13), Commons-Impact-Metrics

Milimetric created T360914: Update Dashiki Cloud Instances.

Mar 25 2024, 3:59 PM · Data Products (Data Products Sprint 14)

Mar 22 2024

Milimetric moved T356444: NEW BUG REPORT Wikipedia clickstream datasets link on Dumps "Other" page should point to HTML readme from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 11) board.

I made the puppet change but I need an SRE to merge. This is not well documented indeed, we should talk about a better way to maintain this interface that so many people use.

Mar 22 2024, 4:21 PM · Patch-For-Review, Data Products (Data Products Sprint 11), Data-Platform

Milimetric added a comment to T359561: Add user fabfur to analytics-privatedata-users.

Approved!

Mar 22 2024, 11:26 AM · Patch-For-Review, Data-Platform-SRE (2024.03.25 - 2024.04.14), SRE, SRE-Access-Requests

Mar 20 2024

Milimetric added a comment to T347970: [L] MachineVision: archive and remove all events and event schemas.

deleted from meta

Mar 20 2024, 4:39 PM · Patch-For-Review, Structured-Data-Backlog (Current Work), MachineVision

Milimetric added a comment to T360073: Wikistats "Active Editors by Country" does not follow definition for active editors.

I believe this dataset that's already being published is strictly better and in my opinion should replace the current active editors by country data: https://analytics.wikimedia.org/published/datasets/geoeditors_weekly/ (also the monthly version)

Mar 20 2024, 3:34 PM · Data Products, Data-Engineering, Movement-Insights, Data-Platform

Milimetric updated subscribers of T360522: aqs endpoint health alerting about mismatched check.

Ah, thanks @will for finding T358793: Decommission AQS 1.0, @brouberol and others can go ahead and take AQS 1 offline and follow through with decommissioning. Take note of what Eric said there, the servers themselves are still useful, just AQS 1 is going away.

Mar 20 2024, 2:23 PM · Patch-For-Review, Data-Platform-SRE (2024.03.04 - 2024.03.24), Data-Engineering

Milimetric added a comment to T360522: aqs endpoint health alerting about mismatched check.

I'm working to find the relevant tickets, but AQS 1 should be sunset and I think it's ok to take it offline for now and follow through with the rest of the process. I've just been absent for a couple months and might be missing some nuance.

Mar 20 2024, 2:21 PM · Patch-For-Review, Data-Platform-SRE (2024.03.04 - 2024.03.24), Data-Engineering

Mar 17 2024

rokejulianlockhart awarded T249419: RFC: Render data visualizations on the server a Like token.

Mar 17 2024, 7:02 PM · Wikimedia-Performance-recommendation, JavaScript, MediaWiki-extensions-Graph, covid-19, TechCom-RFC

Feb 5 2024

Mayakp.wiki awarded T333223: Adding user_is_temp to the user table a Barnstar token.

Feb 5 2024, 7:50 PM · MW-1.42-notes (1.42.0-wmf.15; 2024-01-23), MW-1.41-notes (1.41.0-wmf.10; 2023-05-23), Anti-Harassment, Data-Persistence, Data-Engineering, Temporary accounts

Jan 9 2024

Milimetric set the point value for T353296: Netherlands appears twice as "The Netherlands" or "Netherlands" in country coded data to 3.

Jan 9 2024, 6:14 PM · Movement-Insights, Data Products (Data Products Sprint 05), Data-Platform

Milimetric set the point value for T352793: MediaWiki History Plan: Maintenance Plan to 2.

Jan 9 2024, 6:13 PM · Data Products (Data Products Sprint 05)

Milimetric set the point value for T352790: MediaWiki History Plan: use cases and potential work to 2.

Jan 9 2024, 6:13 PM · Data Products (Data Products Sprint 05)

Milimetric assigned T342911: Data Quality Issue: Wikitext History Job fail / rerun in Airflow to mforns.

Jan 9 2024, 1:20 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Data Products, Movement-Metrics, Movement-Insights

Milimetric moved T342911: Data Quality Issue: Wikitext History Job fail / rerun in Airflow from Sprint Backlog to In Process on the Data Products (Data Products Sprint 05) board.

Jan 9 2024, 1:20 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Data Products, Movement-Metrics, Movement-Insights

Milimetric added a project to T342911: Data Quality Issue: Wikitext History Job fail / rerun in Airflow: Data Products (Data Products Sprint 05).

Jan 9 2024, 1:19 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Data Products, Movement-Metrics, Movement-Insights

Jan 8 2024

Milimetric updated subscribers of T342911: Data Quality Issue: Wikitext History Job fail / rerun in Airflow.

@VirginiaPoundstone this issue came up again (thanks very much to @xcollazo who remembered this task). I support option b) in Xabriel's plan above, and I think this should be triaged with high importance as a production issue. This table is used by lots of people and it seems to me it'll keep failing. If the folks looking into it don't remember this, it's a lot of time wasted.

Jan 8 2024, 9:02 PM · Data-Engineering (Q4 2024 April 1st - June 30th), Data Products, Movement-Metrics, Movement-Insights

Milimetric added a comment to T353956: Traffic anomaly detection triggers alerts because of a MaxMind Country rename.

Quick mention of this other task where some of the work took place: T353296. Relevant to this, the gerrit change https://gerrit.wikimedia.org/r/c/analytics/refinery/+/982899 included updates to the following pipelines/datasets:

Jan 8 2024, 5:39 PM · Data Products (Data Products Sprint 05)

Jan 4 2024

Milimetric added a comment to T354074: Wikistats - incorrect number of content articles for Latvian Wikipedia .

TL;DR; the data pipeline up to AQS seems fine, my guess is we're not filtering properly to exclude redirects in AQS 2, timeline corresponds with the reported problem. Sorry for the inconvenience, working on a fix.

Jan 4 2024, 9:12 PM · Data Products (Data Products Sprint 07), Data-Engineering, Analytics, Data-Engineering-Wikistats

Milimetric added a comment to T346463: Identify and label prefetch proxy data in our traffic.

@Mayakp.wiki the patch to watch is: https://gerrit.wikimedia.org/r/c/operations/puppet/+/981352/. This has not yet been merged and deployed. When it is, you'll start seeing the changes in x_analytics.

Jan 4 2024, 2:57 PM · Traffic, Movement-Insights, Data-Engineering

Milimetric added a comment to T307040: Propagate field descriptions from event schemas to Hive event tables.

Datahub allows you to add descriptions at sub-field level. We should at some point get to consensus about where we want all this description stuff to live. We talked about:

Jan 4 2024, 2:46 PM · Patch-For-Review, Product-Analytics, Data-Engineering

Dec 22 2023

xcollazo awarded T352793: MediaWiki History Plan: Maintenance Plan a Pterodactyl token.

Dec 22 2023, 7:28 PM · Data Products (Data Products Sprint 05)

Dec 12 2023

Milimetric updated subscribers of T312566: Emit lineage information about Airflow jobs to DataHub.

Quick recap for anyone looking to implement lineage. First, a note regarding lineage as part of centralized configuration. I think this would be very useful, and I'm in no way suggesting that we slow down on the work that @JAllemandou and @lbowmaker are leading on that front. The reality is that a centralized config may take a few more months to get implemented. In the meantime, we could instrument lineage in the airflow DAGs in a few minutes per DAG. Done in a standard way, this would be very easy to migrate to centralized config. In addition, as we implement this we may find exceptions and edge cases that would inform the centralized config. If anyone disagrees with anything here, you are very welcome, please don't take this as a "decision". Just a thought. If we agree with this and there's some slow-down to migrate back to the centralized config, I hereby promise that I'll do it myself on all DAGs.

Dec 12 2023, 8:06 PM · Data-Engineering, Data-Catalog

Milimetric added a comment to T351117: Move analytics log from Varnish to HAProxy.

In T351117#9379025, @Fabfur wrote:

Hi @Milimetric sorry for the late reply, I'll try to answer to your question but consider we're still investigating about all pro and cons of this "migration", and for sure we'll share our thought and our action plan before moving on with this...

Dec 12 2023, 4:17 PM · Data Products, Patch-For-Review, Data-Engineering, Observability-Logging, Traffic

Milimetric added a comment to T352793: MediaWiki History Plan: Maintenance Plan.

The following is a quick rundown of what I would think about if something goes wrong, and how I would check.

Dec 12 2023, 3:56 PM · Data Products (Data Products Sprint 05)

Dec 11 2023

Milimetric moved T352790: MediaWiki History Plan: use cases and potential work from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 05) board.

A full list of current use cases could only be compiled by reaching out to researchers who download this dataset. Limited to what we know, current use cases are roughly:

Dec 11 2023, 9:21 PM · Data Products (Data Products Sprint 05)

Milimetric added a comment to T352790: MediaWiki History Plan: use cases and potential work.

MediaWiki History is described in detail in the following places:

Dec 11 2023, 9:00 PM · Data Products (Data Products Sprint 05)

Milimetric moved T352793: MediaWiki History Plan: Maintenance Plan from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 05) board.

Dec 11 2023, 8:59 PM · Data Products (Data Products Sprint 05)

Milimetric added a comment to T352793: MediaWiki History Plan: Maintenance Plan.

The algorithm is explained at length starting here.

Dec 11 2023, 8:59 PM · Data Products (Data Products Sprint 05)

Milimetric added a comment to T352793: MediaWiki History Plan: Maintenance Plan.

A shortened and updated list of Changes and Known Problems.

Dec 11 2023, 8:56 PM · Data Products (Data Products Sprint 05)

Milimetric added a comment to T352793: MediaWiki History Plan: Maintenance Plan.

MediaWiki History is described in detail in the following places:

Dec 11 2023, 8:32 PM · Data Products (Data Products Sprint 05)

Milimetric claimed T352790: MediaWiki History Plan: use cases and potential work.

Dec 11 2023, 8:01 PM · Data Products (Data Products Sprint 05)

Milimetric claimed T352793: MediaWiki History Plan: Maintenance Plan.

Dec 11 2023, 8:01 PM · Data Products (Data Products Sprint 05)

Milimetric moved T352790: MediaWiki History Plan: use cases and potential work from Sprint Backlog to In Process on the Data Products (Data Products Sprint 05) board.

Dec 11 2023, 8:01 PM · Data Products (Data Products Sprint 05)

Milimetric moved T352793: MediaWiki History Plan: Maintenance Plan from Sprint Backlog to In Process on the Data Products (Data Products Sprint 05) board.

Dec 11 2023, 8:01 PM · Data Products (Data Products Sprint 05)

Milimetric added a comment to T353134: Search dag image_suggestions_weekly failed waiting for analytics_platform_eng.image_suggestions_search_index_delta/snapshot=2023-11-27.

wmf_raw.mediawiki_pagelinks and wmf_raw.mediawiki_page_props is available with snapshot 2023-11

Dec 11 2023, 3:00 PM · Discovery-Search (Current work), Data-Engineering, Structured-Data-Backlog, Image-Suggestions, CirrusSearch

Dec 8 2023

Milimetric updated subscribers of T333716: "Active editors by country" doesn't display numbers for Belarus, Kazakhstan, Russia.

I agree, @stjn, hopefully that's not as hyper-urgent and maybe @VirginiaPoundstone + @lbowmaker can triage.

Dec 8 2023, 7:10 PM · Russian-Sites, Data-Engineering, Data-Engineering-Wikistats

Dec 7 2023

Milimetric added a comment to T339318: Indicate that some country data are unavailable on Wikistats.

I'm really sorry this didn't get through the pipeline sooner, someone only told me about the issue last week. Had I known sooner I would have made the fix sooner. We are going to bring this up in our retro.

Dec 7 2023, 2:49 PM · Trust-and-Safety, Russian-Sites, Data-Engineering, Data-Engineering-Wikistats

Milimetric added a comment to T333716: "Active editors by country" doesn't display numbers for Belarus, Kazakhstan, Russia.

In T333716#9389355, @stjn wrote:

@Milimetric: this is great, but I think it should be also indicated under the map that some countries do not have any results, so people can see this easier. For example, page view stats have this in the bottom: Those countries with less than 100 views are not reported and are blank in the map. Seems like the absence of data for privacy reasons is good to report there as well. Can you also add that?

Dec 7 2023, 2:47 PM · Russian-Sites, Data-Engineering, Data-Engineering-Wikistats

Dec 6 2023

Milimetric added a comment to T333716: "Active editors by country" doesn't display numbers for Belarus, Kazakhstan, Russia.

The above patches do what I suggested in a comment on the talk page: https://meta.wikimedia.org/wiki/Talk:Requests_for_comment/Hiding_the_number_of_Russian/Belorussian/Kazakh_contributors_on_the_statistics_map which is to gray out the countries currently on the protection list and explain that the data is hidden. If and when the country list chagnes, we should update this or make it more reactive to the data itself.

Dec 6 2023, 10:05 PM · Russian-Sites, Data-Engineering, Data-Engineering-Wikistats

Ladsgroup awarded T249419: RFC: Render data visualizations on the server a Love token.

Dec 6 2023, 7:45 PM · Wikimedia-Performance-recommendation, JavaScript, MediaWiki-extensions-Graph, covid-19, TechCom-RFC

Milimetric updated subscribers of T352879: Update the sqoop configuration for mediawiki to obtain linktarget from the production replicas, instead of wikireplicas.

Sqooping from the production replicas would mean applying the same sanitization rules on our side. I see the filter here is:

Dec 6 2023, 4:31 PM · Data-Platform-SRE, Data-Engineering

Milimetric added a comment to T346463: Identify and label prefetch proxy data in our traffic.

This is the varnish code (VCL) that does analytics-y things to create and update the X-analytics header. Adding stuff here would prevent us from having to change varnishkafka. Or maybe I misunderstood the whole thing, which is always possible in Varnish land :)

Dec 6 2023, 12:10 PM · Traffic, Movement-Insights, Data-Engineering