Page MenuHomePhabricator

GGoncalves-WMF (Guilherme Gonçalves)
User

Projects (1)

Today

  • No visible events.

Tomorrow

  • No visible events.

Sunday

  • No visible events.

User Details

User Since
May 19 2025, 3:26 PM (38 w, 4 d)
Availability
Available
LDAP User
Guilherme Gonçalves
MediaWiki User
GGoncalves-WMF [ Global Accounts ]

Recent Activity

Yesterday

GGoncalves-WMF added a comment to T416806: [Hypothesis] 5.3.3: Attribution API.

A quick note about the trending indicator: in T409601, DPE onboarded a data pipeline for WME that classifies articles as trending according to (I think) this methodology. You can see a sample query here. Naturally, your definition of trending doesn't have to agree with that, but this could be a starting point.

Thu, Feb 12, 9:31 PM · Epic, OKR-Work, [MWI] FY2025-26 Q3

Tue, Feb 10

GGoncalves-WMF updated the task description for T416933: Investigate and repair pageviews and unique devices spike starting in Nov 2025.
Tue, Feb 10, 9:50 PM · Movement-Insights, Data-Engineering

Mon, Feb 9

GGoncalves-WMF created T416933: Investigate and repair pageviews and unique devices spike starting in Nov 2025.
Mon, Feb 9, 10:12 PM · Movement-Insights, Data-Engineering
GGoncalves-WMF added a comment to T416312: Use wmf.mediawiki_history as baseline for slo completeness.

In general, I think this is good to consider, though I'm not seeing a very strong priority signal just yet. Can we estimate how incorrect our current completeness estimate is, e.g. with a one-time comparison to wmf.mediawiki_history?

Mon, Feb 9, 2:29 PM · DPE-Mediawiki-Content, Data-Engineering (Q3 FY25/26 January 1st - March 31th)
GGoncalves-WMF added a comment to T416672: dbt repository structure (Milestone 3).

This is great, thanks for the detailed proposal @JMonton-WMF !

Mon, Feb 9, 1:55 PM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), Movement-Insights (FY25-26 H2)

Fri, Feb 6

GGoncalves-WMF updated subscribers of T412655: Sudden traffic increase on 1 November 2025.

@Hghani points out (on a first look) that this can be related to T413027. That ticket also hints at some signals we can incorporate into our classification from here.

Fri, Feb 6, 10:39 AM · Data-Engineering, Data-Engineering-Wikistats, Pageviews-Anomaly

Wed, Feb 4

GGoncalves-WMF moved T415202: Introduce a new AQS endpoint to expose video plays from Needs Clarification to Q3 FY25/26 January 1st - March 31th on the Data-Engineering board.
Wed, Feb 4, 3:02 PM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), AQS2.0
GGoncalves-WMF added a comment to T415202: Introduce a new AQS endpoint to expose video plays.

Having spoken to @Ladsgroup this morning (thanks!), here's my notes on this task.

Wed, Feb 4, 2:57 PM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), AQS2.0
GGoncalves-WMF created T416481: Adapt Sqoop for imagelinks schema changes.
Wed, Feb 4, 2:33 PM · Patch-For-Review, Data-Engineering (Q3 FY25/26 January 1st - March 31th)

Thu, Jan 29

GGoncalves-WMF removed a project from T361210: Changes to the cuc_agent column in the cu_changes table: Data-Engineering-Radar.
Thu, Jan 29, 11:13 AM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), Product Safety and Integrity, CheckUser
GGoncalves-WMF added a comment to T361210: Changes to the cuc_agent column in the cu_changes table.

Update (from Slack): cuc_agent and cu_changes are no longer bring read in MediaWiki, and writing will tentatively stop on the week of Feb 9.

Thu, Jan 29, 11:13 AM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), Product Safety and Integrity, CheckUser
GGoncalves-WMF renamed T361210: Changes to the cuc_agent column in the cu_changes table from FYI: Changes to the cuc_agent column in the cu_changes table to Changes to the cuc_agent column in the cu_changes table.
Thu, Jan 29, 11:09 AM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), Product Safety and Integrity, CheckUser

Dec 22 2025

GGoncalves-WMF created T413355: Identify LLM-mediated requests in our pageview data products.
Dec 22 2025, 11:34 AM · Data-Engineering

Dec 9 2025

GGoncalves-WMF added a comment to T410266: Explore how to migrate PyFlink to Java/Scala.

Sorry I'm late to this, but I basically second Andrew's comment. I think there are two things at play here:

Dec 9 2025, 10:57 AM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), Patch-For-Review, Spike, Event-Platform

Dec 8 2025

GGoncalves-WMF added a comment to T410940: WE1.5.3 Productize Data for Monthly Active Moderator Actions.

@fkaelin and I just chatted a little more about this, quoting here:

Dec 8 2025, 4:39 PM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), OKR-Work (WE1 FY2025-26)

Nov 25 2025

GGoncalves-WMF added a comment to T410796: NEW FEATURE REQUEST: Temp Accounts on Wikistats.

Thanks for checking! What annotation do you have in mind? Something like, "Here's when we enabled temp accounts, and they are included under Anonymous edits"?

Nov 25 2025, 12:08 PM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), Data-Engineering-Wikistats, Movement-Insights
GGoncalves-WMF updated subscribers of T410940: WE1.5.3 Productize Data for Monthly Active Moderator Actions.

Looking at the attached sheet, I see the following must-have actions listed as "complicated":

Nov 25 2025, 11:50 AM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), OKR-Work (WE1 FY2025-26)

Nov 24 2025

GGoncalves-WMF updated subscribers of T410796: NEW FEATURE REQUEST: Temp Accounts on Wikistats.

From a quick discussion at the DE team meeting:

Nov 24 2025, 6:04 PM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), Data-Engineering-Wikistats, Movement-Insights
GGoncalves-WMF updated subscribers of T410796: NEW FEATURE REQUEST: Temp Accounts on Wikistats.

This looks like a very reasonable change, and Q3 sounds feasible; I'm not entirely sure how complex this is, but it looks more like "weeks" of effort than "quarters" or "days".

Nov 24 2025, 2:11 PM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), Data-Engineering-Wikistats, Movement-Insights

Nov 20 2025

GGoncalves-WMF added a comment to T410528: Add monitoring / alerting on the number of MySQL queries done by Hive.

Sorry I'm a bit late to this - I just caught up a bit on the incident this morning.

Nov 20 2025, 12:01 PM · Data-Engineering, Sustainability (Incident Followup), Data-Platform-SRE (2025.11.07 - 2025.11.28)

Oct 29 2025

GGoncalves-WMF added a comment to T392065: Enable SEPA via Gr4vy for EU countries..

I've used the link to set up a donation. I get to the "thank you" page normally on my browser.

Oct 29 2025, 10:16 AM · MW-1.46-notes (1.46.0-wmf.1; 2025-11-05), Fundraising Sprint: UTM_key lime pie, Fundraising Sprint: Spaghetti code and makefiles, MW-1.45-notes (1.45.0-wmf.16; 2025-08-26), fr-current-sprint, payments-orchestration, Fundraising-Backlog

Oct 21 2025

GGoncalves-WMF renamed T406764: Provide a dbt-core development environment and production setup in the data-platform from Exlore the use of dbt-core and appropriate adapters in the data-platform environment to Explore the use of dbt-core and appropriate adapters in the data-platform environment.
Oct 21 2025, 1:50 PM · Patch-For-Review, Data-Engineering-Roadmap, Movement-Insights, Epic, Data-Platform-SRE
GGoncalves-WMF updated subscribers of T406531: NEWFEATURE REQUEST: Add new referral sources to pageview data.

@calbon mentioned today that we want to make sure to capture the referrals from x.com/twitter.com . We do look for Twitter in the Referer header, but not X.

Oct 21 2025, 9:08 AM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Patch-For-Review, Essential-Work, Movement-Insights (FY25-26 H1), Data-Platform

Oct 14 2025

GGoncalves-WMF added a comment to T401331: Request for a new request dataset for caching research.

Unfortunately it looks like we won't be able to prioritize this for the next couple of months in Data Engineering, but it's still something we want to get to. We'll revisit this in our next planning period towards the end of the year.

Oct 14 2025, 9:55 AM · Traffic, Data-Engineering

Oct 9 2025

GGoncalves-WMF reassigned T406531: NEWFEATURE REQUEST: Add new referral sources to pageview data from JAllemandou to Mayakp.wiki.

@JAllemandou and @OSefu-WMF just had a chat about this one and how we can address it, now and in the future.

Oct 9 2025, 4:04 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Patch-For-Review, Essential-Work, Movement-Insights (FY25-26 H1), Data-Platform

Sep 26 2025

GGoncalves-WMF added a comment to T401331: Request for a new request dataset for caching research.

Hi, just a quick update after my chat with Sukhbir. We should do this, not only for the value of the dataset itself, but also because it will be an excellent opportunity to make this kind of release a more documented and repeatable process.

Sep 26 2025, 9:43 AM · Traffic, Data-Engineering

Sep 25 2025

GGoncalves-WMF added a comment to T405535: Improve Superset's error message for draft dashboards.

I took a look at the upstream issue tracker and found this issue, which looks related. But that's from 2021 and the associated patch, which allows users to view draft dashboards, apparently went into the the 1.2.0 release according to the GitHub tags on the right. We seem to be running Superset v4.2.0 at the moment.

Sep 25 2025, 9:22 AM · Data-Engineering-Radar, Data-Platform-SRE, Product-Analytics, superset.wikimedia.org

Sep 19 2025

GGoncalves-WMF added a comment to T221976: Have CDN edge set the `X-Request-Id` header for incoming external requests.

Thanks Valentín, I'm thinking probenet can be a useful signal, but our current focus is to experiment with specific known signals (e.g. presence of DOM properties). We'd like a bit more flexibility to deploy custom logic, and also control over what % of traffic we collect those signals from. More fundamentally, we believe we'll need the X-Request-Id correlation to even be able to evaluate with confidence whether client-side signals, include probenet, are helpful or not.

Sep 19 2025, 9:51 AM · MediaWiki-Platform-Team (Radar), Traffic, Platform Engineering (Icebox), SRE

Sep 15 2025

GGoncalves-WMF moved T370368: Gobblin-wmf Gitlab migration and maintenance from In progress to Done on the Data-Engineering (Q1 FY25/26 July 1st - September 30th) board.
Sep 15 2025, 3:43 PM · Data-Engineering (Q1 FY25/26 July 1st - September 30th), Essential-Work, Event-Platform
GGoncalves-WMF moved T402987: OpsWeek: Contact and/or remove dead dump mirrors from Blocked/Paused to Done on the Data-Engineering (Q1 FY25/26 July 1st - September 30th) board.
Sep 15 2025, 3:32 PM · Essential-Work, Data-Engineering (Q1 FY25/26 July 1st - September 30th)
GGoncalves-WMF moved T403169: Migrate and re-deploy eventgate-wikimedia using new service-utils from In progress to In Review on the Data-Engineering (Q1 FY25/26 July 1st - September 30th) board.
Sep 15 2025, 3:25 PM · Data-Engineering, MW-1.45-notes (1.45.0-wmf.20; 2025-09-23), Patch-For-Review, Event-Platform, service-utils

Sep 10 2025

GGoncalves-WMF updated subscribers of T403159: Airflow processes to import dump logs and generate monthly metrics.
Sep 10 2025, 4:28 PM · Data-Platform-SRE, Traffic, Data-Engineering, Wikidata, Wikidata Analytics
GGoncalves-WMF updated subscribers of T402963: Follow-up analysis to understand usage of dumps to inform v2 rollout.

That is a good question, I was just quoting @BTullis 's wisdom in the original ticket (T383175#10440220).

Sep 10 2025, 10:06 AM · Product-Analytics (Kanban), Data-Engineering

Sep 8 2025

GGoncalves-WMF added a comment to T403863: Jupyterhub: Decide on/display escalation paths.

Nice, this makes a lot of sense! I think Jupyter users are generally used to going to either #data-platform-sre, #talk-to-data-engineering or (less frequently) #working-with-data for support.

Sep 8 2025, 10:31 AM · Data-Platform-SRE (2026.01.23 - 2026.02.13), Patch-For-Review, Essential-Work

Sep 2 2025

GGoncalves-WMF added a comment to T401892: Update MediaWiki Content History SLO draft for SRE review.

Excellent, thanks for coming up with this! It's very cool to see the performance of the real pipeline. My thoughts the initial question:

Sep 2 2025, 10:13 AM · Data-Engineering (Q3 FY25/26 January 1st - March 31th)

Aug 27 2025

GGoncalves-WMF added a comment to T402963: Follow-up analysis to understand usage of dumps to inform v2 rollout.

Thanks KC, and agreed: if we need to make this into a dashboard for ongoing tracking, we should look into importing those logs into the Data Lake.

Aug 27 2025, 1:02 PM · Product-Analytics (Kanban), Data-Engineering
GGoncalves-WMF updated subscribers of T402963: Follow-up analysis to understand usage of dumps to inform v2 rollout.

Thanks Halley, a couple of comments and follow-up questions please:

Aug 27 2025, 10:23 AM · Product-Analytics (Kanban), Data-Engineering

Aug 26 2025

GGoncalves-WMF closed T400963: Data for display orientation in mobile views of Wikimedia projects as Resolved.

You're welcome, happy to help!

Aug 26 2025, 4:55 PM · Data-Engineering (Q1 FY25/26 July 1st - September 30th)

Aug 20 2025

GGoncalves-WMF added a comment to T400963: Data for display orientation in mobile views of Wikimedia projects.

We had to ask around a little, and unfortunately we don't have a precise answer as we don't really instrument device orientation directly. However, we do instrument device viewport width (aggregated to buckets) for mobile web page views and keep it for 90 days, and I think that can help give you an approximate answer.

Aug 20 2025, 10:28 AM · Data-Engineering (Q1 FY25/26 July 1st - September 30th)

Aug 14 2025

GGoncalves-WMF moved T401892: Update MediaWiki Content History SLO draft for SRE review from Incoming (new tickets) to Q1 FY25/26 July 1st - September 30th on the Data-Engineering board.
Aug 14 2025, 9:33 AM · Data-Engineering (Q3 FY25/26 January 1st - March 31th)
GGoncalves-WMF created T401892: Update MediaWiki Content History SLO draft for SRE review.
Aug 14 2025, 9:24 AM · Data-Engineering (Q3 FY25/26 January 1st - March 31th)

Aug 11 2025

GGoncalves-WMF moved T366487: Event Platform schemas should not support type changes to structs as array element or map value types from In progress to Done on the Data-Engineering (Q1 FY25/26 July 1st - September 30th) board.
Aug 11 2025, 3:34 PM · Data-Engineering (Q1 FY25/26 July 1st - September 30th), Event-Platform
GGoncalves-WMF updated subscribers of T400963: Data for display orientation in mobile views of Wikimedia projects.
Aug 11 2025, 2:30 PM · Data-Engineering (Q1 FY25/26 July 1st - September 30th)

Aug 6 2025

GGoncalves-WMF closed T400282: Review Image Suggestion pipeline SLOs as Resolved.
Aug 6 2025, 1:03 PM · Data-Engineering (Q1 FY25/26 July 1st - September 30th)
GGoncalves-WMF updated subscribers of T400282: Review Image Suggestion pipeline SLOs.
Aug 6 2025, 1:03 PM · Data-Engineering (Q1 FY25/26 July 1st - September 30th)
GGoncalves-WMF updated subscribers of T400282: Review Image Suggestion pipeline SLOs.

I met with @mfossati to discuss this. The main takeaways are:

Aug 6 2025, 1:02 PM · Data-Engineering (Q1 FY25/26 July 1st - September 30th)
GGoncalves-WMF updated subscribers of T400282: Review Image Suggestion pipeline SLOs.
Aug 6 2025, 12:56 PM · Data-Engineering (Q1 FY25/26 July 1st - September 30th)

Jul 8 2025

GGoncalves-WMF renamed T398958: Update documentation for banner_activity_minutely dataset. from Update documentation for banner_activity dataset. to Update documentation for banner_activity_minutely dataset..
Jul 8 2025, 12:39 PM · Data-Engineering, Data-Engineering-Radar
GGoncalves-WMF created T398958: Update documentation for banner_activity_minutely dataset..
Jul 8 2025, 12:39 PM · Data-Engineering, Data-Engineering-Radar

Jul 3 2025

GGoncalves-WMF attached a referenced file: F62816359: image.png.
Jul 3 2025, 9:41 AM · Data-Platform-SRE (2025.07.26 - 2025.08.15)
GGoncalves-WMF attached a referenced file: F62816190: image.png.
Jul 3 2025, 9:41 AM · Data-Platform-SRE (2025.07.26 - 2025.08.15)
GGoncalves-WMF created T398599: "no healthy upstream" errors in Datahub..
Jul 3 2025, 9:40 AM · Data-Platform-SRE (2025.07.26 - 2025.08.15)

Jun 19 2025

GGoncalves-WMF added a comment to T397338: Enable async queries for Superset with Celery.

Excellent discussion, thank you both.

Jun 19 2025, 9:29 AM · Data-Platform-SRE, Data-Engineering

Jun 17 2025

GGoncalves-WMF updated subscribers of T395988: [Request] User Research for Data Engineering / Data Platform.
Jun 17 2025, 3:36 PM · Research
GGoncalves-WMF added a comment to T395988: [Request] User Research for Data Engineering / Data Platform.

Hey Debra, thank you and welcome as well :) Sorry for the delay here, I wanted to get Virginia's input again to make sure we go in a sensible direction with this.

Jun 17 2025, 9:08 AM · Research

Jun 10 2025

GGoncalves-WMF closed T395428: Requesting access to analytics-privatedata-users, SSH and Kerberos for GGoncalves-WMF as Resolved.

Yep, I was able to run kinit, set my password and run it again. Thank you!

Jun 10 2025, 2:05 PM · SRE, SRE-Access-Requests
GGoncalves-WMF reopened T395428: Requesting access to analytics-privatedata-users, SSH and Kerberos for GGoncalves-WMF as "Open".

I was testing these credentials in the past couple of days. I can use Superset (so I'm in analytics-privatedata-users ) and log in to bastion and stat machines (so SSH is also fine).

Jun 10 2025, 12:19 PM · SRE, SRE-Access-Requests

Jun 4 2025

GGoncalves-WMF created T395988: [Request] User Research for Data Engineering / Data Platform.
Jun 4 2025, 7:50 AM · Research

May 28 2025

GGoncalves-WMF created T395428: Requesting access to analytics-privatedata-users, SSH and Kerberos for GGoncalves-WMF.
May 28 2025, 9:40 AM · SRE, SRE-Access-Requests