Page MenuHomePhabricator

Gilles (Gilles Dubuc)
Engineering Manager, WMFAdministrator

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Oct 7 2014, 8:24 AM (335 w, 2 h)
Roles
Administrator
Availability
Available
IRC Nick
gilles
LDAP User
Gilles
MediaWiki User
GDubuc (WMF) [ Global Accounts ]

I am the engineering manager of the Wikimedia Foundation Performance team.

I am currently the Wikimedia Foundation's W3C Advisory Committee representative.

Recent Activity

Today

Gilles added a comment to T267270: Determine multi-dc strategy for CentralAuth.

@tstarling thank you for your in-depth response. Based on what you've just described, are the following items the minimum changes required to make CentralAuth work in an active-active setup?

Tue, Mar 9, 8:49 AM · Platform Team Workboards (Clinic Duty Team), Performance-Team (Radar), serviceops, Sustainability (MediaWiki-MultiDC), Code-Health-Objective, Platform Team Initiatives (Session Management Service (CDP2))

Yesterday

Gilles closed T256723: Review sampling rate for perceived performance survey as Resolved.

I guess my comment might have given a sufficient answer to your concerns? I'll close the task since it's been inactive for a long time now. Feel free to reopen if necessary.

Mon, Mar 8, 7:50 PM · QuickSurveys (Surveys), Performance-Team
Gilles moved T276121: asoranking timer failed on stat1007 from Inbox to Radar on the Performance-Team board.
Mon, Mar 8, 7:47 PM · Performance-Team (Radar)
Gilles moved T276112: Regression: Page Previews and Reference Previews tip triangle is in the wrong location in RTL wikis from Inbox to Radar on the Performance-Team board.
Mon, Mar 8, 7:46 PM · Performance-Team (Radar), MW-1.36-notes (1.36.0-wmf.34; 2021-03-09), Readers-Web-Backlog (Kanbanana-FY-2020-21), Page-Previews, RTL, I18n
Gilles moved T276195: Allow more than five wikis as projects in Global Watchlist from Inbox to Radar on the Performance-Team board.
Mon, Mar 8, 7:44 PM · Performance-Team (Radar), User-DannyS712, MediaWiki-extensions-GlobalWatchlist
Gilles moved T228467: thumbtime/seek thumbnailing broken with floating point offsets from Inbox to Backlog: Maintenance on the Performance-Team board.
Mon, Mar 8, 7:41 PM · Patch-For-Review, Performance-Team, Regression, Thumbor
Gilles changed the status of T228467: thumbtime/seek thumbnailing broken with floating point offsets from Open to Stalled.
Mon, Mar 8, 7:39 PM · Patch-For-Review, Performance-Team, Regression, Thumbor
Gilles moved T276279: Logo svgs for various projects are not optimized from Inbox to Radar on the Performance-Team board.
Mon, Mar 8, 7:37 PM · Performance-Team (Radar), Logos, Desktop Improvements, Wikimedia-Site-requests
Gilles added a comment to T276668: Regression: Settings button misplaced in article preview.

@Nomsterio let me know if you have time to look into this one as well, thanks!

Mon, Mar 8, 7:35 PM · Patch-For-Review, Performance-Team, Readers-Web-Backlog (Kanbanana-FY-2020-21), Page-Previews
Gilles assigned T276668: Regression: Settings button misplaced in article preview to Nomsterio.
Mon, Mar 8, 7:35 PM · Patch-For-Review, Performance-Team, Readers-Web-Backlog (Kanbanana-FY-2020-21), Page-Previews
Gilles moved T276826: Fundraising banner with inline SVG triggers `Uncaught TypeError: node.className.replace is not a function` for layout shift due to SVGAnimatedString className attribute. from Inbox to Doing: Prio Interrupt on the Performance-Team board.
Mon, Mar 8, 7:33 PM · Patch-For-Review, JavaScript, MediaWiki-extensions-NavigationTiming, Performance-Team, Wikimedia-production-error
Gilles claimed T276826: Fundraising banner with inline SVG triggers `Uncaught TypeError: node.className.replace is not a function` for layout shift due to SVGAnimatedString className attribute..
Mon, Mar 8, 7:33 PM · Patch-For-Review, JavaScript, MediaWiki-extensions-NavigationTiming, Performance-Team, Wikimedia-production-error

Thu, Mar 4

Gilles updated the task description for T276465: puppet admin module: Assign approvers to unix groups.
Thu, Mar 4, 3:15 PM · SRE, Puppet

Wed, Mar 3

Gilles added a comment to T276121: asoranking timer failed on stat1007.

Fixed by applying https://issues.apache.org/jira/browse/HIVE-19231

Wed, Mar 3, 11:04 AM · Performance-Team (Radar)
Gilles reassigned T276121: asoranking timer failed on stat1007 from Gilles to elukey.
Wed, Mar 3, 11:04 AM · Performance-Team (Radar)

Tue, Mar 2

Gilles closed T233644: SCAP python error on successful deploy as Invalid.

Hasn't happened to me since, I think, but I rarely deploy mediawiki things these days. Closing it on the assumption that it magically went away, we can always reopen if someone else runs into it.

Tue, Mar 2, 5:53 PM · Release-Engineering-Team, Scap
Gilles added a comment to T276195: Allow more than five wikis as projects in Global Watchlist.

In that case, given that users like @1234qwer1234qwer4 and @IKhitron essentially generate the same amount of requests using the script as a workaround, for valid reasons since they're active editors on dozens of wikis, I think we should increase the limit to 50.

Tue, Mar 2, 5:17 PM · Performance-Team (Radar), User-DannyS712, MediaWiki-extensions-GlobalWatchlist
Gilles added a comment to T276121: asoranking timer failed on stat1007.

I can't find a way to start the unit myself. Is there a way I could temporarily be allowed to do that? Or maybe I tried the wrong commands. Being able to start the unit myself would let me iterate on the script until I find a fix (latest attempt failed).

Tue, Mar 2, 2:51 PM · Performance-Team (Radar)
Gilles updated the badge description for W3C AC rep.
Tue, Mar 2, 12:26 PM
Gilles added a comment to T276121: asoranking timer failed on stat1007.

@elukey please try running it again with the new version of /home/gilles/T276121.py which makes some encoding parameters explicit.

Tue, Mar 2, 12:24 PM · Performance-Team (Radar)
Gilles added a comment to T276121: asoranking timer failed on stat1007.

A workaround might be to make the encoding/decoding explicit in the Python script. Currently it might inherit it from the shell or systemd. I'll try to modify /home/gilles/T276121.py to that effect.

Tue, Mar 2, 11:45 AM · Performance-Team (Radar)
Gilles added a comment to T276121: asoranking timer failed on stat1007.

The output is very different, which explains the parsing failures:

Tue, Mar 2, 11:38 AM · Performance-Team (Radar)
Gilles closed T276192: Turnilo split thresholds too low as Invalid.

Ah, yes, 100 is hardcoded, so I guess we'll see 100 countries at least. Thanks for that link, it let me find the drop-down menu that I didn't know existed to override the default split limit picked. 100 countries is probably good enough for now, and it's going to let me see all the buckets for Load Event End.

Tue, Mar 2, 10:32 AM · Analytics
Gilles updated subscribers of T276195: Allow more than five wikis as projects in Global Watchlist.

Thanks for the info. I see that you have 100+ edits on 26 wikis: https://meta.wikimedia.org/wiki/Special:CentralAuth?target=1234qwer1234qwer4 I presume that's the bulk of the wikis you watch this way?

Tue, Mar 2, 10:19 AM · Performance-Team (Radar), User-DannyS712, MediaWiki-extensions-GlobalWatchlist
Gilles added a comment to T276195: Allow more than five wikis as projects in Global Watchlist.

I would like to know how and if people are working around this limit. For instance, does sticking to the limit of 5 means that a significant amount of users would still user the old gadget (what limit did that have?). Do they instead keep many browser tabs open on dozens of projects and refresh those manually?

Tue, Mar 2, 10:04 AM · Performance-Team (Radar), User-DannyS712, MediaWiki-extensions-GlobalWatchlist
Gilles added a comment to T276121: asoranking timer failed on stat1007.

We can put a throwaway script into a new systemd service/timer that will serve as a repro case and output what beeline gives. You can use /home/gilles/T276121.py for that purpose. Just set up a systemd unit that runs that and its logs should contain the beeline output when ran the same way through python (at INFO level).

Tue, Mar 2, 9:51 AM · Performance-Team (Radar)
Gilles added a comment to T276121: asoranking timer failed on stat1007.

The errors showed up in the log and suggest that beeline was returning unexpected data when it ran through the timer. They just made the script unable to process the data and its job.

Tue, Mar 2, 7:49 AM · Performance-Team (Radar)
Gilles updated the task description for T276192: Turnilo split thresholds too low.
Tue, Mar 2, 7:38 AM · Analytics
Gilles created T276192: Turnilo split thresholds too low.
Tue, Mar 2, 7:37 AM · Analytics

Mon, Mar 1

Gilles added a comment to T276121: asoranking timer failed on stat1007.

It runs fine without pandas csv parsing issues when run outside of the systemd timer, doing this:

Mon, Mar 1, 6:53 PM · Performance-Team (Radar)
Gilles created T276121: asoranking timer failed on stat1007.
Mon, Mar 1, 5:30 PM · Performance-Team (Radar)

Thu, Feb 25

Gilles added a comment to T275275: Convince Wikimedia Foundation to support reincanation of W3C working group Math.

OK, I'll ask our legal department if Moritz could join the MathML WG under the Foundation's W3C membership. I'll let you know when I hear back from them.

Thu, Feb 25, 9:22 AM · WMF-Strategy, Research, Math

Mon, Feb 22

Gilles moved T275319: Raise limit of $wgMaxArticleSize for Hebrew Wikisource from Inbox to Radar on the Performance-Team board.
Mon, Feb 22, 7:28 PM · Performance-Team (Radar), Language-Team, SRE, Wikimedia-Site-requests
Gilles added a comment to T275319: Raise limit of $wgMaxArticleSize for Hebrew Wikisource.

The fact that each character takes twice the storage space shouldn't affect parsing complexity and time, right? I'm not familiar with out parsing code but I don't imagine it would do any sub-character processing.

Mon, Feb 22, 7:21 PM · Performance-Team (Radar), Language-Team, SRE, Wikimedia-Site-requests

Sun, Feb 21

Gilles added a comment to T261341: Performance review of WikiLambda extension.

Sounds good to me, thanks for requesting it early!

Sun, Feb 21, 7:06 AM · Abstract Wikipedia (Phase δ), Performance-Team
Gilles updated the badge description for W3C AC rep.
Sun, Feb 21, 6:39 AM
Gilles updated the badge description for W3C AC rep.
Sun, Feb 21, 6:38 AM
Gilles updated the badge description for W3C AC rep.
Sun, Feb 21, 6:37 AM
Gilles updated the badge description for W3C AC rep.
Sun, Feb 21, 6:37 AM
Gilles awarded W3C AC rep to recipient: Gilles.
Sun, Feb 21, 6:36 AM
Gilles created W3C AC rep.
Sun, Feb 21, 6:36 AM
Gilles updated Gilles.
Sun, Feb 21, 6:32 AM
Gilles updated Gilles.
Sun, Feb 21, 6:31 AM
Gilles added a comment to T275275: Convince Wikimedia Foundation to support reincanation of W3C working group Math.

I am the WMF's W3C AC representative. Thanks for bringing my attention to this, I hadn't been paying attention to the AC mailing list recently and missed that announcement. I've reviewed the charter, MathML and math on the web are obviously things Wikimedia should support. I'm surprised that there was only a community group for this until now. I've voted in support of the proposal.

Sun, Feb 21, 6:28 AM · WMF-Strategy, Research, Math

Tue, Feb 16

Gilles moved T274041: Investigate performance impact of HookContainer loading 500+ interfaces from Inbox to Radar on the Performance-Team board.
Tue, Feb 16, 7:43 PM · Performance-Team (Radar), Patch-For-Review, Platform Engineering, Performance Issue, MediaWiki-Core-Hooks
Gilles moved T274446: Avoid storing serialized ResourceLoaderFilePath in ExtensionRegistry cache from Inbox to Backlog: Maintenance on the Performance-Team board.
Tue, Feb 16, 7:41 PM · Performance-Team, MediaWiki-ResourceLoader, MediaWiki-Configuration

Mon, Feb 15

Gilles added a comment to T260504: Get rid of remaining non-Thumbor MediaWiki image scaling in WMF production.

Something we will have to watch out for when this gets deployed is that some Swift containers for "temp" might be missing some privileges for the Thumbor Swift user. This happened before (containers that were supposed to have the rights applied didn't when they came in use) and would show up as errors in the Thumbor logs.

Mon, Feb 15, 8:50 AM · MW-1.36-notes (1.36.0-wmf.32; 2021-02-23), MW-on-K8s, Performance-Team (Radar), Thumbor

Thu, Feb 11

Gilles added a comment to T260504: Get rid of remaining non-Thumbor MediaWiki image scaling in WMF production.

I don't think there's anything that would prevent wikis that don't have UploadWizard enabled to use $wgUploadStashScalerBaseUrl

Thu, Feb 11, 8:04 AM · MW-1.36-notes (1.36.0-wmf.32; 2021-02-23), MW-on-K8s, Performance-Team (Radar), Thumbor

Wed, Feb 10

Gilles closed T245552: noc.wikimedia.org with X-Wikimedia-Debug routes to mwdebug but host is not served there as Resolved.

Thanks @jijiki !

Wed, Feb 10, 4:38 PM · Performance-Team (Radar), Developer Productivity, Traffic, MediaWiki-Debug-Logger, SRE
Gilles added a comment to T268167: Fetch mwdebug backend server list from noc.wikimedia.org.

The new version of the WikimediaDebug extension is good to do, we just have to tag and push a new version which takes a few minutes. It's just T245552 that needs to be fixed before we do. For which I've just filed a patch: https://gerrit.wikimedia.org/r/c/operations/puppet/+/663156/

Wed, Feb 10, 10:04 AM · serviceops, WikimediaDebug, Performance-Team
Gilles added a comment to T268167: Fetch mwdebug backend server list from noc.wikimedia.org.

The code is already there. We are still getting requests to update that list in the current reality: T274026: Add 3 new canaries to WikimediaDebug

Wed, Feb 10, 9:53 AM · serviceops, WikimediaDebug, Performance-Team
Gilles reopened T268167: Fetch mwdebug backend server list from noc.wikimedia.org as "Open".

Didn't test this enough, it's actually blocked by T245552: noc.wikimedia.org with X-Wikimedia-Debug routes to mwdebug but host is not served there. As soon as you pick a debug server from the extension and reload the page, opening the extension again will fail, as the attempt to fetch the server list from noc.wikimedia.org errors.

Wed, Feb 10, 9:48 AM · serviceops, WikimediaDebug, Performance-Team
Gilles added a comment to T245552: noc.wikimedia.org with X-Wikimedia-Debug routes to mwdebug but host is not served there.

This is needed before we can release the new version of the WikimediaDebug extension that reads the server list from noc.wikimedia.org Otherwise as soon as you enable a debug server, you can no longer fetch the server list.

Wed, Feb 10, 9:36 AM · Performance-Team (Radar), Developer Productivity, Traffic, MediaWiki-Debug-Logger, SRE
Gilles closed T274342: noc.wikimedia.org is a 404 when X-Wikimedia-Debug is enabled as Invalid.

Oh, indeed. That's exactly what I'm running into.

Wed, Feb 10, 9:33 AM · Internet-Archive, SRE
Gilles created T274342: noc.wikimedia.org is a 404 when X-Wikimedia-Debug is enabled.
Wed, Feb 10, 9:30 AM · Internet-Archive, SRE

Tue, Feb 9

Gilles reassigned T274228: Phabricator should cache tasks for a few minutes for logged-out users from 20after4 to mmodell.
Tue, Feb 9, 6:48 PM · SRE, Traffic, Phabricator
Gilles assigned T274228: Phabricator should cache tasks for a few minutes for logged-out users to 20after4.
Tue, Feb 9, 6:41 PM · SRE, Traffic, Phabricator
Gilles awarded Web Perf Hero to recipient: Joe.
Tue, Feb 9, 9:26 AM
Gilles updated subscribers of T274228: Phabricator should cache tasks for a few minutes for logged-out users.

@epriestley seeing your old comment on a related issue:

Tue, Feb 9, 9:06 AM · SRE, Traffic, Phabricator
Gilles created T274228: Phabricator should cache tasks for a few minutes for logged-out users.
Tue, Feb 9, 8:58 AM · SRE, Traffic, Phabricator

Mon, Feb 8

Gilles moved T273739: Get performance team green light for Cloud NAT to wikis change from Inbox to Doing (old) on the Performance-Team board.
Mon, Feb 8, 6:54 PM · Performance-Team (Radar), cloud-services-team (Kanban), Cloud-VPS
Gilles assigned T273739: Get performance team green light for Cloud NAT to wikis change to Krinkle.
Mon, Feb 8, 6:53 PM · Performance-Team (Radar), cloud-services-team (Kanban), Cloud-VPS
Gilles moved T274023: Convert mwdebug VMs to debian buster from Inbox to Radar on the Performance-Team board.
Mon, Feb 8, 6:47 PM · Performance-Team (Radar), Patch-For-Review, serviceops
Gilles moved T274026: Add 3 new canaries to WikimediaDebug from Inbox to Doing (old) on the Performance-Team board.
Mon, Feb 8, 6:47 PM · Performance-Team, WikimediaDebug
Gilles claimed T274026: Add 3 new canaries to WikimediaDebug.
Mon, Feb 8, 6:47 PM · Performance-Team, WikimediaDebug

Feb 4 2021

Gilles added a comment to T273741: Investigate unusual media traffic pattern for AsterNovi-belgii-flower-1mb.jpg on Commons.

You could even serve another image in its place to this UA, with some text and an email address to contact. You'd probably find out pretty quickly what it is from users of that mysterious thing. A throwaway email address is probably best 🙂

Feb 4 2021, 4:15 PM · Patch-For-Review, Commons, Traffic, SRE

Feb 3 2021

Gilles added a comment to T273741: Investigate unusual media traffic pattern for AsterNovi-belgii-flower-1mb.jpg on Commons.

Might be worth looking at the full unprocessed request headers? Do you have an example?

Feb 3 2021, 1:10 PM · Patch-For-Review, Commons, Traffic, SRE

Feb 2 2021

Gilles added a comment to T260913: Performance review of Wikipedia Preview.

Yes, by all means, create a new folder hierarchy in /static/ for what you need! It's fine if you duplicate some existing icons for the purpose of this project. Your feature should get enough traffic to keep your icons hot in cache.

Feb 2 2021, 8:07 PM · Wikipedia-Preview (Mobile-Prototype), Inuka-Team, Performance-Team
Gilles added a comment to T272530: Adding a graph to a page doubles JS payload on mobile and desktop.

This is not a surprise, indeed. We knew that moving this feature client-side only would have this impact, since Vega is huge and monolithic.

Feb 2 2021, 4:03 PM · Performance-Team (Radar), Readers-Web-Backlog (Tracking), Mobile, MediaWiki-extensions-Graph

Feb 1 2021

Gilles moved T248481: Mysterious replication lag observed by MW in Codfw from Inbox to Doing (old) on the Performance-Team board.
Feb 1 2021, 7:57 PM · Performance-Team, DBA, Patch-For-Review, Wikimedia-Rdbms
Gilles moved T272530: Adding a graph to a page doubles JS payload on mobile and desktop from Inbox to Radar on the Performance-Team board.
Feb 1 2021, 7:57 PM · Performance-Team (Radar), Readers-Web-Backlog (Tracking), Mobile, MediaWiki-extensions-Graph
Gilles moved T272979: Onboard Perf Team to new Alerting Toolset from Inbox to Radar on the Performance-Team board.
Feb 1 2021, 7:54 PM · Performance-Team (Radar), User-fgiunchedi, observability
Gilles moved T273247: Publish wikimedia/minify as its own repo and package from Inbox to Doing (old) on the Performance-Team board.
Feb 1 2021, 7:54 PM · Wikimedia-Minify, Patch-For-Review, Librarization, Performance-Team
Gilles moved T273249: MimeAnalyzer::improveTypeFromExtension() must be of the type string, null given from Inbox to Doing (old) on the Performance-Team board.
Feb 1 2021, 7:53 PM · Performance-Team (Radar), Platform Team Workboards (Clinic Duty Team), Patch-For-Review, MW-1.36-notes (1.36.0-wmf.30; 2021-02-09), Commons, MediaWiki-File-management, User-brennen, MediaWiki-libs-Mime, MediaWiki-API
Gilles assigned T273249: MimeAnalyzer::improveTypeFromExtension() must be of the type string, null given to Krinkle.
Feb 1 2021, 7:53 PM · Performance-Team (Radar), Platform Team Workboards (Clinic Duty Team), Patch-For-Review, MW-1.36-notes (1.36.0-wmf.30; 2021-02-09), Commons, MediaWiki-File-management, User-brennen, MediaWiki-libs-Mime, MediaWiki-API
Gilles reassigned T266904: Performance review of ext:StopForumSpam from Gilles to aaron.
Feb 1 2021, 7:50 PM · user-sbassett, Performance-Team
Gilles added a comment to T260913: Performance review of Wikipedia Preview.

37kb sounds a lot more reasonable, but what I'd like to know is why it needs to be bundled in the CSS and not hotlinked, with the images served from Wikimedia infrastructure. We do this for a number of icons: https://github.com/wikimedia/operations-mediawiki-config/tree/master/static

Feb 1 2021, 2:23 PM · Wikipedia-Preview (Mobile-Prototype), Inuka-Team, Performance-Team
Gilles added a comment to T272169: Regression: Page-Previews seem to be blurred for thumbnails with natural height less than 200px..

If I'm reading the new code correctly, this gets us from these thumbnails width being used:

Feb 1 2021, 1:40 PM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Performance-Team (Radar), Readers-Web-Backlog (Kanbanana-FY-2020-21), Patch-For-Review, Page-Previews
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

All fixed:

Feb 1 2021, 11:03 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

loadEventEnd is just one of the fields recorded by the NavigationTiming schema. I notice that only the NavigationTiming schema is affected by the 15% drop, not PaintTiming, for example. On the dashboard you've linked to, I see that the amount of non-compliant events has dropped significantly.

Feb 1 2021, 8:00 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics

Jan 29 2021

Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

@Ottomata are events from the old pipeline still supposed to make it through to Kafka at this point for all these schemas? I think that a number of clients with cached old JS are going to keep sending events to the old endpoints for some time.

Jan 29 2021, 12:08 PM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

Canary events are now correctly filtered out. I'm a little skeptical that this will let us recover the missing 15% of report rate, when this error was only occurring once every 15 minutes. I have no idea where this drop caused by https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/659264/ might be coming from.

Jan 29 2021, 12:03 PM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

It is a canary message, I was just checking for those incorrectly:

Jan 29 2021, 11:33 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

Ignoring canary events didn't stop the error 🙁

Jan 29 2021, 11:16 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

It seems like this latest deployment of https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/659264/ caused another partial drop of report rate:

Jan 29 2021, 11:01 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

@Gilles, interesting, would it be possible to add checking if meta.domain === 'canary' and considering such an event 'non compliant', or discarding it altogether? See my comment here: https://phabricator.wikimedia.org/T271208#6778241

Jan 29 2021, 8:31 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics

Jan 28 2021

Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

Seems like the fix worked as expected:

Jan 28 2021, 11:47 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

Seems caused by the navtiming agent expecting a userAgent key:

Jan 28 2021, 11:33 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.
Jan 28 2021, 11:29 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

Checking if the data is in Hive for all of these, comparing the amount of rows between 2021-01-26 and 2021-01-27:

Jan 28 2021, 11:24 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics
Gilles added a comment to T271208: NavigationTiming Extension schemas Event Platform Migration.

The migration of the 5 schemas to group0 and group1 yesterday caused a huge drop in report rate, something's not working.

Jan 28 2021, 11:00 AM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Analytics-Kanban, Patch-For-Review, Analytics-EventLogging, Performance-Team, Event-Platform, Analytics

Jan 27 2021

Gilles added a comment to T28741: Migrate file tables to a modern layout (image/oldimage; file/file_revision; add primary keys).

I don't think that pulling the original every time is desirable, it would cause a lot of unnecessary internal network traffic. Some of those documents are in the hundreds of MB. It could be a DDOS vector, even, if merely hitting a URL would trigger this mechanism.

Jan 27 2021, 4:15 PM · Patch-For-Review, Platform Engineering Roadmap Decision Making, Commons, Multimedia, Schema-change, MediaWiki-File-management
Gilles closed T273033: Coal graphs died around 2021-01-26 20:50 UTC as Resolved.

Restarting coal fixed the data, as expected:

Jan 27 2021, 4:00 PM · Analytics, Performance-Team
Gilles added a comment to T28741: Migrate file tables to a modern layout (image/oldimage; file/file_revision; add primary keys).

The thumbnailing service is unrelated to media DB tables. The point of the migration to Thumbor was to separate thumbnailing concerns from MediaWiki entirely. The fact that I still work on Thumbor is due to lack of ownership. I have no intention of undertaking a project as large as this data migration as a side project while being engineering manager. I'm already probably biting more than I can chew with migrating Thumbor to Docker/Buster/Python 3 at the moment.

Jan 27 2021, 10:43 AM · Patch-For-Review, Platform Engineering Roadmap Decision Making, Commons, Multimedia, Schema-change, MediaWiki-File-management
Gilles triaged T273033: Coal graphs died around 2021-01-26 20:50 UTC as High priority.
Jan 27 2021, 9:36 AM · Analytics, Performance-Team
Gilles added a comment to T273033: Coal graphs died around 2021-01-26 20:50 UTC.

Seems like coal simply needed to be restarted, it hadn't been since python3-snappy was installed on the host a few days ago for navtiming's sake. Won't hurt to make the dependency explicit anyway.

Jan 27 2021, 9:26 AM · Analytics, Performance-Team
Gilles created T273033: Coal graphs died around 2021-01-26 20:50 UTC.
Jan 27 2021, 9:17 AM · Analytics, Performance-Team

Jan 25 2021

Gilles moved T269946: Enable webp thumbnails on all images for non-Commons wikis from Doing (old) to Backlog: Future Goals on the Performance-Team board.
Jan 25 2021, 7:48 PM · SRE, Traffic, Performance-Team
Gilles moved T272169: Regression: Page-Previews seem to be blurred for thumbnails with natural height less than 200px. from Doing (old) to Radar on the Performance-Team board.
Jan 25 2021, 7:47 PM · MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Performance-Team (Radar), Readers-Web-Backlog (Kanbanana-FY-2020-21), Patch-For-Review, Page-Previews
Gilles moved T271441: ResourceLoaderSkinModule features are not backwards compatible from Inbox to Radar on the Performance-Team board.
Jan 25 2021, 7:30 PM · MW-1.36-release, Performance-Team (Radar), MW-1.35-notes, MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), MediaWiki-ResourceLoader, MediaWiki-Core-Skin-Architecture