Page MenuHomePhabricator

ori (Ori Livneh)
Senior Grepper

Today

  • No visible events.

Tomorrow

  • No visible events.

Friday

  • No visible events.

User Details

User Since
Oct 3 2014, 4:18 AM (583 w, 4 d)
Availability
Available
IRC Nick
ori
LDAP User
Ori
MediaWiki User
ATDT [ Global Accounts ]

Recent Activity

Nov 9 2025

ori added a comment to T142542: LoggedOut cookie not set anymore.

The cookie has been defunct for almost a decade. The code should be deleted.

Nov 9 2025, 2:57 AM · Wikimedia-Performance-recommendation, MediaWiki-Platform-Team (Radar), Patch-For-Review, Platform Team Legacy (Watching / External), Regression, MediaWiki-User-login-and-signup
ori closed T293577: Error 500 after upgrading to PHP 7.4: Call to undefined method PHPVersionCheck::setFormat() as Invalid.

Obsolete, as MediaWiki now requires PHP 8.1.0+

Nov 9 2025, 1:25 AM · MediaWiki-General
ori closed T26712: Remove superfluous db freeResult calls as Resolved.

freeResults() was fully removed in 1.39:

Nov 9 2025, 1:22 AM · MediaWiki-General, Technical-Debt

Nov 8 2025

ori updated subscribers of T385404: Deploy Lilypond 2.24 with cairo support to shellbox containers.

@akosiaris Do you have any suggestions for getting this task un-stuck?

Nov 8 2025, 4:08 PM · serviceops, Upstream, Wikimedia-SVG-rendering, MediaWiki-extensions-Score

Nov 3 2025

ori closed T409075: Migrate ori to a FIDO-backed key as Resolved.
Nov 3 2025, 3:06 PM · SRE, SRE-Access-Requests
ori created T409075: Migrate ori to a FIDO-backed key.
Nov 3 2025, 2:18 PM · SRE, SRE-Access-Requests

Sep 20 2025

ori closed T391516: https://performance.wikimedia.org/php-profiling/ leads to 404 for all listed sources as Resolved.

I cleaned up the residual instances manually on arclamp1001 and arclamp2001.

Sep 20 2025, 4:09 PM · SRE Observability (FY2025/2026-Q1), Regression, observability, Arc-Lamp, WikimediaDebug

May 19 2025

ori added a comment to T393859: Consider deploying ChessBrowser to Wikipedias.

Deploying this to all Wikipedias would be risky. Chess is very popular, and without a dedicated team for this extension, bug reports and feature requests could pile up quickly.

May 19 2025, 4:47 AM · Wikimedia-Extension-setup, Wikimedia-extension-review-queue, Wikimedia-Site-requests, ChessBrowser
ori closed T239446: Behavior on mobile, with screen readers, and without javascript as Resolved.
May 19 2025, 4:25 AM · MW-1.38-notes (1.38.0-wmf.2; 2021-09-28), MW-1.37-notes (1.37.0-wmf.23; 2021-09-13), Accessibility, Mobile, ChessBrowser
ori updated the task description for T239446: Behavior on mobile, with screen readers, and without javascript.
May 19 2025, 4:24 AM · MW-1.38-notes (1.38.0-wmf.2; 2021-09-28), MW-1.37-notes (1.37.0-wmf.23; 2021-09-13), Accessibility, Mobile, ChessBrowser
ori added a comment to T239446: Behavior on mobile, with screen readers, and without javascript.

With T362586 resolved, I believe the final TODO is complete.

May 19 2025, 4:24 AM · MW-1.38-notes (1.38.0-wmf.2; 2021-09-28), MW-1.37-notes (1.37.0-wmf.23; 2021-09-13), Accessibility, Mobile, ChessBrowser
ori added a comment to T393513: Fatal exception of type "DBUnexpectedError: Database servers in extension1 are overloaded." affecting page views.

Are there active TCP sockets on the client app server matching the sleeping MySQL sessions? Knowing this would help pinpoint whether these are idle connections that are held by PHP workers or if these are orphaned (half-opened) connections.

May 19 2025, 1:12 AM · serviceops, Patch-For-Review, MW-1.44-notes (1.44.0-wmf.28; 2025-05-06), MW-1.45-notes (1.45.0-wmf.1; 2025-05-13), Editing-team (Tracking), DBA, Wikimedia-production-error

Apr 29 2025

ori updated the task description for T392788: Memory leak in excimer_log_get_speedscope_data.
Apr 29 2025, 2:32 PM · MediaWiki-Platform-Team (Radar), MediaWiki-Engineering, Excimer

Apr 28 2025

ori created T392788: Memory leak in excimer_log_get_speedscope_data.
Apr 28 2025, 4:50 AM · MediaWiki-Platform-Team (Radar), MediaWiki-Engineering, Excimer

Apr 24 2025

ori closed T391516: https://performance.wikimedia.org/php-profiling/ leads to 404 for all listed sources as Resolved.
Apr 24 2025, 11:03 PM · SRE Observability (FY2025/2026-Q1), Regression, observability, Arc-Lamp, WikimediaDebug

Apr 23 2025

ori closed T385199: Gather PHP 8.1 profiling data , a subtask of T383845: MediaWiki on PHP 8.1 production traffic ramp-up, as Resolved.
Apr 23 2025, 4:39 PM · Patch-For-Review, serviceops
ori closed T385199: Gather PHP 8.1 profiling data as Resolved.
Apr 23 2025, 4:39 PM · serviceops, Arc-Lamp

Mar 20 2025

ori added a comment to T385199: Gather PHP 8.1 profiling data .

Ack, thanks for the heads up.

Mar 20 2025, 6:10 AM · serviceops, Arc-Lamp

Jan 19 2025

ori added a comment to T376267: ☂ Wikitech account linking and SUL error reporting.

Hi, and thanks!

Jan 19 2025, 6:48 PM · wikitech.wikimedia.org
ori added a comment to T376267: ☂ Wikitech account linking and SUL error reporting.
Wikitech account/LDAP:ori
SUL accountATDT
Account linked on IDMY
I have visited MediaWiki:LoginpromptY
I have tried to reset my password using Special:PasswordResetY
Jan 19 2025, 5:57 PM · wikitech.wikimedia.org

Sep 22 2024

ori closed T363075: Dataset of top-requested JPEG thumbnails as Resolved.

@VirginiaPoundstone Sorry for the delay. Yes, that should do the trick. Thank you.

Sep 22 2024, 6:43 AM · Data Pipelines, Test Kitchen, Data-Platform

Apr 21 2024

ori created T363075: Dataset of top-requested JPEG thumbnails.
Apr 21 2024, 8:27 PM · Data Pipelines, Test Kitchen, Data-Platform

Apr 8 2024

ori added a comment to T361888: [SPIKE] Determine best solution to solve out-of-memory errors on JPG thumbnail generation.

Hi Mark! Could you summarize the back-and-forth? What were the alternatives considered?

Apr 8 2024, 2:34 PM · Thumbor, Structured-Data-Backlog

Feb 1 2024

ori added a comment to T310087: Advance declaration of query parameters.

In lieu of exporting a route map, MediaWiki could, as a first pass at the problem, emit a response header that signals to the CDN that a request contained garbage parameters. The CDN could use this information to throttle clients that issue too many such requests. This may be less desirable than filtering all such requests at the edge, but it is also simpler.

Feb 1 2024, 10:16 PM · User-MoritzMuehlenhoff, SRE, Traffic, MediaWiki-General

Sep 29 2023

ori updated the task description for T347660: Portable performance test representative of Wikimedia's production environment.
Sep 29 2023, 4:14 AM · Wikimedia-Performance-recommendation, Performance Issue, MediaWiki-Core-Benchmarker
ori created T347660: Portable performance test representative of Wikimedia's production environment.
Sep 29 2023, 4:13 AM · Wikimedia-Performance-recommendation, Performance Issue, MediaWiki-Core-Benchmarker

Aug 10 2023

ori closed T341471: A 'cache-control' header contains directives with invalid values: 'stale-while-revalidate=60' as Invalid.

It's a bug in webhint, AFAICT. It thinks stale-while-revalidate should not hold a value, but that is wrong. This is the problematic code:

Aug 10 2023, 9:14 PM · MediaWiki-Platform-Team, Performance-Team, MediaWiki-ResourceLoader
ori closed T244711: wmerrors needs tests as Resolved.
Aug 10 2023, 8:35 PM · Performance-Team, Test-Coverage, php-wmerrors

Jul 31 2023

ori added a comment to T211661: Automatically clean up unused thumbnails in Swift.

The other thing I can't quite leave alone is - why are we being asked for some thumbnails so often? Shouldn't the CDN be caching thumbs? If we served each thumb only once in that 24 hour period, that would have saved about 54 million requests to swift (which is 29% of the requests swift served), which is non-trivial...

Commonest-served thumbs on that day (with request counts):

8924 wikipedia-commons-local-thumb.8e/8/8e/Edit_remove.svg/15px-Edit_remove.svg.png
8053 wikipedia-commons-local-thumb.2c/2/2c/Broom_icon.svg/22px-Broom_icon.svg.png
6268 wikipedia-commons-local-thumb.de/d/de/Wynn.svg/25px-Wynn.svg.png
6264 wikipedia-commons-local-thumb.33/3/33/Crystal_Clear_action_viewmag.png/22px-Crystal_Clear_action_viewmag.png
6258 wikipedia-commons-local-thumb.1e/1/1e/Font_Awesome_5_solid_arrow-down.svg/19px-Font_Awesome_5_solid_arrow-down.svg.png
6256 wikipedia-commons-local-thumb.b2/b/b2/Font_Awesome_5_solid_arrow-up.svg/19px-Font_Awesome_5_solid_arrow-up.svg.png
5706 wikipedia-commons-local-thumb.b3/b/b3/Broom_icon_ref.svg/22px-Broom_icon_ref.svg.png
4990 wikipedia-commons-local-thumb.33/3/33/Crystal_Clear_action_viewmag.png/21px-Crystal_Clear_action_viewmag.png
Jul 31 2023, 12:46 PM · MediaWiki-Platform-Team (Radar), Performance Issue, Traffic, SRE-swift-storage, SRE

Jul 24 2023

ori added a comment to T211661: Automatically clean up unused thumbnails in Swift.

I also don't know how well Swift would handle 15k QPS of object metadata updates (cf T211661#8377883)

Jul 24 2023, 3:53 PM · MediaWiki-Platform-Team (Radar), Performance Issue, Traffic, SRE-swift-storage, SRE
ori added a comment to T211661: Automatically clean up unused thumbnails in Swift.

Right. Now I remember. The initial expiration is indeed supposed to be set by Thumbor. The necessary functionality had some trouble landing in the Wikimedia Thumbor plugin repo, but it has since landed.

Jul 24 2023, 3:19 PM · MediaWiki-Platform-Team (Radar), Performance Issue, Traffic, SRE-swift-storage, SRE
ori added a comment to T211661: Automatically clean up unused thumbnails in Swift.

@MatthewVernon: my understanding is that rewrite.py is currently setting expiry headers for thumbnails on retrieval from Swift -- is that correct, and does that mean some thumbnails are already getting expired?

Jul 24 2023, 1:15 PM · MediaWiki-Platform-Team (Radar), Performance Issue, Traffic, SRE-swift-storage, SRE

May 8 2023

ori reopened T328842: Restructure paws away from special networking, a subtask of T328968: Revert changes in T328967, as Open.
May 8 2023, 5:29 AM · PAWS
ori reopened T328842: Restructure paws away from special networking, a subtask of T328971: Remove old ingress attach public IP to VM, as Open.
May 8 2023, 5:29 AM · PAWS
ori reopened T328842: Restructure paws away from special networking as "Open".
May 8 2023, 5:29 AM · PAWS
ori added a comment to T328842: Restructure paws away from special networking.

This is really confusing.

May 8 2023, 5:27 AM · PAWS

Apr 18 2023

ori added a comment to T334895: XSS via Graph extension.

Vega ships an optional interpreter that can evaluate graph expressions by traversing an AST and performing each operation, rather than relying on runtime code generation. Per https://github.com/vega/vega/pull/3019#issuecomment-749107902, the interpreter mode is not the default because it is 10% slower. Seems like a negligible price to me. This seems like the only sensible option for keeping support for graph expressions but rooting out XSS vectors systematically.

Apr 18 2023, 3:51 PM · Security-Incidents, SecTeam-Processed, WMDE-TechWish-Sprint-2023-04-05, Editing-team, Vuln-XSS, MediaWiki-extensions-Graph, Security, Security-Team

Mar 5 2023

ori added a comment to T330766: Decommission the EditorActivation instrument.

@phuedx I don't know, sorry.

Mar 5 2023, 6:56 AM · Data Engineering and Event Platform Team, MW-1.41-notes (1.41.0-wmf.2; 2023-03-27), Data-Engineering, Technical-Debt, MediaWiki-extensions-WikimediaEvents, Product-Analytics, Event-Platform

Feb 14 2023

ori added a comment to T327440: Post-deployment Vector 2022 metrics analysis on English Wikipedia.

Does the edits graph in T327440#8542723 include bots? Bots may not be a large proportion of users but they do contribute a large proportion of edits.

Feb 14 2023, 3:48 AM · Product-Analytics (Kanban), Web-Team-Backlog-Archived

Jan 13 2023

ori added a comment to T326607: Future of liuggio/statsd-php-client?.

+1 to @Tgr's proposal

Jan 13 2023, 1:49 AM · MediaWiki-Platform-Team (Radar), MediaWiki-libs-Stats, SRE Observability, observability, serviceops-radar, Technical-Debt

Jan 10 2023

ori added a comment to T326607: Future of liuggio/statsd-php-client?.

It might be worth it to try and contact the library's co-maintainer. His contact info is at https://eatingco.de/about/.

Jan 10 2023, 3:07 AM · MediaWiki-Platform-Team (Radar), MediaWiki-libs-Stats, SRE Observability, observability, serviceops-radar, Technical-Debt

Nov 14 2022

ori added a comment to T322964: reviewer comments missing on a specific change.

Nov 14 2022, 12:02 AM · Gerrit (Gerrit 3.5)

Oct 18 2022

ori updated subscribers of T316706: Run user-submitted code under gVisor.

@Jdforrester-WMF : the Beta Cluster instance of the function-evaluator now runs under GVisor. Some additional work will be required to make the production instance of the function-evaluator run under GVisor. There is documentation here: https://gvisor.dev/docs/user_guide/quick_start/kubernetes/.

Oct 18 2022, 4:05 PM · Abstract Wikipedia team, function-evaluator
ori updated the task description for T316706: Run user-submitted code under gVisor.
Oct 18 2022, 3:58 PM · Abstract Wikipedia team, function-evaluator
ori added a comment to T275945: Create Wikifunctions.org.

I created a new task for the alerts, T321099. Let's continue there.

Oct 18 2022, 3:56 PM · Patch-For-Review, MW-1.41-notes (1.41.0-wmf.19; 2023-07-25), User-Urbanecm, Wiki-Setup (Create), Epic, Abstract Wikipedia team (Phase λ – Launch)
ori updated subscribers of T321099: ProbeSlow alerts for Wikifunctions on Beta Cluster.

Wikifunctions on the Beta Cluster uses the *.wikimedia.beta.wmflabs.org wildcard cert, and the CertAlmostExpired alert was caused by automatic certificate renewal being broken on the Beta Cluster in general. T293585 is the issue; it looks like Valentin and Giuseppe fixed it.

Oct 18 2022, 3:54 PM · Abstract Wikipedia team
ori created T321099: ProbeSlow alerts for Wikifunctions on Beta Cluster.
Oct 18 2022, 3:47 PM · Abstract Wikipedia team

Oct 14 2022

ori added a comment to T318258: Decommission the EditConflict instrument.

@phuedx I'm not aware of anything actively using it, no, but I'm also out of the loop -- can you ask someone on the performance team to confirm?

Oct 14 2022, 2:13 PM · MW-1.40-notes (1.40.0-wmf.6; 2022-10-17), Performance-Team (Radar), MediaWiki-extensions-WikimediaEvents

Oct 12 2022

ori placed T307742: Memoize Wikifunction functions calls in memcached up for grabs.
Oct 12 2022, 2:36 PM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Abstract Wikipedia team, MW-1.40-notes (1.40.0-wmf.13; 2022-12-05)
ori closed T307699: Formalize the semantics of the function model, a subtask of T296326: Discuss How to Implement Unions, as Resolved.
Oct 12 2022, 2:35 PM · Abstract Wikipedia team
ori closed T307699: Formalize the semantics of the function model as Resolved.
Oct 12 2022, 2:35 PM · 2022 Wikimedia Google.org Fellowship, Abstract Wikipedia team (Phase θ – Throttling)
ori added a comment to T307699: Formalize the semantics of the function model.

Done by Ali: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Semantics_of_Wikifunctions

Oct 12 2022, 2:35 PM · 2022 Wikimedia Google.org Fellowship, Abstract Wikipedia team (Phase θ – Throttling)
ori closed T307700: Observability for function-* services as Resolved.
Oct 12 2022, 2:33 PM · Abstract Wikipedia team (Phase θ – Throttling), 2022 Wikimedia Google.org Fellowship, function-evaluator, function-orchestrator
ori closed T307820: Prototype Abstract Wikipedia in Scribunto as Resolved.
Oct 12 2022, 2:33 PM · Abstract Wikipedia team (Phase θ – Throttling), 2022 Wikimedia Google.org Fellowship
ori added a comment to T308250: Should Wikifunctions use a WebAssembly runtime?.

Relevant: Provably-Safe Multilingual Software Sandboxing using WebAssembly

Oct 12 2022, 2:32 PM · Abstract Wikipedia team, 2022 Wikimedia Google.org Fellowship
ori closed T310199: Select fastest correct implementation as Declined.
Oct 12 2022, 2:31 PM · Abstract Wikipedia team (Phase θ – Throttling)
ori closed T310093: Investigate why function evaluation is slow as Resolved.
Oct 12 2022, 2:31 PM · Abstract Wikipedia team (Phase θ – Throttling), 2022 Wikimedia Google.org Fellowship
ori closed T314788: Performance analysis documentation for Wikifunctions as Declined.
Oct 12 2022, 2:30 PM · Abstract Wikipedia team (Phase θ – Throttling), 2022 Wikimedia Google.org Fellowship
ori added a comment to T316706: Run user-submitted code under gVisor.

I've cherry-picked the two Puppet patches on the beta cluster. The mediawiki-function-evaluator service is now running under gVisor.

Oct 12 2022, 2:24 PM · Abstract Wikipedia team, function-evaluator

Oct 11 2022

ori added a comment to T316879: Make gVisor packages available via apt.wikimedia.org.

Never mind, I see that it is available for Bullseye -- sorry.

Oct 11 2022, 3:12 PM · Patch-For-Review, Infrastructure-Foundations, serviceops
ori added a comment to T316879: Make gVisor packages available via apt.wikimedia.org.

@Joe the Wikifunctions Beta Cluster instance is running Bullseye -- could you also pull it in there?

Oct 11 2022, 3:04 PM · Patch-For-Review, Infrastructure-Foundations, serviceops

Sep 8 2022

ori closed T315019: HTTP 500 errors from Beta Cluster Wikifunctions health-check API endpoint as Resolved.

There are no outstanding issues that are specific to the Beta Cluster environment, AFAIK.

Sep 8 2022, 5:28 PM · MW-1.39-notes (1.39.0-wmf.27; 2022-08-29), Abstract Wikipedia team (Phase θ – Throttling)
ori closed T316886: Internal server error when calling function on NLG types as Resolved.
Sep 8 2022, 5:27 PM · Abstract Wikipedia team (Phase θ – Throttling)
ori triaged T315403: Framework for running experiments on a subset of the app server fleet as Low priority.
Sep 8 2022, 2:31 PM · serviceops, SRE

Sep 6 2022

ori closed T285312: Enable Logging in Backend Services, a subtask of T299598: Add security limits to the Wikifunctions system to maintain stability and integrity of the content, as Resolved.
Sep 6 2022, 4:02 PM · Abstract Wikipedia team (Phase λ – Launch), Epic
ori closed T285312: Enable Logging in Backend Services as Resolved.

@cmassaro We have some logging now, and instructions on Wikitech on how to access the logs. I think there are more places where we can add additional logging to make debugging easier, but that is better dealt with on an ongoing basis than a dedicated task.

Sep 6 2022, 4:02 PM · Abstract Wikipedia team (Phase θ – Throttling), function-evaluator, function-orchestrator
ori closed T290700: Use a Proper Logging Module in Orchestrator as Resolved.
Sep 6 2022, 3:59 PM · Patch-For-Review, Abstract Wikipedia Fix-It tasks, Abstract Wikipedia team (Phase θ – Throttling), function-orchestrator
ori closed T290700: Use a Proper Logging Module in Orchestrator, a subtask of T285312: Enable Logging in Backend Services, as Resolved.
Sep 6 2022, 3:58 PM · Abstract Wikipedia team (Phase θ – Throttling), function-evaluator, function-orchestrator
ori updated subscribers of T317064: History pages' caches not being invalidated after edits.

I suspect this is fallout from the URL query sorting change (cc @ori) not invalidating the cache of history pages properly.

Sep 6 2022, 3:44 AM · Patch-For-Review, Performance-Team (Radar), MediaWiki-Core-HTTP-Cache, SRE, Regression, Traffic, MediaWiki-Page-history

Sep 2 2022

ori added a comment to T132418: Evaluate using 'stale-while-revalidate' HTTP cache control.

Chrome is shipping this as of Chrome 75. Time to reconsider!

Sep 2 2022, 7:02 PM · MW-1.39-notes, MediaWiki-Platform-Team, MW-1.40-notes (1.40.0-wmf.14; 2022-12-12), Patch-For-Review, MediaWiki-ResourceLoader, Performance-Team
ori added a comment to T316886: Internal server error when calling function on NLG types.

Can we see the function call that was sent? Even just copying the ZObject from expert mode in the UI will help.

Sep 2 2022, 5:25 PM · Abstract Wikipedia team (Phase θ – Throttling)
ori added a comment to T316886: Internal server error when calling function on NLG types.

That no longer looks like an error that would be specific to the Beta cluster environment. @AAssaf-WMF , can you see if you get the same error locally?

Sep 2 2022, 3:19 PM · Abstract Wikipedia team (Phase θ – Throttling)
ori added a comment to T316886: Internal server error when calling function on NLG types.

The API Sandbox request in the task description is still failing, but the underlying error is now different:

Sep 2 2022, 3:09 PM · Abstract Wikipedia team (Phase θ – Throttling)
ori added a comment to T316859: Usage of babel?.

I think you can create a patch to remove it from package.json, and we'll see if all the integration tests pass. If anything breaks after merge we can always revert easily.

Sep 2 2022, 2:26 PM · Abstract Wikipedia team (Phase κ – Clean-up), WikiLambda
ori updated the task description for T310880: Post-creation work for pcmwiki.
Sep 2 2022, 2:10 PM · Wiki-Setup

Sep 1 2022

ori renamed T239609: The N'Ko language cannot be looked up by its English name in the languages search box on Mobile web from The N'Ko language cannot be looked up by it's English name in the languages search box on Mobile web to The N'Ko language cannot be looked up by its English name in the languages search box on Mobile web.
Sep 1 2022, 8:39 PM · MobileFrontend (Language overlay), patch-welcome
ori added a comment to T316886: Internal server error when calling function on NLG types.

OK, it looks like the default User-Agent string sent by node-fetch is blocked by Varnish:
https://github.com/wikimedia/puppet/blob/9843300dba/modules/varnish/templates/wikimedia-frontend.vcl.erb#L716-L718
We need to set a custom user-agent string for the orchestrator.

Sep 1 2022, 7:26 PM · Abstract Wikipedia team (Phase θ – Throttling)
ori added a comment to T316886: Internal server error when calling function on NLG types.

Ok, I hacked in some debugging code to include the HTML body in the response, and it looks like the orchestrator is getting an error page with the message:

Sep 1 2022, 6:05 PM · Abstract Wikipedia team (Phase θ – Throttling)
ori added a comment to T316886: Internal server error when calling function on NLG types.

It seems that the orchestrator is getting an invalid response from the MediaWiki API:

Sep 1 2022, 5:36 PM · Abstract Wikipedia team (Phase θ – Throttling)
ori added a comment to T268678: Make Wikifunctions a true multi-lingual wiki, exposing content in each language to readers and search engines with parity.

When a page on a wiki is updated, MediaWiki sends purge requests to the CDN layer to invalidate objects in the cache. Currently, this is URL-based. So, for example, if I got edit the article on 'Science' on enwiki, MediaWiki will send purge requests to Varnish for the following URLs:

Sep 1 2022, 4:48 PM · Abstract Wikipedia team, MW-1.41-notes (1.41.0-wmf.19; 2023-07-25)
ori reopened T315019: HTTP 500 errors from Beta Cluster Wikifunctions health-check API endpoint as "Open".

We're seeing errors again.

Sep 1 2022, 4:15 PM · MW-1.39-notes (1.39.0-wmf.27; 2022-08-29), Abstract Wikipedia team (Phase θ – Throttling)
ori updated the task description for T316879: Make gVisor packages available via apt.wikimedia.org.
Sep 1 2022, 3:50 PM · Patch-For-Review, Infrastructure-Foundations, serviceops
ori created T316879: Make gVisor packages available via apt.wikimedia.org.
Sep 1 2022, 3:50 PM · Patch-For-Review, Infrastructure-Foundations, serviceops

Aug 30 2022

ori added a project to T316706: Run user-submitted code under gVisor: Abstract Wikipedia team.
Aug 30 2022, 7:35 PM · Abstract Wikipedia team, function-evaluator
ori created T316706: Run user-submitted code under gVisor.
Aug 30 2022, 7:34 PM · Abstract Wikipedia team, function-evaluator
ori closed T138093: Investigate query parameter normalization for MW/services as Resolved.

This is now rolled for text frontends.

Aug 30 2022, 4:07 PM · MW-1.39-notes (1.39.0-wmf.25; 2022-08-15), Patch-For-Review, Traffic-Icebox, Platform Team Legacy (Watching / External), Services (watching), SRE, MediaWiki-General
ori updated the task description for T314868: Roll out query parameter normalization.
Aug 30 2022, 3:52 PM · MW-1.39-notes (1.39.0-wmf.23; 2022-08-01), Patch-For-Review, Traffic, SRE, MediaWiki-General
ori added a comment to T314868: Roll out query parameter normalization.

This is now complete. Many thanks to @Vgutierrez for partnering with me to get this rolled out.

Aug 30 2022, 2:51 PM · MW-1.39-notes (1.39.0-wmf.23; 2022-08-01), Patch-For-Review, Traffic, SRE, MediaWiki-General
ori closed T314868: Roll out query parameter normalization, a subtask of T138093: Investigate query parameter normalization for MW/services, as Resolved.
Aug 30 2022, 2:35 PM · MW-1.39-notes (1.39.0-wmf.25; 2022-08-15), Patch-For-Review, Traffic-Icebox, Platform Team Legacy (Watching / External), Services (watching), SRE, MediaWiki-General
ori closed T314868: Roll out query parameter normalization as Resolved.
Aug 30 2022, 2:34 PM · MW-1.39-notes (1.39.0-wmf.23; 2022-08-01), Patch-For-Review, Traffic, SRE, MediaWiki-General

Aug 28 2022

ori added a comment to T315398: Set MW appserver scaling_governor to performance.

I tried setting EPP to 0 using x86_energy_perf_policy, thinking that bypassing the sysfs interface and writing directly to the MSR would make the setting sticky. Unfortunately this does not seem to be the case -- the EPP is gradually reset to 128, same as when you tried changing it via sysfs. At this point I also don't see value in further experimentation with the EPP knob and agree that performance is the way to go.

Aug 28 2022, 9:14 PM · Performance-Team (Radar), SRE

Aug 26 2022

ori added a comment to T315398: Set MW appserver scaling_governor to performance.

Actually, let me not step on your toes. But if you can tolerate a short extension of this task, I would very much like to see this setting tested. I think there is a good chance it will give the same or very similar performance increase with less waste of power. Just to be fully explicit, the setting is:

Aug 26 2022, 11:12 AM · Performance-Team (Radar), SRE
ori added a comment to T315398: Set MW appserver scaling_governor to performance.

So 'powersave' with EPP=0 gives a broader range of operating frequencies than 'performance'. We should see if in this mode the frequency scaling is still responsive enough for the workload.

Aug 26 2022, 7:33 AM · Performance-Team (Radar), SRE

Aug 22 2022

ori updated the task description for T314868: Roll out query parameter normalization.
Aug 22 2022, 9:55 AM · MW-1.39-notes (1.39.0-wmf.23; 2022-08-01), Patch-For-Review, Traffic, SRE, MediaWiki-General

Aug 20 2022

ori added a comment to T290700: Use a Proper Logging Module in Orchestrator.

Unfortunately continuation-local-storage and its more modern counterpart, AsyncLocalStorage come with a substantial performance cost, particularly for workloads with a lot of async/await calls. I don't think we can afford the performance penalty.

Aug 20 2022, 1:58 AM · Patch-For-Review, Abstract Wikipedia Fix-It tasks, Abstract Wikipedia team (Phase θ – Throttling), function-orchestrator

Aug 19 2022

ori added a comment to T252719: Upgrade thumbor to Thumbor 7 and python3.

@Vlad.shapik thank you, but what about the other points I raised?

Aug 19 2022, 3:23 PM · Patch-For-Review, Thumbor Migration, Python3-Porting
ori added a parent task for T240685: MediaWiki Prometheus support: T315403: Framework for running experiments on a subset of the app server fleet.
Aug 19 2022, 2:25 PM · MW-1.44-notes (1.44.0-wmf.8; 2024-12-17), Patch-For-Review, SRE Observability (FY2023/2024-Q4), MW-1.41-notes (1.41.0-wmf.28; 2023-09-26), MW-1.40-notes (1.40.0-wmf.27; 2023-03-13), MW-1.38-notes (1.38.0-wmf.19; 2022-01-24), MediaWiki-libs-Stats, Platform Team Workboards (External Code Reviews), serviceops, SRE, MediaWiki-General, observability
ori added a subtask for T315403: Framework for running experiments on a subset of the app server fleet: T240685: MediaWiki Prometheus support.
Aug 19 2022, 2:25 PM · serviceops, SRE

Aug 18 2022

ori added a comment to T315398: Set MW appserver scaling_governor to performance.

Congratulations, this is a huge win! I think we should dig deeper to see if we can get the same or similar performance benefit, but waste less power.

Aug 18 2022, 8:21 PM · Performance-Team (Radar), SRE
ori added a comment to T252719: Upgrade thumbor to Thumbor 7 and python3.

Thanks @roman-stolar. I think it's a mistake to combine (a) changes from (multiple?) upstream(s), (b) unmerged changes from Gerrit, and (c) your own work into a single commit, as in Icabc39dab. This destroys some useful history (for example, the authorship, review comments and discussion on Id6ec6d62c), and it makes future reconciliation with upstream code harder. It's also error-prone.

Aug 18 2022, 3:40 PM · Patch-For-Review, Thumbor Migration, Python3-Porting