Page MenuHomePhabricator

aaron (Aaron Schulz)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Oct 20 2014, 5:25 PM (396 w, 1 d)
Availability
Available
IRC Nick
AaronSchulz
LDAP User
Aaron Schulz
MediaWiki User
Aaron Schulz [ Global Accounts ]

Recent Activity

Yesterday

aaron added a comment to T275319: Raise limit of $wgMaxArticleSize for Hebrew Wikisource.

It would be good to look at the performance of pages at https://he.wikisource.org/wiki/%D7%9E%D7%99%D7%95%D7%97%D7%93:%D7%93%D7%A4%D7%99%D7%9D_%D7%90%D7%A8%D7%95%D7%9B%D7%99%D7%9D

Tue, May 24, 9:15 PM · Performance-Team (Radar), SRE, Wikimedia-Site-requests
aaron added a comment to T308893: Increase $wgMaxArticleSize to 4MB for ruwikisource.

It would be good to look at the performance of pages at https://ru.wikisource.org/wiki/%D0%A1%D0%BB%D1%83%D0%B6%D0%B5%D0%B1%D0%BD%D0%B0%D1%8F:%D0%94%D0%BB%D0%B8%D0%BD%D0%BD%D1%8B%D0%B5_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D1%8B

Tue, May 24, 7:19 PM · Performance-Team (Radar), SRE, Wikimedia-Site-requests, Russian-Sites
aaron moved T267990: Support stashing in page/html and revision/html endpoints in MW core from Inbox to Radar on the Performance-Team board.
Tue, May 24, 6:55 PM · Performance-Team (Radar), Platform Team Workboards (MW Expedition), Patch-For-Review, MediaWiki-REST-API
aaron moved T309126: Allow segmentation for large WANCache values from Inbox to Doing: Prio Interrupt on the Performance-Team board.
Tue, May 24, 6:38 PM · MediaWiki-libs-ObjectCache, Performance-Team
aaron edited projects for T309126: Allow segmentation for large WANCache values, added: MediaWiki-libs-ObjectCache; removed Wikimedia-Rdbms, User-brennen, Wikimedia-production-error.
Tue, May 24, 6:38 PM · MediaWiki-libs-ObjectCache, Performance-Team
aaron created T309126: Allow segmentation for large WANCache values.
Tue, May 24, 6:16 PM · MediaWiki-libs-ObjectCache, Performance-Team

Mon, May 23

aaron added a comment to T293530: Improve slow read query handling.

Note that MYSQLI_OPT_READ_TIMEOUT can only be set once per https://bugs.php.net/bug.php?id=76703

Mon, May 23, 10:37 PM · Platform Engineering, Sustainability (Incident Followup), SRE, DBA

Sat, May 21

aaron updated the task description for T308732: Cleanup interaction between per-job and root-job deduplication logic so they mix properly.
Sat, May 21, 12:03 AM · Patch-For-Review, MediaWiki-Core-JobQueue

Thu, May 19

aaron created T308732: Cleanup interaction between per-job and root-job deduplication logic so they mix properly.
Thu, May 19, 8:26 AM · Patch-For-Review, MediaWiki-Core-JobQueue

Wed, May 18

aaron updated subscribers of T273375: Raise minimum supported MySQL version to MySQL 5.6 (or later).

@aaron @jcrespo @tstarling Any thoughts on whether and how much to raise the MySQL requirement at this time for the 1.39 LTS release? (To be released Nov 2022.)

Wed, May 18, 4:43 AM · Data-Persistence (Consultation), Proposal, MediaWiki-Installer

Mon, May 16

aaron closed T162050: Implement GTID-MySQL support for mediawiki as Resolved.

Done in 315ffb840bf134e66d64998f3e5f88ac86c8ab26 .

Mon, May 16, 7:11 PM · Performance-Team, Sustainability, Wikimedia-Rdbms

Tue, May 10

aaron added a comment to T307133: DBSessionStateError: Cannot execute query from MediaWiki\Deferred\LinksUpdate\LinksUpdate::acquirePageLock while session status is ERROR.

Making RefreshLinksJob boost the net timeouts could work:

Tue, May 10, 2:21 AM · Patch-For-Review, Performance-Team, Wikimedia-Rdbms, User-brennen, Wikimedia-production-error
aaron added a comment to T307133: DBSessionStateError: Cannot execute query from MediaWiki\Deferred\LinksUpdate\LinksUpdate::acquirePageLock while session status is ERROR.
DBQueryDisconnectedError: A connection error occurred during a query. Query: COMMIT Function: RefreshLinksJob::runForTitle Error: 2006 MySQL server has gone away (db1104)
Tue, May 10, 2:05 AM · Patch-For-Review, Performance-Team, Wikimedia-Rdbms, User-brennen, Wikimedia-production-error

Mon, May 2

aaron added a comment to T306589: Add sharding to site_stats table.

I thought that we updated this table in autocommit mode within POSTSEND deferred updates. Is there a case that is not doing that, it should be fixed. Is there a user impact, then? Nothing stands out in the DBPerformance log either.

Mon, May 2, 6:52 PM · Performance-Team (Radar), DBA, Patch-For-Review

Thu, Apr 28

aaron closed T293859: Wikimedia\Rdbms\DBTransactionStateError: Cannot execute query ... while transaction status is ERROR (after PHP timeout) as Resolved.
Thu, Apr 28, 5:25 PM · MW-1.39-notes (1.39.0-wmf.6; 2022-04-04), Commons, Performance-Team, Wikimedia-Rdbms, Wikimedia-production-error
aaron added a comment to T306729: Alert: Perf survey down as of Thu 2022-04-21 18:00.

The CSS for the response buttons seems to be wonkey. The buttons look like plain text, discouraging their use.

Thu, Apr 28, 3:44 AM · QuickSurveys, Performance-Team

Tue, Apr 26

aaron added a comment to T293630: Investigate performance degradation at high concurrencies in php-fpm .

Using https://gist.github.com/AaronSchulz/28a2cc7701a33adca1479b5ff6530b2c and ab , apcu perfomance degradation was tested in a number of scenarios on a depooled host. When doing high writes to a set keys of random sizes (128 bytes to 1MB), the global write locks slow down even simple read-only requests (e.g. apcu_fetch). Inducing memory fragmentation (reported by apc.php) only makes it worse. Another antipattern is quickly filling up the cache up with an overly large working-set and causing resets, which creates an endless cycle of sets and cache flushes, with reads being slow.

Tue, Apr 26, 3:16 AM · serviceops, Performance-Team

Mon, Apr 25

aaron added a comment to T306589: Add sharding to site_stats table.

Are there any metrics indicating the scale of the problem? Are there deadlocks or transaction slow downs? The idea sounds plausible.

Mon, Apr 25, 6:42 PM · Performance-Team (Radar), DBA, Patch-For-Review
aaron moved T306732: Consider applying critical sections to DeferredUpdates from Inbox to Backlog: Future Goals on the Performance-Team board.
Mon, Apr 25, 6:23 PM · Performance-Team, MediaWiki-General, Sustainability (Incident Followup)
aaron added a comment to T212129: Move MainStash out of Redis to a simpler multi-dc aware solution.

Next: Decide on how and whether to fragment the data in mainstashdb, e.g. like parser cache, like external store, or something else. @aaron to propose some ideas for DBAs to provide feedback/guidenace on.

Mon, Apr 25, 5:45 PM · MW-1.39-notes (1.39.0-wmf.9; 2022-04-25), MW-1.38-notes (1.38.0-wmf.20; 2022-01-31), Patch-For-Review, Performance-Team, Sustainability (MediaWiki-MultiDC), MediaWiki-General, serviceops-radar, User-mobrovac, User-jijiki, SRE

Apr 19 2022

aaron placed T305384: Server error on https://de.wikipedia.org/wiki/Bild_(Zeitung) due to {{Auflagen-Diagramm}} up for grabs.
Apr 19 2022, 6:32 PM · affects-Kiwix-and-openZIM, Patch-For-Review, Performance-Team (Radar), MediaWiki-Parser, Performance Issue, Wikimedia-production-error
aaron updated subscribers of T9043: Templates should show parameters when not expanded due to preprocessor node count limit.
Apr 19 2022, 5:25 PM · MediaWiki-Templates
aaron closed T304960: Caller ignored an error originally raised from IndexPager::buildQueryInfo or ApiQueryUserContribs::execute as Resolved.
Apr 19 2022, 5:20 PM · MW-1.39-notes (1.39.0-wmf.7; 2022-04-11), Performance-Team, Wikimedia-Rdbms, Wikimedia-production-error

Apr 11 2022

aaron added a comment to T305615: Performance review of Extension:ImageSuggestions.

How many users might get notified in one run of the maintenance script? I doubt it would be a disk space issue, but we should make sure that the script throttles the Echo DB updates with batching and LBFactory::waitForReplication() calls.

Apr 11 2022, 7:04 PM · Performance-Team (Radar), Image-Suggestions, Structured-Data-Backlog (Current Work)
aaron moved T304960: Caller ignored an error originally raised from IndexPager::buildQueryInfo or ApiQueryUserContribs::execute from Inbox to Doing: Prio Interrupt on the Performance-Team board.
Apr 11 2022, 6:58 PM · MW-1.39-notes (1.39.0-wmf.7; 2022-04-11), Performance-Team, Wikimedia-Rdbms, Wikimedia-production-error

Apr 4 2022

aaron changed the status of T96123: Make ReplicatedBagOStuff fail-over happen on corresponding slaves for redis masters from Resolved to Declined.

We are going with mainstash db for this reason. The remaining redis users do not need cross-dc replication.

Apr 4 2022, 6:27 PM · Performance-Team, MediaWiki-libs-ObjectCache, Sustainability

Mar 24 2022

aaron closed T303628: File: includes/objectcache/SqlBagOStuff.php: "PHP Warning: Undefined array key 1 @1090" as Declined.
Mar 24 2022, 5:38 PM · MW-1.39-notes (1.39.0-wmf.5; 2022-03-28), Performance-Team, MediaWiki-libs-ObjectCache, MW-1.37-release

Mar 22 2022

aaron added a comment to T303628: File: includes/objectcache/SqlBagOStuff.php: "PHP Warning: Undefined array key 1 @1090".

@Joerg.Bernau For MediaWiki <= 1.37 (before 65b1b6b56afd5b533a9213813ce9e6984c443ce4 ), you have to set $wgShellLocale to "C.UTF-8" .

Mar 22 2022, 10:22 PM · MW-1.39-notes (1.39.0-wmf.5; 2022-03-28), Performance-Team, MediaWiki-libs-ObjectCache, MW-1.37-release
aaron added a comment to T303628: File: includes/objectcache/SqlBagOStuff.php: "PHP Warning: Undefined array key 1 @1090".
aaron@SpectreX360:~/PhpstormProjects/mediawiki/core$ sudo locale-gen es_ES
[sudo] Mot de passe de aaron : 
Generating locales (this might take a while)...
  es_ES.ISO-8859-1... done
Generation complete.
aaron@SpectreX360:~/PhpstormProjects/mediawiki/core$ sudo locale-gen es_ES.utf8
Generating locales (this might take a while)...
  es_ES.UTF-8... done
Generation complete.
aaron@SpectreX360:~/PhpstormProjects/mediawiki/core$ sudo update-locale
aaron@SpectreX360:~/PhpstormProjects/mediawiki/core$ wphp maintenance/shell.php 
Psy Shell v0.11.2 (PHP 8.0.8 — cli) by Justin Hileman
>>> setlocale(LC_ALL, 'es_ES.UTF-8');
=> "es_ES.UTF-8"
>>> explode( '.', sprintf( '%.6F', 14.4 ) );
=> [
     "14",
     "400000",
   ]
>>> explode( '.', sprintf( '%.6f', 14.446 ) );
=> [
     "14,446000",
   ]
>>>
Mar 22 2022, 4:26 AM · MW-1.39-notes (1.39.0-wmf.5; 2022-03-28), Performance-Team, MediaWiki-libs-ObjectCache, MW-1.37-release
aaron added a comment to T303628: File: includes/objectcache/SqlBagOStuff.php: "PHP Warning: Undefined array key 1 @1090".

@Joerg.Bernau What locale are you using (LC_*)? I think the sprintf() should use "F" instead of "f".

Mar 22 2022, 4:13 AM · MW-1.39-notes (1.39.0-wmf.5; 2022-03-28), Performance-Team, MediaWiki-libs-ObjectCache, MW-1.37-release

Mar 21 2022

aaron removed a project from T304306: Investigate integration of FlaggedRevs ParserCache into ParserOutputAccess service: Performance-Team.
Mar 21 2022, 6:33 PM · Patch-For-Review, WMDE-TechWish-Sprint-2022-03-16, WMDE-GeoInfo-FocusArea, WMDE-TechWish

Mar 16 2022

aaron added a comment to T303092: quibble-vendor-mysql-php72-noselenium-docker fails for a noop PageTriage patch.

The method, getMultiWithUnionSetCallback(), seems to expect explicit "false" values for missing items...I'll update the docs/tests.

Mar 16 2022, 8:45 PM · MW-1.39-notes (1.39.0-wmf.5; 2022-03-28), Growth community maintenance, Growth-Team (Current Sprint), Patch-For-Review, ci-test-error (WMF-deployed Build Failure), Release-Engineering-Team, PageCuration, User-TheresNoTime
aaron added a comment to T292239: Flaky "LogicException: Trying to delete mock tables" failure in MW integration test.

Patch now gets:

SQL ERROR (ignored): FUNCTION wikidb.RELEASE_ALL_LOCKS does not exist (localhost:/workspace/db/quibble-mysql-5z8kvkdz/socket)

This is due to the mariadb version not supporting that method (https://jira.mariadb.org/browse/MDEV-10569). Mine does, locally, and mysql has since 2014. I'll have to use RELEASE_LOCK() then.

Mar 16 2022, 5:49 PM · MW-1.39-notes (1.39.0-wmf.10; 2022-05-02), MW-1.38-notes, MW-1.38-release, Patch-For-Review, Platform Engineering, SQLite, MediaWiki-Core-Tests, ci-test-error (WMF-deployed Build Failure)
aaron merged T281451: Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-rollback-callbacks') into T303887: Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-rollback-session').
Mar 16 2022, 5:45 PM · User-Ladsgroup, MW-1.39-notes (1.39.0-wmf.1; 2022-03-21-early), Wikimedia-production-error
aaron merged task T281451: Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-rollback-callbacks') into T303887: Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-rollback-session').
Mar 16 2022, 5:45 PM · MW-1.38-notes (1.38.0-wmf.7; 2021-11-02), Performance-Team, Wikimedia-Rdbms, Platform Team Workboards (Clinic Duty Team), Wikimedia-production-error
aaron closed T281451: Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-rollback-callbacks') as Resolved.

I'm seeing this same error locally when testing specific behavior of some REST API endpoint changes for T303352, but having a hard time debugging the path from ParameterAssertionException being thrown here to this error being thrown. I think @aaron's comment above might explain the problem i'm seeing, any chance of helping me step through this particular use case @aaron ? :)

Mar 16 2022, 5:39 PM · MW-1.38-notes (1.38.0-wmf.7; 2021-11-02), Performance-Team, Wikimedia-Rdbms, Platform Team Workboards (Clinic Duty Team), Wikimedia-production-error
aaron added a comment to T292239: Flaky "LogicException: Trying to delete mock tables" failure in MW integration test.

Having tweaked the error message, I now see:

Mar 16 2022, 4:49 PM · MW-1.39-notes (1.39.0-wmf.10; 2022-05-02), MW-1.38-notes, MW-1.38-release, Patch-For-Review, Platform Engineering, SQLite, MediaWiki-Core-Tests, ci-test-error (WMF-deployed Build Failure)
aaron added a comment to T292239: Flaky "LogicException: Trying to delete mock tables" failure in MW integration test.

Thanks for working on this. These patches do not fix the bug either. We can't merge https://gerrit.wikimedia.org/r/771081 because it keeps failing the tests with the same error message.

Mar 16 2022, 4:48 PM · MW-1.39-notes (1.39.0-wmf.10; 2022-05-02), MW-1.38-notes, MW-1.38-release, Patch-For-Review, Platform Engineering, SQLite, MediaWiki-Core-Tests, ci-test-error (WMF-deployed Build Failure)

Mar 15 2022

aaron closed T297424: Reduce LBFactory::rollbackPrimaryChanges() callers in core, a subtask of T281451: Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-rollback-callbacks'), as Resolved.
Mar 15 2022, 8:21 PM · MW-1.38-notes (1.38.0-wmf.7; 2021-11-02), Performance-Team, Wikimedia-Rdbms, Platform Team Workboards (Clinic Duty Team), Wikimedia-production-error
aaron closed T297424: Reduce LBFactory::rollbackPrimaryChanges() callers in core, a subtask of T293859: Wikimedia\Rdbms\DBTransactionStateError: Cannot execute query ... while transaction status is ERROR (after PHP timeout), as Resolved.
Mar 15 2022, 8:21 PM · MW-1.39-notes (1.39.0-wmf.6; 2022-04-04), Commons, Performance-Team, Wikimedia-Rdbms, Wikimedia-production-error
aaron closed T297424: Reduce LBFactory::rollbackPrimaryChanges() callers in core as Resolved.
Mar 15 2022, 8:21 PM · MW-1.38-notes (1.38.0-wmf.23; 2022-02-21), Platform Engineering, Performance-Team, Wikimedia-Rdbms
aaron added a comment to T303628: File: includes/objectcache/SqlBagOStuff.php: "PHP Warning: Undefined array key 1 @1090".

I don't really see how this is possible either. I'm curious what the type of the $mtime variable is in this case (should be float).

Mar 15 2022, 5:33 PM · MW-1.39-notes (1.39.0-wmf.5; 2022-03-28), Performance-Team, MediaWiki-libs-ObjectCache, MW-1.37-release

Mar 10 2022

aaron closed T227376: Move callers away from getMainObjectStash() that do not need it as Resolved.

Grepping for getMainObjectStash() callers, I don't see anything left to do here.

Mar 10 2022, 10:26 PM · Sustainability (MediaWiki-MultiDC), Platform Team Workboards (External Code Reviews), WMF-General-or-Unknown, Patch-For-Review, Performance-Team

Mar 8 2022

aaron added a comment to T303225: CI failure from non-fatal SqlModuleDependencyStore "database is locked" warning.

SqliteInstaller should probably set $wgMainStash to CACHE_DB since it already generates a separate DB with optimized settings (this error should be unlikely with the default PDO lock timeout)

Mar 8 2022, 9:43 PM · MW-1.39-notes (1.39.0-wmf.7; 2022-04-11), Performance-Team, ci-test-error (WMF-deployed Build Failure), MediaWiki-ResourceLoader

Mar 7 2022

aaron moved T302953: Deprecate wasDeadlock/wasLockTimeout/wasConnectionLoss as public IDatabase methods from Inbox to Backlog: Maintenance on the Performance-Team board.
Mar 7 2022, 7:25 PM · Technical-Debt (Deprecation process), Performance-Team, Wikimedia-Rdbms

Mar 3 2022

aaron renamed T302953: Deprecate wasDeadlock/wasLockTimeout/wasConnectionLoss as public IDatabase methods from wasDeadlock to Deprecate wasDeadlock/wasLockTimeout/wasConnectionLoss as public IDatabase methods.
Mar 3 2022, 5:37 AM · Technical-Debt (Deprecation process), Performance-Team, Wikimedia-Rdbms
aaron created T302953: Deprecate wasDeadlock/wasLockTimeout/wasConnectionLoss as public IDatabase methods.
Mar 3 2022, 5:35 AM · Technical-Debt (Deprecation process), Performance-Team, Wikimedia-Rdbms

Feb 22 2022

aaron moved T288702: Reduce complexity and time spent in WANObjectCache from Doing: Prio Interrupt to Backlog: Maintenance on the Performance-Team board.
Feb 22 2022, 8:00 PM · MW-1.38-notes (1.38.0-wmf.7; 2021-11-02), MW-1.37-notes (1.37.0-wmf.23; 2021-09-13), Patch-For-Review, Performance-Team, MediaWiki-libs-ObjectCache

Feb 3 2022

aaron created T300938: update.php fails with FlaggedRevs with sqlite.
Feb 3 2022, 11:39 PM · SQLite, MediaWiki-extensions-FlaggedRevs

Jan 26 2022

aaron closed T296610: MediumSpecificBagOStuff->guessSerialValueSize infinite loop when storing Title object (Special:Homepage throws "Maximum function nesting reached") as Resolved.
Jan 26 2022, 10:56 PM · MW-1.38-notes (1.38.0-wmf.20; 2022-01-31), Patch-For-Review, GrowthExperiments-NewcomerTasks, Performance-Team, Growth-Team, MediaWiki-libs-ObjectCache

Jan 25 2022

aaron added a comment to T193565: Foreign query for metawiki fails with "Table 'centralauth.page' doesn't exist" (DBConnRef mixup?).

Code that uses getConnection() is expected to use reuseConnection(), the only problem is having the deferred update scheduled in the middle with a direct Database handle be given to it (rather than the deferred update calling openConnection). So it's really just two anti-patterns.

Jan 25 2022, 12:29 AM · MW-1.37-notes, MW-1.38-notes, MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), MW-1.37-release, MW-1.38-release, Performance-Team, Patch-For-Review, Sustainability (Incident Followup), Wikimedia-production-error, Wikimedia-Rdbms

Jan 24 2022

aaron added a comment to T299691: Break down monster class: Database.

The "Move replication-based stuff to its own class" bullet could use some clarity. I'd imagine a lot of that could easily stay in Database/subclasses (due to rdbms specific logic). I guess you could have separate per-rdbms class hierarchies and inject the objects. Was this referring to more of the LoadBalancer code?

Jan 24 2022, 11:55 PM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Performance-Team (Radar), Platform Engineering, Epic, Wikimedia-Rdbms, Data-Persistence
aaron closed T166911: Add support for Redis database selection in MediaWiki RedisBagOStuff as Declined.

The feature seems to be discouraged* by the author of redis, and doesn't gain much over key prefixes (MediaWiki code using BagOStuff already uses ones based on $wgDBname/$wgDBprefix and configurable IDs) or multiple redis instances.

Jan 24 2022, 11:52 PM · Performance-Team, Platform Engineering, MediaWiki-libs-ObjectCache

Jan 21 2022

aaron closed T297917: [WANObjectCache] ctype_digit(): Argument of type bool will be interpreted as string in the future as Resolved.
Jan 21 2022, 1:16 AM · MW-1.38-notes (1.38.0-wmf.19; 2022-01-24), Performance-Team, MediaWiki-libs-ObjectCache, PHP 8.1 support
aaron added a comment to T193565: Foreign query for metawiki fails with "Table 'centralauth.page' doesn't exist" (DBConnRef mixup?).

I recall seeing that it looked possible from DeferredUpdates too (similar kind of pattern to onTransaction*), though I didn't post a simplest repro case for that. Here it is:

Jan 21 2022, 1:11 AM · MW-1.37-notes, MW-1.38-notes, MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), MW-1.37-release, MW-1.38-release, Performance-Team, Patch-For-Review, Sustainability (Incident Followup), Wikimedia-production-error, Wikimedia-Rdbms

Jan 18 2022

aaron added a comment to T298485: MW scripts should reload the database config.

I don't think refreshing the DB config is going to work.

  • Various non-rdbms objects/code might keep $db/$lb references around.
  • Some callers might prefer to keep using an MVCC snapshot (from REPEATABLE-READ) when possible, rather than getting switched to another DB.
  • Config loading is still a mix of declarative/imperative logic with hooks and runtime files. This is still a problem even if we had an LBFactory::recycleAll() method to close connections and reload config if enough time passed since init or last "recycle".
Jan 18 2022, 9:39 PM · Patch-For-Review, Performance-Team (Radar), MediaWiki-Maintenance-system, User-Ladsgroup, DBA

Jan 11 2022

aaron added a comment to T212129: Move MainStash out of Redis to a simpler multi-dc aware solution.

I like "mainstash". If there is ever vertical sharding by extension, then "<group>stash" could be used as a DB name on separate clusters.

Jan 11 2022, 12:28 AM · MW-1.39-notes (1.39.0-wmf.9; 2022-04-25), MW-1.38-notes (1.38.0-wmf.20; 2022-01-31), Patch-For-Review, Performance-Team, Sustainability (MediaWiki-MultiDC), MediaWiki-General, serviceops-radar, User-mobrovac, User-jijiki, SRE

Jan 10 2022

aaron updated the task description for T288702: Reduce complexity and time spent in WANObjectCache.
Jan 10 2022, 6:38 PM · MW-1.38-notes (1.38.0-wmf.7; 2021-11-02), MW-1.37-notes (1.37.0-wmf.23; 2021-09-13), Patch-For-Review, Performance-Team, MediaWiki-libs-ObjectCache

Jan 7 2022

aaron triaged T278392: Storage solution for cross-datacenter tokens as Medium priority.
Jan 7 2022, 1:43 AM · Patch-For-Review, MediaWiki-extensions-OAuth, ConfirmEdit (CAPTCHA extension), MediaWiki-extensions-CentralAuth, Sustainability (MediaWiki-MultiDC), Performance-Team
aaron raised the priority of T293630: Investigate performance degradation at high concurrencies in php-fpm from Low to High.
Jan 7 2022, 1:43 AM · serviceops, Performance-Team
aaron triaged T296610: MediumSpecificBagOStuff->guessSerialValueSize infinite loop when storing Title object (Special:Homepage throws "Maximum function nesting reached") as Medium priority.
Jan 7 2022, 1:42 AM · MW-1.38-notes (1.38.0-wmf.20; 2022-01-31), Patch-For-Review, GrowthExperiments-NewcomerTasks, Performance-Team, Growth-Team, MediaWiki-libs-ObjectCache
aaron removed a parent task for T225968: Per component/extension profiling of hooks and pre-send DeferredUpdates with Grafana dashboards: T293630: Investigate performance degradation at high concurrencies in php-fpm .
Jan 7 2022, 1:42 AM · MW-1.37-notes (1.37.0-wmf.14; 2021-07-12), Arc-Lamp, Performance-Team
aaron removed a subtask for T293630: Investigate performance degradation at high concurrencies in php-fpm : T225968: Per component/extension profiling of hooks and pre-send DeferredUpdates with Grafana dashboards.
Jan 7 2022, 1:42 AM · serviceops, Performance-Team
aaron triaged T293630: Investigate performance degradation at high concurrencies in php-fpm as Low priority.
Jan 7 2022, 1:41 AM · serviceops, Performance-Team
aaron added a parent task for T225968: Per component/extension profiling of hooks and pre-send DeferredUpdates with Grafana dashboards: T293630: Investigate performance degradation at high concurrencies in php-fpm .
Jan 7 2022, 1:41 AM · MW-1.37-notes (1.37.0-wmf.14; 2021-07-12), Arc-Lamp, Performance-Team
aaron added a subtask for T293630: Investigate performance degradation at high concurrencies in php-fpm : T225968: Per component/extension profiling of hooks and pre-send DeferredUpdates with Grafana dashboards.
Jan 7 2022, 1:41 AM · serviceops, Performance-Team
aaron triaged T293859: Wikimedia\Rdbms\DBTransactionStateError: Cannot execute query ... while transaction status is ERROR (after PHP timeout) as Medium priority.
Jan 7 2022, 1:22 AM · MW-1.39-notes (1.39.0-wmf.6; 2022-04-04), Commons, Performance-Team, Wikimedia-Rdbms, Wikimedia-production-error
aaron triaged T295439: Create a run book for save timing alerts as Low priority.
Jan 7 2022, 1:22 AM · Performance-Team
aaron merged T270273: Instrument major parts of save timing into T225968: Per component/extension profiling of hooks and pre-send DeferredUpdates with Grafana dashboards.
Jan 7 2022, 1:19 AM · MW-1.37-notes (1.37.0-wmf.14; 2021-07-12), Arc-Lamp, Performance-Team
aaron merged task T270273: Instrument major parts of save timing into T225968: Per component/extension profiling of hooks and pre-send DeferredUpdates with Grafana dashboards.
Jan 7 2022, 1:19 AM · Performance-Team
aaron triaged T283029: FlaggableWikiPage::preloadPreparedEdit() does not actually carry over the parser output, leading to double parses on save as Medium priority.
Jan 7 2022, 1:16 AM · MW-1.38-notes (1.38.0-wmf.23; 2022-02-21), Patch-For-Review, User-Ladsgroup, Platform Engineering, MediaWiki-Page-derived-data, MediaWiki-extensions-FlaggedRevs, Performance-Team
aaron triaged T294969: Deprecate and remove Database::lockTables and Database::unlockTables as Medium priority.
Jan 7 2022, 1:15 AM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Patch-For-Review, Technical-Debt (Deprecation process), Performance-Team, Wikimedia-Rdbms
aaron triaged T279977: Deprecate BagOStuff ATTR_EMULATION (use ATTR_DURABILITY) as Lowest priority.
Jan 7 2022, 1:11 AM · Performance-Team, Technical-Debt (Deprecation process), MediaWiki-libs-ObjectCache
aaron triaged T269161: Disallow direct "BEGIN"/"COMMIT"/"ROLLBACK" via Database::query() as Lowest priority.
Jan 7 2022, 1:11 AM · Performance-Team, Wikimedia-Rdbms, Platform Engineering
aaron triaged T265386: Rewrite LoadMonitor to better handle cache regeneration and improve separation of concern as Lowest priority.
Jan 7 2022, 1:10 AM · Wikimedia-Rdbms, Performance-Team

Jan 6 2022

aaron added a comment to T298682: Wikibase\Repo\Store\Sql\SqlIdGenerator::generateNewId deadlock on wb_id_counters when running selenium tests in parallel.

\Wikibase\Repo\Store\Sql\SqlIdGenerator definitely looks prone to deadlocks. It should probably work more like TableNameStore (named locks + auto-commit trx).

Jan 6 2022, 11:36 PM · MW-1.38-notes (1.38.0-wmf.19; 2022-01-24), Wikidata-Campsite (Team A Hearth 🏰🔥), Browser-Tests, MediaWiki-extensions-WikibaseRepository, wdwb-tech, ci-test-error, Wikidata

Dec 14 2021

aaron added a project to T297665: MapCacheLRUTest::testHasInvalidKey fails on PHP 8.0.8: MediaWiki-Core-Tests.
Dec 14 2021, 12:46 AM · PHP 8.0 support, MediaWiki-Core-Tests
aaron created T297665: MapCacheLRUTest::testHasInvalidKey fails on PHP 8.0.8.
Dec 14 2021, 12:46 AM · PHP 8.0 support, MediaWiki-Core-Tests

Dec 9 2021

aaron created T297424: Reduce LBFactory::rollbackPrimaryChanges() callers in core.
Dec 9 2021, 7:19 PM · MW-1.38-notes (1.38.0-wmf.23; 2022-02-21), Platform Engineering, Performance-Team, Wikimedia-Rdbms

Dec 7 2021

aaron added a project to T295706: Improve TransactionProfiler as replacement for tendril's slow queries: Performance-Team-publish.
Dec 7 2021, 7:48 PM · Performance-Team-publish, MW-1.38-notes (1.38.0-wmf.9; 2021-11-16), Patch-For-Review, Performance-Team (Radar), Developer Productivity, Wikimedia-Rdbms, User-Ladsgroup, DBA

Nov 17 2021

aaron added a comment to T253926: Wikimedia\Rdbms\Database::selectSQLText: aggregation used with a locking SELECT (NewsletterDb::addNewsletterIssue).

I chose nl_newsletters since the row should exist even if there have been no newsletter issues yet for the given newletter ID. This would help avoid deadlocks for the "first issue case". SELECT FOR UPDATE for non-existing rows makes gap locks that are prohibitive-only (do not give insert permissions, causing deadlocks on the actual insertion).

Nov 17 2021, 10:32 PM · MW-1.38-notes (1.38.0-wmf.16; 2022-01-03), Platform Team Workboards (Clinic Duty Team), MediaWiki-extensions-Newsletter, Wikimedia-production-error

Nov 3 2021

aaron created T294969: Deprecate and remove Database::lockTables and Database::unlockTables.
Nov 3 2021, 7:18 PM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Patch-For-Review, Technical-Debt (Deprecation process), Performance-Team, Wikimedia-Rdbms

Nov 2 2021

aaron added a comment to T281451: Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-rollback-callbacks').

Using mysql-level query timeouts would also be an option to limit this from occurring (though it could still happen with multiple slow queries).

Nov 2 2021, 11:01 PM · MW-1.38-notes (1.38.0-wmf.7; 2021-11-02), Performance-Team, Wikimedia-Rdbms, Platform Team Workboards (Clinic Duty Team), Wikimedia-production-error
aaron added a comment to T281451: Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-rollback-callbacks').

One problem is that RevisionBasedEntityLookup is catching generic Exceptions and throwing a different exception, so any entrypoint logic (e.g. jobrunner, runSingleJob, MediaWiki class, MWExceptionHandler class) won't know that the original error was a timeout (via an interrupt, leaving a bunch of function calls unfinished).

Nov 2 2021, 10:45 PM · MW-1.38-notes (1.38.0-wmf.7; 2021-11-02), Performance-Team, Wikimedia-Rdbms, Platform Team Workboards (Clinic Duty Team), Wikimedia-production-error
aaron closed T264604: Enable "/*/mw-with-onhost-tier/" route for MediaWiki where safe, a subtask of T244340: Reduce read pressure on mc* servers by adding a machine-local Memcached instance (on-host memcached), as Declined.
Nov 2 2021, 3:40 AM · User-jijiki, Sustainability (Incident Followup), Performance-Team, Patch-For-Review, SRE, serviceops
aaron closed T264604: Enable "/*/mw-with-onhost-tier/" route for MediaWiki where safe as Declined.
Nov 2 2021, 3:40 AM · MW-1.36-notes, MW-1.37-notes (1.37.0-wmf.1; 2021-04-13), Patch-For-Review, User-jijiki, SRE, serviceops, Performance-Team

Oct 27 2021

aaron added a comment to T280497: Benchmark performance of MediaWiki on k8s.

I have some scripts in my home dir on mwdebug1001.eqiad.wmnet (apcu_stats_test.php and apcu_rw_test.php).

Oct 27 2021, 1:35 AM · Patch-For-Review, Performance-Team (Radar), MW-on-K8s, serviceops, SRE

Oct 25 2021

aaron closed T255492: Consider making ILoadBalancer::getServerConnection private or internal as Resolved.
Oct 25 2021, 6:12 PM · MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), Performance-Team, Platform Engineering Roadmap Decision Making, Developer Productivity, Wikimedia-Rdbms

Oct 18 2021

aaron placed T129093: SHOW SLAVE STATUS as a health check should have a low timeout up for grabs.
Oct 18 2021, 6:45 PM · EngProd-Virtual-Hackathon, Performance-Team, DBA, Wikimedia-Rdbms
aaron closed T286531: Make WANObjectCache handle "lowTTL"/checkKeys/touchedCallback with "onHostRoutingPrefix" properly, a subtask of T264604: Enable "/*/mw-with-onhost-tier/" route for MediaWiki where safe, as Declined.
Oct 18 2021, 6:44 PM · MW-1.36-notes, MW-1.37-notes (1.37.0-wmf.1; 2021-04-13), Patch-For-Review, User-jijiki, SRE, serviceops, Performance-Team
aaron closed T286531: Make WANObjectCache handle "lowTTL"/checkKeys/touchedCallback with "onHostRoutingPrefix" properly as Declined.

Going with feature removal instead. See https://gerrit.wikimedia.org/r/c/mediawiki/core/+/731793 .

Oct 18 2021, 6:44 PM · MediaWiki-libs-ObjectCache, Performance-Team
aaron added a comment to T259084: Fix broken DatabasePostgresTest cases that emit db error.

I can look at this as part of getting https://gerrit.wikimedia.org/r/c/mediawiki/core/+/574101 merged, since it involves the same methods and testing of all the DB types anyway.

Oct 18 2021, 6:31 PM · Performance-Team (Radar), MW-1.35-notes, MW-1.36-notes (1.36.0-wmf.3; 2020-08-04), PostgreSQL, Wikimedia-Rdbms, Platform Engineering
aaron moved T195792: Add support for setting individual query timeout in wikimedia/rdbms from Inbox to Radar on the Performance-Team board.
Oct 18 2021, 6:24 PM · MW-1.38-notes (1.38.0-wmf.12; 2021-12-06), User-Ladsgroup, DBA, Performance-Team (Radar), User-Addshore, wdwb-tech, Wikidata, Platform Engineering (Icebox), Wikimedia-Rdbms
aaron merged T293536: MediaWiki should support setting a read query time limit into T195792: Add support for setting individual query timeout in wikimedia/rdbms.
Oct 18 2021, 6:23 PM · MW-1.38-notes (1.38.0-wmf.12; 2021-12-06), User-Ladsgroup, DBA, Performance-Team (Radar), User-Addshore, wdwb-tech, Wikidata, Platform Engineering (Icebox), Wikimedia-Rdbms
aaron merged task T293536: MediaWiki should support setting a read query time limit into T195792: Add support for setting individual query timeout in wikimedia/rdbms.
Oct 18 2021, 6:22 PM · Performance-Team (Radar), Platform Engineering, Wikimedia-Rdbms, Sustainability (Incident Followup)

Oct 14 2021

aaron closed T252951: ResourceLoader DepStore lock acquired twice? as Resolved.

Seems fixed in https://gerrit.wikimedia.org/r/c/mediawiki/libs/WaitConditionLoop/+/713337/2/src/WaitConditionLoop.php#101

Oct 14 2021, 1:13 AM · MW-1.38-notes (1.38.0-wmf.12; 2021-12-06), Patch-For-Review, Performance-Team

Oct 7 2021

aaron added a comment to T253926: Wikimedia\Rdbms\Database::selectSQLText: aggregation used with a locking SELECT (NewsletterDb::addNewsletterIssue).

Some pitfalls are implied by https://dev.mysql.com/doc/refman/8.0/en/innodb-error-handling.html . Statement rollbacks and ROLLBACK TO SAVEPOINT queries do not release locks due to implementation details (no tracking of which savepoints caused what locks). InnoDB deadlocks, and innodb lock wait timeouts with nnodb_rollback_on_timeout=On, cause an automatic transaction rollback in order to release the locks and let other transactions progress. Rolling back the whole transaction implicitly deletes the savepoints. You might be running into that problem.

Oct 7 2021, 12:23 AM · MW-1.38-notes (1.38.0-wmf.16; 2022-01-03), Platform Team Workboards (Clinic Duty Team), MediaWiki-extensions-Newsletter, Wikimedia-production-error

Oct 5 2021

aaron added a comment to T291648: Stuck cache for pages of a changed DJVU file.

I suspect that the thumbnail is no longer in Swift, but still in CDN, since ?action=purge does not work on the file description page.

Oct 5 2021, 8:47 PM · MediaWiki-Core-HTTP-Cache, MediaWiki-DjVu
aaron moved T292300: Eliminate unnecessary duplicate parses from Inbox to Radar on the Performance-Team board.
Oct 5 2021, 7:13 PM · MW-1.38-notes (1.38.0-wmf.17; 2022-01-10), Patch-For-Review, Performance-Team (Radar), Platform Team Workboards (MW Expedition), MediaWiki-Parser
aaron moved T292302: CommonsMetadata extension causes every page on commons to be always parsed twice from Inbox to Radar on the Performance-Team board.
Oct 5 2021, 7:12 PM · MW-1.38-notes (1.38.0-wmf.9; 2021-11-16), Performance-Team (Radar), CommonsMetadata, Platform Team Workboards (MW Expedition), MediaWiki-Parser