Page MenuHomePhabricator

Unable to save edits or delete pages on Commons – database lag
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue:

  • Try to save an edit on Commons
  • Try to delete a page on Commons

*Check https://commons.wikimedia.org/wiki/Special:RecentChanges or Special:Contributions

What happens?:
Edits and deletions do not save. Log pages display the message:

"Due to high database server lag, changes newer than 527 seconds may not be shown in this list."

RecentChanges shows no edits for the past several minutes.

What should have happened instead?:
Edits and deletions should save immediately and appear in logs/RecentChanges without extended lag.

Other information:
Tested while logged in. Issue appears to affect all edits sitewide.

Event Timeline

This looks like it's impacting en.wiki and en.wikt also (deletion only afaik). VPT link.

NickK triaged this task as Unbreak Now! priority.Aug 24 2025, 11:12 PM
NickK subscribed.

Large portion of users can't edit, hence it's an Unbreak Now.

Example error

Request served via cp3067 cp3067, Varnish XID 1036383093
Error: 503, Backend fetch failed at Sun, 24 Aug 2025 23:02:20 GMT

Seems to be back as of 23:12 UTC

Ladsgroup closed this task as Resolved.EditedAug 24 2025, 11:18 PM
Ladsgroup claimed this task.
Ladsgroup added a project: DBA.
Ladsgroup subscribed.

It has recovered.

The problem was that so many UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts */ (around 3000 of them at the same time) were running that s4 master couldn't handle it and just died. I couldn't restart the service or reboot the host as it could have caused data corruption and made things way worse.

This means T365303: Move update of category members count to CategoryMembershipChangeJob is UBN now.

To be sure update category membership is the culprit, I went through all slow write queries reordered by the master around the time of the outage and the top function was MediaWiki\Page\WikiPage::updateCategoryCounts basically more than half of all slow write queries. Some were quite slow (+4 seconds and more).

Specifically these edits seemed to be the main reason: https://commons.wikimedia.org/w/index.php?title=Special:Contributions/Yac%C3%A0wot%C3%A7%C3%A3&target=Yac%C3%A0wot%C3%A7%C3%A3&offset=20250824235443 updating members of this category: https://commons.wikimedia.org/wiki/Category:Photographs_by_Ricardo_Stuckert_taken_in_2023

root@db1244:/srv/sqldata# grep -i "updateCategoryCounts" binlog_slow
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 359117679
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id IN (282511428,6147469,362641266,154949224,359845911,346612774,359845558,1424684,346612848,361856346)
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
INSERT /* MediaWiki\Page\WikiPage::updateCategoryCounts  */ INTO `category` (cat_title,cat_pages,cat_subcats,cat_files) VALUES ('Russian_heritage_ID_7733072000',1,0,1) ON DUPLICATE KEY UPDATE cat_pages = cat_pages + 1,cat_files = cat_files + 1
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 344672727
INSERT /* MediaWiki\Page\WikiPage::updateCategoryCounts  */ INTO `category` (cat_title,cat_pages,cat_subcats,cat_files) VALUES ('Russian_heritage_ID_1002334000',1,0,1) ON DUPLICATE KEY UPDATE cat_pages = cat_pages + 1,cat_files = cat_files + 1
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 358815304
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages - 1,cat_files = cat_files - 1 WHERE cat_id = 317237
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages - 1,cat_files = cat_files - 1 WHERE cat_id = 317237
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages - 1,cat_files = cat_files - 1 WHERE cat_id IN (350019437,349832244,349825214)
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_subcats = cat_subcats + 1 WHERE cat_id IN (360098587,106600)
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 362465834
UPDATE /* MediaWiki\Page\WikiPage::updateCategoryCounts  */  `category` SET cat_pages = cat_pages + 1,cat_files = cat_files + 1 WHERE cat_id = 360156659
`

@Ladsgroup : Just FYI, from the Cat-a-lot code side, the user was using a pre-August 18, 2024 version of Cat-a-lot which didn't have the throttling code yet. I will say to user that please use later version.

@Ladsgroup : Just FYI, from the Cat-a-lot code side, the user was using a pre-August 18, 2024 version of Cat-a-lot which didn't have the throttling code yet. I will say to user that please use later version.

Thanks. I'd appreciate it.

Here is another graph of the issue:
https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&from=2025-08-24T11:50:40.862Z&to=2025-08-25T22:31:09.215Z&timezone=utc&var-job=$__all&var-server=db1244&var-port=9104&refresh=1m&viewPanel=panel-24

grafik.png (840×1 px, 110 KB)

I'll monitor to see if things improve or not. Now that T365303 is deployed.

@Ladsgroup : Just FYI, from the Cat-a-lot code side, the user was using a pre-August 18, 2024 version of Cat-a-lot which didn't have the throttling code yet. I will say to user that please use later version.

Thanks. I'd appreciate it.

This was almost certainly intentional avoidance of the throttling given the recent creation and importing. Given the previous sysadmin intervention for an incident caused by the gadget (T370304) and this incident, a sysadmin should consider deleting the user's fork that does not have throttling.

@Ladsgroup : Just FYI, from the Cat-a-lot code side, the user was using a pre-August 18, 2024 version of Cat-a-lot which didn't have the throttling code yet. I will say to user that please use later version.

Thanks. I'd appreciate it.

This was almost certainly intentional avoidance of the throttling given the recent creation and importing. Given the previous sysadmin intervention for an incident caused by the gadget (T370304) and this incident, a sysadmin should consider deleting the user's fork that does not have throttling.

Well, I had to be woken up at 1am on a Sunday to bring the site back online since for half an hour noone could edit Commons. Maybe someone should mention it to them?

The user replied "calm down" instead of making the requested change. Not a great sign. Agree that maybe a sysadmin should just make the edit for them.

The user replied "calm down" instead of making the requested change. Not a great sign. Agree that maybe a sysadmin should just make the edit for them.

Jonatan Svensson Glad is commons admin so this side is handled currently.

Only an interface sysops will be able to edit the user's specific .js pages (not a mere regular sysops as myself), but unless they act themselves, I'll might need to IAR (regarding deletion policy) and this will happen to their script:

Deleted because the script generated excessive automated requests without respecting Wikimedia’s API etiquette, violating [[foundation:Policy:Wikimedia Foundation API Usage Guidelines]]. Please see [[User talk:Yacàwotçã#Please update your version of Cat-a-lot to newer]]

This comment was removed by Josve05a.

However, I remain concerned that a determined attacker or a widely used non-compliant script could create the same load again. This risk highlights the need for server-side protections, not just reliance on community expectations or user discretion. ...

T365303 solved that updateCategoryCounts() would create spikes. Ie. it moved the updating to job.

am not as technically experienced as most participants here, so apologies if any of the above is misguided. From my limited understanding, however, the core point remains: without server-side rate limiting or throttling, the system is vulnerable, regardless of the intentions of individual editors.

My three pennies here (as fiwiki admin and commons user, I am not in WMF tech), The tricky part is how to implement this without breaking the service's usability. For example, Wikimedia Commons has 125M files and 1M new files per month. It's clearly a mass-editing community where users rely on automatic or semi-automatic editing tools to manage this volume. To enable large-scale curation, there are different throttling limits for different user groups. Trusted user groups have higher throttling limits than anonymous users or newcomers.

While it would have been possible in current case to throttle edits more to prevent users from crashing the system via editing, the cost would have been severely disrupting how people use the site—which would be much harder to recover from than an outage.

Once it was understood what was causing the problem and how to mitigate it in emergencies (such as targeted attacks), there wasn't an immediate need to tighten throttling rules so restrictively that they would prevent potential outages, compared to waiting for T365303 (or similar) to be deployed.

(I meant to edit my comment but deleted it… ugh)

@Josve05a: I could re-post your comment from my bugmail copy, if you want me to?

However, I remain concerned that a determined attacker or a widely used non-compliant script could create the same load again. This risk highlights the need for server-side protections, not just reliance on community expectations or user discretion. ...

T365303 solved that updateCategoryCounts() would create spikes. Ie. it moved the updating to job.

am not as technically experienced as most participants here, so apologies if any of the above is misguided. From my limited understanding, however, the core point remains: without server-side rate limiting or throttling, the system is vulnerable, regardless of the intentions of individual editors.

My three pennies here (as fiwiki admin and commons user, I am not in WMF tech), The tricky part is how to implement this without breaking the service's usability. For example, Wikimedia Commons has 125M files and 1M new files per month. It's clearly a mass-editing community where users rely on automatic or semi-automatic editing tools to manage this volume. To enable large-scale curation, there are different throttling limits for different user groups. Trusted user groups have higher throttling limits than anonymous users or newcomers.

While it would have been possible in current case to throttle edits more to prevent users from crashing the system via editing, the cost would have been severely disrupting how people use the site—which would be much harder to recover from than an outage.

Once it was understood what was causing the problem and how to mitigate it in emergencies (such as targeted attacks), there wasn't an immediate need to tighten throttling rules so restrictively that they would prevent potential outages, compared to waiting for T365303 (or similar) to be deployed.

I believe we should revert T194864: Raise the rate limit for autopatrollers on Commons so autopatroller, patroller and image reviewer will no longer have higher rate limits than normal user.

I believe we should revert T194864: Raise the rate limit for autopatrollers on Commons so autopatroller, patroller and image reviewer will no longer have higher rate limits than normal user.

You should be more specific what you mean "normal user". Wikimedia's default for all wikis normal user is 90 edits per 180 seconds, in Wikimedia Commons it is 900 edits per 180 seconds and for trusted users it is 1500 edits per 180 seconds. ( it was lowered after August 18, 2024 from 10500 to 1500 ).

Cat-a-lot editing speed was also reduced after August 18, 2024 to ~60 edits per minute and then increased 100 - 150 edits per minute which is still too low for general daily usage. There is need for increase it when backend can support (see. T375355 ). I think that good target would be for Cat-a-lot to increase speed to 10~ edits per second for interactive use cases as long Cat-a-lot follows that server load is not high and slows down when needed.

Larger category changes IMHO could be considered as batch jobs which would have lower limits, but there is similarly need for increase of editing speed, but it concerns less number of users so it can be coordinated with users who are doing mass changes or curating incoming files.

I suggest something even more radical: Move CAL (and HotCat) to core. These two gadgets are one of the most widely used and installed gadgets in the movement. Then plus having a better and official support from WMF, it can simply unleash the category changes to the jobqueue and let the existing system of handling concurrency deal with it. Similarly, we had issues with the Nuke extension deleting all pages at the same time, I just turned that into jobs. From users perspective, if it's done in a minute or two it's fine. I think a big annoyance for them is that they have to keep the window open until it's over.

I suggest something even more radical: Move CAL (and HotCat) to core

I am not against this per se, but I would prefer focusing on increasing overall performance as it would help not just with categories, but with Wikidata and SDC also. Second in wishlist could be that there would be API for changing categories. (for in javascript in mw and action API) It would be even better if API would have support for for changing multiple categories once, reverting the changes and optionally notifying when task is ready.

From users perspective, if it's done in a minute or two it's fine. I think a big annoyance for them is that they have to keep the window open until it's over.

I think that this depends on what people are doing. If they are doing interactive things (i.e., select some, do something, select next, do something... repeat), then they need direct feedback from the UI (for example, if changes were unsuccessful, what files they have already processed, etc.). The biggest annoyance, as far as I know, is that because changing categories with cat-a-lot is slow, it feels that it wastes a lot of users time.

Then there is another thing where people select a lot of files and change the categories for these. For example, moving 3000 pages from one category to another with quickcategories, which is made for moving categories in the background so that it follows maxlag, would take hours. Here the problem is that it takes so much time that it becomes a blocker when user is doing larger changes.