Page MenuHomePhabricator

Drop stable check for images used in pages
Closed, ResolvedPublic

Description

Basically implement parts of https://gerrit.wikimedia.org/r/703040

This means flaggedimages table will be probably dropped

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 714181 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/extensions/FlaggedRevs@master] Drop support for files

https://gerrit.wikimedia.org/r/714181

Change 714181 merged by jenkins-bot:

[mediawiki/extensions/FlaggedRevs@master] Drop support for checking for stable versions of files

https://gerrit.wikimedia.org/r/714181

This is very good news as this table is reaching considerable sizes on some wikis, like:
dewiki: 67G
ruwiki: 55G

This should now be live on testwikis.

unfortunately the only test wiki that has flaggedrevs seems to be test2wiki and it's on group1 :/

On group1 now, tested now, so far looks good and no user has made a ticket against FlaggedRevs

It seems it's stable in production \o/

Change 717445 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/extensions/FlaggedRevs@master] Drop flaggedimages table

https://gerrit.wikimedia.org/r/717445

Change 717445 merged by jenkins-bot:

[mediawiki/extensions/FlaggedRevs@master] Drop flaggedimages table

https://gerrit.wikimedia.org/r/717445

I just noticed that both the FR_INCLUDES_FREEZE aspect for files (which required the huge flaggedimages table) and the FR_INCLUDES_STABLE aspect for files (requiring the fr_img_* fields) was removed in 67d780fd38d2. The FR_INCLUDES_STABLE aspect for templates remains. This means that files cannot be reviewed to prevent vandalism from being widely shown. Where the fr_img_* columns also taking up too much space? Since file uploads make revisions with the same timestamp, logic to use stable file versions could be done without even having those columns (e.g. reusing fr_rev_timestamp instead). Though finding File objects based on timestamp inequalities (to skip text-only change) would be annoying...

I suppose one key difference between using stable templates vs files is that a lot of wikis mostly (or some, only) use commons files, which were never reviewable...so FR_INCLUDES_STABLE wouldn't do much on such wikis anyway. Was that part of the reasoning? I guess the other issue was ?filetimestamp not being part of core File: handling and MutimediaViewer probably showing the current version when thumbnails are clicked.

I suppose one key difference between using stable templates vs files is that a lot of wikis mostly (or some, only) use commons files, which were never reviewable...so FR_INCLUDES_STABLE wouldn't do much on such wikis anyway. Was that part of the reasoning? I guess the other issue was ?filetimestamp not being part of core File: handling and MutimediaViewer probably showing the current version when thumbnails are clicked.

Yup, extremely low-value in practice because it rarely worked as people expected for different tools for the local files, and not at all for remote ones.

I suppose one key difference between using stable templates vs files is that a lot of wikis mostly (or some, only) use commons files, which were never reviewable...so FR_INCLUDES_STABLE wouldn't do much on such wikis anyway. Was that part of the reasoning? I guess the other issue was ?filetimestamp not being part of core File: handling and MutimediaViewer probably showing the current version when thumbnails are clicked.

Yup, extremely low-value in practice because it rarely worked as people expected for different tools for the local files, and not at all for remote ones.

Aye. I've mostly been digging through things to understand the situation for Attribution signals. It does simplify the case of Attribution for File: pages themselves a bit, since file shown is always the latest version (though not the text). I was about to bring up a complication but double-checked and realized it wasn't actually there anymore.

And they were extremely taxing on the infra. I'm talking half of s5 and s6 was because of these tables level of taxing. My estimate is that it has cost us around $100,000 every year just on hardware budget and will continue to do so for the foreseeable future since merging sections is not really doable (T408834) at least s7 is not growing that fast anymore. Not to mention it overloaded the masters too with the level of writes.