Page MenuHomePhabricator

Suppressed username shown on File pages
Open, MediumPublic6 Estimated Story Points

Assigned To
None
Authored By
csteipp
Feb 20 2015, 11:23 PM
Referenced Files
F183775: 0001-SECURITY-Add-img_deleted-column-b.patch
Jun 26 2015, 10:23 PM
F173490: 0001-Make-image-table-private.patch
Jun 3 2015, 1:21 AM
F173492: 0001-Hide-suppressed-usernames-in-image-table.patch
Jun 3 2015, 1:21 AM
F163752: 0001-SECURITY-Add-img_deleted-column-b.patch
May 11 2015, 11:45 PM
F95549: 0001-Hide-suppressed-usernames-in-image-table.patch
Mar 16 2015, 8:02 PM

Description

When a user is blocked with username suppression, the username is suppressed for old revisions of a file, but the current version has the (suppressed) username shown.

To reproduce,

  • Upload an image as a user
  • Upload a second version of the image
  • Login as an admin and block the uploading, selecting "Hide username from edits and lists"
  • As a logged out user, view the File page. The username is still visible on the current revision, but not for old revisions.

Example of affected file with a single revision: Schachboxen1.jpg
Example of affected file with multiple revisions: Toolse_linnuse_varemed.jpg

Patches

  • - add image.img_deleted, populate it from ipblocks, update oldimage.oi_deleted/filearchive.fa_deleted based on ipblocks to account for past image -> oldimage/filearchive transitions
  • - support revdel flags on current versions of images
  • - invalidate image cache when needed
  • - trigger cache invalidation from CentralAuth global locks
  • - make image table dump private (patch made against ariel branch of operations/dumps
  • - hide username in Labs replicas (patch made against operations/software)

One-time script files for WMF cluster update:

T90300-add-img_deleted.sql:

ALTER TABLE image ADD img_deleted tinyint unsigned NOT NULL default 0;

T90300-populate-img_deleted.sql:

UPDATE image JOIN ipblocks ON img_user_text = ipb_address
SET img_deleted = img_deleted | 12
WHERE ipb_deleted;

UPDATE oldimage JOIN ipblocks ON oi_user_text = ipb_address
SET oi_deleted = oi_deleted | 12
WHERE ipb_deleted;

UPDATE filearchive JOIN ipblocks ON fa_user_text = ipb_address
SET fa_deleted = fa_deleted | 12
WHERE ipb_deleted;

WMF cluster update plan

  1. run foreachdb sql.php /home/tgr/T90300-add-img_deleted.sql
  2. apply all patches
  3. run foreachdb sql.php /home/tgr/T90300-populate-img_deleted.sql
  4. run script via foreachwiki eval.php to invalidate cache (if we care enough):
$dbr = wfGetDB( DB_SLAVE ); $res = $dbr->select( 'image', LocalFile::selectFields(), array( $dbr->bitAnd( 'img_deleted', 12 ) . ' = 12' ), 'T90300' ); foreach ( $res as $row ) { RepoGroup::singleton()->getLocalRepo()->newFileFromRow( $row )->invalidateCache(); }

Notices

  • removes public image table dumps (image table will still be available via Tool Labs)

Details

Related Changes in Gerrit:

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@aaron, any ideas on how to deal with this caching issue?

Why not just have RevisionDeleteUser iterate over the image rows to be changed and call invalidateCache()? That should be able to handle a fair number of edits (and it's not likely such users will have many files). AFAIK, the doLocalSuppression call can already run in jobs. If we just removed $wgCentralAuthWikisPerSuppressJob and always used the queue for simplicity, and pushed the jobs to the local wikis rather than the central queue, the cache purges would work. CentralAuthSuppressUserJob looks like it would work on local wikis.

I made https://gerrit.wikimedia.org/r/212579 to speed up such purges a bit.

Did that, don't really like the result:


@aaron, couldn't we expose the shared repo cache key to other wikis instead?

Fixed a stupid mistake in the Labs replication patch.

@Tgr, I applied

, which doesn't look like it fixed the issue for local suppression blocks. Is there a local patch similar to the centralauth patch, so that the cache is invalidated? Wasn't sure if I missed something or if you're still working on it.

How do you trigger a local suppression block? RevisionDeleteUser::invalidateCache() needs to be called, and right now that's the responsibility of the code that calls the other RevisionDeleteUser methods (and I only did it for CentralAuth). Maybe it should be called by those methods instead; I didn't want to do that because those work when called cross-wiki and invalidateCache does not and that would have been confusing.

Just do a normal block on a username, and there's an option to also
suppress the username, if you have the right (iirc).

@csteipp: updated the patch (it's in the task summary). Looks like undoing a suppress block via Special:Block is currently broken, the patch fixes that as well.

In T90300#1332550, @Tgr wrote:

Did that, don't really like the result:


@aaron, couldn't we expose the shared repo cache key to other wikis instead?

Yes, though I think we should be careful about cross-wiki code and the existing pattern seems a bit like a mistake. It's already there though and building on it is not intolerable though. In any case it should either fully work crosswiki or not pretend to.

There was a stray comma in the sql patchfile.

@Tgr, thanks for the new patches, I confirmed the issue is fixed now!

I'm a little worried about the move to only using jobs in centralauth-- @aaron, do those run in their own queue so we can be sure that they're going to get run soon after the user is hidden?

If so, I'm happy to go ahead and deploy these.

@Tgr, thanks for the new patches, I confirmed the issue is fixed now!

I'm a little worried about the move to only using jobs in centralauth-- @aaron, do those run in their own queue so we can be sure that they're going to get run soon after the user is hidden?

If so, I'm happy to go ahead and deploy these.

They don't, just the regular high-priority queue. It might makes sense to leave it that way unless it's too slow. If we have slow jobs in the high priority queue that back up others, then they should be moved the low priority queue rather than making new queue loops, generally speaking.

@jcrespo / @Springle,

This all looks good from the security side, I'd like to start getting the fixes rolled out. Tgr outlined the task description, but basically,

WMF cluster update plan

  1. run foreachdb sql.php /home/tgr/T90300-add-img_deleted.sql
  2. apply all patches
  3. run foreachdb sql.php /home/tgr/T90300-populate-img_deleted.sql
  4. run script via foreachwiki eval.php to invalidate cache (if we care enough):

Can we start working on #1?

Unless @Springle is against it, I am ok with the intended work, I would be against the method, specially for commons which has 36 million rows on image and that would disable file uploads for over 1 hour, plus create a 1 hour lag on the slaves. For other wikis with no images or a small number, it is ok (<1000 rows).

I would recommend using pt-online-schema change for the larger tables (this table has a PRIMARY KEY). We can take care of this, with the usual questions (where and when?).

For the updates, make sure they are done in <1 sec steps, perform several of them otherwise.

Re: query optimization, there is a big difference between the IN () and the JOIN version (apparently MySQL is really stupid about handling uncorrelated IN subqueries), so using joins and the (both faster and more correct) username-based joins:

This is fixed on 5.6/10. We recently dumped 5.5 on the slaves, and will do on the masters soon.

Unless @Springle is against it, I am ok with the intended work, I would be against the method, specially for commons which has 36 million rows on image and that would disable file uploads for over 1 hour, plus create a 1 hour lag on the slaves. For other wikis with no images or a small number, it is ok (<1000 rows).

I would recommend using pt-online-schema change for the larger tables (this table has a PRIMARY KEY). We can take care of this, with the usual questions (where and when?).

@jcrespo, I'd be happy for you or Sean to do this. It would need to be applied to all wikis, whenever you're able to schedule it.

For the updates, make sure they are done in <1 sec steps, perform several of them otherwise.

I'm not exactly clear on what you're suggesting here-- are you saying setup a limit and repeatedly run this where each run finishes under 1 second? Something like running,

UPDATE image JOIN ipblocks ON img_user_text = ipb_address
SET img_deleted = img_deleted | 12
WHERE ipb_deleted AND img_deleted = 0
LIMIT <something sane that finishes under 1 second>;

Will apply the change, report back when finished.

I'm not exactly clear on what you're suggesting here-- are you saying setup a limit and repeatedly run this where each run finishes under 1 second? Something like running,

UPDATE image JOIN ipblocks ON img_user_text = ipb_address
SET img_deleted = img_deleted | 12
WHERE ipb_deleted AND img_deleted = 0
LIMIT <something sane that finishes under 1 second>;

That would probably work. If you want to be 100% sure of changes being applied in less than 1 second, page using the primary key (LIMIT could take sometimes too much time if a full table scan or large range is done). Take a 1 second pause between updates or monitor the lag during its application. If that is too difficult, ask for help to the DBAs after code is applied.

@csteipp I've deployed it to testwiki. Sincerely, this seems like a very strightforward change (except for large tables), but if you want to do a final check or test my previous comment suggestions there (for example, 1000 rows updated at a time is a good starting point), I will wait a day before applying it to the rest of the wikis.

Deployment update:

The schema change was applied to all wikis in an online fashion (pt-table-checksum), with the exception of commonswiki. The reason for that is that due to the high SELECT rate of that table on commons, the change created high contention due to metadata locking, so I stopped and rolled back the change. Please note that the issue here is the schema change, not the resulting schema.

I am now applying the change in a rolling-update fashion, depooling and repooling one server at a time -hopefully that will avoid the load issues. The issue with this method is that the change takes between 2-4 hours per server and has to be mostly serialized, so it will probably take me over a day to complete it (but is is ongoing right now).

ALTER TABLE image ADD img_deleted tinyint unsigned NOT NULL default 0; (or more precisely, a complex list of operations with the same result but without affecting production) has been run on ALL WMF wikis.

I suppose that concludes the first step.

Be careful with Commons: there, image is a very busy 30-million row table, and even simple operations can generate lag.

ALTER TABLE image ADD img_deleted tinyint unsigned NOT NULL default 0; (or more precisely, a complex list of operations with the same result but without affecting production) has been run on ALL WMF wikis.

You missed labswiki, which I just did (as there's only 260 rows there).

In my search to find out why this column had appeared on some wikis but not labswiki, I came across T38497 and T22742 which also suggested adding this column years ago.

@Tgr Just to clarify, are you still working on this/Does anyone know the status?

I think this was just waiting for the patches to be deployed (but that was 3 years ago so they need rebase / re-review now), not sure where / why exactly that got stuck. Are you still interested in getting it out? I'll be mostly on vacation until end of next week, can look at it after that.

Aklapper removed Tgr as the assignee of this task.Jun 19 2020, 4:25 PM

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)

@sbassett is the Security team interested in seeing this through? If not, can it be made public? (It's an infoleak, but a very minor one, and not triggered by user action, so public knowledge of the bug wouldn't make much difference.)
It was a nontrivial amount of work, and the patch clears up a significant amount of tech debt in the image table, so I'd like to see it through one way or another (now that I remember it exists... thanks for the reminder, @Aklapper).

@Tgr - moving this to our clinic meeting today. If the work can go through gerrit sooner than later, that sounds like the easier path to get it deployed, especially for what I'd personally consider a fairly low-risk vulnerability.

@Tgr - Security-Team is fine with this being public, given what we perceive to be a fairly minor information disclosure, especially if the work can proceed sooner than later within gerrit. I'll make this public now.

sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".Jul 6 2020, 3:24 PM
sbassett moved this task from Incoming to Watching on the Security-Team board.

Change 609798 had a related patch set uploaded (by Reedy; owner: CSteipp):
[mediawiki/core@master] Add img_deleted column

https://gerrit.wikimedia.org/r/609798

Seems the img_deleted column was only recently removed from WMF wikis... T250055: Remove image.img_deleted column from production

@Tgr - Security-Team is fine with this being public, given what we perceive to be a fairly minor information disclosure, especially if the work can proceed sooner than later within gerrit. I'll make this public now.

Thanks! Will try to re-test things in the next week or two.

Aklapper changed the edit policy from "Custom Policy" to "All Users".Jan 23 2023, 8:06 PM

Change #609798 abandoned by Reedy:

[mediawiki/core@master] Add img_deleted column

https://gerrit.wikimedia.org/r/609798

@Tgr -- it looks like there has not been any work on this in 4+ years. Are we safe to close this out?

Bugs don't usually fix themselves when no one is working on them.

@Ladsgroup is changing the table schema so the patches on the task aren't valid anymore. Not sure if those changes fix/affect the bug.

My refactor won't fix the problem out of the box but it would make it easier to fix. Clearly I won't have time to work on the fix.

Bugs don't usually fix themselves when no one is working on them.

Fair :) A lot can change over 4 years though, including refactors that might fix things accidentally!

Thanks for the updates. Great to hear that your changes will make the fix easier too, @Ladsgroup . @Tgr are you on point to update the patches in response to the schema changes? If so, I might remove the MWI tag.

I'll try to find the time, but this was (and I assume still is) volunteer work so no guarantees.

Thanks again for the info! I'm removing MWI to get this off our board for now, but feel free to tag us again later if you need assistance, code reviews, or whatever else.