Page MenuHomePhabricator

flaggedrevs.fr_user is unindexed
Closed, ResolvedPublic

Description

As noticed in T104686

Apparently, such an opperation runs the query UPDATE /* MergeUser::mergeDatabaseTables */ flaggedrevs SET fr_user = 'XXXXXX' WHERE fr_user = 'XXXXX', which goes over 64 million rows unindexed and not batched (even if no rows are updated). That would be a hard blocker for future runs.

fr_user is unindexed

Wikis where this table is enabled: https://github.com/wikimedia/operations-mediawiki-config/blob/master/dblists/flaggedrevs.dblist

s1:

  • enwiki

s2:

  • eowiki
  • fiwiki
  • idwiki
  • plwiki
  • ptwiki
  • trwiki

s3:

  • alswiki
  • bawiki (not enabled)
  • bewiki
  • bnwiki
  • bswiki
  • cawikinews
  • cewiki
  • ckbwiki
  • de_labswikimedia (not enabled)
  • dewikiquote
  • dewiktionary
  • elwikinews
  • en_labswikimedia (not enabled)
  • enwikibooks
  • enwikinews
  • eswikibooks (not enabled)
  • eswikinews
  • fawikinews
  • flaggedrevs_labswikimedia (not enabled)
  • frwikinews
  • hewikisource
  • hiwiki
  • iawiki
  • iswiktionary
  • kawiki
  • lawikisource
  • mediawikiwiki (not enabled)
  • mkwiki
  • plwikisource
  • plwiktionary
  • ptwikibooks
  • ptwikinews
  • ptwikisource
  • ruwikinews
  • ruwikiquote
  • ruwikisource
  • ruwiktionary
  • siwiki (not enabled)
  • sqwiki
  • srwikinews
  • tawikinews
  • test2wiki
  • testwiki (not enabled)
  • trwikiquote
  • ukwiktionary
  • vecwiki
  • zh_classicalwiki

s5:

  • dewiki

s6:

  • frwiki
  • ruwiki

s7:

  • arwiki
  • eswiki (not enabled)
  • fawiki
  • huwiki
  • metawiki (not enabled)
  • ukwiki

ALTER to run: ALTER TABLE /*$wgDBprefix*/flaggedrevs ADD INDEX (fr_user);

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

From what I can see it is not a mistake on ops side (which wouldn't be that rare)- it is not on the code either (if I am not wrong): https://phabricator.wikimedia.org/diffusion/EFLR/browse/master/backend/schema/mysql/FlaggedRevs.sql;b7ec872bbd53fed2c72f615f35af17d3272a63fb$72

While technically we are not blocked to add it on the infrastructure now, normally we wait until there is consensus to add it on mediawiki (or in this case, the extension itself). For me this would be a blocker for other user merges.

I'm guessing there's not going to be many merges in the future, but I may be wrong.

And of course, it's only an issue if one of the wikis that a merge needs to happen has FR enabled, but all wikis don't

I'm guessing there's not going to be many merges in the future, but I may be wrong.

Yes, I am not indicating the priority of this, just stating facts- that if we were to merge a user, I would like this fixed first -and meanwhile we can avoid doing it :-)

And of course, it's only an issue if one of the wikis that a merge needs to happen has FR enabled, but all wikis don't

How could we get a list of those?

List of wikis running flagged revs? We have a dblist :)

https://noc.wikimedia.org/conf/highlight.php?file=flaggedrevs.dblist

Ah, thanks - didn't know! :-)

Looks like it, but we've come across it again.

Adding it to the extension is trivial in MW land, getting it deployed is harder.

Do we just mark it lowest?

I don't know how trivial or hard would be. I know no coding. However if it causes issues we should get this eventually done. Would the process be similar to what happened at T148242? Thanks.

Nope, nothing like it.

Though, I just noticed it does at least have a primary key, so the DBA's may have an easier job of making a schema change

Yep. For reference "only" 53 wikis use FlaggedRevs, although some are big; so it's not like we need to run this for the whole production wikis as opposed to the task I linked above. However it seems that some tables, such as the dewiki one, has millions of entries and might take a while to have it done once the fix is in the extension. I guess that'll be done using a maintenance script.

Change 369460 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/FlaggedRevs@master] Add flaggedrevs.fr_user index

https://gerrit.wikimedia.org/r/369460

Yep. For reference "only" 53 wikis use FlaggedRevs, although some are big; so it's not like we need to run this for the whole production wikis as opposed to the task I linked above. However it seems that some tables, such as the dewiki one, has millions of entries and might take a while to have it done once the fix is in the extension. I guess that'll be done using a maintenance script.

What maintenance script?

There's nothing to do other than add the index. Nothing needs populating

Nothing needs populating

Ah, okay. I though we need populating. Thanks :)

Yep. For reference "only" 53 wikis use FlaggedRevs, although some are big; so it's not like we need to run this for the whole production wikis as opposed to the task I linked above. However it seems that some tables, such as the dewiki one, has millions of entries and might take a while to have it done once the fix is in the extension. I guess that'll be done using a maintenance script.

For consistency, the table should be altered wherever it exists despite of whether it is enabled or not :-)
Creating an index is an online/in-place operation so there should be no lag, but we have seen that that is not always true, so we'd need to pool/depool slaves for those wikis where the table is big.
The good thing is that it can be altered on the masters too (without replication) as it will not lock the table \o/

I agree with Jaime and I would prefer to have the patch merged before getting started with this task.

Most wikis won't have the tables. There's probably a few that had flagged revs installed, but it's now disabled. It'd still going to be less than 10% of the total number of wikis

More like just shy of 6%

chad@notsexy /a/ops/mediawiki-config/dblists (master)$ cat flaggedrevs.dblist | wc -l
      53
chad@notsexy /a/ops/mediawiki-config/dblists (master)$ cat all.dblist | wc -l
     909

btw, what does flaggedrevs_labswikimedia do? I don't think we have flaggedrevs at any labswiki?

btw, what does flaggedrevs_labswikimedia do? I don't think we have flaggedrevs at any labswiki?

It is probably and old and not used thing (the whole DB hasn't been touched since 2015).
What I pasted there was where the table exists (on disk)

I am still not sure whether we should alter _only_ those dbs where the flaggedrevs are enabled or all the dbs where the table exists (for consistency across them)
Opinions?

flaggedrevs_labswikimedia is a deleted wiki. It predates beta et al, and presumably was a testing wiki for FlaggedRevs

I'm indifferent as to whether it's worth fixing everywhere, but might be worth it just for overall ease

I would spent time dropping the table on those places where the extension is disabled instead of altering- it will be faster and will help cleanup/db object spam- although we should check the "disabling" is not temporary and there are no expectations of keeping the data and/or reenable it soon. I may need to work with some of you to check those cases.

A quick look at https://github.com/wikimedia/operations-mediawiki-config/commits/master/dblists/flaggedrevs.dblist suggests we've had no enabling or disabling of the extension since before October 2015...

A quick look at https://github.com/wikimedia/operations-mediawiki-config/commits/master/dblists/flaggedrevs.dblist suggests we've had no enabling or disabling of the extension since before October 2015...

Shall we get another drop tables task then? :-)

I guess if we compare that dblist and the list you added to the description of what files on disk exist... And they're not equal, we should, yeah :)

I have compared the list and I believe it can be dropped from:

bawiki
de_labswikimedia
en_labswikimedia
eswikibooks
flaggedrevs_labswikimedia
mediawikiwiki
siwiki
frwiki
eswiki
metawiki

However I would like another pair of eyes to compare both lists and confirm it.

T172207#3493117 confirms that it not used on flaggedrevs_labswikimedia
T172207#3631191 confirms that it is not used on eswiki, eswikibooks and metawiki

Marostegui triaged this task as Medium priority.Aug 3 2017, 8:15 AM

Humm... eswikibooks never had flaggedrevs installed afaik. There was a discussion about enabling them but I don't remember having requested them to be installed. Memory might fail though. The same for Meta, without the discussion though. In any case neither meta nor eswikibooks use flaggedrevs.

Thanks @MarcoAurelio, so from your side the list of wikis that can get the table delete: T172207#3496301 looks good?
@jcrespo @Reedy ?

@Marostegui I'd have to check other wikis as well but as for meta, eswiki and eswikibooks I can confirm that they're not using flaggedrevs at this moment.

I can confirm it's unused on mediawikiwiki. Stupid experiment: I pushed for installation and I pushed for its removal.

Hi,

This is now blocking an user rename - T173859.
If we can get this merged on mediawiki, we could proceed and start creating the index in core. As Jaime said, getting it merged on mediawiki isn't a great blocker, but we would prefer if there is consensus to add it so we can proceed and create it on the databases.

@Marostegui and @jcrespo The operation this task refers to was a user merge, which is not the same as an user rename. Can we confirm the problem will happen again with user rename? I guess an UPDATE command is issued as well, right? Please advice.

@Marostegui and @jcrespo The operation this task refers to was a user merge, which is not the same as an user rename. Can we confirm the problem will happen again with user rename? I guess an UPDATE command is issued as well, right? Please advice.

I don't really know how that works underneath, I guess someone would need to look at the code to see if it is issued. Maybe @Reedy knows?

flaggedrevs.fr_user is an int. So it's a user id.

For a rename, we don't touch this, just any field storing the username (usually _user_text).

For a merge, we do update ids, as a user goes from having two ids (primary and secondary accounts, or however you want to look at it) to having only one user id.

This doesn't block any renames, just any potential user merges (though, likelihood of these seems to be quite slim)

And specifically, for a user rename, no flagged revs tables are touched anyway

Thanks a lot @Reedy for your explanation! :-)

Change 369460 merged by jenkins-bot:
[mediawiki/extensions/FlaggedRevs@master] Add flaggedrevs.fr_user index

https://gerrit.wikimedia.org/r/369460

Ok, so patch merged.

It doesn't need adding to WMF wikis, but it'd help solve any further user merges in future *if* they need to happen.

Leaving the decision upto DBA for them to decide how to proceed from now :)

Thanks @Reedy!

If we want to do a bit of clean up, let's check if the table can be drop from the wikis listed on: T172207#3496301
Meanwhile we can add it to the wikis we are sure it is present and used.

eswiki, eswikibooks and metawiki tables can go for sure. We ain't using the extension @Marostegui @Reedy

eswiki, eswikibooks and metawiki tables can go for sure. We ain't using the extension @Marostegui @Reedy

Thanks!!

Let's wait for confirmation on the other wikis listed on: T172207#3496301

However, there is nothing blocking us now from deploying the index on the wikis where we have no doubts where it is enabled: https://github.com/wikimedia/operations-mediawiki-config/blob/master/dblists/flaggedrevs.dblist

Marostegui moved this task from Pending comment to In progress on the DBA board.
Marostegui moved this task from Backlog to In progress on the Schema-change-in-production board.

Mentioned in SAL (#wikimedia-operations) [2017-11-08T06:57:20Z] <marostegui> Deploy alter table on s6 - on codfw master (db2028) with replication enabled - T172207

Mentioned in SAL (#wikimedia-operations) [2017-11-08T07:13:33Z] <marostegui> Deploy alter table on s7 - on codfw master (db2029) with replication enabled - T172207

Mentioned in SAL (#wikimedia-operations) [2017-11-08T07:21:18Z] <marostegui> Deploy alter table on s3 - on codfw master (db2018) with replication enabled - T172207

Mentioned in SAL (#wikimedia-operations) [2017-11-08T07:46:38Z] <marostegui> Deploy alter table on s2 - on codfw master (db2017) with replication enabled - T172207

Mentioned in SAL (#wikimedia-operations) [2017-11-08T07:55:18Z] <marostegui> Deploy alter table on s1 - on codfw master (db2048) with replication enabled - T172207

I have altered: s1,s2,s3,s6 and s7 in codfw (with replication)
s5 is pending on codfw as there is another alter table running at the moment and I do not want to combine both. Once s5 is done, I will start with eqiad hosts (without replication for most of them, will only enable replication on the sanitarium masters, so the index gets replicated across labs straightaway).

s7 eqiad is also done.
I have also decided to alter the tables even though if it is not enabled there, because if it is ever enabled, better to alter them now that they are empty, so it is fast :-)

Until the other big alter is done in codfw s5, I am altering slaves in s5 eqiad.
Once codfw is finished with the other task, I will alter codfw master with replication.

Only pending:

s5 primary master (db1063)
s5 codfw (will be done with replication once the other alter table currently running finishes).

Mentioned in SAL (#wikimedia-operations) [2017-11-10T06:50:21Z] <marostegui> Deploy alter table on s5 eqiad master (db1063) - T172207

Marostegui updated the task description. (Show Details)

s5 primary master done