Page MenuHomePhabricator

Reset file count statistics on a few wikis with negative/off-by-one errors
Closed, ResolvedPublic

Description

https://meta.wikimedia.org/wiki/Talk:List_of_Wikipedias/Table#Wrong_file_count asks to act on a few wikis where Special:Statistics shows a file count different from Special:ListFiles. initSiteStats.php --update will fix the count (we run updateArticleCount regularly, but that doesn't do files).

The wikis are: {ady,io,mrj,zh-classical,nrm,jbo,da,es,cs,nn}.wikipedia (negative amount of files as found in http://wikistats.wmflabs.org/display.php?t=wp + manual checks of miscount ; cs requested at T131922 ; nn requested here).

Event Timeline

Do we know how some wikis manage to get negative counts?

Do we know how some wikis manage to get negative counts?

Presumably in the usual way: some addition (new upload) fails to increment the counter by 1 but then deletion manages to decrease count by 1. Or vice versa for the cases of es/da.

@jcrespo Could we run this without any issue or should we sync with you?

What do initSiteStats.php?
$counter = new SiteStatsInit(false); // don't use master

$counter->edits();
$counter->articles();
$counter->pages();
$counter->users();
$counter->files();

$counter->refresh();

@jcrespo Can you make sure to run that on the vslow hosts? If vslow hosts are what are casing issues, on the analytics host (dbstore1002).

We do not want to run long running queries on "regular" hosts, not because replication issues,but because of buffers/undo table issues (which is why we have special slaves for that).

Yes we can if we ask SiteStatsInit to use get a database with wfGetDB( DB_SLAVE, 'vslow' ) instead of wfGetDB( DB_SLAVE ).

I've prepared https://gerrit.wikimedia.org/r/#/c/280872/ to achieve that.

Thank you, feels ok to me, but I do not have enough underlying mediawiki knowledge to know if there would be problems with the logic (running on slightly delayed hosts with even only ms of delay, etc.), so I will leave that to others more knowledgeable. I know we are shifting towards offloading the master, but I know it is a WIP.

Nemo_bis triaged this task as Medium priority.Apr 2 2016, 8:51 AM

There was no need to disturb jcrespo or I would have already added him. Running initSiteStats.php on request is standard practice.

@Nemo_bis Please keep the blocking task, as we need to cherry pick the change for the wmf branches before run the script.

Please also note I'm not comfortable to run any MySQL intensive script without a correct assertion of performance impact by someone knowledgeable in these matters. Currently, we have Jaime and Roan for that. I'm going to add a list of whitelisted task on Wikitech, so we'll have somewhere to note this script is now ok with the Gerrit changes 114994 and 280872.

So plan is:

  • cherry pick the change 280872 to relevant wmf branch
  • once merged and deployed, run the script for each of these 8 wikis, sequentially

I'm running them, I'll post the final output when it's done, but we haven't negative numbers anymore for files. It's instantaneous for small wikis, took a few minutes for da., but I suspect it will be running for one hour for es.wikipedia (10x the number of edits than on da.).

Terbium
$ mwscript initSiteStats.php --wiki adywiki --update
Refresh Site Statistics

Counting total edits...5088
Counting number of articles...339
Counting total pages...857
Counting number of users...643
Counting number of images...0

Updating site statistics...done.

Done.
$ mwscript initSiteStats.php --wiki iowiki --update
Refresh Site Statistics

Counting total edits...893235
Counting number of articles...26577
Counting total pages...40057
Counting number of users...19036
Counting number of images...0

Updating site statistics...done.

Done.
$ mwscript initSiteStats.php --wiki mrjwiki --update
Refresh Site Statistics

Counting total edits...91476
Counting number of articles...10094
Counting total pages...14332
Counting number of users...4878
Counting number of images...0

Updating site statistics...done.

Done.
$ mwscript initSiteStats.php --wiki nrmwiki --update
Refresh Site Statistics

Counting total edits...205541
Counting number of articles...3598
Counting total pages...7972
Counting number of users...7396
Counting number of images...0

Updating site statistics...done.

Done.
$ mwscript initSiteStats.php --wiki jbowiki --update
Refresh Site Statistics

Counting total edits...107205
Counting number of articles...1188
Counting total pages...5369
Counting number of users...8377
Counting number of images...0

Updating site statistics...done.

Done.
Script started on Thu 14 Apr 2016 11:55:58 AM UTC
$ mwscript initSiteStats.php --wiki nnwiki --update
Refresh Site Statistics

Counting total edits...2864074
Counting number of articles...126025
Counting total pages...290546
Counting number of users...71113
Counting number of images...17

Updating site statistics...done.

Done.
$ mwscript initSiteStats.php --wiki zh_classicalwiki --update
Refresh Site Statistics

Counting total edits...252281
Counting number of articles...4274
Counting total pages...67175
Counting number of users...58852
Counting number of images...0

Updating site statistics...done.

Done.
$ mwscript initSiteStats.php --wiki dawiki --update
Refresh Site Statistics

Counting total edits...8497144
Counting number of articles...215722
Counting total pages...725461
Counting number of users...265276
Counting number of images...0

Updating site statistics...done.

Done.
$ mwscript initSiteStats.php --wiki eswiki --update
Refresh Site Statistics

Counting total edits...90091612
Counting number of articles...1251356
Counting total pages...5445095
Counting number of users...4178848
Counting number of images...0

Updating site statistics...done.

Done.
$ mwscript initSiteStats.php --wiki cswiki --update
Refresh Site Statistics

Counting total edits...13522375
Counting number of articles...351144
Counting total pages...928177
Counting number of users...332110
Counting number of images...1

Updating site statistics...done.

Done.

Mentioned in SAL [2016-04-14T12:41:04Z] <Dereckson> Ran initSiteStats.php for 10 wikis to fix negative/off-by-one statistics errors (T131306)