Page MenuHomePhabricator

Error in Statistics for Doteli Wikipedia
Closed, ResolvedPublic

Description

From last month there is vast differences in statistics for Doteli Wikipedia. The change in number of articles is by nearly 2000 (it was around 3000). Here is the statistics page - https://dty.wikipedia.org/wiki/बिशेष:Statistics

We tried to figure out the problem but cannot find any solution. Here is discussion we carried out -
https://dty.wikipedia.org/wiki/विकिपिडिया:चौपातो_(प्राविधिक)#.E0.A4.A1.E0.A5.8B.E0.A4.9F.E0.A5.87.E0.A4.B2.E0.A5.80_.E0.A4.B5.E0.A4.BF.E0.A4.95.E0.A4.BF.E0.A4.95.E0.A5.8B_.E0.A4.A4.E0.A4.A5.E0.A5.8D.E0.A4.AF.E0.A4.BE.E0.A4.99.E0.A5.8D.E0.A4.95.E0.A4.AE.E0.A4.BE_.E0.A4.A6.E0.A5.87.E0.A4.96.E0.A4.BF.E0.A4.8F.E0.A4.95.E0.A5.8B_.E0.A4.AB.E0.A4.B0.E0.A4.95

Thanks in advance.
Nirajan Pant
(User:Nirajan pant)

Event Timeline

Besides this there is an error on Wiki depth [गहिराई (48.251740567655) ] statistics also and We were unable to get Sanbox on dty.wikipedia.org.

Thanks
(User: Janak Bhatta)

https://dty.wikipedia.org/wiki/बिशेष:Statistics currently lists 1074 content pages. You imply that the number was around 3000 a month ago?

Besides this there is an error on Wiki depth [गहिराई (48.251740567655) ] statistics also

I do not know what "depth" is and where to see that. :( Please follow https://mediawiki.org/wiki/How_to_report_a_bug and provide steps to reproduce. Thanks!

and We were unable to get Sanbox on dty.wikipedia.org.

Please create a separate task for separate topics, as this task is about statistics only.
(If you refer to the SandboxLink extension: it is not installed on dty.wikipedia.org as the extension is not listed on https://dty.wikipedia.org/wiki/बिशेष:Version . Please see meta:Requesting wiki configuration changes for the procedure to follow. As I said: A topic for a separate task, not this one.)

https://dty.wikipedia.org/wiki/बिशेष:Statistics currently lists 1074 content pages. You imply that the number was around 3000 a month ago?

Yes, it was. Right now I have no evidence to proof it. But you can check Incubator Statistics (https://incubator.wikimedia.org/wiki/Wp/dty/खास_पृष्ठ). After site creation the statistics were according to this. But a month ago statistics changed suddenly. I think you could help to solve this.

@Aklapper what is the status of this issue? Its been 1 month since the issue posted. Can you help to solve the issue?

Peachey88 subscribed.

This just needs a script ran to refresh the statistics.

Apologies for the lengthy post...

Newly imported wikis commonly have inflated article counts immediately after importing, presumably as a result of article-couting bugs that remain in the MediaWiki code. (It's not as bad as it used to be, when imported pages would add 1 to the article count for every individual edit in each imported page's history! But something is still not right.) The Punjabi Wikisource (pawikisource), for example, went from c. 100 articles down to 1 when it was recounted for the first time.

In any case, running either updateArticleCount.php (to update only the article count) or initSiteStats.php (to update all [?] site statistics) now will presumably give (approximately?) the same count as the current one, since "all" Wikimedia content wikis, including dtywiki, are recounted from scratch on the 21st of each month, and so dtywiki has been recounted twice since it was created. (The monthly recounting is done precisely by running updateArticleCount.php on each wiki.)

The first time dtywiki was recounted was May 21st, at which time it went from 3,127 articles down to 1,014 (the count immediately after importing was 2,808). When the wiki was recounted on Jun 21st, it changed from 1,119 to 1,450. (These counts are from statistics collected some number of hours after the importing and/or recounting was done.)

The first change is the kind that is to be expected on a first recount because of the (presumed) lingering counting bug I alluded to earlier. The second change is large enough to be worrying, however, as the count should not have gotten that far off in the normal course of wiki editing.

So, sure, go ahead and recount it again, and we can see what happens... but note that it will be recounted anyway on July 21st, even if nothing is done as a result of this task. (I don't want to dissuade anyone from acting on this task, I'm just explaining the situation.)

As for why the count seems low, this may be due to a very widespread misunderstanding of how MediaWiki counts articles. Not every page in the main namespace (or "content namespace[s]") counts as an article: it has to contain at least one [[wikilink]] to another page on that same wiki. See, for example, mw:Manual:Article count. (Therefore, the "soluttion" to a low-but-correct article count is to link articles together with [[wikilinks]].)

Finally, regarding the statistics back when the Doteli Wikipedia was in the Incubator, I can't comment on that since I never looked at the stats back then, and they are now gone (Wp/dty content deleted on the Incubator because importing was finished).

(Disclosure: I am not a developer nor a sysadmin, I just know some stuff about article counting from past research/experience.)

Thank you @Dcljr. [[wikilink]] might be the possible reason. There are many articles not having [[wikilink]].

$ mwscript initSiteStats.php --wiki=dtywiki --update
Refresh Site Statistics

Counting total edits...23006
Counting number of articles...2006
Counting total pages...5626
Counting number of users...523
Counting number of images...0

Updating site statistics...done.

Done.
Reedy changed the task status from Open to Stalled.Jul 19 2017, 1:02 AM

Interesting… so the article count just changed from 1,686 about five hours ago to 2,006 after the recount. This is a weirdly large change (just like the last time it was recounted on June 21st), especially since there were no changes made on-wiki in that time.

So, presumably this 320-page "error" in the count is due to something that has happened on-wiki in the previous 4 weeks. In that time, 2 pages in the main namespace were deleted; 373 pages were created, incluing 60 in article space and 252 templates; and 3 pages were imported, all templates (but only 2 main-namespace pages transclude those templates).

I don't see what could be causing the count to be getting off so much, even if it's a MW bug. I mean, I still "don't trust" importing when it comes to keeping the article count correct, so I would have suspected that. But there just hasn't been enough importing activity to explain the difference we're seeing.

Anyone else have any ideas?

Could the count given by initSiteStats.php actually be different than that given by updateArticleCount.php? (Horrors!) I guess we'll find out in a few days when the latter script gets run on the wiki!

Interesting… so the article count just changed from 1,686 about five hours ago to 2,006 after the recount.

I don't know why that happened, but I've checked the monthly updates since then (happening at each 21st 5:00 UTC) and couldn't find any similarly large changes:

2017-07: https://meta.wikimedia.org/w/index.php?title=List_of_Wikipedias/Table&diff=17021708&oldid=17019990 (dtywiki: +3 articles)
2017-08: https://meta.wikimedia.org/w/index.php?title=List_of_Wikipedias/Table&diff=17140059&oldid=17135099 (dtywiki: +9 articles)
2017-09: https://meta.wikimedia.org/w/index.php?title=List_of_Wikipedias/Table&diff=17245579&oldid=17245165 (dtywiki: ±0 articles)
2017-10: https://meta.wikimedia.org/w/index.php?title=List_of_Wikipedias/Table&diff=17348333&oldid=17348316 (dtywiki: +4 articles)
2017-11: https://meta.wikimedia.org/w/index.php?title=List_of_Wikipedias/Table&diff=17450699&oldid=17449373 (dtywiki: +1 article)
2017-12: https://meta.wikimedia.org/w/index.php?title=List_of_Wikipedias/Table&diff=17565589&oldid=17564466 (dtywiki: +2 articles)
2018-01: https://meta.wikimedia.org/w/index.php?title=List_of_Wikipedias/Table&diff=17663115&oldid=17662085 (dtywiki: +2 articles)
2018-02: https://meta.wikimedia.org/w/index.php?title=List_of_Wikipedias/Table&diff=17760141&oldid=17758699 (dtywiki: +2 articles)

Could the count given by initSiteStats.php actually be different than that given by updateArticleCount.php? (Horrors!)

No. Internally both call the same function to do the recount operation.

I can't find any evidence that this continues to be a problem. Can this task be closed?

From my perspective, yes, this can be closed.

EddieGP claimed this task.