Page MenuHomePhabricator

Report New articles for 2014 Oct-Dec
Closed, ResolvedPublic

Description

Generate number for quarterly report: https://meta.wikimedia.org/w/index.php?title=File%3AWikimedia_Foundation_Quarterly_Report%2C_2014-15_Q2.pdf&page=4

  • no historical data from VS
  • Meta counts
  • Wikistats / reportcard
  • Recommendation: use reportcard figures if available, if not default to Meta counts (EZ)

Event Timeline

kevinator raised the priority of this task from to Needs Triage.
kevinator updated the task description. (Show Details)
kevinator added a project: Analytics.

To clarify for onlookers, this refers to http://reportcard.wmflabs.org/graphs/articles or the corresponding Wikistats page. I would be happy to work from the data that's already available there, except that (like for T88403 ) the number for December 2014 is still missing.

So I have instead calculated the number from the "List of Wikipedias" page on Meta (which is also the source for the public stats maintained on the Foundation's home page https://wikimediafoundation.org/wiki/Template:ALL-WP-COUNT ).

684596 = 34 127 177 (https://meta.wikimedia.org/w/index.php?oldid=10866718#Grand_Total ) - 33 442 581 (https://meta.wikimedia.org/w/index.php?title=List_of_Wikipedias/Table&oldid=10058364#Grand_Total)
...and I could do the same for the trend data (comparison with Q1, and Q2 2013/14).

The data on Meta, in turn, is based on @Dzahn's http://wikistats.wmflabs.org/display.php?t=wp , which I understand is polling the APIs of each wiki directly.

yes, http://wikistats.wmflabs.org/display.php?t=wp has a "grand total" section at the bottom.

it fetches the numbers from the api of each wikipedia and then simply adds them up. note how there is a difference between "total" and "good". What is considered "good" is a Mediawiki thing.

It is the same thing as "pages" and "content pages" on https://en.wikipedia.org/wiki/Special:Statistics

imho that is reliable, unless Special:Statistics itself has a bug i don't see why it would be different from parsing dumps (which seems more error prone)

(Adding Dario in case he wants to weigh in on the reliability of the two different data sources. For now, I'm OK with using Daniel's data via the version history of the Meta page.)