Page MenuHomePhabricator

[REQUEST] End-of-year pageview statistics
Closed, ResolvedPublic

Description

Comms are now prepping for our traditional end-of-year narrative about the most-popular Wikipedia articles of the year. (Here's our post from 2019!)

This topic is always among, if not the most, popular things we publish and pitch to press each year, and we're excited to once again be looking at ways to put together a fun narrative around our analytics.

To assist with this, we'd like to ask if it would be possible for the Product Analytics team to pull this data twice:

  • In early December, so we can get a rough idea of what the list will look like and begin to pitch ideas to press
  • Sometime between December 15–22, so we have the most recent data possible

(In an ideal world, we'd pull this data after January 1st from the Topviews WMFlabs page and lots of people would tune in. Unfortunately, past experience has shown that community, public, and press interest in the prior year drops precipitously after the new year begins.)

Deliverables

  • Pageview data just from English Wikipedia
  • Request for top 50: One list by desktop, One list by mobile, Combined

Event Timeline

Justarandomamerican raised the priority of this task from Medium to Needs Triage.

Actually, never mind.

Justarandomamerican removed MNeisler as the assignee of this task.
Justarandomamerican added a subscriber: MNeisler.
Justarandomamerican removed a subscriber: MNeisler.
LGoto triaged this task as Medium priority.
LGoto moved this task from Triage to Needs Investigation on the Product-Analytics board.

Hey @cchen -- we have one extra ask that came up later... Would it be possible to add in views from redirects once we have a top 50? We suspect that doing this would add a lot of pageviews to the COVID pages especially, which have been moved several times since the beginning of the pandemic = those old titles are now redirects.

@EdErhart-WMF sure! i will add redirects once i have the results.

Hi @EdErhart-WMF, please find the stats here. the pageviews also includes redirects, with user only pageviews.
timeframe for the data is 01/01/2020 - 11/29/2020

Thank you so much @cchen! Do you have any thoughts as to the legitimacy of Microsoft Office at #29? It's roughly 50/50 desktop/mobile views, but it's such a strange entry that I'm having trouble accepting it. (This is as opposed to the YouTube article, which is probably the result of people mistakenly navigating there instead of the website!)

@EdErhart-WMF hmm i checked the referral class for this page, a large amount of pageviews are from "none" referral class, which means the traffic come from either direct link to this page, or it's automated or bot traffic. We detected and removed large amount of automated traffic in May (from desktop) and October (from mobile web).
In this case, some pageviews to this page are probably from undetected automated traffic in other months. and another reason might be the page URL is direct linked from some websites or social media.

Thanks, Connie! That's really helpful info that I think is enough for us to remove it from the listing. :-)

Hey again @cchen! We've been advised by our partners that December 15th would make it much easier to pitch this story idea to press.

Would it be possible to pull the data on that day specifically? My apologies for the change from the previous "sometime between December 15–22."

Hi @EdErhart-WMF sure, the full data for December 15 will be available on December 16. I will rerun the query then.

That works great! Thanks @cchen. We really appreciate your flexibility!

@EdErhart-WMF here's the updated data for page and pageviews between 2020/01/01 - 2020/12/15