Page MenuHomePhabricator

Visualization of Wikimedia traffic by language, country and region
Closed, ResolvedPublic

Description

Tracking task for @ezachte's wikistats dataviz.

  • Prepare a static CSV on top of tabular data in HTML T170691
  • Review design and browser support T169751
  • Documentation on Meta
  • Upload source of the frontend to Github
  • Blog post T169530
  • Register the data on a repository T178695

Event Timeline

DarTar triaged this task as Medium priority.Jul 28 2017, 3:38 PM
DarTar updated the task description. (Show Details)
DarTar added a project: Epic.
DarTar updated the task description. (Show Details)
DarTar added a subscriber: MelodyKramer.

GitHub: https://github.com/wikimedia/analytics-wikistats/tree/master/traffic/wivivi

data files: https://stats.wikimedia.org/wikimedia/animations/pageviews/data.html (to do: registering on repository)

ready to publish viz. tomorrow, JIT before Wikimania buzz starts, blog post and data repository can follow later,

Viz published today at https://stats.wikimedia.org/wikimedia/animations/pageviews/wivivi.html

Early bugs/wishes:

"Is it possible to have a permalink for a query, somewhere? I think some communities would be happy to have a direct link to the map in their language"
(me: sounds like a great idea, and probably small change)

Can we have this for all projects?
(me: hmm not sure yet, to be triaged for priority)

Can we have more colors? 50% to 100% is all yellow
(me: for most languages yellow is quite rare, and more colors make viz more crowded, let's see what others say)

I've also noticed an font rendering issue. To reproduce it, select French, the Breakdown by country. #16, the Réunion island is encoded as R�union.
(me: possible encoding error, will look)

If you switch the language to German, Greenland is displayed as yellow (50+% German Wikipedia Views). On mouse over, one sees however, that the real percentages are English (57%) and Danish (42%). Just a bug?
(me: Greenland, yeah that's a bug in the datamaps library. Sometimes the country under the mouse doesn't get updated if the map is repainted for another language. Refresh (F5) should fix this. But I'll list it as open bug)

Hyvä Suomi! Most readers for any Wikipedia per capita, if I read correctly. It would have been nice to confirm by reading in the table, but the numbers displayed in the table were different from what was displayed in the choropleth map.
(me: Great point, thanks. I need to look into that.)

Several readers report issues with ad blocker uBlock. Needs to be deactivated to see viz.

Beborah: for countries like South Africa, the details screen extends beyond the page itself—can that be adjusted? (Canvas: 1614x960)

from Facebook:

James Heilman By the way Erik Zachte it would be very very cool to have that graph avaliable by year (maybe a scroll bar?). It would then show that while Europe was originally using mostly English as the WPs in those languages developed people switched from EN to the native language of the country. Will this happen in SE Asia?

Erik Zachte James Heilman I thought of this too, and it would be doable. As a first step I keep data files for earlier months in separate folders. See https://stats.wikimedia.org/wikimedia/animations/pageviews/2017-06/ I could generate files for earlier months but I hesitate going further back than May 2015. Before that we did not filter bots, to name one issue.

James Heilman Erik Zachte It is more proportional views between languages that is of interest. Even with bot views useful info would come through IMO.

Erik Zachte James Heilman bot traffic was about 22% in 2015, and surely not distributed proportionally over all countries, that'\s major distortion, alas

Tilman Bayer Erik Zachte Indeed, we should not use the old-definition (pre May 2015) pageview data for that, for the reason you mention. However, if someone is interested in investigating this a bit further, we do have an internal archive of (sampled) new-definition data that goes back two more years, to 2013, with views by country and project, and bots filtered. It's what I use for these 2013-17 trend charts: https://commons.wikimedia.org/wiki/File:Wikimedia_monthly_pageviews_(desktop%2Bmobile),_2013-.png

Tilman Bayer PS: I have documented that 2013-15 dataset and its differences to today's dataset here, finding that these are small enough to justify merging both into a four-year view: https://meta.wikimedia.org/wiki/Research:Page_view#Differences_to_earlier_implementation_of_the_.22new.22_definition_.282013-2015_data.29

Highest range of values for map "Wikipedia pageviews, percentage to language ...." (map in red-orange-yellow) has been split into two ranges, on user request. So range 50%-100% is now range 50%-80% and range 80%-100%

For the other map "Pageviews per capita to any Wikipedia in June 2017" (map in green) an extra range has been added on the bottom: "views: 0.1-0.25" This makes the viz. more nuanced for Africa: instead of nearly all of sub-Sahara countries shown black, there are countries with very dark-green showing they reached a slightly higher level of participation.

ad-blocker uBlock blocks WiViVI (with 3 reports on this from a small audience, this better be solved before we publish the blog post)

Turns out that uBlock doesn't like folder 'pageviews' (is blacklisted). So on thorium I moved all files to folder ../wivivi/.. and put a redirect in folder ../pageviews/..

bash file datamaps_views.sh has been migrated to stat1005, so monthly updates to WiViVi can now be generated

https://stats.wikimedia.org/wikimedia/animations/wivivi/wivivi.html now shows data for September 2017
change also recorded at https://phabricator.wikimedia.org/T176478

@Erik_Zachte I /think/ you can cross off the Blog post for this task, given that it's already done. Are there plans to work on the last checkbox as well? If yes, do you need our help anywhere?

DarTar edited projects, added Research-Archive; removed Research.