Page MenuHomePhabricator

Wikipedia.org Portal Dashboard: add clicks by language
Closed, ResolvedPublic8 Estimated Story Points

Description

As part of the Foundation's goals, we want to expand usage of Wikipedia in the Global South and other countries and smaller languages. In order to help in this goal, we need to see what the usage currently is on the wikipedia.org portal page and display it on our dashboard.

To do this, we want to show a dashboard tab that will tally up how many clicks are generated onto the language links from the portal:

  • count by clicks onto individual language links
    • listed alphabetically
    • can also be sorted by highest to lowest counts (and vice versa)
  • total amount of clicks onto all language wiki's
  • create with as much timely data as we currently have stored in order to show trends
    • ie: are the smaller wiki's getting more or less traffic over time
    • ie: use any old data that we still have and add in the new data that we're collecting
  • have a filter to sort by the top 10 languages
  • have a filter to sort by the bottom 50 languages

Event Timeline

Simple prototype from T138397 up at http://discovery-experimental.wmflabs.org/portal2sites/

Will build on it to satisfy the description of this task.

Thanks, @mpopov - the other projects always seem to be 0%. Can you double check that please?

Thanks, @mpopov - the other projects always seem to be 0%. Can you double check that please?

"other projects" will always be 0 for *.wikipedia.org because only search, primary links, and secondary links lead to those. If you select mediawiki.org or meta.wikimedia.org then you'll only see "other projects" clicks

Change 303730 had a related patch set uploaded (by Bearloga):
Count clicks by language

https://gerrit.wikimedia.org/r/303730

Change 303730 merged by Bearloga:
Count clicks by language

https://gerrit.wikimedia.org/r/303730

Change 303815 had a related patch set uploaded (by Bearloga):
Bug-proofing

https://gerrit.wikimedia.org/r/303815

mpopov set the point value for this task to 8.

Had a meeting showcasing the dashboard and here are the following requests for changes/features/fixes before this becomes part of the Portal dashboard in production:

  • Fill in 0s for unobserved days (e.g. Abkhazian)
  • Explain users vs clicks and how multiple clicks are possible per visit per session
  • Explain sampling on the languages visited panel
  • Have a question mark for the tooltip about selecting languages
  • "Select all" button for top 10 and bottom 50
  • Make a note that languages will change depending on top 10 vs bottom 50 vs alphabetical
  • Remove multiplication by 200 and make it raw counts again

Change 304505 had a related patch set uploaded (by Bearloga):
Add languages visited

https://gerrit.wikimedia.org/r/304505

Change 304505 merged by Chelsyx:
Add languages visited

https://gerrit.wikimedia.org/r/304505

I found a couple more issues;

  1. Clicking data: users and selecting "Include English Wikipedia" does not change the results on the tab.

include-en-wiki_doesnt_work.png (584×1 px, 92 KB)

  1. It looks like with only one language selected (default lang), splines cannot be used in the chart (chart doesn't load)

no-splines_with_one_lang_selected.png (382×1 px, 46 KB)

Change 304954 had a related patch set uploaded (by Bearloga):
Language visits bug fixes

https://gerrit.wikimedia.org/r/304954

Change 304954 merged by Bearloga:
Language visits bug fixes

https://gerrit.wikimedia.org/r/304954

This comment was removed by mpopov.

Small updates to be made:

On http://discovery-beta.wmflabs.org/portal/#languages_summary:

primary: the links around the Wikipedia globe logo, which are dynamically placed and sorted according to each visitor's language preferences.

  • change to:

primary: the links around the Wikipedia globe logo, which are dynamically placed and sorted according to each visitor's browser's language preferences.

Also, the x/y axis is hard to read (dark font on a dark background page color)

...is this correct?

search: wikipedia.org visitors can search Wikipedias in different languages and end up on specific articles, in which case we know the language of the Wikipedia they visited. If they did not find a specific article, they are taken to all a search results page, in which case we will not know the language of the Wikipedia they visited.

  • it should probably be more along the lines of:

search: wikipedia.org visitors can search Wikipedia's in different languages and end up on specific articles and we log the language of the Wikipedia they visited. However, if the visitor did not find a specific article during their initial query from the search metadata that is displayed, or by hitting 'enter', they will be redirected to a default search results page in the language that they searched in (even if they changed the language in the small dropdown in the search box while on the portal). At this time, this searchbox language selection change is not logged.

  • and add in:

Note: Sister project link clickthroughs are not tracked on this page, see this page for more info.

On http://discovery-beta.wmflabs.org/portal/#languages_visited:

  • black x/y axes need to be white like the other graphs
  • when selecting 'top 10' - all top ten languages should be shown by default (currently, the user has to take an additional step of clicking the 'select all top 10' button to do this)
  • when selecting 'bottom 50' button - all 50 languages should be shown by default
  • update the 'sort languages' to default to be in this order: top 10, bottom 50, overall clicks (high to low), overall clicks (low to high), alphabetically (a to z), alphabetically (z to a)
  • after selecting several languages and then clicking through the sort languages selections - the languages selected change order: they should remain in the same order as selected previously
  • the languages listed below the chart should be in order of highest clicks to lowest clicks (it seems to be in alphabetical order)
  • when selecting a lot of languages - the language box gets very large and pushes the chart and the legend down over the notes

too many languages are overwriting notes.png (1×1 px, 561 KB)

On http://discovery-beta.wmflabs.org/portal/#languages_visited:

  • black x/y axes need to be white like the other graphs

Fixed in upcoming patch.

  • when selecting 'top 10' - all top ten languages should be shown by default (currently, the user has to take an additional step of clicking the 'select all top 10' button to do this)
  • when selecting 'bottom 50' button - all 50 languages should be shown by default

I'm putting a hard limit of 12 maximum selectable languages. Anything more than that is too much for anyone to make (or to want to make) sense of and as you noted in the point below, selecting a lot of languages (e.g. 50) breaks the UI.

  • update the 'sort languages' to default to be in this order: top 10, bottom 50, overall clicks (high to low), overall clicks (low to high), alphabetically (a to z), alphabetically (z to a)

Fixed in upcoming patch.

  • after selecting several languages and then clicking through the sort languages selections - the languages selected change order: they should remain in the same order as selected previously

Sorry, can't do it within the framework we're using. The selected languages always re-order according to how they would be ordered if they weren't selected. That is, sorting by clicks (high to low) will send English and German to the front of the selected languages pack if they were selected.

  • the languages listed below the chart should be in order of highest clicks to lowest clicks (it seems to be in alphabetical order)

Fixed in upcoming patch.

  • when selecting a lot of languages - the language box gets very large and pushes the chart and the legend down over the notes

See above for the limit on the number of languages. I did fix in the upcoming patch so at least the legend won't go through the notes :P

Change 305150 had a related patch set uploaded (by Bearloga):
Languages visited bug fixes & re-wording

https://gerrit.wikimedia.org/r/305150

Change 305150 merged by Bearloga:
Languages visited bug fixes & re-wording

https://gerrit.wikimedia.org/r/305150

@debt Except the order of languages in the legend. Sorry! Only fixed the order of the languages in the selection box and the graph title, not the legend below the graph. Will try to fix now!

Change 305158 had a related patch set uploaded (by Bearloga):
Languages legend ordering bugfix

https://gerrit.wikimedia.org/r/305158

Change 305158 merged by Bearloga:
Languages legend ordering bugfix

https://gerrit.wikimedia.org/r/305158

Cool - but there are few that aren't working well yet.

For the bottom 50 - it defaults to just whatever language is the last in the overall list, not showing the bottom 50 languages as is expected (don't make me guess what the bottom languages are). We need it to show all the bottom languages when the user first clicks on it that selection (pre-loaded and they can be static).

Alternatively, we can change that selection to note that we'll only show the bottom 12 languages (which isn't as useful as showing the bottom 50) and pre-load those bottom 12 languages when the user selects that from the dropdown.

Also, the list of languages that I selected are not listed (above the notes) in the order of highest number of clicks (or users) to lowest. I'm not sure if that got reverted with one of the other changes/check-ins?

Screen Shot 2016-08-17 at 1.32.51 PM.png (480×1 px, 117 KB)

Change 305400 had a related patch set uploaded (by Bearloga):
Language visited bugfixes & progress bar

https://gerrit.wikimedia.org/r/305400

Change 305400 merged by Bearloga:
Language visited bugfixes & progress bar

https://gerrit.wikimedia.org/r/305400

@debt Patch notes:

Fixes

  • Correctly sorts the languages in the top 10 and bottom 50 lists by clicks or users (high to low) depending on which the user selected
  • Now auto-selects the bottom 12 (the maximum number of languages that can be selected at the same time) when sort is set to "Bottom 50"

Adds

  • A progress bar showing the first-user-of-the-day the progress as the dashboard downloads the latest data

Once this gets an OK from @debt, it will be deployed by @chelsyx as part of the deployment step in T142436

Change 305561 had a related patch set uploaded (by Bearloga):
Change bottom 50 to bottom 10

https://gerrit.wikimedia.org/r/305561

Change 305561 merged by Bearloga:
Change bottom 50 to bottom 10

https://gerrit.wikimedia.org/r/305561

@debt Per our discussion:

  • Changes "Bottom 50" to "Bottom 10"
  • Changes "Select all top 10" button to "Select all 10"
  • Button now shows up if user selected "Top 10" or "Bottom 10" sorting

Yes, @mpopov although I really would love a static listing of all the top 10 (or 15) languages and then a listing of all the bottom 50 languages. I think that shows a much more accurate picture of things.

Static meaning that a user can't edit which languages are listed as top or bottom, the page just displays the information. Even if it's messy looking with the colors, the user can hover over the data and get the info they want by how they display below the graph, in exact numbers per day.

Change 306084 had a related patch set uploaded (by Bearloga):
'Bottom 10' to static 'Bottom 50', makes 'Top 10' static too

https://gerrit.wikimedia.org/r/306084

Change 306084 merged by Bearloga:
'Bottom 10' to static 'Bottom 50', makes 'Top 10' static too

https://gerrit.wikimedia.org/r/306084

Hi @mpopov - looks much better, yay! Maybe we can unbold the listing of languages for the bottom 50 - so that the emphasis is on the graph and the list of languages by user/click below the graph? :)

Also - it appears that on all pages - the languages are all in alphabetical order, not by amounts of clicks/users. Did it regress?

Change 306271 had a related patch set uploaded (by Bearloga):
De-bold & make languages visited graph title smaller & re-fix language order

https://gerrit.wikimedia.org/r/306271

Change 306271 merged by Bearloga:
De-bold & make languages visited graph title smaller & re-fix language order

https://gerrit.wikimedia.org/r/306271

@debt: Both points addressed/fixed! :D

Sorry, maybe I'm missing the update, but I'm not seeing the fix for either the size of the font of the languages selected nor the fix for showing the higher amount of clicks first to lowest (below the graph).

Screen Shot 2016-08-24 at 9.06.04 AM.png (684×983 px, 259 KB)

  1. Would it be better if I were to just hide the title if bottom 50 is selected? Making the font way small would make it illegible, so maybe just hiding makes the most sense.
  2. The languages are in order of highest to lowest clicks (overall), just not at any particular date.

Change 306517 had a related patch set uploaded (by Bearloga):
Note language order in legend

https://gerrit.wikimedia.org/r/306517

Change 306517 merged by Bearloga:
Note language order in legend

https://gerrit.wikimedia.org/r/306517

@debt: is this latest version OK to deploy to production?

Yes - but can you make the 'combine languages' (for clicks) be the default screen for the bottom 50, please?

Change 306581 had a related patch set uploaded (by Bearloga):
Give option to 'combine languages' when multiple langs are selected && check it by default whenever 'Bottom 50' sorting option is chosen

https://gerrit.wikimedia.org/r/306581

Change 306581 merged by Bearloga:
Give option to 'combine languages' when multiple langs are selected && check it by default whenever 'Bottom 50' sorting option is chosen

https://gerrit.wikimedia.org/r/306581

Change 306964 had a related patch set uploaded (by Bearloga):
Annotate backfilled data

https://gerrit.wikimedia.org/r/306964

Change 306964 merged by Bearloga:
Annotate backfilled data

https://gerrit.wikimedia.org/r/306964