Page MenuHomePhabricator

Generate list of most used special pages
Closed, ResolvedPublic

Description

Background

In order to determine optimum expanded navigation for mobile contributions, we'd like to get a rough idea of the most used special pages on our sites.

Acceptance criteria

  • Generate a list of the top 50 special pages on enwiki

--> T198218#4322343 (desktop+mobile), T198218#4322957 (desktop only), T198218#4527184 (logged-in, desktop+mobile), T198218#4599248 (logged-in, mobile)

  • Generate a list of the top 50 special pages (by desktop and mobile pageviews) for logged-in users on the following projects:

English Wikipedia
Spanish Wikipedia
Japanese Wikipedia
Chinese Wikipedia
Portuguese Wikipedia
Italian Wikipedia
Persian Wikipedia
Hebrew Wikipedia
Arabic Wikipedia
Finnish Wikipedia
Vietnamese Wikipedia
Indonesian Wikipedia
English Wiktionary
Thai Wikipedia
Hindi Wikipedia
--> https://www.mediawiki.org/wiki/Reading/Web/Advanced_mobile_contributions/Special_pages_usage

  • Generate a list of the top 50 pages outside the article namespace (by desktop and mobile) for logged-in users for the same projects listed above

--> T198218#4600385 (enwiki)
--> https://www.mediawiki.org/wiki/Reading/Web/Advanced_mobile_contributions/Special_pages_usage#Top_non-mainspace_pages (all 15 projects)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Here is a first result for enwiki, ordered by views in May 2018. This is by root page name, so it e.g. doesn't distinguish between the different logs (Special:Log/delete vs. Special:Log/block).
Linked to the desktop version for convenience, but this includes mobile views.

_c0 views
https://en.wikipedia.org/wiki/Special:Search 58397730
https://en.wikipedia.org/wiki/- 17491023
https://en.wikipedia.org/wiki/Special:RecentChangesLinked 6984810
https://en.wikipedia.org/wiki/Special:BookSources 4692749
https://en.wikipedia.org/wiki/Special:WhatLinksHere 4218669
https://en.wikipedia.org/wiki/Special:Contributions 4017440
https://en.wikipedia.org/wiki/Special:CreateAccount 3627632
https://en.wikipedia.org/wiki/Special:ElectronPdf rOPUP3002279d308d
https://en.wikipedia.org/wiki/Special:Watchlist 2232102
https://en.wikipedia.org/wiki/Special:History 2227126
https://en.wikipedia.org/wiki/Special:MobileLanguages 1788997
https://en.wikipedia.org/wiki/Special:MobileDiff 1783171
https://en.wikipedia.org/wiki/Special:CiteThisPage 1725691
https://en.wikipedia.org/wiki/Special:MobileMenu 1172652
https://en.wikipedia.org/wiki/Special:MobileOptions 933428
https://en.wikipedia.org/wiki/Special:Log 907614
https://en.wikipedia.org/wiki/Special:Book 752099
https://en.wikipedia.org/wiki/Special:LinkSearch 607160
https://en.wikipedia.org/wiki/Special:RecentChanges 318573
https://en.wikipedia.org/wiki/Special:PasswordReset 266977
https://en.wikipedia.org/wiki/Special:Nearby 182586
https://en.wikipedia.org/wiki/Special:AllPages 173705
https://en.wikipedia.org/wiki/Special:GlobalUsage 164129
https://en.wikipedia.org/wiki/Special:UserLogout 145693
https://en.wikipedia.org/wiki/Special:SpecialPages 141800
https://en.wikipedia.org/wiki/Special:PrefixIndex 135855
https://en.wikipedia.org/wiki/Special:FeedItem 132252
https://en.wikipedia.org/wiki/Special:Preferences 125954
https://en.wikipedia.org/wiki/Special:ListFiles 68955
https://en.wikipedia.org/wiki/Special:UserRights 57589
https://en.wikipedia.org/wiki/Special:NewPages 52768
https://en.wikipedia.org/wiki/Special:BlockList 49602
https://en.wikipedia.org/wiki/Special:EditWatchlist 49486
https://en.wikipedia.org/wiki/Special:Notifications 46750
https://en.wikipedia.org/wiki/Special:ConfirmEmail 45514
https://en.wikipedia.org/wiki/Special:Block 44746
https://en.wikipedia.org/wiki/Special:AbuseLog 42117
https://en.wikipedia.org/wiki/Special:DeletedContributions 41900
https://en.wikipedia.org/wiki/Special:ListUsers 39768
https://en.wikipedia.org/wiki/Special:ChangeCredentials 36513
https://en.wikipedia.org/wiki/Special:EmailUser 33049
https://en.wikipedia.org/wiki/Special:ChangeEmail 32567
https://en.wikipedia.org/wiki/Special:CentralAuth 30159
https://en.wikipedia.org/wiki/Special:Uploads 28619
https://en.wikipedia.org/wiki/Special:Upload 28512
https://en.wikipedia.org/wiki/Special:Statistics 26866
https://en.wikipedia.org/wiki/Special:SiteMatrix 24599
https://en.wikipedia.org/wiki/Special:MovePage 23416
https://en.wikipedia.org/wiki/Search 23254
https://en.wikipedia.org/wiki/special:book 18953

Query:

SELECT CONCAT('https://en.wikipedia.org/wiki/',page_root), 
SUM(view_count) AS views FROM (
  SELECT IF(INSTR(page_title,'/')=0,page_title, SUBSTR(page_title,0,INSTR(page_title, '/')-1)) AS page_root,
  view_count
  FROM wmf.pageview_hourly
  WHERE year = 2018 AND month = 5
  AND namespace_id = -1
  AND project = 'en.wikipedia'
  AND agent_type = 'user') AS pr
GROUP BY page_root
ORDER BY views DESC LIMIT 50;

There is a source of inaccuracy here because - as noted in the documentation for pageview_hourly - some special pages (pages with names starting with "Special:") don't have the correct namespace ID set in the pageview_hourly table; but from a quick check it seems that this doesn't happen very often. Also, it could be that the "Special:" prefix is sometimes omitted, see the stray entry for https://en.wikipedia.org/wiki/Search above.

Linked to the desktop version for convenience, but this includes mobile views.

To check I understand the table and the sequel query - this is total views of special pages across the site?
Is it possible to see the view_count for mobile alone? Specifically, it would be interesting to see unusual high usage of special pages that are not available

Another query that would be interesting to investigate is search terms on mobile. Currently you can get to special pages from search by typing the name of the special page. It would thus be a good idea (if possible) to look at Schema:MobileWebSearch queries for clues about what editors need.

Here is the same limited to desktop, but for five months (January-May) instead of one. It confirms that the high number for RecentChangesLinked above wasn't an outlier - it got 13x as much traffic as RecentChanges, which I would have expected to be much more relevant and useful.

_c0 views
https://en.wikipedia.org/wiki/Special:Search 182484019
https://en.wikipedia.org/wiki/Special:WhatLinksHere 22019298
https://en.wikipedia.org/wiki/Special:RecentChangesLinked 20430499
https://en.wikipedia.org/wiki/Special:BookSources 20199725
https://en.wikipedia.org/wiki/Special:CreateAccount 20056114
https://en.wikipedia.org/wiki/Special:Contributions 18085714
https://en.wikipedia.org/wiki/Special:ElectronPdf 12371826
https://en.wikipedia.org/wiki/Special:Watchlist 10383133
https://en.wikipedia.org/wiki/- 9104395
https://en.wikipedia.org/wiki/Special:CiteThisPage 7120982
https://en.wikipedia.org/wiki/Special:Book 5465846
https://en.wikipedia.org/wiki/Special:Log 4187284
https://en.wikipedia.org/wiki/Special:LinkSearch 3125436
https://en.wikipedia.org/wiki/Special:RecentChanges 1578842
https://en.wikipedia.org/wiki/Special:AllPages 915764
https://en.wikipedia.org/wiki/Special:SpecialPages 837283
https://en.wikipedia.org/wiki/Special:PrefixIndex 803735
https://en.wikipedia.org/wiki/Special:ListFiles 706313
https://en.wikipedia.org/wiki/Special:FeedItem 693090
https://en.wikipedia.org/wiki/Special:UserLogout 658928
https://en.wikipedia.org/wiki/Special:UserRights 618910
https://en.wikipedia.org/wiki/Special:Preferences 536678
https://en.wikipedia.org/wiki/Special:GlobalUsage 419132
https://en.wikipedia.org/wiki/Special:PasswordReset 404227
https://en.wikipedia.org/wiki/Special:DeletedContributions 307613
https://en.wikipedia.org/wiki/Special:Block 293906
https://en.wikipedia.org/wiki/Special:AbuseLog 292844
https://en.wikipedia.org/wiki/Special:NewPages 289853
https://en.wikipedia.org/wiki/Special:CentralAuth 212355
https://en.wikipedia.org/wiki/Special:ListUsers 210642
https://en.wikipedia.org/wiki/Special:BlockList 170480
https://en.wikipedia.org/wiki/Special:EmailUser 169868
https://en.wikipedia.org/wiki/Special:Upload 153332
https://en.wikipedia.org/wiki/Special:Notifications 144606
https://en.wikipedia.org/wiki/Special:MovePage 141108
https://en.wikipedia.org/wiki/Special:ConfirmEmail 128983
https://en.wikipedia.org/wiki/Special:Statistics 128634
https://en.wikipedia.org/wiki/Special:EditWatchlist 120165
https://en.wikipedia.org/wiki/special:book 109355
https://en.wikipedia.org/wiki/Special:Undelete 104293
https://en.wikipedia.org/wiki/Special:Categories 100132
https://en.wikipedia.org/wiki/Special:ChangeEmail 80694
https://en.wikipedia.org/wiki/Special:AbuseFilter 73921
https://en.wikipedia.org/wiki/Special:NewPagesFeed 71053
https://en.wikipedia.org/wiki/Special:OAuth 63009
https://en.wikipedia.org/wiki/wiki.phtml 61495
https://en.wikipedia.org/wiki/Special:ComparePages 61480
https://en.wikipedia.org/wiki/Special:CheckUser 60061
https://en.wikipedia.org/wiki/Special�3ASearch 59233
https://en.wikipedia.org/wiki/Special:MobileOptions 56050

Query:

SELECT CONCAT('https://en.wikipedia.org/wiki/',page_root), SUM(view_count) AS views FROM (
  SELECT IF(INSTR(page_title,'/')=0,page_title, SUBSTR(page_title,0,INSTR(page_title, '/')-1)) AS page_root,
  view_count
  FROM wmf.pageview_hourly
  WHERE year = 2018 AND month <= 5
  AND namespace_id = -1
  AND project = 'en.wikipedia'
  AND access_method = 'desktop'
  AND agent_type = 'user') AS pr
GROUP BY page_root
ORDER BY views DESC LIMIT 50;

Curious about where users typically access https://en.wikipedia.org/wiki/Special:LinkSearch from? I see you can go to the Special Pages page and find it there, but wondering if there are other points of access (perhaps from article pages)?

Here is the same list limited to views by logged-in users, as an approximation to editors (for July 2018, on both the desktop and mobile domains). Special:Booksources ranks much lower here, confirming the hunch that it's mostly accessed while reading. And RecentChanges is more popular among logged-in users than RecentChangesLinked instead of vice versa above, which is more in line with expectations about their usefulness for editors.

Data via

SELECT '|',
CONCAT('[[https://en.wikipedia.org/wiki/',page_root,'|',page_root,']]') AS page, '|',
SUM(1) AS views FROM (
  SELECT IF(INSTR(pageview_info['page_title'],'/')=0,pageview_info['page_title'], SUBSTR(pageview_info['page_title'],0,INSTR(pageview_info['page_title'], '/')-1)) AS page_root
  FROM wmf.webrequest
  WHERE year = 2018 AND month = 7
  AND namespace_id = -1
  AND x_analytics_map['loggedIn'] IS NOT NULL
  AND is_pageview
  AND pageview_info['project'] = 'en.wikipedia'
  AND agent_type = 'user') AS pr
GROUP BY page_root
ORDER BY views DESC LIMIT 50

For direct comparison, here is the analogous list for all users (logged-in and anonymous) and the same timespan, July 2018:

Data via

SELECT '|',
CONCAT('[[https://en.wikipedia.org/wiki/',page_root,'|',page_root,']]') AS page, '|',
SUM(view_count) AS views FROM (
  SELECT IF(INSTR(page_title,'/')=0,page_title, SUBSTR(page_title,0,INSTR(page_title, '/')-1)) AS page_root,
  view_count
  FROM wmf.pageview_hourly
  WHERE year = 2018 AND month = 7
  AND namespace_id = -1
  AND project = 'en.wikipedia'
  AND agent_type = 'user') AS pr
GROUP BY page_root
ORDER BY views DESC LIMIT 50

(BTW, I don't know how Special:History got in there, would need to check.)

(BTW, I don't know how Special:History got in there, would need to check.)

In mobile history is served via a special page Special:History: https://en.m.wikipedia.org/wiki/Special:History/Alabama
Does that answer your question?

And here is the analogous list of special page with the most logged in views on the mobile site (en.m.wikipedia.org in this case), also for July 2018. Comparing with the above result, one finds e.g. that 6% of views to Special:Watchlist are on the mobile domain.

Data via:

SELECT '|',
CONCAT('[[https://en.m.wikipedia.org/wiki/',page_root,'|',page_root,']]') AS page, '|',
SUM(1) AS views FROM (
  SELECT IF(INSTR(pageview_info['page_title'],'/')=0,pageview_info['page_title'], SUBSTR(pageview_info['page_title'],0,INSTR(pageview_info['page_title'], '/')-1)) AS page_root
  FROM wmf.webrequest
  WHERE year = 2018 AND month = 7
  AND namespace_id = -1
  AND x_analytics_map['loggedIn'] IS NOT NULL
  AND is_pageview
  AND pageview_info['project'] = 'en.wikipedia'
  AND access_method = 'mobile web'
  AND agent_type = 'user') AS pr
GROUP BY page_root
ORDER BY views DESC LIMIT 50

(BTW, I don't know how Special:History got in there, would need to check.)

In mobile history is served via a special page Special:History: https://en.m.wikipedia.org/wiki/Special:History/Alabama
Does that answer your question?

Right, this occurred to me afterwards too.

@Tbayer thanks, that's helpful. Is there any way to include the use of Talk pages within these results?

Here is the (or an) answer to the second question from the task, about the top 50 pages outside the article namespace (enwiki, July, logged in views, desktop+mobile).

There are various ways to slice this. I grouped again by root page name, i.e. the part before the "/" - which e.g. means that all deletion discussions (say https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Maxwell%27s_zombie ) are counted towards https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion .

root pagelogged-in views July
Special:Watchlist2069902
Special:Contributions1741339
Special:Search1009612
Special:MobileDiff400956
Wikipedia:Articles_for_deletion192131
Special:WhatLinksHere126225
Wikipedia:Administrators'_noticeboard108928
Special:History108133
-102442
Special:Preferences101638
Special:RecentChanges93007
Wikipedia:New_user_landing_page90860
Special:Log81434
Wikipedia:Article_wizard76944
Portal:Current_events71403
Wikipedia:Sockpuppet_investigations60146
Special:EditWatchlist50354
Special:AbuseLog44755
Special:Book44187
Wikipedia:Arbitration40605
Wikipedia:Reference_desk39655
Special:Notifications38079
Wikipedia:File_Upload_Wizard35753
Special:CreateAccount34427
Wikipedia:Requests_for_adminship33523
Wikipedia:Administrator_intervention_against_vandalism32212
Help:Getting_started29672
Special:RecentChangesLinked25465
Special:MobileOptions24280
Wikipedia:In_the_news22830
Special:MovePage22018
Special:ConfirmEmail21831
Wikipedia_talk:Manual_of_Style21379
Special:ElectronPdf19964
Wikipedia:Teahouse19226
Wikipedia:Categories_for_discussion18503
Help:IPA17908
Wikipedia:Manual_of_Style17579
Wikipedia:Village_pump_(technical)16694
Special:MobileLanguages16269
Wikipedia:Your_first_article16062
Wikipedia:Requests_for_page_protection15824
Template:Did_you_know_nominations15762
User_talk:Dwayne_Reed15196
Portal:Contents14756
Wikipedia:Featured_article_candidates14533
Special:Undelete13978
Talk:Donald_Trump13757
Wikipedia:Help_desk13647
Wikipedia:Village_pump_(proposals)13519

Data via

SELECT '|',
CONCAT('[[https://en.wikipedia.org/wiki/',page_root,'|',page_root,']]') AS page, '|',
SUM(1) AS views FROM (
  SELECT IF(INSTR(pageview_info['page_title'],'/')=0,pageview_info['page_title'], SUBSTR(pageview_info['page_title'],0,INSTR(pageview_info['page_title'], '/')-1)) AS page_root
  FROM wmf.webrequest
  WHERE year = 2018 AND month = 7
  AND namespace_id !=0
  AND x_analytics_map['loggedIn'] IS NOT NULL
  AND is_pageview
  AND pageview_info['project'] = 'en.wikipedia'
  AND agent_type = 'user') AS pr
GROUP BY page_root
ORDER BY views DESC LIMIT 50
In T198218#4599295, @alexhollender wrote:

@Tbayer thanks, that's helpful. Is there any way to include the use of Talk pages within these results?

We can look at the ratio of article talk page views among all mobile logged-in pageviews, in case that is useful?

The result above about the top non-mainspace page might also be interesting in that regard, concerning discussion pages outside the proper article talk page namespace, e.g. AfD or the Administrators'_noticeboard.

We haven't really decided into which direction to take this task from here. Which of the above combinations would be most useful to extend to the other four languages? (logged in vs all views, mobile domain vs. mobile + desktop)

Does mobile domain include users using desktop on mobile? I would like to know the answer to
"which special pages are users using on a mobile phone that they feel the need to use desktop site for?"

I think knowing this would be useful to target pages that editors want to use on their mobile phones and help us work out where we can best fill those gaps to give them better experiences.

Here is the same limited to desktop, but for five months (January-May) instead of one. It confirms that the high number for RecentChangesLinked above wasn't an outlier - it got 13x as much traffic as RecentChanges, which I would have expected to be much more relevant and useful.

_c0 views
https://en.wikipedia.org/wiki/wiki.phtml 61495

TIL that MediaWiki used to have an entry point called wiki.phtml. It was removed in MediaWiki 1.29.

Results for the second question, expanding the earlier result for the top 50 special pages to 14 non-enwiki projects, are now posted on this wiki page (consisting of 15 x 50 = 750 numbers, this information is a bit unwieldy and would not fit well into a table here on Phabricator).

For enwiki, the top 10 list remains the same as in the result for an earlier month (T198218#4527184, using a slightly different query). One of the most noteworthy difference on non-enwiki projects might be that Special:Newpages is generally much more popular there than on English Wikipedia.

Feedback on the format (wiki table) welcome. Unless there are other suggestions, I'm planning to post the other updated result (non-mainspace pages on non-enwiki projects) in a similar fashion.

Data source below. I took the opportunity to make a few changes to the query, including listing the percentage among all special page views instead of the absolute number of views, and making use of the x_analytics_map['special'] field that handily provides the original special page name instead of the localized name in the URL, e.g. Special:Watchlist instead of Especial:CambiosRecientes (thanks to @phuedx for bringing it to my attention) .

WITH 

  counted AS (
  SELECT pageview_info['project'] AS project,
  x_analytics_map['special'] AS specialpagename,
  SUM(1) AS views
  FROM wmf.webrequest
  WHERE year = 2019 AND month = 2
  AND namespace_id = -1
  AND x_analytics_map['loggedIn'] IS NOT NULL
  AND pageview_info['project'] IN ('en.wikipedia', 'es.wikipedia', 'ja.wikipedia', 
    'zh.wikipedia', 'pt.wikipedia', 'it.wikipedia', 'fa.wikipedia', 'he.wikipedia',
    'ar.wikipedia', 'fi.wikipedia', 'vi.wikipedia', 'id.wikipedia', 'en.wiktionary',
    'th.wikipedia', 'hi.wikipedia')
  AND agent_type = 'user'
  GROUP BY pageview_info['project'], x_analytics_map['special']
  ),

  ranked AS (
    SELECT project, specialpagename, views,
    RANK() OVER (PARTITION BY project ORDER BY views DESC) as ranking
    FROM counted
  ),
  
  totals AS (
    SELECT project, SUM(views) AS all_special_views
    FROM counted
    GROUP BY project
  )

SELECT specialpages.project AS project,
ranking,
special_page_url, 
views,
views/all_special_views AS ratio,
ROUND(100*views/all_special_views,2) AS percentage
FROM (
  SELECT project,
    ranking,
    CONCAT('[https://', project, '.org/wiki/Special:', specialpagename, ' ',specialpagename,']') AS special_page_url,
    views
  FROM ranked
  WHERE ranking <= 50) AS specialpages
  JOIN (
    SELECT project, all_special_views FROM totals) AS projects
  ON specialpages.project = projects.project
ORDER BY project ASC, ratio DESC LIMIT 10000

Results for the third question about the most popular non-mainspace pages for logged-in users, expanding on the enwiki results of T198218#4600385 , are now posted at https://www.mediawiki.org/wiki/Reading/Web/Advanced_mobile_contributions/Special_pages_usage#Top_non-mainspace_pages
Note that as before, this is grouped by (and linking to) page roots, e.g. https://es.wikipedia.org/wiki/Wikipedia:Consultas_de_borrado/Cloud9_(League_of_Legends) counts for https://es.wikipedia.org/wiki/Wikipedia:Consultas_de_borrado . For some pages root, a corresponding page may not exist (e.g. there is https://en.wiktionary.org/wiki/Reconstruction:Proto-Slavic/-ati but no https://en.wiktionary.org/wiki/Reconstruction:Proto-Slavic ).

Closing this task for now, but feel free to post followup questions here.

Data source:

WITH 

  counted AS (
  SELECT project, page_root,
  SUM(1) AS views
  FROM (
    SELECT
    pageview_info['project'] AS project,
    IF(INSTR(pageview_info['page_title'],'/')=0,
        pageview_info['page_title'], 
        SUBSTR(pageview_info['page_title'],0,INSTR(pageview_info['page_title'], '/')-1)) 
    AS page_root
    FROM wmf.webrequest  
    WHERE year = 2019 AND month = 2
    AND namespace_id != 0
    AND x_analytics_map['loggedIn'] IS NOT NULL
    AND is_pageview
    AND pageview_info['project'] IN ('en.wikipedia', 'es.wikipedia', 'ja.wikipedia', 
      'zh.wikipedia', 'pt.wikipedia', 'it.wikipedia', 'fa.wikipedia', 'he.wikipedia',
      'ar.wikipedia', 'fi.wikipedia', 'vi.wikipedia', 'id.wikipedia', 'en.wiktionary',
     'th.wikipedia', 'hi.wikipedia')
    AND agent_type = 'user') AS pr
  GROUP BY project, page_root
  ),

  ranked AS (
    SELECT project, page_root, views,
    RANK() OVER (PARTITION BY project ORDER BY views DESC) as ranking
    FROM counted
  ),
  
  totals AS (
    SELECT project, SUM(views) AS all_nonmain_views
    FROM counted
    GROUP BY project
  )

SELECT pageroots.project AS project,
ranking,
page_root_link, 
views,
views/all_nonmain_views AS ratio,
ROUND(100*views/all_nonmain_views,2) AS percentage
FROM (
  SELECT project,
    ranking,
    CONCAT('[https://', project, '.org/wiki/', page_root, ' ',page_root,']') AS page_root_link,
    views
  FROM ranked
  WHERE ranking <= 50) AS pageroots
  JOIN (
    SELECT project, all_nonmain_views FROM totals) AS projects
  ON pageroots.project = projects.project
ORDER BY project ASC, ratio DESC LIMIT 10000;

@ovasileva looking at https://www.mediawiki.org/wiki/Reading/Web/Advanced_mobile_contributions/Special_pages_usage#Top_non-mainspace_pages I'm wondering if we somehow overlooked including the following Special pages in AMC navigation:

  • Wikipedia:Articles_for_deletion
  • Wikipedia:Administrators'_noticeboard
  • Special:Log
  • Wikipedia:Arbitration
  • Special:AbuseLog