Page MenuHomePhabricator

most popular related articles
Open, LowPublic

Description

Author: jsalsman

Description:
During a #wikimedia-strategy brainstorming session, regarding a Question of the Week:
"What changes to Wikimedia's technology would enable a friendlier and more welcoming
environment?") it was suggested that the main issues could be addressed thusly:

[20:20] <jimmyps> we could address the first two (tallest) bars on

http://strategy.wikimedia.org/wiki/File:091207_QOTW.png
simply by publicizing statistics from http://stats.grok.se

[20:21] <eekim> jimmyps: the key question is, how would you publicize it,

and how would you measure if you were being effective?

[20:22] <jimmyps> eekim: for each article find the top 10 articles

also in its categories and list them in order on the 
sidebar after the interwikis with "x,xxx views/month" 
right-justified on every other line after each of the 10

[20:23] <jimmyps> that would indicate to people the most popular subjects

that they are also interested in

[20:23] <jimmyps> this could be done in batch mode
[20:25] <jimmyps> does anyone disagree that listing the most popular

"related articles" with their viewership counts on the 
sidebar after the interwikis would address the largest 
leftmost two bars on 
http://strategy.wikimedia.org/wiki/File:091207_QOTW.png ?

(no disagreements were forthcoming)

Would someone who understands what is and is not possible with bots and MediaWiki please comment on the feasibility of this proposal? Thank you. 99.62.186.125 (talk) 04:49, 9 December 2009 (UTC)

it would be possible, the best method would probably be a toolserver acc with a javascript function that retrieves the data from the toolserver once we set the rules for what is and is not related. βcommand 04:52, 9 December 2009 (UTC)

    Even better would be to have a statistics tab next to history. With graphs of metrics like readability, bytes size, html size, word count, number of references, incoming link count (backlinks), outgoing link count (links), traffic statistics, and possibly something like history flow. And maybe be able to compare to other pages. If the caching is done right it could be done on the toolserver. — Dispenser 05:41, 9 December 2009 (UTC)\

    Perhaps 'related' is everything wikilinked and everything in the same categories? 99.62.186.125 (talk) 18:29, 9 December 2009 (UTC)

        The same algorithm as related changed I would say. Rich Farmbrough, 09:20, 15 December 2009 (UTC).

Note: per http://stats.grok.se/about the statistics are not from the toolserver, they are from http://dammit.lt/wikistats/ -- per Brion, the upstream data is from a wikimedia internal source.


Version: unspecified
Severity: enhancement
URL: http://en.wikipedia.org/wiki/Wikipedia:Bot_requests/Archive_32#Most_popular_related_articles

Details

Reference
bz21921

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:58 PM
bzimport set Reference to bz21921.
bzimport added a subscriber: Unknown Object (MLST).

jsalsman wrote:

*** Bug 6689 has been marked as a duplicate of this bug. ***

pdhanda wrote:

After some discussion on MediaWiki-General this is what I gathered.
Pageview tracking is disabled on Wikipedia because of performance reasons.
Erik Zachte already does some analysis of Squid logs but I am not sure about the accuracy, frequency and technical details.

So I am adding Roan and Erik to this thread :)

jsalsman wrote:

(MediaWiki-General) rtmprus: domas: is bug 21921 with a round-robin iteration of article space more appropriate for the toolserver, dammit.lt, or somewhere else?
[4:12pm] Platonides: I don't see any work there for dammit.lt
[4:13pm] rtmprus: I don't want to pound its bandwidth if local copies of popularity logs aren't available on the toolserver
[4:13pm] Platonides: it would be a work for the toolserver
[4:13pm] Platonides: or http://stats.grok.se if he wants to do it
[4:13pm] Platonides: the toolserver already downloads copies
[4:13pm] Platonides: they are at a common folder
... rtmprus: oh good
Platonides: I don't completely understand the algorithm they propose, but it
surely can be done
[4:17pm] rtmprus: someone suggested members of the same categories and
wikilinks, and someone else suggested the Special:RecentChangesLinked algorithm

jsalsman wrote:

Thanks to mikelifeguard, https://wiki.toolserver.org/view/User-store says squid traffic logs live in /mnt/user-store/ on the toolserver.

jsalsman wrote:

[12:57] <mikelifeguard> jps: river said "A user has made this available in raw form at /mnt/user-store/stats"

Jdlrobson added a subscriber: Jdlrobson.

Should probably be closed out in the interest of being honest - this is not likely to get attention.