Random page in this category feature
Closed, ResolvedPublic

Description

Author: hemanshu_desai

Description:
Mediawiki should have a random page in this category feature


Version: unspecified
Severity: enhancement

bzimport added a project: MediaWiki-Special-pages.Via ConduitNov 21 2014, 8:29 PM
bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz2170.
bzimport created this task.Via LegacyMay 14 2005, 12:01 PM
Bawolff added a comment.Via ConduitJan 26 2006, 12:12 AM

wouldn't [[special:Random/Category]] do that?

bzimport added a comment.Via ConduitJan 26 2006, 2:06 PM

robchur wrote:

No. That retrieves a random category. The request is for a way of generating a
random page from a specified category.

bzimport added a comment.Via ConduitMar 31 2006, 1:21 PM

Wiki.Melancholie wrote:

*** Bug 5399 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitOct 5 2006, 10:12 AM

anthony.bentley wrote:

My thoughts are that this could be either by adding arguments to
SpecialRandompage or by creating a new special page for this purpose.
SpecialRandompage currently has the argument of the namespace in which it
searches.
If carrying out a search within a category then shouldn't be required as
the search would always be over the main namespace. Not aware that users
etc. can assign themselves to categories?
Would this only be required over the main namespace or is there a good
reason for allowing choice of namespace?
Is category (cl_from) the only index that it's worth filtering on or is it
worth having additional functionality to choose the index?

bzimport added a comment.Via ConduitApr 14 2007, 12:20 PM

badock wrote:

I had the same idea, but with Portals instead of Categories.

bzimport added a comment.Via ConduitMay 31 2007, 12:06 PM

dan.bolser wrote:

The DPL extension can do this (random n pages in category X).

DPL should be a part of MediaWiki :D

bzimport added a comment.Via ConduitNov 11 2007, 12:04 AM

ayg wrote:

This probably needs a cl_random column. Marking schema-change.

vvv added a comment.Via ConduitNov 11 2007, 12:34 PM

Fixed in r27380.

bzimport added a comment.Via ConduitNov 11 2007, 2:26 PM

dan.bolser wrote:

How about some docs as to how the feature works? Should I add a 'documentation' bug?

bzimport added a comment.Via ConduitNov 11 2007, 5:01 PM

ayg wrote:

Note, the patch might not be acceptable for efficiency concerns. It should stay in the software but might not be enabled on Wikimedia sites.

bzimport added a comment.Via ConduitJan 19 2009, 2:22 PM

Wiki.Melancholie wrote:

*** Bug 17068 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitJan 27 2009, 12:32 AM

Wiki.Melancholie wrote:

As this is not fixed for Wikimedia wiki, a nice toolserver link:

bzimport added a comment.Via ConduitSep 11 2009, 5:59 AM

GregUbben wrote:

This new user script also works well for this:

bzimport added a comment.Via ConduitApr 13 2010, 11:57 AM

happy.melon.wiki wrote:

*** Bug 23181 has been marked as a duplicate of this bug. ***

Platonides added a comment.Via ConduitApr 15 2010, 11:09 AM

Vasiliev implementation was reverted on r27436.

However, I don't think we can make it faster without adding a category_random, it seems quite good:

EXPLAIN SELECT page_namespace, page_title FROM page USE INDEX(page_random) JOIN categorylinks ON page_id = cl_from WHERE page_is_redirect = 0 AND page_random >= 0.15564 AND cl_to = 'GFDL' ORDER BY page_random LIMIT 1;

Both select_types SIMPLE:
+-------------+------+-----------------+-------------+------------------------+

tabletypekeykey_lenrefExtra

+-------------+------+-----------+-----+-------------+------------------------+

pagerangepage_random8NULLUsing where
categorylinkseq_refcl_from261page_id,constUsing where; Using index

+-------------+------+-----------+-----+-------------+------------------------+

It would change to Using temporary; Using filesort if we weren't using a LIMIT, but that's not the case.
Accessing the pages on the category is O(1), the problem is that for all the results it needs to go to page to see the page_random. And for large categories that would be a worse case of checking thousands of entries.
My testing shows that in practise it is run in a fraction of second, probably due to the index + random numbers uniformly distributed.

bzimport added a comment.Via ConduitApr 15 2010, 3:08 PM

ayg wrote:

AFAICT, that will have to scan the entire page_random index in the worst case, e.g., if there are no actual pages in the category. You left out part of the EXPLAIN -- this is full thing for me on enwiki (on toolserver).

  • 1. row ******* id: 1 select_type: SIMPLE table: page type: range

possible_keys: page_random

    key: page_random
key_len: 8
    ref: NULL
   rows: 10001064
  Extra: Using where
  • 2. row ******* id: 1 select_type: SIMPLE table: categorylinks type: eq_ref

possible_keys: cl_from,cl_timestamp,cl_sortkey

    key: cl_from
key_len: 261
    ref: enwiki.page.page_id,const
   rows: 1
  Extra: Using where; Using index

Note rows: 10001064. Try running that on a large database with 'GFDL' replaced by 'Nonexistent category' and you'll see it takes forever. It's O(N) in number of pages in the worst case, only acceptable for very small sites.

Platonides added a comment.Via ConduitApr 15 2010, 9:38 PM

Yes, I tried to fit it into bugzilla width.
Hmm. You are right. Is it checking page table before categorylinks? Checking categorylinks first, it should be immediate if there are no pages in the category, but O(N) pages in the category otherwise.
mysql should have implemented a 2choose a random row matching this" :(

bzimport added a comment.Via ConduitApr 16 2010, 12:01 AM

ayg wrote:

If it checks category first, it can't use the page_random index, so it's O(N log N) in the size of the category to sort its contents. You may as well skip the page table join and ORDER BY RAND() in that case.

I don't think it would have been trivial for MySQL to implement efficient "pick a random row" without some kind of special index. In any event, they don't, so we need cl_random if we really want this enough.

IAlex added a comment.Via ConduitNov 7 2010, 1:56 PM
  • This bug has been marked as a duplicate of bug 15824 ***
Platonides added a comment.Via ConduitNov 7 2010, 3:13 PM

Reason to close bug 15824 was the existance of http://www.mediawiki.org/wiki/Extension:RandomInCategory

Platonides added a comment.Via ConduitNov 7 2010, 3:14 PM

*** Bug 15824 has been marked as a duplicate of this bug. ***

Technical13 added a comment.Via ConduitMay 23 2013, 11:29 AM

Either there is no documentation for this fix or the fix was never reinstated after it was reverted. http://www.mediawiki.org/wiki/Help_talk:Random_page/Archive_1 It is being asked for by others, and now I've found need of something like this myself on en.wikipedia. There is a banner for Today's Article For Improvement that is currently populated using a bot and a lot of "template" style pages with less than optimal code that could greatly be simplified if I could use a [[:Special:Random/Category:This_weeks_TAFIs]] to pick a random article from a category for the week. The only "extra" that I would ask is that the page would be picked on page load instead of clicking on the link so that the link when using the "piping trick" would show the name of the article it was going to take you to. Can this be done?

Technical13 added a comment.Via ConduitMay 24 2013, 12:13 AM

(In reply to comment #22)

Either there is no documentation for this fix or the fix was never reinstated
after it was reverted.
http://www.mediawiki.org/wiki/Help_talk:Random_page/Archive_1 It is being
asked for by others, and now I've found need of something like this myself on
en.wikipedia. There is a banner for Today's Article For Improvement that is
currently populated using a bot and a lot of "template" style pages with less
than optimal code that could greatly be simplified if I could use a
[[:Special:Random/Category:This_weeks_TAFIs]] to pick a random article from a
category for the week. The only "extra" that I would ask is that the page
would be picked on page load instead of clicking on the link so that the link
when using the "piping trick" would show the name of the article it was going
to take you to. Can this be done?

http://en.wikipedia.org/wiki/Wikipedia_talk:Today%27s_articles_for_improvement#Teahouse_TAFI_banner is the link to the full discussion for using this feature.

Legoktm added a comment.Via ConduitMay 24 2013, 1:19 AM

(In reply to comment #22)

Either there is no documentation for this fix or the fix was never reinstated
after it was reverted.

No, the extension in comment 20 was simply never deployed to WMF sites.

Technical13 added a comment.Via ConduitMay 24 2013, 1:48 AM

(In reply to comment #24)

(In reply to comment #22)
> Either there is no documentation for this fix or the fix was never reinstated
> after it was reverted.

No, the extension in comment 20 was simply never deployed to WMF sites.

Can this bug be re-opened since there was no actual implemented fix?

Legoktm added a comment.Via ConduitMay 24 2013, 1:51 AM

(In reply to comment #25)

Can this bug be re-opened since there was no actual implemented fix?

No, this bug (implementing such a feature in MediaWiki) is fixed. If you want it deployed on enwiki or another WMF site, file a new bug under the Wikimedia category.

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.