Page MenuHomePhabricator

Results of Disambiguator extension are not complete
Closed, ResolvedPublic

Description

Author: taste1at

Description:
On the english wikipedia, the pages Special:DisambiguationPages and Special:DisambiguationPageLinks are not complete (they only give 1000 results!)

I want especially point to:

http://en.wikipedia.org/w/index.php?title=Special:PagesWithProp/disambiguation

gives exactly the same information as

http://en.wikipedia.org/w/index.php?title=Special:DisambiguationPages

But Special:DisambiguationPages does only give 1000 results and is only cached, while Special:PagesWithProp/disambiguation gives all results and is not cached. This is somehow pointless.


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=50979

Details

Reference
bz50832

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:49 AM
bzimport set Reference to bz50832.
bzimport created this task.Jul 5 2013, 8:04 PM

This isn't actually specific to Disambiguator. It's defined by the wgQueryCacheLimit config var. For example, the Redirects list is also limited to 1000 results:
http://en.wikipedia.org/wiki/Special:ListRedirects

This config var is set to 1000 for enwiki, 2000 for dewiki, and 5000 for all other wikis. So any QueryPages that use caching (i.e. are marked as expensive on a wiki using MiserMode) are limited to X number of results as defined by wgQueryCacheLimit.

Unfortunately, this also limits the API results. I'll file a bug about increasing the limit for enwiki.

The other option would be to mark Special:DisambiguationPages as not being expensive. Since it joins the page and page_props tables I imagine it is at least slightly expensive, but it may be worth doing some profiling on it.

Change 73008 had a related patch set uploaded by Kaldari:
Optimizing Special:DisambiguationPages query to avoid filesort

https://gerrit.wikimedia.org/r/73008

Change 73008 merged by Anomie:
Optimizing Special:DisambiguationPages query to avoid filesort

https://gerrit.wikimedia.org/r/73008

Ryan Kaldari: This issue has been assigned to you a while ago.
Could you please provide a status update and inform us whether you are still working (or still plan to work) on this issue?
Only in case you do not plan to work on this issue anymore, should the assignee be set back to default? Thanks.

matmarex closed this task as Resolved.Feb 2 2015, 6:53 PM
matmarex added a subscriber: matmarex.

This is fixed now, probably by Kaldari's patch. I've been able to page through all the 256,165 disambiguation pages on the English Wikipedia: https://en.wikipedia.org/w/index.php?title=Special:DisambiguationPages&limit=5000&offset=255000.