Implement a special page to show items with the most sitelinks
Open, LowPublic

Description

On a MediaWiki install we have special pages (https://www.wikidata.org/wiki/Special:SpecialPages) like Most linked-to pages and Pages with the most interwikis. The wikibase extension should add an extension for items with the most sitelinks.
Rough query:
SELECT ips_item_id, COUNT(ips_item_id) FROM wb_items_per_site GROUP by ips_item_id ORDER BY COUNT(ips_item_id) DESC LIMIT 100;


Version: unspecified
Severity: minor
URL: https://www.wikidata.org/wiki/Special:SpecialPages

bzimport added a subscriber: Unknown Object (MLST).
bzimport set Reference to bz46217.

Maarten: You mean we should have such a special page on Wikidata? Or on the client (Wikipedia, Wikivoyage, Commons, ...)?

I was thinking about Wikidata itself.

Change 94830 had a related patch set uploaded by Bene:
(bug 46217) Implement a special page to show items with the most sitelinks

https://gerrit.wikimedia.org/r/94830

The suggested patch needs to be changed to allow more efficient SQL (see bug 40157 and bug 58032).

Change 94830 abandoned by Bene:
(bug 46217) Implement a special page to show items with the most sitelinks

Reason:
this is not likely to get implemented this way

https://gerrit.wikimedia.org/r/94830

Lydia_Pintscher removed a subscriber: Unknown Object (MLST).
Multichill set Security to None.

Looks like I worked on this months ago, I might as well finish it.

Change 181902 had a related patch set uploaded (by Multichill):
Implement a special page to show items with the most sitelinks

https://gerrit.wikimedia.org/r/181902

Patch-For-Review

I took a different approach than Bene: I extended the standard QueryPage. Would appreciate input. If this approach works I plan to add some more special pages and maybe rewrite Special:unconnectedpages.

Change 181902 had a related patch set uploaded (by Siebrand):
Implement a special page to show items with the most sitelinks

https://gerrit.wikimedia.org/r/181902

Patch-For-Review

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 27 2015, 10:35 AM

Change 181902 abandoned by Multichill:
Implement a special page to show items with the most sitelinks

Reason:
Too frustrating, not going to invest time in this any more.

https://gerrit.wikimedia.org/r/181902

Multichill removed Multichill as the assignee of this task.Aug 7 2015, 6:36 PM
Multichill added a subscriber: Ladsgroup.
Multichill removed a project: Patch-For-Review.

Abandoned. Too frustrating, not going to invest time in this any more. Up for grabs.

The major problem with both patches is that they use COUNT and JOIN and GROUP BY. This does not scale well for such big tables. What we do now is to store the number of sitelinks per item in a "wb-sitelinks" page property (in the page_props table). pp_sortkey is a numeric field that can be accessed then, and possibly ordered (need to check this). This should already be deployed, as far as I know.

So whoever wants to pick this up, please pick one of the existing patches, reopen it and change the SQL query to query pp_sortkey in page_props instead.

Change 181902 restored by JanZerebecki:
Implement a special page to show items with the most sitelinks

Reason:
Restoring to make it possible for Ricordisamoa to work on it.

https://gerrit.wikimedia.org/r/181902

With https://gerrit.wikimedia.org/r/232698 I explore a different approach: it would be possible to use something like https://www.wikidata.org/wiki/Special:PagesWithProp/wb-sitelinks?sortbyvalue=1.
The same would work with wb-claims, etc.

Add Comment