listing of unprotected page with list=allpages
Open, NormalPublic

Description

There is no way to get only none protected page with the api.
Maybe add a 'none' to 'apprlevel' to allow listing of this page.
Thanks.


Version: unspecified
Severity: enhancement

Details

Reference
bz18770
bzimport raised the priority of this task from to Normal.
bzimport set Reference to bz18770.
bzimport added a subscriber: Unknown Object (MLST).

Bryan.TongMinh wrote:

Not really trivial to accomplish this. LEFT JOIN against page_restrictions?

Reedy added a comment.Jan 6 2010, 11:29 PM

Left join the restriction table.. Then only output the page if some restriction col is null for that row?

Reedy added a comment.Jan 11 2010, 1:24 PM

[13:20:06] <RoanKattouw> OK so about the unprotected pages in list=allpages thing
[13:20:19] <RoanKattouw> The LEFT JOIN approach you and Bryan came up with is the right one
[13:21:00] <RoanKattouw> I'm not 100% sure about its efficiency and scalability to Wikimedia levels, but I think it should be OK because protected pages are scarce

Reedy added a comment.Jan 11 2010, 2:15 PM

[13:35:33] <RoanKattouw> domas: Are queries like these OK to run on the cluster? http://pastebin.com/m72223c4b
[13:35:43] <RoanKattouw> EXPLAIN says "using where" but I think it's lying
[13:36:29] <domas> depends
[13:36:35] <RoanKattouw> Also the row count is huge but I don't believe it'll really examine that much rows for a query with WHERE foo=const ORDER BY bar LIMIT 51 when there's an index on (foo,bar)
[13:36:42] <RoanKattouw> *many
[13:36:48] <domas> if all pages have restrictions, this gets really expensive
[13:36:49] <domas> :)
[13:36:53] <Reedy> heh
[13:37:01] <RoanKattouw> Yeah it was kinda based on the assumption that protected pages are scarce
[13:37:29] <RoanKattouw> Of course the "get pages with protection X" variant (which already runs on the cluster) joins the tables in reverse order
[13:37:56] <RoanKattouw> ... hopefully
[13:37:59] RoanKattouw checks
[13:41:15] <RoanKattouw> Hm the "get pages with protection X" query does indeed join page_restrictions first at the cost of filesorting, but I guess that's the lesser of two evils unless I totally rewrite this module to page by page ID

Reedy added a comment.Jan 11 2010, 2:28 PM

[14:18:32] <RoanKattouw> Reedy: Short summary: given that the existing appr* queries are apparently OK, this one should not be a problem