API should surface whether isMainPage is true for a page cheapily
Open, Needs TriagePublic

Description

There is no way via the API to determine whether the page you are looking at is the main page.

https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=&meta=&titles=Main+Page

This is important for clients to know as main pages tend to require special handling.

https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=revisions%7Cpageimages%7Cinfo&meta=siteinfo&titles=San+Francisco&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser%7Ccontentmodel&inprop=protection%7Cwatchers&siprop=general
can be used for this purpose but siteinfo but this provides a lot of additional irrelevant information which needs parsing.

Proposal

Jdlrobson created this task.Aug 1 2017, 9:37 PM
Jdlrobson renamed this task from API should surface whether isMainPage is true for a page to API should surface whether isMainPage is true for a page cheapily.Aug 1 2017, 10:51 PM
Jdlrobson updated the task description. (Show Details)
Anomie added a subscriber: Anomie.Aug 2 2017, 10:26 PM

There is no way via the API to determine whether the page you are looking at is the main page.

You already noted the workaround, although you hid it inside a more complicated query. You can fetch meta=siteinfo&siprop=general once and cache the value for comparison as needed.

I'm inclined to decline this, unless you can make a case that comparing strings client-side is somehow problematic.

The main problem here is that RESTBase is stateless so we have to make that call on every page view. I don't know how expensive calling meta=siteinfo&siprop=general is but it returns a lot of useless and irrelevant information. If this is expensive, an alternative way of looking at this would be to update the siteinfo e.g. meta=siteinfo&siprop=general&sipropname=mainpage that only returns the mainpage field.

In addition to this, my feedback as an API consumer is that it's not very intuitive to find and I see this as a small detail that makes our content more easy to work with. It doesn't sound like it would be a difficult thing to implement so why not? Title already has an isMainPage method and we'd only need to return the flag on the Main page page.

The main problem here is that RESTBase is stateless so we have to make that call on every page view.

Restbase is also a cache. Can't it cache the needed information somehow?

I don't know how expensive calling meta=siteinfo&siprop=general is

Not very. It mostly just dumps a bunch of configuration variables, with a few message expansions for stuff like 'mainpage'.

In addition to this, my feedback as an API consumer is that it's not very intuitive to find and I see this as a small detail that makes our content more easy to work with. It doesn't sound like it would be a difficult thing to implement so why not? Title already has an isMainPage method and we'd only need to return the flag on the Main page page.

It might make it more "intuitive" for you, but it adds complexity for everyone else.

The main problem here is that RESTBase is stateless so we have to make that call on every page view.

I know this is a bit late, but to follow up here the MCS node processes are not completely stateless. They do cache the siteinfo data in RAM.

@Jdlrobson if this is done in the new summary API are we good here?

I still think this should be fixed in the mediawiki API - it's basically the usual trade off between the complexity Anomie is afraid of and the lack of intuitiveness for consumers of the API that I'm afraid of (although this is a much larger problem). It doesn't impact the summary API and MCS which hides this complexity.

What we'd really want for the "return all the information about one page in one request" use case that seems to be somewhat typical of requests from the mobile teams is a new, non-query action that does exactly that: returns more or less all the public information about one page, with few additional options, caching enabled by default, and purging done by WikiPage::doPurge().