Page MenuHomePhabricator

Create Special:PageData as a canonical entry point for machine readable page data.
Closed, ResolvedPublic

Description

Special:PageData serves as a data retrieval endpoint, as described by T161527: RFC: Canonical data URLs for machine readable page content. Its purpose is not to provide an API, but to offer a unified naming scheme for accessing machine readable page content via any API.

The initial implementation of Special:PageData should do the following:

  • From the request, extract a page title and a slot name:
    • Subpage syntax should be supported; e.g. Special:PageData/main/User:Foo means slot "main", title "User:Foo". If subpage syntax is used, both, slot and title, must be given.
    • Simple request parameters can be used; e.g. Special:PageData?title=User:Foo&slot=main means slot "main", title "User:Foo". The slot parameter is optional and defaults to "main".
    • If no title is given in the request, a message explaining the purpose of the special page should be shown. A form for entering title and slot would be nice, but it not initially required.
  • Check that the given slot is supported by the given page.
    • Until T107595: [RFC] Multi-Content Revisions is implemented, all pages support the "main" slot, and only the "main" slot. So for now, this can just check that the slot parameter is "main".
  • If the client send an Accept header for content negotiation, check that the page's (default) serialization format (mime type) is compatible with that header. If not, send status 406.
    • It should be possible to extend/override content negotiation using a hook.
  • Respond with a HTTP redirect (status 303) to the page's raw data
    • The default target for the redirect is the action=raw API, i.e. $title->getFullUrl( [ 'action' => 'raw' ] ). The slot name can be added in the future. A setting should be provided that allows this to be changed to use the MediaWiki REST API.
    • Call a hook that allows modifying the redirect target. For instance, Wikibase may redirect to its own Special:EntityData instead of action=raw.
    • The redirect should be marked as non-cacheable for now. Caching the redirect would require web caches to vary on the Accept header.
  • Special:PageData should not be listed on Special:SpecialPages.

Event Timeline

I have been working on this in the past couple of days, the HTTP content negoation is not possible without moving most parts of LinkedData namespace classes into core which I think is good. @daniel: Do you think we should move HttpAcceptNegotiator, HttpAcceptParser, EntityDataFormatProvider (with some modifications) to core? Their codebase mostly not related to Wikibase as far as I checked.

@Ladsgroup: HttpAcceptNegotiator and HttpAcceptParser can and should be moved to core, yes.

I don't think we need the equivalent of EntityDataFormatProvider in core: Special:PageData dosn't need to (and really cannot) use file name extensions, or API serialization formats, or format names.

All that Special:PageData needs is a list of supported mime types for a given page. That list is provided by ContentHandler::getSupportedFormats() and can be checked using isSupportedFormat(). But perhaps we should even support just one format per page for now, the one returned by ContentHander::getDefaultFormat(). If there is just one format, we may not even needed HttpAcceptNegotiator for now.

Change 356121 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Move HttpAcceptNegotiator and HttpAcceptParser from Wikibase to core

https://gerrit.wikimedia.org/r/356121

@daniel : I made a patch for moving these classes for now. Once is merged, I give the Special:PageData another try to see what I can do about it.

Change 356616 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Start a very basic version of Special:PageData

https://gerrit.wikimedia.org/r/356616

Change 356121 merged by jenkins-bot:
[mediawiki/core@master] Move HttpAcceptNegotiator and HttpAcceptParser from Wikibase to core

https://gerrit.wikimedia.org/r/356121

Change 356616 merged by jenkins-bot:
[mediawiki/core@master] Start a very basic version of Special:PageData

https://gerrit.wikimedia.org/r/356616

Change 358360 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Use target instead title in SpecialPageData

https://gerrit.wikimedia.org/r/358360

Change 358360 merged by jenkins-bot:
[mediawiki/core@master] Use "target" instead "title" as the param name in SpecialPageData

https://gerrit.wikimedia.org/r/358360

Change 358372 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Move HttpAccept* to libs

https://gerrit.wikimedia.org/r/358372

Change 358372 merged by jenkins-bot:
[mediawiki/core@master] Move HttpAccept* to libs

https://gerrit.wikimedia.org/r/358372

Change 358379 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/extensions/Wikibase@master] Drop HttpAccept* from Wikibase and use moved ones from core

https://gerrit.wikimedia.org/r/358379

Change 358379 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Drop HttpAccept* from Wikibase and use moved ones from core

https://gerrit.wikimedia.org/r/358379

Change 358540 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Make Special:PageData accept two-part subpage

https://gerrit.wikimedia.org/r/358540

Change 358540 merged by jenkins-bot:
[mediawiki/core@master] Make Special:PageData accept two-part subpage

https://gerrit.wikimedia.org/r/358540

We can call this done now, feel free to re-open it if you think otherwise.

Given the large number of applications, a coding convention seems desirable in the PageData pages.
The usual standards could be a coding convention including JS, Lua ... where some eventual wikicode can be inside values.
Do we need a coding convention, limited but extensible, and discussions to extend it?

How to use Special:PageData? How to create them? How to read them?
Could you answer these questions in the Lua reference manual? and/or elsewhere?