Page MenuHomePhabricator

Reader reads a page offline
Closed, ResolvedPublic5 Estimated Story Points

Description

"As a Reader, I want to get a page and its contents, so that I can read it whenever I want."

This is the most basic reading use case, retrieving the metadata about the page and the page text in HTML form in a single request. There are optimizations that work better when the reader is online and using a browser or native HTML widget that can load an HTML stream directly; see the related "Reader reads a page online" user story T234377.

GET /page/{title}/with_html

Returns the page as JSON. Title is escaped for slashes

Payload: empty

Notable Request headers: none

Status:
200 – this is the page
304 – not modified; body should be empty
400 - this page cannot be rendered as HTML because of its content model
404 – page does not exist (never created or deleted)

Notable response headers: none

Body: JSON

  • id: numeric id of the page
  • key: prefixed DB key of the page, like "Talk:Main_Page"
  • title: title for display, like "Talk:Main Page"
  • latest: latest revision of the page, object with these properties
    • id: revision ID
    • timestamp: revision timestamp
  • license: Object for the preferred license of the page, including these properties:
    • spdx: SPDX code
    • url: URL for the license
    • title: title of the license
  • other_licenses: array of objects with {spdx, url, title} for other licenses the page is available under
  • content_model: content model for the main slot of the page
  • html: Parsoid HTML for the main slot of the page and any other slots that render in-page

Event Timeline

I'm not crazy about "with_html". Happy to get suggestions for a better way to express this.

eprodromou updated the task description. (Show Details)Oct 1 2019, 9:17 PM
eprodromou updated the task description. (Show Details)Oct 11 2019, 1:54 AM
eprodromou updated the task description. (Show Details)Oct 11 2019, 2:13 AM
eprodromou updated the task description. (Show Details)Oct 28 2019, 8:31 PM

I removed the caching headers, and brought the output in line with the schema.

eprodromou updated the task description. (Show Details)Nov 10 2019, 7:16 PM

I added the content_model output.

eprodromou updated the task description. (Show Details)Nov 12 2019, 8:05 PM
eprodromou updated the task description. (Show Details)Dec 4 2019, 6:09 PM
WDoranWMF set the point value for this task to 5.Jan 7 2020, 7:02 PM

Change 565408 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/core@master] REST: /page/{title}/(with_)?html endpoing backed by RESTBase.

https://gerrit.wikimedia.org/r/565408

Change 565408 merged by jenkins-bot:
[mediawiki/core@master] REST: /page/{title}/{bare,html,with_html} endpoints backed by RESTBase.

https://gerrit.wikimedia.org/r/565408

Parsoid/PHP hits composer.json in MediaWiki this week

daniel added a subscriber: daniel.Mar 2 2020, 11:48 AM

This is tracked as "waiting for review", but I see no open patches. Is this done? What is missing?

eprodromou closed this task as Resolved.Mar 11 2020, 6:09 PM
eprodromou added a subscriber: Pchelolo.

Looks good. Thanks, @Pchelolo