Page MenuHomePhabricator

Enhance section retrieval API
Closed, ResolvedPublic


In T94890 it's been decided to change the /transform/sections/to/wikitext format to accept an object in the following form:

  "mwAA": [ { html: "<h1>bla bla bla</h1>" },   { "id": "mwAb" }, { "html": "<h1>bla</h1>", "data-mw": "asdas" } ]

So we might want to alter the section retrieval API to emit something similar. Here's the list of changes I propose:

  1. Make section retrieval emit something like
  "mwAA": { html: "<h1>bla bla bla</h1>" }

This format is in line with section transformation format, and also when we separate data-mw it will allow us to build a /bundle endpoint, that would emit both html and data-mw

  1. Potentially make section retrieval take an array instead of a comma-separated list.
  2. Potentially introduce a new hierarchy for sections to increase discoverability.


This task is not a call for action, but a question whether you think it worths doing that or not. @GWicke @mobrovac @Eevans what's your thoughts?

Event Timeline

Some comments:

  1. Make section retrieval emit an object per section: I like this direction, as it creates symmetry with the transform API, and lets us set up different bundle end points that all follow the same general structure. It is likely that we will have more separate metadata in the future (ex: T55508: Move invisible page properties from the DOM to dedicated metadata), which makes the ability to add more properties in the response very useful.
  1. Potentially make section retrieval take an array instead of a comma-separated list: This would fit better with a move to JSON as the request format in general, as discussed in T111748: [RFC] Generalize POST parameter to JSON structure and header mapping in REST APIs. However, we need to address the issue of doc usability as raised in T111748#2580806.
  1. Potentially introduce a new hierarchy for sections to increase discoverability.: I like this proposal. While it is unfortunate that we didn't do this before adding the mobile section end points, this would at least give us consistency & easier discovery going forward. We could also consider gradually migrating the mobile end points to this hierachy later.

We just discussed 2) (array instead of comma separated list) IRL, and came to the conclusion that this isn't really worth it for GET query parameters.

In terms of use cases for by-section retrieval, I think we are primarily interested in these two:

  1. Client has full Parsoid HTML for viewing, and needs data-mw for specific sections.
  2. Client has no data, and wants to load HTML (and possibly data-mw) for the first sections, ideally enough to fill the first screen.

Support for 2) is fairly poor right now, as clients would first need to retrieve a list of fairly random element IDs, and would then lack information on how large those sections actually are. Efforts like T114072 to define <section>s more in line with expected semantics could help here, especially in combination with a deterministic ID for the first section (such as mw0).

Pchelolo edited projects, added Services (later); removed Services.
GWicke triaged this task as Medium priority.Oct 12 2016, 7:38 PM
GWicke edited projects, added Services (next); removed Services (later).

old section retrieval API has been deprecated and removed.