Page MenuHomePhabricator

Spike: Specify changes to page-summary endpoint
Closed, ResolvedPublic

Description

Services intend to implement v2.0.0 of the page summary endpoint as it provides an opportunity to create a good example of how to implement RESTBase migration handlers (v2.0.0 -> v1.1.2, for example). Per T165017#3271061, what they need from us in order to start work is for us to:

  • Specify/discuss any other major upcoming changes that can be done as part of the content type bump.
  • See T165619#3272994.
  • Document which textextracts request parameters to change for v2.0.0.
  • See T165619#3272926.

Related Objects

Event Timeline

Are we talking about RESTBase's /page/summary/{title} endpoint? Why are they working on version 2.0.0? Any background info and clarification to the task description would help. Thanks.

@bmansurov: Hopefully the description is a little clearer now.

Document which textextracts request parameters to change for v2.0.0.

Drop explaintext=true from the current request parameters, i.e.

const parameters = {
  prop: 'extracts',
  exsentences: 5,
  exintro: true
};

e.g. https://en.wikipedia.org/w/api.php?action=query&format=json&prop=extracts&titles=Monuments+(band)&exsentences=5&exintro=1&formatversion=2

phuedx updated the task description. (Show Details)

Specify/discuss any other major upcoming changes that can be done as part of the content type bump.

The intention is to whitelist HTML elements in the extract by tag name, CSS class, or ID so that the AC of T113094: [EPIC] The Page Summary API needs to provide useful content for the majority of articles are satisfied. Since this only affects the extract property of the response, any change to the whitelist only corresponds to a minor version bump.

@GWicke: I'm not entirely sure how RESTBase migrates from one version of the response to another but it seems as if we'll need to store both the untransformed and transformed textextract response so that we can migrate between versions easily, i.e.

function migrate( response, _mostRecentVersion, targetVersion ) {
  const extract = response.rawExtract; // Some property that isn't sent to the client as part of the response.
  let whitelist;

  switch ( targetVersion ) {
    case 'v2.2.0':
      whitelist = [ '.foo', '.bar', '#baz' ];

    case 'v2.1.0':
      whitelist = [ '.foo', '.bar', '#baz', '.qux.quux' ]; // More restrictive than v2.2.0, for example.

    default:
      whitelist = [ '.foo', '.bar' ];
  }

  response.extract = transformExtract( extract, whitelist );

  return response;
}

Should this have a time limit on it? Or is it "research until it is done"?

I'm not entirely sure how RESTBase migrates from one version of the response to another but it seems as if we'll need to store both the untransformed and transformed textextract response so that we can migrate between versions easily

Our API versioning policy considers minor versions non-critical, so backwards compatibility concerns only affect major versions. When a minor version is bumped, we will re-render all stored content with that version right away. For patch version bumps, we might or might not re-render.

If future whitelist changes are in fact breaking for old clients, then a new major version would be in order.