Page MenuHomePhabricator

Reading List REST Interface: create REST endpoints
Closed, ResolvedPublic

Description

Create MediaWiki REST API handlers (and associated code) to (eventually) replace the existing RESTBase "lists" endpoints and the existing Reading Lists extension Action API endpoints.

The new endpoints should be call-compatible with the existing RESTBase endpoints. In all but one case, the existing RESTBase endpoints all forward the call to Action API endpoints, so much of the work will be created MediaWiki REST API endpoints that are call-compatible with the RESTBase endpoints, but which provide the functionality from the Action API endpoints. (The endpoint that does not forward the call is /lists/{id}/entries. This one does make an Action API call, but also includes logic in the RESTBase code that will need to be duplicated.)

It may be necessary to modify/refined the MediaWiki REST API infrastructure in various ways (for example, T305973: JsonBodyValidator does not validate the parameter types) If any such work is significant, we can create separate subtasks.

Individual endpoint subtasks:

All handlers should include tests, any appropriate monitoring/logging support, etc.

At the RESTBase level, response header handling in the existing code is provided by mediawiki_auth_filter.js. This filter (if the author of this task is reading it correctly...) includes any 'cookie', 'x-forwarded-for', or 'x-client-ip' headers provided by the Action API endpoint. This means that we need to match the header handling provided by Action API for (only) those headers.

Additional headers may be provided by other layers of the infrastructure (varnish, etc). We should confirm header handling before considering the endpoints complete.

Callers are currently providing csrf tokens as a query parameter. For example:

curl -X 'POST' \
  'https://en.wikipedia.org/api/rest_v1/data/lists/?csrf_token=f63c343876da566045e6b59c4532450559c828d3%2B%5C' \
  -H 'accept: application/json; charset=utf-8' \
  -H 'Content-Type: */*' \
  -d '{
  "name": "Planets",
  "description": "Planets of the Solar System"
}'

The token is then included in the forwarded Action API request as a query parameter named "token", which is the normal way the Action API receives tokens. The MediaWiki REST API, on the other hand, includes support for csrf tokens sent as a "token" parameter in the request body. This support is implemented via the TokenAwareHandlerTrait. To avoid forcing callers to make changes, the Reading Lists handlers will need to be able to receive tokens via a "csrf_token" query parameter. TokenAwareTrait won't help with this directly, but code within it might be useful for inspiration.

See the associated Miro board for endpoint map.
See parent task T336693: Re-implement reading lists REST interface outside RESTbase for context and details.

Related Objects

StatusSubtypeAssignedTask
StalledNone
In ProgressNone
In ProgressBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle
ResolvedBPirkle

Event Timeline

Change 968971 had a related patch set uploaded (by BPirkle; author: BPirkle):

[mediawiki/extensions/ReadingLists@master] [DNM] Experimental REST handlers, bad code for discussion, do not merge

https://gerrit.wikimedia.org/r/968971

BPirkle renamed this task from Reading List REST Interface: create REST handlers to Reading List REST Interface: create REST endpoints.Nov 14 2023, 1:32 AM
BPirkle updated the task description. (Show Details)

@daniel , I added this sentence into the task description, but I don't have a lot of confidence in it:

At the RESTBase level, response header handling in the existing code is provided by mediawiki_auth_filter.js. This filter (if the author of this task is reading it correctly...) includes any 'cookie', 'x-forwarded-for', or 'x-client-ip' headers provided by the Action API endpoint. This means that we need to match the header handling provided by Action API for (only) those headers.

Can you fact-check me?

The action API doesn't handle most headers. Authentication cookies are handled by AuthManager, XFF is handled by BlockManager. (Not sure what x-client-ip is used for, if at all.) Integration with those components should be effortless, it happens when you interact with Authority objects (or when those objects are created).

The action API doesn't handle most headers. Authentication cookies are handled by AuthManager, XFF is handled by BlockManager. (Not sure what x-client-ip is used for, if at all.) Integration with those components should be effortless, it happens when you interact with Authority objects (or when those objects are created).

Thank you, that's good news.

I'm also unsure if x-client-ip is used for anything, but it is already included in rest.php responses, so we should be fine there as well:

curl -v "https://en.wikipedia.org/w/rest.php/v1/page/Jupiter/bare"

< HTTP/2 200
< date: Tue, 28 Nov 2023 21:21:55 GMT
< server: mw2292.codfw.wmnet
< x-client-ip: 132.147.157.140

(other headers snipped for brevity)

Change 978599 had a related patch set uploaded (by BPirkle; author: BPirkle):

[mediawiki/extensions/ReadingLists@master] Add token handling trait for use in Reading Lists REST endpoints

https://gerrit.wikimedia.org/r/978599

Change 968971 abandoned by BPirkle:

[mediawiki/extensions/ReadingLists@master] [DNM] Experimental REST handlers, bad code for discussion, do not merge

Reason:

Abandoned in favor of https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ReadingLists/+/972399 (which may itself end up being split into multiple changes...)

https://gerrit.wikimedia.org/r/968971

Change 980831 had a related patch set uploaded (by BPirkle; author: BPirkle):

[mediawiki/extensions/ReadingLists@master] Setup and Teardown REST Handlers

https://gerrit.wikimedia.org/r/980831

Change 978599 abandoned by BPirkle:

[mediawiki/extensions/ReadingLists@master] Add token handling trait for use in Reading Lists REST endpoints

Reason:

Abandoning in favor of https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ReadingLists/+/980831

https://gerrit.wikimedia.org/r/978599

Change 980831 merged by jenkins-bot:

[mediawiki/extensions/ReadingLists@master] Setup and Teardown REST Handlers

https://gerrit.wikimedia.org/r/980831

Change 983241 had a related patch set uploaded (by BPirkle; author: BPirkle):

[mediawiki/extensions/ReadingLists@master] REST Handlers for managing reading lists

https://gerrit.wikimedia.org/r/983241

There has been some discussion of continuation values (as they apply to the GET /lists/ and GET /lists/entries endpoints) in synchronous meetings. There's some related information in T182706: Make sure apps can continue /changes/since where they left off (which I do not pretend that I completely understand as of this writing). But as it applies to Reading Lists, here is what I think is going on (someone please correct me if they see that I am wrong):

  • There are two separate values involved. One is the "continuation data" used for pagination. The other is a "sync timestamp" for use in the /changes/since endpoint exposed by RESTBase, which maps to MODE_CHANGES in the Action API module. These are different values, for different purposes.
  • The format of the "continuation data" varies by endpoint and sorting type. The possibilities are:
    • id: the id of the first item that should be returned on a subsequent call. This is used only for the lists/pages/{project}/{title} endpoint
    • name, id: the name and id of the first item that should be returned on a subsequent call. This is used when sorting by name
    • updated, id: the timestamp and id of the first item that should be returned on a subsequent call. This is used when sorting by date updated
  • The "sync timestamp" is calculated using the "MaxUserDBWriteDuration" config value, which is the "Max time (in seconds) a user-generated transaction can spend in writes." Note that this is a MediaWiki core config value, not a Reading Lists extension config value.
  • The Action API returns a "sync timestamp" value only if the request did not include a "continuation data" parameter. In other words, a sync timestamp is not returned for requests for the second or later page within a set of results.
  • The parameter naming is adjusted in the RESTBase <=> Action API exchange:
    • Action API returns the "sync timestamp" value by the name "readinglists-synctimestamp".
    • RESTBase returns this value to the caller by the name "continue-from". It uses the Action API value verbatim, if it exists in the response. Otherwise, if there was a "continuation data" parameter in the request, RESTBase does not include a "sync timestamp" value (by any name) in the response. Otherwise (there was no "continuation data" parameter, but Action API did not return a "sync timestamp" value) RESTBase calculates its own sync timestamp value and returns that to the caller (by the name "continue-from". It is not clear to me under what circumstances this could occur, or if it ever actually does.
    • RESTBase uses the name "next" for the "continuation data" parameter, while Action API uses the names "continue/rlcontinue"

While this is all a little involved from an implementation standpoint (and I don't think my current implementation is quite right, I'll fix that up...), it is pretty straightforward from a caller's perspective:

  • If you get a "next" parameter in a response, you can send it with a follow-up request to get more data
  • If you get a "continue-from" parameter in a response, you can use it with the /changes/since endpoint for synchronization

Change 983241 merged by jenkins-bot:

[mediawiki/extensions/ReadingLists@master] REST Handlers for managing reading lists

https://gerrit.wikimedia.org/r/983241

Open API spec for all new REST endpoints is available here: https://meta.wikimedia.beta.wmflabs.org/w/rest.php/

Because the MW REST API does not yet support filtering specs, this link includes all endpoints exposed by core and extensions enabled in WMF productions. Search for "readinglists" to find the applicable section of the spec.

Also, it is a known limitation that this spec is less specific than we'd like in various ways. Improvements are planned, but there is no specific timeframe for when they will be implemented.

BPirkle updated the task description. (Show Details)