Page MenuHomePhabricator

Create a generic cached service module
Closed, ResolvedPublic

Description

As we expand beyond Parsoid we need a way to expose and (optionally) store relatively simple responses from backend services. We should make this module fairly generic, so that we can easily configure it to support many services with similar caching needs.

Use cases:

  • Citoid
    • store JSON blob per format and search parameter (key), if response was 200
    • could alos store only per search, but outputting the right format would need further work in the backend module
  • Graphoid
    • store image under hash key
    • forward title, rev, hash to service on cache miss: /PageTitle/12345/a64b022a8fa5b7fc5e40a2c95cd0a114b2ae1174.png
  • MobileApps
    • store mobile-friendly HTML version of the page, store under key and revision
    • forward title, rev to service on cache miss
  • RevisionScoring service
    • generate and store JSON metadata per title & revision

Requirements:

  • Provide an internal service end point, and map requests to storage & backend service requests. The backend request format needs to be configurable.
  • URI format in spec needs to define key and optionally revision.
  • Store responses from GETs, and only if no query parameters were supplied.
  • Optionally (if configured) support refreshing content with a Cache-control: no-cache header.

Draft module config in config.yaml:

/{module:testservice}:
  x-modules:
    - name: simple_service
      version: 1.0.0
      type: file
      options:
        paths:
          /test/{key}:
            get:
              backend_request:
                uri: http://en.wikipedia.org/wiki/{+key}
              storage: 
                no-cache_refresh: true
                bucket_request: 
                  uri: /{domain}/sys/key_value/testservice.test
                item_request:
                  uri: /{domain}/sys/key_value/testservice.test/{key}

Draft PR: https://github.com/wikimedia/restbase/pull/229

Event Timeline

GWicke raised the priority of this task from to Medium.
GWicke updated the task description. (Show Details)
GWicke added a project: RESTBase.
GWicke added a subscriber: GWicke.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 9 2015, 2:57 PM
GWicke updated the task description. (Show Details)Apr 9 2015, 2:58 PM
GWicke set Security to None.
GWicke raised the priority of this task from Medium to High.Apr 9 2015, 3:07 PM
GWicke updated the task description. (Show Details)
GWicke updated the task description. (Show Details)Apr 9 2015, 3:09 PM
GWicke updated the task description. (Show Details)Apr 9 2015, 11:49 PM
GWicke updated the task description. (Show Details)
GWicke updated the task description. (Show Details)Apr 9 2015, 11:52 PM
Yurik added a subscriber: Yurik.EditedApr 10 2015, 2:03 AM

Don't forget the private wikis - for them some images could be private (per user), and some - public, e.g. hosted on the main page or accessible via other means to anonymous users. Due to how backend works, the hash might even be the same for different graphs.

GWicke added a comment.EditedApr 10 2015, 2:48 AM

For the simple case of a private wiki that does not allow read access unless you are authenticated we can just disallow anon access to any resource on that wiki. Whitelisted pages themselves can be supported too, but for graphs the difficulty is verifying that a given graph is actually used in one of those pages.

Does MediaWiki actually support images or other media files in whitelisted pages?

Yurik added a comment.EditedApr 10 2015, 3:21 AM

No idea about whitelisting files, Zero uses a whitelisted special page, which in turn knows who the user is and returns data accordingly (in a raw CSV format). It does set the caching headers accordingly.

Eevans added a subscriber: Eevans.Apr 10 2015, 3:36 PM
mobrovac updated the task description. (Show Details)Apr 10 2015, 5:37 PM
GWicke updated the task description. (Show Details)Apr 10 2015, 5:44 PM
GWicke updated the task description. (Show Details)
GWicke updated the task description. (Show Details)Apr 10 2015, 6:09 PM
GWicke updated the task description. (Show Details)Apr 10 2015, 7:47 PM
GWicke updated the task description. (Show Details)Apr 13 2015, 3:13 PM
GWicke lowered the priority of this task from High to Medium.Jun 23 2015, 12:13 AM

This has been merged and is in use for the Graphoid end point: https://en.wikipedia.org/api/rest_v1/?doc#!/Page_content/page_graph_png__title___revision___graph_id__get

The storage functionality is currently not yet used in production. We can start using it for immutable items like hieroglyphs (see T93787), but need to figure out a more solid change propagation strategy for other services (see T102476 and related tasks).

GWicke moved this task from Backlog to Under discussion on the RESTBase board.Jun 29 2015, 5:37 PM
GWicke closed this task as Resolved.Dec 2 2015, 5:59 AM
GWicke claimed this task.

These use cases are now covered well by RESTBase's fairly rich handler functionality.