Page MenuHomePhabricator

Strip old metadata from old Parsoid content <head>: mw:TimeUuid, user, comment
Open, MediumPublic


As discussed in T125266 and implemented in, Parsoid recently removed user information from their <head>. For old content (spec version 1.1.0), we should strip this information before returning it to the client to avoid exposing possibly-sensitive user information.

Additionally, we used to inject a mw:TimeUuid meta tag as a fall-back solution for clients not supplying If-Match headers to the html2wt end point. We should verify that all major clients like VE have moved to the If-Match header, and strip it from old content as well.

Event Timeline

Related PR by @Pchelolo to document If-Match & log remaining mw:TimeUuid users:

After the last deployment we're logging the clients who don't supply the If-Match header, but do supply the mw:TimeUuid meta property. There's around 5 requests like that per hour. Here's a sample log entry:

  "_index": "logstash-2016.03.02",
  "_type": "restbase",
  "_id": "AVM4Ry_qO3D718AOW20G",
  "_score": null,
  "_source": {
    "host": "restbase1008",
    "level": "WARNING",
    "version": "1.0",
    "@version": "1",
    "@timestamp": "2016-03-02T17:01:41.790Z",
    "source_host": "",
    "pid": 289,
    "levelPath": "warn/parsoid/etag",
    "root_req.method": "post",
    "root_req.uri": "/",
    "root_req.headers.content-length": "377642",
    "root_req.headers.content-type": "application/x-www-form-urlencoded",
    "root_req.headers.user-agent": "wikimedia/multi-http-client v1.0",
    "root_req.headers.x-client-ip": "::ffff:",
    "root_req.headers.x-request-id": "6fa06749-e098-11e5-9d0c-d4de92971dc1",
    "request_id": "6fa06749-e098-11e5-9d0c-d4de92971dc1",
    "type": "restbase",
    "tags": [
    "message": "Client did not supply etag, fallback to mw:TimeUuid meta element",
    "normalized_message": "Client did not supply etag, fallback to mw:TimeUuid meta element",
    "gelf_level": "4"
  "sort": [

This particular log entry suggests that this edit was done by a bot, but more investigation is needed.

Flow doesn't use RESTBase, and I don't think the Parsoid change is applicable to the endpoint we use (we don't use the same kind of revisions as MW).

There's been 700 cases when the If-Match was not supplied over the last month and only 2 user agents:

  1. - a user script. I will reach out to the maintainer
  2. And, surprisingly, still VE

Tagging as a good onboarding bug as once the subtask is resolved, it will be easy to fix in code and it provides a great glimpse into what a render is, how Parsoid, RESTBase and VE interacts and what constraints we need to maintain in order to make the 3 works together correctly.