Page MenuHomePhabricator

Restbase math server and mediawiki EventBus have conflicting log mappings for "response"
Closed, ResolvedPublic

Description

With the upgrade of the elasticsearch cluster supporting logging to 2.3 the mappings of logs have become more strict. Specifically logs from different sources cannot have conflicting mappings.

EventBus sends events with the "response" field set to a simple string containing the error response, for example:

"_source": {
  "message": "Unable to deliver event: 500: Internal Server Error",
  "@version": 1,
  "@timestamp": "2016-06-07T12:01:02.000Z",
  "type": "mediawiki",
  "host": "deployment-mediawiki01",
  "level": "ERROR",
  "tags": [
    "syslog",
    "es",
    "es"
  ],
  "channel": "EventBus",
  "normalized_message": "Unable to deliver event: 500: Internal Server Error",
  "url": "/w/index.php?title=Benutzer_Diskussion:Perhelion&action=submit",
  "ip": "10.68.18.103",
  "http_method": "POST",
  "server": "de.wikipedia.beta.wmflabs.org",
  "referrer": "http://de.wikipedia.beta.wmflabs.org/w/index.php?title=Benutzer_Diskussion:Perhelion&action=edit&section=7",
  "unique_id": "V1a3fApEEaoAADVyTr8AAAAN",
  "wiki": "dewiki",
  "mwversion": "1.28.0-alpha",
  "reqId": "V1a3fApEEaoAADVyTr8AAAAN",
  "response": "<html><title>500: Internal Server Error</title><body>500: Internal Server Error</body></html>"
}

"_source": {
  "message": "Unable to deliver event: (curl error: 28) Timeout was reached",
  "@version": 1,
  "@timestamp": "2016-06-04T01:52:37.000Z",
  "type": "mediawiki",
  "host": "deployment-mediawiki02",
  "level": "ERROR",
  "tags": [
    "syslog",
    "es",
    "es"
  ],
  "channel": "EventBus",
  "normalized_message": "Unable to deliver event: (curl error: 28) Timeout was reached",
  "url": "/w/api.php",
  "ip": "10.68.18.103",
  "http_method": "POST",
  "server": "wikidata.beta.wmflabs.org",
  "referrer": null,
  "unique_id": "V1I0XwpEEH8AAFy6kxIAAAAA",
  "wiki": "wikidatawiki",
  "mwversion": "1.28.0-alpha",
  "reqId": "V1I0XwpEEH8AAFy6kxIAAAAA",
  "response": ""
}

The Math extension (i think, the type is mml, and logs reference restbase math server) sends a very structured response field, basically a hash containing many different properties of the response, such as:

"_source": {
  "message": "Restbase math server problem:",
  "@version": 1,
  "@timestamp": "2016-06-13T13:48:25.000Z",
  "type": "mml",
  "host": "deployment-mediawiki01",
  "level": "ERROR",
  "tags": [
    "syslog",
    "es"
  ],
  "channel": "Math",
  "normalized_message": "Restbase math server problem:",
  "url": "/wiki/User:Physikerwelt/sandbox",
  "ip": "10.68.18.103",
  "http_method": "GET",
  "server": "en.wikipedia.beta.wmflabs.org",
  "referrer": null,
  "unique_id": "V165qApEEaoAAA6qxCAAAAAR",
  "wiki": "enwiki",
  "mwversion": "1.28.0-alpha",
  "reqId": "V165qApEEaoAAA6qxCAAAAAR",
  "request": {
    "method": "GET",
    "url": "/mathoid/local/v1/media/math/render/mml/b5a503b0fbbb58ad91eb00328b1ddb2ccdf5ef19"
  },
  "response": {
    "code": 400,
    "reason": "",
    "headers": {
      "access-control-allow-origin": "*",
      "access-control-allow-headers": "accept, content-type",
      "access-control-expose-headers": "etag",
      "x-xss-protection": "1; mode=block",
      "x-content-type-options": "nosniff",
      "x-frame-options": "SAMEORIGIN",
      "content-security-policy": "default-src 'none'; frame-ancestors 'none'",
      "x-content-security-policy": "default-src 'none'; frame-ancestors 'none'",
      "x-webkit-csp": "default-src 'none'; frame-ancestors 'none'",
      "content-type": "application/json; charset=utf-8",
      "vary": "Accept-Encoding",
      "date": "Mon, 13 Jun 2016 13:48:25 GMT",
      "access-control-allow-methods": "GET",
      "cache-control": "private, max-age=0, s-maxage=0, must-revalidate",
      "x-request-id": "801ea415-316d-11e6-a610-ba67002d9c31",
      "content-length": "155",
      "connection": "keep-alive"
    },
    "body": "{\"type\":\"https://mediawiki.org/wiki/HyperSwitch/errors/bad_request\",\"title\":\"Bad Request\",\"method\":\"POST\",\"detail\":\"Expected width > 0.\",\"uri\":\"/complete\"}",
    "error": "",
    "0": 400,
    "1": "",
    "2": {
      "access-control-allow-origin": "*",
      "access-control-allow-headers": "accept, content-type",
      "access-control-expose-headers": "etag",
      "x-xss-protection": "1; mode=block",
      "x-content-type-options": "nosniff",
      "x-frame-options": "SAMEORIGIN",
      "content-security-policy": "default-src 'none'; frame-ancestors 'none'",
      "x-content-security-policy": "default-src 'none'; frame-ancestors 'none'",
      "x-webkit-csp": "default-src 'none'; frame-ancestors 'none'",
      "content-type": "application/json; charset=utf-8",
      "vary": "Accept-Encoding",
      "date": "Mon, 13 Jun 2016 13:48:25 GMT",
      "access-control-allow-methods": "GET",
      "cache-control": "private, max-age=0, s-maxage=0, must-revalidate",
      "x-request-id": "801ea415-316d-11e6-a610-ba67002d9c31",
      "content-length": "155",
      "connection": "keep-alive"
    },
    "3": "{\"type\":\"https://mediawiki.org/wiki/HyperSwitch/errors/bad_request\",\"title\":\"Bad Request\",\"method\":\"POST\",\"detail\":\"Expected width > 0.\",\"uri\":\"/complete\"}",
    "4": ""
  },
  "tex": "\\textstyle{}"
}

These two are incompatible with the new elasticsearch configuration. For the most part each day the response field will be dynamically mapped to whichever comes in first, and logs with a conflicting mapping will be rejected. Based on logging volume i would expect the restbase math server logs will just dissapear. We should normalize how this log field is used to something consistent.

A couple additional side notes:

  • The Math extension is setting the 'type' field. This should not be set by individual log messages, mediawiki sets this to 'mediawiki' and is used to distinguish where these log messages came from
  • The response hash contains all items indexed both by string and by

Event Timeline

One relatively simple option would be to change the EventBus code to submit a body nested inside response. To make things more explicit the mapping for this could be explicitly included in the template logstash uses to create indices rather than allowing it to be dynamically created

Change 295762 had a related patch set uploaded (by EBernhardson):
Remove duplicate keys from log message

https://gerrit.wikimedia.org/r/295762

Change 295765 had a related patch set uploaded (by EBernhardson):
Match the expected format of 'response' log key

https://gerrit.wikimedia.org/r/295765

Change 295762 merged by Mobrovac:
Remove duplicate keys from log message

https://gerrit.wikimedia.org/r/295762

Change 295944 had a related patch set uploaded (by EBernhardson):
Match the expected format of 'response' log key

https://gerrit.wikimedia.org/r/295944

Change 295765 abandoned by EBernhardson:
Match the expected format of 'response' log key

Reason:
new patch uploaded against master, https://gerrit.wikimedia.org/r/#/c/295944/

https://gerrit.wikimedia.org/r/295765

Change 295944 merged by Mobrovac:
Match the expected format of 'response' log key

https://gerrit.wikimedia.org/r/295944

Change 296473 had a related patch set uploaded (by EBernhardson):
Match the expected format of 'response' log key

https://gerrit.wikimedia.org/r/296473

Change 296474 had a related patch set uploaded (by EBernhardson):
Remove duplicate keys from log message

https://gerrit.wikimedia.org/r/296474

Change 296474 merged by jenkins-bot:
Remove duplicate keys from log message

https://gerrit.wikimedia.org/r/296474

Change 296473 merged by jenkins-bot:
Match the expected format of 'response' log key

https://gerrit.wikimedia.org/r/296473