Page MenuHomePhabricator

MediaWiki Math invalid JSON in logs on Restbase server error
Closed, ResolvedPublic

Description

From time to time Logstash/Elasticsearch emit indexing errors when ingesting MW Math logs (more context on the general issue at T236343), specifically errors when talking to Restbase math server (I'm assuming mathoid?) and trying to parse response.

[2019-11-14T15:16:28,895][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-mediawiki-2019.11.14", :_type=>"mediawiki", :_routing=>nil}, 2019-11-14T15:16:28.663Z mw1348 Restbase math server problem:], :response=>{"index"=>{"_index"=>"logstash-mediawiki-2019.11.14", "_type"=>"mediawiki", "_id"=>"AW5qfLf3KWrIH1QRSOcx", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [response]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Can't get text on a START_OBJECT at 1:1009"}}}}}

From Math.log on mwlog some sample log is (itself truncated because of udp MTU, although Logstash has a higher (32k) limit, so the truncation isn't the (only?) problem)

2019-11-14 15:16:28 [bb106400-06f1-11ea-aa58-03f80b650a62] mw1347 enwiki 1.35.0-wmf.5 Math ERROR: Restbase math server problem: {"request":{"method":"GET","url":"/mathoid/local/v1/media/math/render/mml/a03664eb84bcf74e3ecc7acc9c1ae91b8232b47a"},"response":{"code":400,"reason":"","headers":{"content-type":"application/problem+json","access-control-allow-origin":"*","access-control-allow-headers":"accept, content-type, content-length, cache-control, accept-language, api-user-agent, if-match, if-modified-since, if-none-match, dnt, accept-encoding","access-control-expose-headers":"etag","x-xss-protection":"1; mode=block","x-content-type-options":"nosniff","x-frame-options":"SAMEORIGIN","content-security-policy":"default-src 'none'; frame-ancestors 'none'","x-content-security-policy":"default-src 'none'; frame-ancestors 'none'","x-webkit-csp":"default-src 'none'; frame-ancestors 'none'","vary":"Accept-Encoding","date":"Thu, 14 Nov 2019 15:16:28 GMT","content-location":"https://en.wikipedia.org/api/rest_v1/media/math/render/mml/a03664eb84bcf74e3ecc7acc9c1ae91b8232b47a","access-control-allow-methods":"GET,HEAD","referrer-policy":"origin-when-cross-origin","cache-control":"private, max-age=0, s-maxage=0, must-revalidate","x-request-id":"bb221740-06f1-11ea-8b58-99fcdfa5d9bf","server":"restbase1022","content-length":"195","connection":"keep-alive"},"body":"{\"type\":\"https://mediawiki.org/wiki/HyperSwitch/errors/bad_request\",\"title\":\"Bad Request\",\"method\":\"POST\",\"detail\":[\"TeX parse error: Double subscripts: use braces to clarify\"],\"uri\":\"/complete\"}","error":""},"math_type":"mml","tex":"{\\displaystyle y=\\sum_{i=0}^\\N a_i \\mathcal{\\prod(x _i^\\alpha,y_i^\\beta,z_i^\\gamma..)} \\equiv

I think a potential fix is to stop dumping all request/response headers, maybe keep the body for debugging purposes. Also note that body is correctly (IMHO) sent as a string with the json serialization inside, as opposed to request/response which are nested objects.

Details

Related Gerrit Patches:

Event Timeline

Restricted Application added subscribers: Liuxinyu970226, Aklapper. · View Herald TranscriptNov 14 2019, 4:04 PM
mobrovac claimed this task.Nov 14 2019, 4:17 PM
mobrovac triaged this task as Medium priority.

Change 550871 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/extensions/Math@master] Minor: Only log the code and body of the response in case of error

https://gerrit.wikimedia.org/r/550871

mobrovac closed this task as Resolved.Nov 15 2019, 10:40 AM

Will go out with the next MW train.

Change 550871 merged by jenkins-bot:
[mediawiki/extensions/Math@master] Minor: Only log the code and body of the response in case of error

https://gerrit.wikimedia.org/r/550871