Page MenuHomePhabricator

EventLogging fails to validate a Recentchanges event for he.wikipedia.org
Closed, ResolvedPublic

Description

EventLogging is reporting failures, and the problem seems to be the following (already decoded):

Unable to validate: ?{"event":{"pagename":"Recentchanges","hidepatrolled":true,"namespace":"0"},"schema":"ChangesListFilters","revision":[REDACTED],"clientValidated":false,"wiki":"hewiki","webHost":"he.wikipedia.org","userAgent":"REDACTED"};   cp1066.eqiad.wmnet      6020266 2017-01-01T11:16:03     -       "MediaWiki/1.29.0-wmf.6" (u'0' is not of type 'integer')

(the REDACTED parts can be found in the logs on eventlog1001).

Event Timeline

elukey triaged this task as Medium priority.Jan 1 2017, 11:20 AM

Alarm:

Notification Type: PROBLEM

Service: Throughput of EventLogging EventError events
Host: graphite1001
Address: 10.64.32.155
State: CRITICAL

Date/Time: Sun Jan 1 10:29:21 UTC 2017

Additional Info:

CRITICAL: 71.43% of data above the critical threshold [30.0]

Screen Shot 2017-01-04 at 10.59.47 AM.png (602×2 px, 481 KB)

See spike on errors, it appears as a spike on EventError schema. Spike has since then decreased and I think alarm can be reseted.

Note to self: see actual EventErrors: kafkacat -C -b kafka1014.eqiad.wmnet -t eventlogging_EventError

Looked at events on errors and agree with initial diagnosis about events ChangesListFilters being on error. Assigning to Roan which seems to be the schema owner (https://meta.wikimedia.org/wiki/Schema:ChangesListFilters) . Please be so kind to fix validation errors or let us know if this schema is no longer used.

Change 330973 had a related patch set uploaded (by Catrope):
onChangesListSpecialPageFilters: Actually treat namespace as an integer

https://gerrit.wikimedia.org/r/330973

Change 330973 merged by jenkins-bot:
onChangesListSpecialPageFilters: Actually treat namespace as an integer

https://gerrit.wikimedia.org/r/330973