Page MenuHomePhabricator

Make channel(topic) a filterable field in logstash
Closed, ResolvedPublic

Description

The graph here: https://grafana-rw.wikimedia.org/d/000000566/overview?orgId=1&viewPanel=16&from=now-90d&to=now&forceLogin&editPanel=16
is meant to represent user-facing client-side errors, and warn the web team when problematic code has been introduced so our servers.

We have an established baseline that has been stable for over a year of less than 5k an hour errors and a recent. The low rate has made it extremely obvious when errors are introduced to group 0 and group 1 wikis.

However, it has become common that teams understandably want to log errors to debug issues. The Growth team in particular have been utilizing this.

A recent manually logged error has pushed the error rate up to 25k an hour. This makes it near impossible for the web team to notice errors as they roll out to group 1 wikis and before they roll out to group 2 wikis.

Needs

  • We'd like to be able to filter out any errors that are not from the main channel.

mw.errorLogger.logError takes a topic as a parameter so this should be visible and filterable in logstash

  • With the new field in place we'd need to update the logstash graph to only show errors in the main channel

Event Timeline

Jdlrobson renamed this task from Make channel a filterable field in to Make channel(topic) a filterable field in logstash.Aug 10 2022, 5:46 PM

I wonder if there should be a separate warning and error channel. Errors typically should be treated no differently than exceptions - they are used e.g. when an exception was caught manually to provide graceful error handling, for logic errors (when some theoretically impossible condition occurs), for backend errors etc. I don't think they should be filtered out in general. The "Your skin is incompatible with VisualEditor" error should not be filtered out in theory either, as it is not supposed to happen in Wikimedia production (we have no such skins), that would have just been a temporary measure about a known bug with minimal impact.

That said, the field should be present in Logstash. It's useful for routing bug reports, for team-specific monitoring of error volume etc.

The current situation is there is no channel field associated with client error logs. The only thing we can do from our side is to manually exclude the message Incompatible skin: vector.

Please reach out if the Observability team if you have any questions.

Change 700242 had a related patch set uploaded (by Jdlrobson; author: Gergő Tisza):

[mediawiki/extensions/WikimediaEvents@master] clientError: Log everything sent from mw.errorLogger.logError()

https://gerrit.wikimedia.org/r/700242

Change 700242 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] clientError: Log everything sent from mw.errorLogger.logError()

https://gerrit.wikimedia.org/r/700242

Is this done? It should be possible to filter by error_context.component.

Jdlrobson claimed this task.

Yes! This works great. Thank you for doing this!