Page MenuHomePhabricator

Stack traces in logstash should follow ecs schema for field naming
Closed, ResolvedPublic

Description

In logstash, there is inconsistent field naming for stack traces.

  • Client-side errors use the field stack_trace which at least matches the ecs field name,
  • Mediawiki errors, on the other hand, use exception.trace

I can modify Phatality to support both fields but it seems like the right thing to do here is unify the logging under ECS schema standards.

I couldn't find a todo for this although it might fall under other ecs-related work that's ongoing. Apologies if this is duplicated elsewhere.

Event Timeline

mmodell renamed this task from logstash stack traces should follow ecs schema naming to Stack traces in logstash should follow ecs schema for field naming.May 26 2021, 6:42 PM
colewhite triaged this task as Medium priority.May 27 2021, 4:56 PM
colewhite moved this task from Inbox to Radar on the observability board.

This may be obsoleted by T284830. In its current form, I don't think Phatality should be enabled at all on the client-errors dashboard (or indeed for any event not "type:mediawiki"). If we do create modes of it for other event types, we'd have to do a lot more than select the field for the stack trace, but also select different fields as there are logical differences beyond the names for the same thing (e.g. reqId of the MW request, the frontend request, or the client-error beacon request), and apply different normalization to the data as well (different stack trace format).

As for the name of exception.trace, this has been named before we looked at ECS and it seems like that's the one that should change. However, I would argue against it here since it's part of the larger exception structure. I don't know if ECS has a prescription for what such field would look like, but it would be different than for messages that are in their entirety an exception. E.g. perhaps it would end up as exception.stack_trace for errors from PHP, but either way queries and such wouldn't work across them.

That's apart from the current limitation in Kibana where it isn't supported to query the logstash indexes for mediawiki and the logstash indexes for other datasets in the same query, which appears to be an arbitrary UI limitation, but it seems right now we couldn't e.g. query across "mediawiki" and "scap" (which is a regression from before ECS was adopted for Scap and has made the deployment more confusing and timeconsuming).

Change 832004 had a related patch set uploaded (by Cwhite; author: Cwhite):

[releng/phatality@master] Add document data model and support ECS

https://gerrit.wikimedia.org/r/832004

Change 832004 merged by jenkins-bot:

[releng/phatality@master] Add document data model and support ECS

https://gerrit.wikimedia.org/r/832004

Change 888737 had a related patch set uploaded (by Cwhite; author: Cwhite):

[releng/phatality@master] Update deploy/phatality-2.4.1.zip for deployment

https://gerrit.wikimedia.org/r/888737

Change 888737 merged by jenkins-bot:

[releng/phatality@master] Update deploy/phatality-2.4.1.zip for deployment

https://gerrit.wikimedia.org/r/888737

colewhite claimed this task.
colewhite subscribed.

It would be a larger lift to get MediaWiki to produce ECS than it is to enable phatality to handle field mapping itself.

With the latest deploy, phatality now maps fields based on document type.