Page MenuHomePhabricator

Inconsistent use of structured log message properties in change-propigation
Closed, ResolvedPublic

Description

The log events generated by https://github.com/wikimedia/change-propagation reuse names for structured context data for values with inconsistent types. event seems to be especially overloaded with both object and string data when added to various different log messages. There also seems to be an inconsistent use of message and msg.

There's nothing wrong with this except when we try to store the log events in Elasticsearch as part of our ELK stack for unified log reporting. See T150106 for some of the issues that this causes there. Self-inconsistencies are bad and type collisions across applications are worse.

Event Timeline

@bd808 Is there a way to query logstash for a complete list of self-inconsistencies? If there is I will fix up all of them ChangeProp in one go, I'd like to avoid doing a lot of iterations on this.

For the cross-service inconsistencies I think Elasticsearch should handle that somehow internally (something like prefixing the field names with the service name may be?), because trying to communicate the rules on which field names correspond to which types all the services and all the service, core and extension developers seem like a completely impossible thing to do..

@bd808 Is there a way to query logstash for a complete list of self-inconsistencies? If there is I will fix up all of them ChangeProp in one go, I'd like to avoid doing a lot of iterations on this.

Sadly, no. When there is a type collision the event doesn't get indexed and that is only logged on the Elasticsearch master's error log. The error that led me here was specifically for "event" being sometimes an object and sometimes a string across all log events we see. When I looked at the change-prop code it seemed to be self inconsistent. I was basically just grepping for .log and looking at the labels and value used.

For the cross-service inconsistencies I think Elasticsearch should handle that somehow internally (something like prefixing the field names with the service name may be?), because trying to communicate the rules on which field names correspond to which types all the services and all the service, core and extension developers seem like a completely impossible thing to do..

Agreed. That's becoming the topic of the parent task.

Pchelolo added a project: Services (next).

@bd808 Ok, thank you. I will do the grep-check then and will create a PR shortly. Assigning this to myself.

Change deployed. Resolving.