Page MenuHomePhabricator

Unable to exclude "error" field in Logstash
Open, MediumPublic


This is not a very commonly used field name, but it is used for messages in channel:exec from type:mediawiki to describe shell commands that had stderr output.

It holds a long-ish multi line string, similar to e.g. command, message and exception.trace.

The issue is that while any of those I can "exclude" from the current view, this one I cannot:

cap1.png (1×1 px, 99 KB)
cap2.png (507×626 px, 28 KB)

Event Timeline

The error field appears to be in a cross-index field type conflict. I don't think fields in this state can be filtered.

Screenshot from 2021-03-04 08-32-56.png (575×1 px, 67 KB)

If you try the logstash-mediawiki-* index pattern (as opposed to logstash-*), the type conflict does not exist. Filters ought to work in that case.

colewhite triaged this task as Medium priority.Mar 4 2021, 3:39 PM

@colewhite Hm.. I see, does that mean there's messages in one of the non-mediawiki indexes that use this key for something that is not a string?

I could not find a way to limit a dashboard as a whole to a different index pattern. I did find this on the Discover interface, and when editing an individual panel. However, I found to my surprise that when I switched from logstash- to logstash-mediawiki-, a significant portion of mediawiki messages no longer were included in the results. Namely, the error and exception channels that we separated in T234564. These are in indexes like logstash-deploy-2021.03.07.

I'm confused as to why one can't run these searches, given that these indexes are indeed internally separate. I thought these queries are essentially fanned out and run separately with results combined. If the query "foo:bar" is only valid on some indexes, then it seems fine to ignore those where it can't yield results.

It is especially unfortunate because these dashboards for MediaWiki, while using the default of logstash- techncically, they also have at least type:mediawiki set as base filter, so in practice they only ever match logstash-deploy-* and logstash-mediawiki-*. The reason for the separation in T234564 was such as non-essential messages can use freeform key-value pairs without having to worry about being able to cause essential error monitoring to break due to e.g. conflicting keys or maximum keys reached. (If and when that happens, we can address it but with lower urgency.) But they are still shown together in the same dashboards like prior to the split. And this seems to work fine, except that it is making the search buttons disabled if there is a type conflict with messages that are nowhere in the result set for type:mediawiki, nor elsewhere in logstash-mediawiki or logstash-deploy indexes.

Does something come to mind as for how this could be solved in a somewhat future proof way? E.g. is it possible to tell Kibana to not care about those other indexes, but consider these two patterns only? Or if not, is it possible to change the index patterns such that for this frontend purpose they are considered "one" pattern, but in the back mediawiki and deploy would remain separate? I suspect this might be possible, because we have logstash-* today as pattern which covers multiple "sub" patterns (mediawiki, deploy, and other stuff). Maybe if logstash-deploy-* were renamed logstash-mediawiki-deploy-*?

Relatedly, I notice that logstash-deploy-* does not actually show up in the list of index patterns. Should it? I don't see a need for anything to only query those. I'm just observing it here, in case that is not expected by you.

In 2023, the error field type conflict still exists. logstash-k8s indexes treat the error field as an object. logstash-default indexes treat it as a string field (type:parsoid-tests). logstash-mediawiki indexes include both string and object variants.

Any given field's data type must conform to a compatible data type mapping, otherwise filters cannot be applied to the field. We recommend following the ECS definition of the error field as an object.

Change 951881 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] logstash: move error to error.message when it is a string

The proposed fix for this is a breaking change that will render dashboards expecting to see the error field as a string to show a - instead.

The affected are:

  1. type:parsoid-tests
  2. type:mediawiki
  3. kubernetes.container_name:revscoring*
  4. kubernetes.container_name:revertrisk

These can be identified by the lucene query: _exists_:error.keyword