EventBus logs don't show up in logstash
Closed, ResolvedPublic

Description

There's been a burst of 400 errors in the EventBus and the local service file log is full of errors, but there's none showing up in logstash.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 13 2016, 12:38 AM

Ok, apparently @bd808 temporarily disabled logging in https://gerrit.wikimedia.org/r/#/c/320016/2 Can that be reverted now?

Have you done anything to change the code that was causing the failures as noted in T150106#2774165 and T150106#2777178? We did not come up with a generic solution for either problem where mixed scalar and object values are stored under the same field name.

Nuria moved this task from Incoming to Wikistats Production on the Analytics board.
Pchelolo added a comment.EditedMar 6 2017, 8:22 PM

Looks like https://phabricator.wikimedia.org/T150106#2777178 was improved by https://github.com/wikimedia/change-propagation/pull/133

Now the event property in ChangeProp is always a string, so it's consistent within the service, EventBus extension doesn't ever include the event entry any more, but I'm not sure what EventLogging Service does. @Ottomata ?

Nuria triaged this task as Low priority.Mar 27 2017, 3:54 PM
Nuria added a project: Easy.
Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptMar 27 2017, 3:54 PM
Nuria moved this task from Wikistats Production to Dashiki on the Analytics board.Jun 12 2017, 3:54 PM
Nuria edited projects, added Analytics-Kanban; removed Analytics.Jul 13 2017, 4:09 PM

Ah ok, I remember what's going on here. So https://phabricator.wikimedia.org/T150106#2777178 is about eventlogging error event logs conflicting with change prop's event object. As @Pchelolo, if changeprop now emits event as a string, this shouldn't conflict.

I'm not sure how https://gerrit.wikimedia.org/r/#/c/320016/2/wmf-config/InitialiseSettings.php was related though, as I don't think the EventBus extension doesn't really emits log errors with an event field.

Can we reenable EventBus extension logstash stuff and see what happens?

Ottomata added a comment.EditedJul 17 2017, 2:52 PM

I think the conflict was with the eventlogging_EventError topic. This data contains EventLogging Analytics events that did had errors, usually one that did not validate. EventLogging Analytics events are all wrapped in a capsule, with the actual event data contained in an event object.

{
  "wiki": "",
  "uuid": "3548417061a911e7860e90b11c2d80e4",
  "timestamp": 1499276503,
  "schema": "EventError",
  "revision": 14035058,
  "recvFrom": "eventlog1001.eqiad.wmnet",
  "event": {
    "schema": "MobileWikiAppFindInPage",
    "revision": 14586774,
    "rawEvent": "xxxxxxxx",
    "message": "findText is a required property",
    "code": "validation"
  }
}

Marcel had done some work in the past to ingest eventlogging_EventError from Kafka into logstash, but I'm not sure of the current status of that.

Nuria edited projects, added Analytics; removed Analytics-Kanban.Sep 14 2017, 4:30 PM
Nuria moved this task from Dashiki to Deprioritized on the Analytics board.Sep 14 2017, 6:17 PM
Pchelolo closed this task as Resolved.Nov 8 2017, 11:13 AM
Pchelolo edited projects, added Services (done); removed Services (watching).

The EventBus logs were fixed and now can be seen in log stash. Resolving.