Issue (include links if applicable):
Logstash dashboard shows a high amount of EventGate validation errors for CX events: https://logstash.wikimedia.org/goto/2234659c7b149ebfc3c1d4695568e602
The error is '.event_source' should be string, '.event_source' should be equal to one of the allowed values
For all these events, event source was null.
~2.5K events were affected by this error during the last 90 days.
What should have happened instead?:
There can be two scenarios why this might be happening (both should be checked for)
- Events for which these events are not relevant, for example, tab selection, and explicitly being set to null. For events which fields which are not relevant, they should not be set to null. Missing fields will be set to null during ingestion. Please see: Event_Platform/Schemas/Guidelines#Optional_/_Missing_fields
- Events for which these fields are applicable, but are null for some reason. For example, there are several dashboard_translation_start events which have this errors as well, which should ideally have a source.
Also note: the spike errors seems to have started during mid March, coinciding with the unified CX dashboard release to desktop.
Update
After months of trying to solve this, we've come to the conclusion that having those events without event_source is better than not having them at all. We should update the code to remove the event_source field if it's null or empty so it doesn't cause the event as a whole to fail validation and be thrown out.
Test Case 1: Verify events with null or empty .event_source
- Trigger a CX event for which event_source is not applicable or is null (e.g., tab selection).
- Capture the event payload sent to EventGate.
- ✅❌❓⬜ AC1: Confirm that .event_source is removed entirely from the payload.
QA Results - Logstash
| AC | Status | Details |
|---|---|---|
| 1 | ✅ | T395418#11372655 |
