Page MenuHomePhabricator

CX events EventGate validation errors: event_source should be string and equal to one of the enum values
Closed, ResolvedPublic4 Estimated Story PointsBUG REPORT

Description

Issue (include links if applicable):

Logstash dashboard shows a high amount of EventGate validation errors for CX events: https://logstash.wikimedia.org/goto/2234659c7b149ebfc3c1d4695568e602

The error is '.event_source' should be string, '.event_source' should be equal to one of the allowed values

For all these events, event source was null.

~2.5K events were affected by this error during the last 90 days.

What should have happened instead?:

There can be two scenarios why this might be happening (both should be checked for)

  • Events for which these events are not relevant, for example, tab selection, and explicitly being set to null. For events which fields which are not relevant, they should not be set to null. Missing fields will be set to null during ingestion. Please see: Event_Platform/Schemas/Guidelines#Optional_/_Missing_fields
  • Events for which these fields are applicable, but are null for some reason. For example, there are several dashboard_translation_start events which have this errors as well, which should ideally have a source.

Also note: the spike errors seems to have started during mid March, coinciding with the unified CX dashboard release to desktop.

Update

After months of trying to solve this, we've come to the conclusion that having those events without event_source is better than not having them at all. We should update the code to remove the event_source field if it's null or empty so it doesn't cause the event as a whole to fail validation and be thrown out.


Test Case 1: Verify events with null or empty .event_source

  1. Trigger a CX event for which event_source is not applicable or is null (e.g., tab selection).
  2. Capture the event payload sent to EventGate.
  3. ✅❌❓⬜ AC1: Confirm that .event_source is removed entirely from the payload.

QA Results - Logstash

ACStatusDetails
1T395418#11372655

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
SBisson moved this task from In-progress to Prioritized on the LPL Hypothesis board.
SBisson subscribed.

I can't find where/why the empty event_source is happening. I'll let someone else take a look.

Not sure what fixed it, but this is longer an issue! I don't see any error on Logstash. There are some, like less 10 a week, so this can probably be closed.

Not sure what fixed it, but this is longer an issue! I don't see any error on Logstash.

There are some, like less 10 a week, so this can probably be closed.

Can you check once more? I see this still happening at the same level: https://logstash.wikimedia.org/goto/bcd0e6e008b59f4d4bfdf117e55ad8fc

Can you check once more? I see this still happening at the same level: https://logstash.wikimedia.org/goto/bcd0e6e008b59f4d4bfdf117e55ad8fc

Hmm, right! This error is not resolved. Thanks for double-checking.

Change #1170142 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] CX translation_start event: Log errors when empty event source is set

https://gerrit.wikimedia.org/r/1170142

Change #1170292 had a related patch set uploaded (by Nik Gkountas; author: Abijeet Patro):

[mediawiki/extensions/ContentTranslation@master] Use "direct_preselect" event source when starting translation from URL

https://gerrit.wikimedia.org/r/1170292

ngkountas set the point value for this task to 4.Jul 17 2025, 1:32 PM

Change #1170142 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] CX translation_start event: Log errors when empty event source is set

https://gerrit.wikimedia.org/r/1170142

Change #1170292 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] Use "direct_preselect" event source when starting translation from URL

https://gerrit.wikimedia.org/r/1170292

Change #1170411 had a related patch set uploaded (by Sbisson; author: Sbisson):

[mediawiki/extensions/ContentTranslation@master] CX3 Build 1.0.0+20250717

https://gerrit.wikimedia.org/r/1170411

Change #1170412 had a related patch set uploaded (by Sbisson; author: Sbisson):

[mediawiki/extensions/ContentTranslation@wmf/1.45.0-wmf.10] CX3 Build 1.0.0+20250717

https://gerrit.wikimedia.org/r/1170412

Change #1170411 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] CX3 Build 1.0.0+20250717

https://gerrit.wikimedia.org/r/1170411

Change #1170412 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@wmf/1.45.0-wmf.10] CX3 Build 1.0.0+20250717

https://gerrit.wikimedia.org/r/1170412

Mentioned in SAL (#wikimedia-operations) [2025-07-17T20:09:08Z] <sbisson@deploy1003> Started scap sync-world: Backport for [[gerrit:1170412|CX3 Build 1.0.0+20250717 (T388503 T395417 T395418)]]

Mentioned in SAL (#wikimedia-operations) [2025-07-17T20:11:08Z] <sbisson@deploy1003> sbisson: Backport for [[gerrit:1170412|CX3 Build 1.0.0+20250717 (T388503 T395417 T395418)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-07-17T20:20:18Z] <sbisson@deploy1003> Finished scap sync-world: Backport for [[gerrit:1170412|CX3 Build 1.0.0+20250717 (T388503 T395417 T395418)]] (duration: 11m 10s)

Change #1170581 had a related patch set uploaded (by Sbisson; author: Sbisson):

[mediawiki/extensions/ContentTranslation@master] useTranslationStart: set event source and context first

https://gerrit.wikimedia.org/r/1170581

Change #1170581 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] useTranslationStart: set event source and context first

https://gerrit.wikimedia.org/r/1170581

Change #1171255 had a related patch set uploaded (by Sbisson; author: Sbisson):

[mediawiki/extensions/ContentTranslation@master] CX3 Build 1.0.0+20250721

https://gerrit.wikimedia.org/r/1171255

Change #1171255 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] CX3 Build 1.0.0+20250721

https://gerrit.wikimedia.org/r/1171255

Change #1172592 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] Confirmation step: Add guard to reject navigation if URL params missing

https://gerrit.wikimedia.org/r/1172592

Change #1172607 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] CX "dashboard_translation_start" event: Assert event_source

https://gerrit.wikimedia.org/r/1172607

Change #1172592 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] Confirmation step: Add guard to reject navigation if URL params missing

https://gerrit.wikimedia.org/r/1172592

Change #1173366 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] CX instrumentation: Assert non-null fields

https://gerrit.wikimedia.org/r/1173366

Change #1172607 abandoned by Nik Gkountas:

[mediawiki/extensions/ContentTranslation@master] CX "dashboard_translation_start" event: Assert event_source

Reason:

Abandoning in favor of I05744e167ba305f2d9e4a94992e025bc906aaa67

https://gerrit.wikimedia.org/r/1172607

Change #1173366 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] CX instrumentation: Assert non-null fields

https://gerrit.wikimedia.org/r/1173366

Change #1175492 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] CX3 Build 1.0.0+20250804

https://gerrit.wikimedia.org/r/1175492

Change #1175492 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] CX3 Build 1.0.0+20250804

https://gerrit.wikimedia.org/r/1175492

Change #1179650 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] CX useEventLogging: Fix "assertNonNullFields"

https://gerrit.wikimedia.org/r/1179650

Change #1179651 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] CX3 Build 1.0.0+20250818

https://gerrit.wikimedia.org/r/1179651

Change #1179650 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] CX useEventLogging: Fix "assertNonNullFields"

https://gerrit.wikimedia.org/r/1179650

Change #1179651 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] CX3 Build 1.0.0+20250818

https://gerrit.wikimedia.org/r/1179651

SBisson changed the task status from Open to In Progress.Oct 22 2025, 7:06 PM
SBisson moved this task from Prioritized to In-progress on the LPL Hypothesis board.

Change #1198142 had a related patch set uploaded (by Sbisson; author: Sbisson):

[mediawiki/extensions/ContentTranslation@master] Remove event fields that cause validation errors that we can't fix

https://gerrit.wikimedia.org/r/1198142

Change #1198142 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] Remove event fields that cause validation errors that we can't fix

https://gerrit.wikimedia.org/r/1198142

Change #1201184 had a related patch set uploaded (by Sbisson; author: Sbisson):

[mediawiki/extensions/ContentTranslation@master] CX3 Build 1.0.0+20251103

https://gerrit.wikimedia.org/r/1201184

Change #1201184 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] CX3 Build 1.0.0+20251103

https://gerrit.wikimedia.org/r/1201184

@SBisson After the patch was updated, as shown in Logstash, the errors no longer appeared. I will move this to Sign-off. Thanks for all your work!

Test Result - Logstash

Status: ✅ PASS
Environment: Logstash
OS: macOS Tahoe 26.1
Browser: Chrome 142
Device: MBA
Emulated Device: N/A

Test Artifact(s):

Test Case 1: Verify events with null or empty .event_source

  1. Trigger a CX event for which event_source is not applicable or is null (e.g., tab selection).
  2. Capture the event payload sent to EventGate.
  3. AC1: Confirm that .event_source is removed entirely from the payload.

2025-11-12_11-18-30.png (1×1 px, 1 MB)

GMikesell-WMF updated Other Assignee, removed: GMikesell-WMF.
GMikesell-WMF moved this task from Needs QA to Design Signoff on the LPL Hypothesis board.

no design review needed for this task so moving it to product signoff.