During resolution of T356938, it appears that performer.session_id might be sending malformed data.
Description
Ensure that performer.session_id values match the constraints from the MP base schemas.
See validation errors in logstash:
https://logstash.wikimedia.org/goto/15eafe27f5a52735ee7cd0b509e2aaeb
'.performer.session_id' should NOT be longer than 20 characters, '.performer.session_id' should match pattern "^[0-9a-z]{20}$"
Filtered example:
https://logstash.wikimedia.org/goto/be2b358e83c911466837064b6c1f35a5
{ "_index": "logstash-default-1-7.0.0-1-2024.02.09", "_id": "0ZC4j40B8T_a4T-ee3z8", "_version": 1, "_score": null, "_source": { "errored_stream_name": "android.product_metrics.article_toolbar_interaction", "emitter_id": "eventgate-production", "@timestamp": "2024-02-09T21:14:08.629Z", "$schema": "/error/1.0.0", "tags": [ "input-kafka-eventgate-analytics-external-validation-error-eqiad", "kafka", "es", "eventgate", "normalized_message_untrimmed" ], "meta": { "dt": "2024-02-09T21:14:07.622Z", "request_id": "018b2b8e-bd05-4741-94b6-21d9d8e12622", "uri": "unknown", "domain": "de.wikipedia.org", "stream": "eventgate-analytics-external.error.validation", "id": "f86b443e-ef2a-4469-9b98-f073a761cb1f" }, "@version": "1", "errored_schema_uri": "/analytics/product_metrics/app/base/1.0.0", "normalized_message": "'.performer.session_id' should NOT be longer than 20 characters, '.performer.session_id' should match pattern \"^[0-9a-z]{20}$\"", "raw_event": "{\"action\":\"article_toolbar_interaction\",\"action_context\":\"time_spent_ms.87995\",\"action_subtype\":\"load\",\"agent\":{\"app_flavor\":\"betarelease\",\"app_install_id\":\"cda66242-172b-4a5b-ad46-d730fa7f45ad\",\"app_theme\":\"DARK\",\"app_version\":470,\"client_platform\":\"android\",\"client_platform_family\":\"app\",\"device_language\":\"de\",\"release_status\":\"dev\"},\"mediawiki\":{\"database\":\"dewiki\"},\"page\":{\"content_language\":\"de\",\"id\":158904,\"namespace_id\":0,\"namespace_name\":\"MAIN\",\"revision_id\":238320560,\"title\":\"Schmerzensgeld\",\"wikidata_qid\":\"Q562125\"},\"performer\":{\"groups\":[],\"id\":74471111,\"is_logged_in\":false,\"language_groups\":\"[de]\",\"language_primary\":\"de\",\"pageview_id\":\"cd3dab9c101d699f48ab\",\"session_id\":\"13c5b49e-cc58-4589-a774-6c01c803f39c\"},\"meta\":{\"domain\":\"de.wikipedia.org\",\"stream\":\"android.product_metrics.article_toolbar_interaction\",\"id\":\"803f5e76-a073-4753-8a3b-48f7b2a9a386\",\"dt\":\"2024-02-09T21:14:07.621Z\",\"request_id\":\"018b2b8e-bd05-4741-94b6-21d9d8e12622\"},\"$schema\":\"/analytics/product_metrics/app/base/1.0.0\",\"dt\":\"2024-02-09T21:14:02Z\",\"http\":{\"request_headers\":{\"user-agent\":\"Metrics Platform Client/Java 2.2\"}}}", "message": "'.performer.session_id' should NOT be longer than 20 characters, '.performer.session_id' should match pattern \"^[0-9a-z]{20}$\"", "type": "eventgate_validation_error" }, "fields": { "@timestamp": [ "2024-02-09T21:14:08.629Z" ] }, "highlight": { "normalized_message.keyword": [ "@opensearch-dashboards-highlighted-field@'.performer.session_id' should NOT be longer than 20 characters, '.performer.session_id' should match pattern \"^[0-9a-z]{20}$\"@/opensearch-dashboards-highlighted-field@" ], "type": [ "@opensearch-dashboards-highlighted-field@eventgate_validation_error@/opensearch-dashboards-highlighted-field@" ] }, "sort": [ 1707513248629 ] }
~59k errors in the last 3 days - https://logstash.wikimedia.org/goto/10356cfe95f40445a93fbe1be215a282
Technical Notes
- Recent beta release on 2/8 introduced latest fixes for AgentData and Java lib 2.2.
- Stream config was updated and deployed on 2/5.
- The performer_session_id property has been included since 12/12/23 (see https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/980963).
Acceptance Criteria
- EventGate validation errors stop