Page MenuHomePhabricator

[Java] Fix EventGate validation error related to performer.session_id
Closed, ResolvedPublic2 Estimated Story Points

Description

During resolution of T356938, it appears that performer.session_id might be sending malformed data.

Description

Ensure that performer.session_id values match the constraints from the MP base schemas.

See validation errors in logstash:
https://logstash.wikimedia.org/goto/15eafe27f5a52735ee7cd0b509e2aaeb

'.performer.session_id' should NOT be longer than 20 characters, '.performer.session_id' should match pattern "^[0-9a-z]{20}$"

Filtered example:
https://logstash.wikimedia.org/goto/be2b358e83c911466837064b6c1f35a5

{
  "_index": "logstash-default-1-7.0.0-1-2024.02.09",
  "_id": "0ZC4j40B8T_a4T-ee3z8",
  "_version": 1,
  "_score": null,
  "_source": {
    "errored_stream_name": "android.product_metrics.article_toolbar_interaction",
    "emitter_id": "eventgate-production",
    "@timestamp": "2024-02-09T21:14:08.629Z",
    "$schema": "/error/1.0.0",
    "tags": [
      "input-kafka-eventgate-analytics-external-validation-error-eqiad",
      "kafka",
      "es",
      "eventgate",
      "normalized_message_untrimmed"
    ],
    "meta": {
      "dt": "2024-02-09T21:14:07.622Z",
      "request_id": "018b2b8e-bd05-4741-94b6-21d9d8e12622",
      "uri": "unknown",
      "domain": "de.wikipedia.org",
      "stream": "eventgate-analytics-external.error.validation",
      "id": "f86b443e-ef2a-4469-9b98-f073a761cb1f"
    },
    "@version": "1",
    "errored_schema_uri": "/analytics/product_metrics/app/base/1.0.0",
    "normalized_message": "'.performer.session_id' should NOT be longer than 20 characters, '.performer.session_id' should match pattern \"^[0-9a-z]{20}$\"",
    "raw_event": "{\"action\":\"article_toolbar_interaction\",\"action_context\":\"time_spent_ms.87995\",\"action_subtype\":\"load\",\"agent\":{\"app_flavor\":\"betarelease\",\"app_install_id\":\"cda66242-172b-4a5b-ad46-d730fa7f45ad\",\"app_theme\":\"DARK\",\"app_version\":470,\"client_platform\":\"android\",\"client_platform_family\":\"app\",\"device_language\":\"de\",\"release_status\":\"dev\"},\"mediawiki\":{\"database\":\"dewiki\"},\"page\":{\"content_language\":\"de\",\"id\":158904,\"namespace_id\":0,\"namespace_name\":\"MAIN\",\"revision_id\":238320560,\"title\":\"Schmerzensgeld\",\"wikidata_qid\":\"Q562125\"},\"performer\":{\"groups\":[],\"id\":74471111,\"is_logged_in\":false,\"language_groups\":\"[de]\",\"language_primary\":\"de\",\"pageview_id\":\"cd3dab9c101d699f48ab\",\"session_id\":\"13c5b49e-cc58-4589-a774-6c01c803f39c\"},\"meta\":{\"domain\":\"de.wikipedia.org\",\"stream\":\"android.product_metrics.article_toolbar_interaction\",\"id\":\"803f5e76-a073-4753-8a3b-48f7b2a9a386\",\"dt\":\"2024-02-09T21:14:07.621Z\",\"request_id\":\"018b2b8e-bd05-4741-94b6-21d9d8e12622\"},\"$schema\":\"/analytics/product_metrics/app/base/1.0.0\",\"dt\":\"2024-02-09T21:14:02Z\",\"http\":{\"request_headers\":{\"user-agent\":\"Metrics Platform Client/Java 2.2\"}}}",
    "message": "'.performer.session_id' should NOT be longer than 20 characters, '.performer.session_id' should match pattern \"^[0-9a-z]{20}$\"",
    "type": "eventgate_validation_error"
  },
  "fields": {
    "@timestamp": [
      "2024-02-09T21:14:08.629Z"
    ]
  },
  "highlight": {
    "normalized_message.keyword": [
      "@opensearch-dashboards-highlighted-field@'.performer.session_id' should NOT be longer than 20 characters, '.performer.session_id' should match pattern \"^[0-9a-z]{20}$\"@/opensearch-dashboards-highlighted-field@"
    ],
    "type": [
      "@opensearch-dashboards-highlighted-field@eventgate_validation_error@/opensearch-dashboards-highlighted-field@"
    ]
  },
  "sort": [
    1707513248629
  ]
}

~59k errors in the last 3 days - https://logstash.wikimedia.org/goto/10356cfe95f40445a93fbe1be215a282

Technical Notes

Acceptance Criteria

  • EventGate validation errors stop

Event Timeline

cjming updated the task description. (Show Details)
cjming moved this task from Wikistats Backlog to Data Products Sprint 09 on the Data Products board.
cjming set the point value for this task to 2.
phuedx triaged this task as Unbreak Now! priority.Feb 13 2024, 12:53 PM

Metrics Platform MR:
https://gitlab.wikimedia.org/repos/data-engineering/metrics-platform/-/merge_requests/32

^^ depends on https://gerrit.wikimedia.org/r/c/schemas/event/secondary/+/1003564

Once the schema changes and MP MR are merged, I'll publish a new release and make a PR for the Android app.

gah - this error is still happening after today's beta release
https://logstash.wikimedia.org/goto/0960a6638e0abf6609d8f9d3db82f192

moving it back to sprint 10 board :(

I'm not sure why the session ids are 24 chars, instead of 20 (the random id generation function is the same in the MP library as it is in the EventPlatformClient in the Android repo) -- especially when I tested in beta cluster, they were all 20 characters.

Will dig in some more and hope to get another fix out for next release.

Fix published/merged -- next beta release is scheduled for 3/11

Turns out, beta was re-released on Friday 3/1 and validation errors are finally waning which is a huge relief - screenshot of eventgate validation errors from last week through today

https://logstash.wikimedia.org/goto/3fae6d68a94497700ff7220bab9361c3

Screenshot 2024-03-04 at 9.31.44 AM.png (970×2 px, 306 KB)

moving this to done