Page MenuHomePhabricator

[Java] Ensure that missing client data from Android article events is populated
Closed, ResolvedPublic5 Estimated Story Points

Description

Spun from T352947

Description

While verifying data from the Metrics Platform data contract version of migrated Android article instruments, certain data objects appear to be missing data.

The following data is missing:

  • agent: app_theme, app_flavor, app_version
  • page: page_title, page_content_language, page_id, page_namespace_id
  • performer: performer_is_logged_in, performer_session_id, performer_pageview_id, performer_language_groups, performer_language_primary, performer_groups
  • user_agent_map: browser_family, browser_major, device_family, os_family, os_major, os_minor, wmf_app_version

Push another Java library release and integrate it with the Android app to resolve the missing data.

Developer Notes

The following spreadsheet tracks queries and corresponding results from each version of the Android article instruments:

  • MEP: modern event platform - current instrument in production
  • MPdc: metrics platform data contract - in beta
  • MPmono: metrics platform monoschema - in beta but removed about a week ago - data still exists from November/December tho

https://docs.google.com/spreadsheets/d/1-sDP6zKWrAc8jJbeYdyst7AqfgjkZtQaavtweZkfa34/edit#gid=1794012242.

Initial investigation into the issue indicates that the integration might be faulty. TBD if the library also drops the data during curation, queuing, or processing of the events.

The following document provides sample queries for verifying data in Hive:
https://docs.google.com/document/d/1uudMe6hdxrZDVEMj16Alokg_Jco-mqjicX5lmFEMWWo/edit

Acceptance Criteria

  • The properties from the data objects mentioned in the description are populated on the backend
  • Verify that MEP events data looks comparable to MP data contract events data

Event Timeline

cjming renamed this task from Ensure that missing client data from Android article events is populated to [Java] Ensure that missing client data from Android article events is populated.Jan 18 2024, 8:02 PM
cjming updated the task description. (Show Details)
cjming moved this task from Wikistats Backlog to Data Products Sprint 07 on the Test Kitchen board.
cjming set the point value for this task to 5.

Change 992541 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[operations/mediawiki-config@master] Update Android Metrics Platform stream configs

https://gerrit.wikimedia.org/r/992541

Android MR merged!

Next beta release for Android is Monday 2/5

I will deploy the new stream config on Monday 2/5

After 2/5, we should start seeing data coming in and I will verify data later in the week.

Change 992541 merged by jenkins-bot:

[operations/mediawiki-config@master] Update Android Metrics Platform stream configs

https://gerrit.wikimedia.org/r/992541

Mentioned in SAL (#wikimedia-operations) [2024-02-05T21:02:47Z] <cjming@deploy2002> Started scap: Backport for [[gerrit:992541|Update Android Metrics Platform stream configs (T355360)]]

Mentioned in SAL (#wikimedia-operations) [2024-02-05T21:05:29Z] <cjming@deploy2002> cjming: Backport for [[gerrit:992541|Update Android Metrics Platform stream configs (T355360)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-02-05T21:11:58Z] <cjming@deploy2002> Finished scap: Backport for [[gerrit:992541|Update Android Metrics Platform stream configs (T355360)]] (duration: 09m 10s)