Data Platform Engineering Bug Report or Data Problem Form.
What kind of problem are you reporting?
- Access related problem
- Service related problem
- Data related problem
For a data related problem:
- Is this a data quality issue? Yes
- What datasets and/or dashboards are affected?
unique_devices_per_domain_daily
unique_devices_per_domain_monthly
unique_devices_per_project_family_daily
unique_devices_per_project_family_monthly
Both druid and hive datasets appear impacted.
- What are the observed vs expected results? Please include information such as location of data, any initial assessments, sql statements, screenshots.
There are significant and unexpected spikes in monthly and daily unique devices to the wikifunctions project family and domains starting in February 2024.
In unique_devices_per_project_family_monthly and unique_devices_per_project_family_daily, there was about a 20x increase in unique devices to wikifunctions across all countries in February 2024. These levels have sustained through April 2024, with another observed spike for Hong Kong in March 2024 where unique devices went from 20K in March to 513K in April 2024.
Trends are different when querying unique_devices_per_domain_monthly where a smaller spike is just seen in Singapore and the United States. This increase is not as high as the increase observed in the per project family datasets but still unexpected based on project trends.
Here's the superset dashboard I created while exploring this issue.
It's possible this issue might be related to the issue reported in T344381.